Skip to content

Use the App Autoscaler

In this Topic you will learn how to use the App Autoscaler. Cloud Foundry offers you simple elasticity of your applications. You can easily scale the application instances up and down as you like. The App Autoscaler Service allows to even automate this process of scaling application instances up and down based on a ruleset you can create on your own.

The App Autoscaler service component is based on an open source project. You can check out its code in the public App Autoscaler repository.

The App AutoScaler provides the capability to adjust the computation resources for Cloud Foundry applications through:

  • Dynamic scaling based on application performance metrics
  • Scheduled scaling based on time

App AutoScaler ensures your application to have the right number of instances to maintain the service quality and pay the resource you really need. With auto-scaling capability, your application gains the following benefits:

  • Better service quality: Auto-scaling monitors your application performance and scales out/in your application horizontally to match the resource need. In this way, your service quality target defined in the policy can be maintained.
  • Better availability: Auto-scaling helps prepare enough number of application instances in advance if there is predictable workload spike in a given period of time. It will avoid overloading and application crash during the spike.
  • Cost reduction: Auto-scaling allows you to pay for the resource you really need. You don’t need to have capacity planning and deploy your application with a fixed number of instances. It saves you money by starting new instances when your application needs them and destroys them when it does not.

App AutoScaler is offered as a Cloud Foundry service via Open Service Broker API, you need to provision and bind App AutoScaler service through Cloud Foundry CLI or the AppCloud Portal first.

Terminal window
cf create-service autoscaler autoscaler-free-plan <service-instance_name>

Bind the service instance to your application

Section titled “Bind the service instance to your application”

You can attach scaling policy together with service binding by providing the policy file name as a parameter of the service binding command.

Terminal window
cf bind-service <app_name> <service_instance_name> -c <policy_file_name>

To disconnect App AutoScaler from your application, unbind the service instance. This will remove the autoscaling policy as well. Furthermore, you can deprovision the service instance if no application is bound.

Terminal window
cf unbind-service <app_name> <service_instance_name>
cf delete-service <service_instance_name>

Sample scaling policy JSON-file:

{
"instance_min_count": 1,
"instance_max_count": 4,
"scaling_rules": [
{
"metric_type": "cpu",
"threshold": 5,
"operator": ">=",
"adjustment": "+1"
},
{
"metric_type": "cpu",
"threshold": 5,
"operator": "<",
"adjustment": "-1"
}
]
}

App AutoScaler requires a policy file written in JSON with the following schemas:

NameTypeRequiredDescription
instance_min_countinttrueminimum number if instance count
instance_max_countinttruemaximum number of instance count
scaling_rulesJSON-array<scaling_rules>dynamic scaling rules
schedulesJSON-arrayscheduled
NameTypeRequiredDescription
metric_typestringtrueSee list below.
thresholdinttrueThe boundary when metric value exceeds is considered as a breach.
operatorstringtrue>, <, >=, <=
adjustmentstringtrueThe adjustment approach for instance count with each scaling. Support regex format ^[-+][1-9]+[0-9]*[%]?$, i.e. +5 means adding 5 instances, -50% means shrinking to the half of current size.
breach_duration_secsint, secondsfalseTime duration to fire scaling event if it keeps breaching.
cool_down_secsint, secondsfalseThe time duration to wait before the next scaling kicks in.
  • CPU - “cpu”, a short name of “cpu utilization”, is the cpu usage of your application in percentage;
  • Memoryused - “memoryused” represents the absolute value of the used memory of your application. The unit of “memoryused” metric is “MB”;
  • Memoryutil - “memoryutil”, a short name of “memory utilization”, is the used memory of the total memory allocated to the application in percentage. For example, if the memory usage of the application is 100MB and memory quota is 200MB, the value of “memoryutil” is 50%;
  • Responsetime - “responsetime” represents the average amount of time the application takes to respond to a request in a given time period. The unit of “responsetime” is “ms” (milliseconds);
  • Throughput - “throughput” is the total number of processed requests in a given time period. The unit of “throughput” is “rps” (requests per second);
  • Custom metric - You can define your own metric name and emit your own metric to App Autoscaler to trigger further dynamic scaling. Only alphabet letters, numbers and ”_” are allowed for a valid metric name, and the maximum length of the metric name is limited up to 100 characters.
NameTypeRequiredDescription
timezonestringtrueusing timezone definition of Java
recurring_scheduleJSON-array<recurring_schedules>the schedules which will take effect repeatly, see Recurring Schedule below
specific_dateJSON-array<specific_date>the schedules which take effect only once, see Specific Date below
NameTypeRequiredDescription
start_datestring, “yyyy-mm-dd”falsethe start date of the schedule. Must be a future time
end_datestring, “yyyy-mm-dd”falsethe end date of the schedule. Must be a future time.
start_timestring, “hh:mm”truethe start time of the schedule
end_timestring, “hh:mm”truethe end time of the schedule
days_of_week / days_of_montharrayfalserecurring days of a week or month. Use [1,2,..7] or [1,2,..31] to define it
instance_min_countinttrueminimum number of instance count for this schedule
instance_max_countinttruemaximum number of instance count for this schedule
initial_min_instance_countintfalsethe initial minimal number of instance count for this schedule

If one schedule overlaps another, the one which starts first will be guaranteed, while the later one is completely ignored, for example:

Schedule #1: ———sssssssssss——————————————
Schedule #2: ——————————ssssssssss————————
Schedule #3: ————————————————sssssssss———

With the above definition, schedule #1 and #3 will be applied, while schedule #2 is ignored.