Use the App AutoScaler

Last updated on Mar 13, 2026

In this topic you will learn how to use the App AutoScaler. Cloud Foundry offers you simple elasticity of your applications. You can easily scale the application instances up and down as you like. The App AutoScaler service allows to even automate this process of scaling application instances up and down based on a ruleset you can create on your own.

The App AutoScaler service component is based on an open source project. You can check out its code in the public App AutoScaler repository.

App AutoScaler dashboard in Stratos

The App AutoScaler provides the capability to adjust the computation resources for Cloud Foundry applications through:

Dynamic scaling based on application performance metrics
Scheduled scaling based on time

App AutoScaler ensures your application to have the right number of instances to maintain the service quality and pay the resource you really need. With auto-scaling capability, your application gains the following benefits:

Better service quality: Auto-scaling monitors your application performance and scales out/in your application horizontally to match the resource need. In this way, your service quality target defined in the policy can be maintained.
Better availability: Auto-scaling helps prepare enough number of application instances in advance if there is predictable workload spike in a given period of time. It will avoid overloading and application crash during the spike.
Cost reduction: Auto-scaling allows you to pay for the resource you really need. You don’t need to have capacity planning and deploy your application with a fixed number of instances. It saves you money by starting new instances when your application needs them and destroys them when it does not.

App AutoScaler is offered as a Cloud Foundry service via Open Service Broker API, you need to provision and bind App AutoScaler service through Cloud Foundry CLI or the Portal first.

Create an instance of the service

cf create-service autoscaler autoscaler-free-plan <service-instance_name>

Bind the service instance to your application

You can attach scaling policy together with service binding by providing the policy file name as a parameter of the service binding command.

cf bind-service <app_name> <service_instance_name> -c <policy_file_name>

To disconnect App AutoScaler from your application, unbind the service instance. This will remove the autoscaling policy as well. Furthermore, you can deprovision the service instance if no application is bound.

cf unbind-service <app_name> <service_instance_name>

cf delete-service <service_instance_name>

App AutoScaler policy settings

Sample scaling policy JSON-file:

{
  "instance_min_count": 1,
  "instance_max_count": 4,
  "scaling_rules": [
    {
      "metric_type": "cpu",
      "threshold": 5,
      "operator": ">=",
      "adjustment": "+1"
    },
    {
     "metric_type": "cpu",
     "threshold": 5,
     "operator": "<",
     "adjustment": "-1"
    }
  ]
}

App AutoScaler schemas

App AutoScaler requires a policy file written in JSON with the following schemas:

Policy

Name	Type	Required	Description
`instance_min_count`	`int`	true	minimum number of instance count
`instance_max_count`	`int`	true	maximum number of instance count
`scaling_rules`	JSON-array<`scaling_rules`>		dynamic scaling rules
`schedules`	JSON-array		scheduled

Scaling rules

Name	Type	Required	Description
`metric_type`	`string`	true	See list below.
`threshold`	`int`	true	The boundary when metric value exceeds is considered as a breach.
`operator`	`string`	true	>, <, >=, <=
`adjustment`	`string`	true	The adjustment approach for instance count with each scaling. Support regex format `^[-+][1-9]+[0-9]*[%]?$`, i.e. +5 means adding 5 instances, -50% means shrinking to the half of current size.
`breach_duration_secs`	`int`, seconds	false	Time duration to fire scaling event if it keeps breaching.
`cool_down_secs`	`int`, seconds	false	The time duration to wait before the next scaling kicks in.

`Metric_type` values

cpu - short name of “CPU utilization”, is the CPU usage of your application in percentage.
memoryused - represents the absolute value of the used memory of your application. The unit of memoryused metric is “MB”.
memoryutil - short name of “memory utilization”, is the used memory of the total memory allocated to the application in percentage. For example, if the memory usage of the application is 100 MB and memory quota is 200 MB, the value of memoryutil is 50%.
responsetime - represents the average amount of time the application takes to respond to a request in a given time period. The unit of responsetime is “ms” (milliseconds).
throughput - is the total number of processed requests in a given time period. The unit of throughput is “rps” (requests per second).
Custom metric - You can define your own metric name and emit your own metric to App AutoScaler to trigger further dynamic scaling. Only alphabet letters, numbers and ”_” are allowed for a valid metric name, and the maximum length of the metric name is limited up to 100 characters.

Scheduled

Name	Type	Required	Description
`timezone`	`string`	true	using timezone definition of Java
`recurring_schedule`	JSON-array<`recurring_schedules`>		the schedules which will take effect repeatedly, see Recurring Schedule below
`specific_date`	JSON-array<`specific_date`>		the schedules which take effect only once, see Specific Date below

Recurring schedule

Name	Type	Required	Description
`start_date`	`string`, “yyyy-mm-dd”	false	the start date of the schedule; must be a future time
`end_date`	`string`, “yyyy-mm-dd”	false	the end date of the schedule; must be a future time.
`start_time`	`string`, “hh:mm”	true	the start time of the schedule
`end_time`	`string`, “hh:mm”	true	the end time of the schedule
`days_of_week` / `days_of_month`	`array`	false	recurring days of a week or month; use [1,2,..7] or [1,2,..31] to define it
`instance_min_count`	`int`	true	minimum number of instance count for this schedule
`instance_max_count`	`int`	true	maximum number of instance count for this schedule
`initial_min_instance_count`	`int`	false	the initial minimal number of instance count for this schedule

If one schedule overlaps another, the one which starts first will be guaranteed, while the later one is completely ignored, for example:

Schedule #1: ———sssssssssss——————————————
Schedule #2: ——————————ssssssssss————————
Schedule #3: ————————————————sssssssss———

With the above definition, schedule #1 and #3 will be applied, while schedule #2 is ignored.