Zum Inhalt springen

Use the App AutoScaler

In this topic you will learn how to use the App AutoScaler. Cloud Foundry offers you simple elasticity of your applications. You can easily scale the application instances up and down as you like. The App AutoScaler service allows to even automate this process of scaling application instances up and down based on a ruleset you can create on your own.

The App AutoScaler service component is based on an open source project. You can check out its code in the public App AutoScaler repository.

The App AutoScaler provides the capability to adjust the computation resources for Cloud Foundry applications through:

  • Dynamic scaling based on application performance metrics
  • Scheduled scaling based on time

App AutoScaler ensures your application to have the right number of instances to maintain the service quality and pay the resource you really need. With auto-scaling capability, your application gains the following benefits:

  • Better service quality: Auto-scaling monitors your application performance and scales out/in your application horizontally to match the resource need. In this way, your service quality target defined in the policy can be maintained.
  • Better availability: Auto-scaling helps prepare enough number of application instances in advance if there is predictable workload spike in a given period of time. It will avoid overloading and application crash during the spike.
  • Cost reduction: Auto-scaling allows you to pay for the resource you really need. You don’t need to have capacity planning and deploy your application with a fixed number of instances. It saves you money by starting new instances when your application needs them and destroys them when it does not.

App AutoScaler is offered as a Cloud Foundry service via Open Service Broker API, you need to provision and bind App AutoScaler service through Cloud Foundry CLI or the Portal first.

Terminal window
cf create-service autoscaler autoscaler-free-plan <service-instance_name>

Bind the service instance to your application

Section titled “Bind the service instance to your application”

You can attach scaling policy together with service binding by providing the policy file name as a parameter of the service binding command.

Terminal window
cf bind-service <app_name> <service_instance_name> -c <policy_file_name>

To disconnect App AutoScaler from your application, unbind the service instance. This will remove the autoscaling policy as well. Furthermore, you can deprovision the service instance if no application is bound.

Terminal window
cf unbind-service <app_name> <service_instance_name>
cf delete-service <service_instance_name>

Sample scaling policy JSON-file:

{
"instance_min_count": 1,
"instance_max_count": 4,
"scaling_rules": [
{
"metric_type": "cpu",
"threshold": 5,
"operator": ">=",
"adjustment": "+1"
},
{
"metric_type": "cpu",
"threshold": 5,
"operator": "<",
"adjustment": "-1"
}
]
}

App AutoScaler requires a policy file written in JSON with the following schemas:

  • cpu - short name of “CPU utilization”, is the CPU usage of your application in percentage.
  • memoryused - represents the absolute value of the used memory of your application. The unit of memoryused metric is “MB”.
  • memoryutil - short name of “memory utilization”, is the used memory of the total memory allocated to the application in percentage. For example, if the memory usage of the application is 100MB and memory quota is 200MB, the value of memoryutil is 50%.
  • responsetime - represents the average amount of time the application takes to respond to a request in a given time period. The unit of responsetime is “ms” (milliseconds).
  • throughput - is the total number of processed requests in a given time period. The unit of throughput is “rps” (requests per second).
  • Custom metric - You can define your own metric name and emit your own metric to App AutoScaler to trigger further dynamic scaling. Only alphabet letters, numbers and ”_” are allowed for a valid metric name, and the maximum length of the metric name is limited up to 100 characters.

If one schedule overlaps another, the one which starts first will be guaranteed, while the later one is completely ignored, for example:

Schedule #1: ———sssssssssss——————————————
Schedule #2: ——————————ssssssssss————————
Schedule #3: ————————————————sssssssss———

With the above definition, schedule #1 and #3 will be applied, while schedule #2 is ignored.