1 of 8

Autoscaling Hosts

Autoscale your Host workloads in DuploCloud

DuploCloud supports various ways to scale Host workloads, depending on the underlying AWS services being used.

Autoscaling Groups (ASG)
ECS Auto Scaling
Autoscaling in Kubernetes

Autoscaling Groups (ASG)

Create Autoscaling groups to scale EC2 instances to your workload

Configure Autoscaling Groups (ASG) to ensure the application load is scaled based on the number of EC2 instances configured. Autoscaling detects unhealthy instances and launches new EC2 instances. ASG is also cost-effective as EC2 Instances are dynamically created per the application requirement within minimum and maximum count limits.

For cluster autoscaling, enable the Cluster Autoscaler option in your Infrastructure before creating an ASG.

Creating Autoscaling Groups (ASG)

In the DuploCloud Portal, navigate to Cloud Services -> Hosts.
In the ASG tab, click Add. The Add ASG page is displayed.
In the Friendly Name field, enter the name of the ASG.
Select Availability Zone and Instance Type.
In the Instance Count field, enter the desired capacity for the Autoscaling group.
In the Minimum Instances field, enter the minimum number of instances. The Autoscaling group ensures that the total number of instances is always greater than or equal to the minimum number of instances.
In the Maximum Instances field, enter the maximum number of instances. The Autoscaling group ensures that the total number of instances is always less than or equal to the maximum number of instances.
Optionally, select Use for Cluster Autoscaling.
Select Advanced Options. The Advanced Options section displays.
Fill in additional fields as needed for your ASG.
Click Add. Your ASG is added and displayed in the ASG tab.

In the Friendly Name field, enter the name of the ASG.
Select Availability Zone and Instance Type.
In the Instance Count field, enter the desired capacity for the Autoscaling group.
In the Minimum Instances field, enter the minimum number of instances. The Autoscaling group ensures that the total number of instances is always greater than or equal to the minimum number of instances.
In the Maximum Instances field, enter the maximum number of instances. The Autoscaling group ensures that the total number of instances is always less than or equal to the maximum number of instances.
Optionally, select Use for Cluster Autoscaling.
Optionally select Advanced Options, and complete additional fields as needed.
Click Add. Your ASG is added and displayed in the ASG tab.

Viewing Hosts in Autoscaling Groups

To view the hosts in an Autoscaling group, follow these steps:

In the DuploCloud Portal, navigate to Cloud Services -> Hosts.
Select the ASG tab.
In the NAME column, select the ASG for which you want to view Hosts.
Select the Hosts tab. A list of individual Hosts displays.

Creating an Amazon EC2 Autoscaling Policy

Refer to AWS Documentation for detailed steps on creating Scaling policies for the Autoscaling Group.

Launch Templates

Managing Launch Template Versions for Autoscaling Groups (ASG) in DuploCloud

Launch templates define the configuration for instances in an Auto Scaling Group (ASG). They specify key settings such as the instance type, AMI, and other parameters that determine how new instances are launched. DuploCloud allows you to create multiple launch template versions, each with its own unique settings (e.g., instance type, AMI, etc.). You can easily switch between versions as your requirements evolve. One version can be set as the default, and updates to the launch template can be applied to both new and existing instances by using the Instance Refresh feature.

This feature is applicable to both Kubernetes Node ASGs and Docker Native ASGs.

Editing launch templates

Select the appropriate Tenant from the Tenant list box.
For Kubernetes-managed ASGs (Nodes), navigate to Kubernetes -> Nodes. For Docker Native ASGs (EC2 Instances Running Docker Directly), Navigate to Cloud Services -> Hosts.
Select the ASG tab.
In the NAME column, click on the ASG you wish to edit launch templates for.
Select the Launch Templates tab.
In the row of the version you wish to update, click the menu icon (), and select Edit (Create a new version). The Edit Launch Template (Create a new version) pane displays.

Configure the following launch template settings:
- Template Version Description: Provide a description for the new version.
- Instance Type: Select the type of EC2 instance to use for this version (e.g., t3.medium, m5.large, etc.).
- Image ID: Specify the Amazon Machine Image (AMI) ID for the instances in this version. This defines the base image for launching new instances.
- Set as Default: Optionally, set the newly created version as the default launch template for the ASG. The default version automatically applies to all newly launched instances in the ASG.
Click Submit. The updated launch template version is created.

Changing the default launch template version

In DuploCloud, you can manage multiple versions of a launch template for your Auto Scaling Group (ASG). You may want to change the default version to ensure that new instances are launched with the desired configuration.

To change the default launch template version:

Select the Tenant from the Tenant list box.
For Kubernetes-managed ASGs (Nodes), navigate to Kubernetes -> Nodes. For Docker Native ASGs (EC2 Instances Running Docker Directly), Navigate to Cloud Services -> Hosts.
Select the ASG tab and click the name of the appropriate ASG.
Click on the Launch Templates tab.
Select Set as Default.

The selected version will now be the default for any new instances launched in the ASG. Existing instances will remain unchanged. To update existing instances, use the Instance Refresh feature.

Instance Refresh for ASG

Initiate an Instance Refresh for an Auto Scaling Group (ASG) within the DuploCloud Portal.

Instance refresh allows you to apply configuration changes to existing instances in your Auto Scaling Group (ASG). While updates to the ASG automatically apply to newly launched instances, an instance refresh is required to apply these changes to instances that are already running. This ensures that all instances in your ASG are consistent with the latest settings.

In general, this feature works for:

ASGs with EC2 Instances: It applies to Auto Scaling Groups that are managing EC2 instances, which can be part of a Kubernetes cluster.
ASGs using Launch Templates or Configurations: The ASG must be configured to use a launch template or configuration to define how new instances should be created.

Initiating an Instance Refresh for an Auto Scaling Group (ASG)

Select the appropriate Tenant from the Tenant list box.
For Kubernetes-managed ASGs (Nodes), navigate to Kubernetes -> Nodes. For Docker Native ASGs (EC2 Instances Running Docker Directly), Navigate to Cloud Services -> Hosts.
Select the ASG tab.
Select the name of the ASG you want to refresh from the NAME column.
Select the Launch Template tab.
Click on the Actions menu, and select Start Instance Refresh. The Start Instance Refresh pane displays.

Choose the Instance Replacement Method:
- Launch before Termination: New instances are launched before the old ones are terminated, ensuring capacity is maintained throughout the refresh process and minimizing downtime.
- Terminate and Launch: Old instances are terminated first, and new ones are launched afterward. This method may temporarily reduce capacity until the new instances are fully launched and healthy.
- Custom Behavior: Define a custom instance replacement strategy to meet specific timing or instance replacement policies based on your needs.
Set the Min and Max Healthy Percentage:
- Min Healthy Percentage: Specifies the minimum percentage of instances that must remain healthy during the refresh to avoid capacity issues.
- Max Healthy Percentage: Limits the percentage of instances that can be healthy during the refresh to control how many instances are updated at once.
Define the Instance Warmup time, which is the duration to wait before considering a newly launched instance as healthy. This ensures the instance has time to fully initialize.
Optionally, select Update and Launch Template to apply any new configurations to the instances being replaced.
- Version: If updating the launch template, choose the desired version of the launch template to apply to the instances being replaced.
Click Start to initiate the refresh process. The EC2 instances within the ASG will begin updating according to the selected replacement method.

Note: The instance refresh process can take some time to complete depending on the number of instances and the selected update method. Please allow adequate time for the instances to be updated and replaced.

Scale to or from Zero

Scale to or from zero when creating Autoscaling Groups in DuploCloud

DuploCloud allows you to scale to or from zero in Amazon EKS clusters by enabling the Scale from Zero option within the Advanced Options when creating an Autoscaling Group. This feature intelligently adjusts the number of instances in your cluster, dynamically scaling up when demand increases and down to zero when resources are not in use. Reducing resource allocation during idle periods leads to significant cost savings.

When to Use Scale to Zero

Autoscaling to zero is ideal for Kubernetes workloads that don’t always require 100% availability such as:

Non-Critical Workloads: Batch processing jobs, data analysis tasks, and other non-customer-facing services that can be scaled down to zero during off-peak hours (e.g., nights or weekends).

Dev/Test Environments: Development and testing environments that can be scaled up when developers need them and scaled down when not in use.

Background Jobs: Workloads with background jobs running in Kubernetes that are only needed intermittently, such as those triggered by specific events or scheduled at certain times.

When Autoscaling to Zero Should Not Be Used

Autoscaling to zero is not suitable for all workloads. Avoid using this feature for:

Customer-Facing Applications: Frontend web applications that must always be available should not use autoscaling to zero, as it can cause downtime and negatively impact user experience.

Workloads Outside Kubernetes: If background jobs or other processes are not running in Kubernetes, autoscaling to zero will not apply. Different scaling strategies are required for these environments.

Advantages of Scaling to Zero

Scaling to or from zero with AWS Autoscaling Groups (ASG) offers several advantages depending on the context and requirements of your application:

Cost Savings: By scaling down to zero instances during periods of low demand, you minimize costs associated with running and maintaining instances. This pay-as-you-go model ensures you only pay for resources when they are actively being used.
Resource Efficiency: Scaling to zero ensures that resources are not wasted during periods of low demand. By terminating instances when they are not needed, you optimize resource utilization and prevent over-provisioning, leading to improved efficiency and reduced infrastructure costs.
Flexibility: Scaling to zero provides the flexibility to dynamically adjust your infrastructure in response to changes in workload. It allows you to efficiently allocate resources based on demand, ensuring that your application can scale up or down seamlessly to meet varying levels of traffic.
Simplified Management: With automatic scaling to zero, you can streamline management tasks associated with provisioning and de-provisioning instances. The ASG handles scaling operations automatically, reducing the need for manual intervention and simplifying infrastructure management.

Advantages of Scaling from Zero

Rapid Response to Increased Demand: Scaling from zero allows your infrastructure to quickly respond to spikes in traffic or sudden increases in workload. By automatically launching instances as needed, you ensure that your application can handle surges in demand without experiencing performance degradation or downtime.
Improved Availability: Scaling from zero helps maintain optimal availability and performance for your application by ensuring that sufficient resources are available to handle incoming requests. This proactive approach to scaling helps prevent resource constraints and ensures a consistent user experience even during peak usage periods.
Enhanced Scalability: Scaling from zero enables your infrastructure to scale out horizontally, adding additional instances as demand grows. This horizontal scalability allows you to seamlessly handle increases in workload and accommodate a growing user base without experiencing bottlenecks or performance issues.
Elasticity: Scaling from zero provides elasticity to your infrastructure, allowing it to expand and contract based on demand. This elasticity ensures that you can efficiently allocate resources to match changing workload patterns, resulting in optimal resource utilization and cost efficiency.

Spot Instances for AWS

Create Autoscaling Groups (ASG) with Spot Instances in the DuploCloud platform

Spot Instances are spare capacity priced at a significant discount compared to On-Demand Instances. Users specify the maximum price (bid) they will pay per hour for a Spot Instance. The instance is launched if the current Spot price is below the user's bid. Since Spot Instances can be interrupted when spare capacity is unavailable, applications using Spot Instances must be fault-tolerant and able to handle interruptions.

Spot Instances are only supported for Auto-scaling Groups (ASG) with EKS

Enabling Spot Instances when Creating Autoscaling Groups

Follow the steps in the section Creating Autoscaling Groups (ASG). Before clicking Add, Click the box to access Advanced Options. Enable Use Spot Instances and enter your bid, in dollars, in the Maximum Spot Price field.

Creating Services using Spot Instances

Follow the steps in Creating Services using Autoscaling Groups. In the Add Service page, Basic Options, Select Tolerate spot instances.

Tolerations will be entered by default in the Add Service page, Advanced Options, Other Container Config field.

ECS Autoscaling

ECS Autoscaling has the ability to scale the desired count of tasks for the ECS Service configured in your infrastructure. Average CPU/Memory metrics of your tasks are used to increase/decrease the desired count value.

Step 1: Add Auto-Scaling for Targets

Navigate to Cloud Services -> ECS. Select the ECS Task Definition where Autoscaling needs to be enabled > Add Scaling Target

Set the MinCapacity (minimum value 2) and MaxCapacity to complete the configuration.

Step 2: Add Scaling Policy

Once Autoscaling for Targets is configured, Next we have to add Scaling Policy

Provide details below:

Policy Name - The name of the scaling policy.
Policy Dimension - The metric type tracked by the target tracking scaling policy.. Select from the dropdown
Target Value - The target value for the metric.
Scalein Cooldown - The amount of time, in seconds, after a scale in activity completes before another scale in activity can start.
ScaleOut Cooldown -The amount of time, in seconds, after a scale out activity completes before another scale out activity can start.
Disable ScaleIn - Disabling scale-in makes sure this target tracking scaling policy will never be used to scale in the Autoscaling group

This step creates the target tracking scaling policy and attaches it to the Autoscaling group

Step 3: View Scaling Target and Policy

View the Scaling Target and Policy Details from the DuploCloud Portal. Update and Delete Operations are also supported from this view

Autoscaling in Kubernetes

Autoscale your DuploCloud Kubernetes deployment

Prerequisites

Before autoscaling can be configured for your Kubernetes service, make sure that:

is setup in the DuploCloud tenant
is enabled for your DuploCloud infrastructure

Kubernetes Horizontal Pod Autoscaler (HPA)

Horizontal Pod Autoscaler (HPA) automatically scales the Deployment and its ReplicaSet. HPA checks the metrics configured in regular intervals and then scales the replicas up or down accordingly.

Configuring Services with HPA

You can configure HPA while creating a Deployment Service from the DuploCloud Portal.

In the DuploCloud Portal, navigate Kubernetes -> Services, displaying the Services page.
Create a new by clicking Add.
In Add Service - Basic Options, from the Replication Strategy list box, select Horizontal Pod Scheduler.
In the Horizontal Pod Autoscaler Config field, add a sample configuration, as shown below. Update the minimum/maximum Replica Count in the resource attributes, based on your requirements.
Click Next to navigate to Advanced Options.
In Advanced Options, in the Other Container Config field, ensure your resource attributes, such as Limits and Requests, are set to work with your HPA configuration, as in the example below.
At the bottom of the Advanced Options page, click Create.

For HPA Configures Services, Replica is set as Auto in the DuploCloud Portal

When your services are running, Replicas: Auto is displayed on the Service page.

Stopping a running Kubernetes service that is using an HPA

If a Kubernetes Service is running with a Horizontal Pod AutoScaler (HPA), you cannot stop the Service by clicking Stop in the service's Actions menu in the DuploCloud Portal.

Instead, do the following to stop the service from running:

In the DuploCloud Portal, navigate to Kubernetes -> Containers and select the Service you want to stop.
From the Actions menu, select Edit.
From the Replication Strategy list box, select Static Count.
In the Replicas field, enter 0 (zero).
Click Next to navigate to the Advanced Options page.
Click Update to update the service.

Allowing real-time alerts for autoscaling Kubernetes nodes

When the Cluster Autoscaler flag is set and a Tenant has one or more ASGs, an unschedulable-pod alert will be delayed by five (5) minutes to allow for autoscaling. You can configure the Infrastructure settings to bypass the delay and send the alerts in real-time.

From the DuploCloud portal, navigate to Administrator -> Infrastructure.
Click on the Infrastructure you want to configure settings for in the Name list.
Select the Settings tab.
Click the Add button. The Infra - Set Custom Data pane displays.
In the Setting Name list box, select Enables faults prior to autoscaling Kubernetes nodes.
Set the Enable toggle switch to enable the setting.
Click Set. DuploCloud will now generate faults for unschedulable K8s nodes immediately (before autoscaling).