Create Autoscaling groups to scale EC2 instances to your workload
Configure Autoscaling Groups (ASG) to ensure the application load is scaled based on the number of EC2 instances configured. Autoscaling detects unhealthy instances and launches new EC2 instances. ASG is also cost-effective as EC2 Instances are dynamically created per the application requirement within minimum and maximum count limits.
The Use for Cluster Autoscaling option will not be available until you enable the Cluster Autoscaler option in your Infrastructure.
In the DuploCloud Portal, navigate to Cloud Services -> Hosts.
In the ASG tab, click Add. The Add ASG page is displayed.
In the Friendly Name field, enter the name of the ASG.
Select Availability Zone and Instance Type.
In the Instance Count field, enter the desired capacity for the Autoscaling group.
In the Minimum Instances field, enter the minimum number of instances. The Autoscaling group ensures that the total number of instances is always greater than or equal to the minimum number of instances.
In the Maximum Instances field, enter the maximum number of instances. The Autoscaling group ensures that the total number of instances is always less than or equal to the maximum number of instances.
Select Use for Cluster Autoscaling.
Select Advanced Options.
Select the appropriate Image ID.
From the Agent Platform list box, select Linux Docker/Native to run a Docker service or select EKS Linux to run services using EKS. Fill in additional fields as needed for your ASG.
Optionally, enable Spot Instances.
Optionally, for EKS only, enable Scale from zero.
Click Add. Your ASG is added and displayed in the ASG tab.
View the Hosts created as part of ASG creation from the ASG Hosts tab.
Refer to AWS Documentation for detailed steps on creating Scaling policies for the Autoscaling Group.
The DuploCloud Portal provides the ability to configure Services based on the platforms EKS Linux and Linux Docker/Native. Select the ASG based on the platform used when creating services and Autoscaling groups. Optionally, if you previously enabled Spot Instances in the ASG, you can configure the Service to use Spot Instances by selecting Tolerate spot instances.
Create Autoscaling Groups (ASG) with Spot Instances in the DuploCloud platform
Spot Instances are spare capacity priced at a significant discount compared to On-Demand Instances. Users specify the maximum price (bid) they will pay per hour for a Spot Instance. The instance is launched if the current Spot price is below the user's bid. Since Spot Instances can be interrupted when spare capacity is unavailable, applications using Spot Instances must be fault-tolerant and able to handle interruptions.
Spot Instances are only supported for Auto-scaling Groups (ASG) with EKS
Follow the steps in the section Creating Autoscaling Groups (ASG). Before clicking Add, Click the box to access Advanced Options. Enable Use Spot Instances and enter your bid, in dollars, in the Maximum Spot Price field.
Follow the steps in Creating Services using Autoscaling Groups. In the Add Service page, Basic Options, Select Tolerate spot instances.
Tolerations will be entered by default in the Add Service page, Advanced Options, Other Container Config field.
Scale to or from zero when creating Autoscaling Groups in DuploCloud
DuploCloud allows you to scale to or from zero in Amazon EKS clusters by enabling the Scale from Zero option within the Advanced Options when creating an Autoscaling Group. This feature intelligently adjusts the number of instances in your cluster, dynamically scaling up when demand increases and down to zero when resources are not in use. Reducing resource allocation during idle periods leads to significant cost savings.
Autoscaling to zero is ideal for Kubernetes workloads that don’t always require 100% availability such as:
Non-Critical Workloads: Batch processing jobs, data analysis tasks, and other non-customer-facing services that can be scaled down to zero during off-peak hours (e.g., nights or weekends).
Dev/Test Environments: Development and testing environments that can be scaled up when developers need them and scaled down when not in use.
Background Jobs: Workloads with background jobs running in Kubernetes that are only needed intermittently, such as those triggered by specific events or scheduled at certain times.
Autoscaling to zero is not suitable for all workloads. Avoid using this feature for:
Customer-Facing Applications: Frontend web applications that must always be available should not use autoscaling to zero, as it can cause downtime and negatively impact user experience.
Workloads Outside Kubernetes: If background jobs or other processes are not running in Kubernetes, autoscaling to zero will not apply. Different scaling strategies are required for these environments.
Scaling to or from zero with AWS Autoscaling Groups (ASG) offers several advantages depending on the context and requirements of your application:
Cost Savings: By scaling down to zero instances during periods of low demand, you minimize costs associated with running and maintaining instances. This pay-as-you-go model ensures you only pay for resources when they are actively being used.
Resource Efficiency: Scaling to zero ensures that resources are not wasted during periods of low demand. By terminating instances when they are not needed, you optimize resource utilization and prevent over-provisioning, leading to improved efficiency and reduced infrastructure costs.
Flexibility: Scaling to zero provides the flexibility to dynamically adjust your infrastructure in response to changes in workload. It allows you to efficiently allocate resources based on demand, ensuring that your application can scale up or down seamlessly to meet varying levels of traffic.
Simplified Management: With automatic scaling to zero, you can streamline management tasks associated with provisioning and de-provisioning instances. The ASG handles scaling operations automatically, reducing the need for manual intervention and simplifying infrastructure management.
Rapid Response to Increased Demand: Scaling from zero allows your infrastructure to quickly respond to spikes in traffic or sudden increases in workload. By automatically launching instances as needed, you ensure that your application can handle surges in demand without experiencing performance degradation or downtime.
Improved Availability: Scaling from zero helps maintain optimal availability and performance for your application by ensuring that sufficient resources are available to handle incoming requests. This proactive approach to scaling helps prevent resource constraints and ensures a consistent user experience even during peak usage periods.
Enhanced Scalability: Scaling from zero enables your infrastructure to scale out horizontally, adding additional instances as demand grows. This horizontal scalability allows you to seamlessly handle increases in workload and accommodate a growing user base without experiencing bottlenecks or performance issues.
Elasticity: Scaling from zero provides elasticity to your infrastructure, allowing it to expand and contract based on demand. This elasticity ensures that you can efficiently allocate resources to match changing workload patterns, resulting in optimal resource utilization and cost efficiency.