Cloud Cost Optimization

Using a cloud services have many advantages and one of them is scale or grow infinitely. Having ability to create resources like server, database, etc. quickly sometime increase the operational cost of the organizations if not used properly. When people don’t follow the certain guidelines, they start paying high monthly bills for cloud services.

Here are some of the tips on how to optimize the cloud spending


AWS or any cloud service provider gives you option to scale on demand which means you can add more resources like compute, memory, storage etc. to your infrastructure as and when needed. There is no need to do capacity planning in advance for 6 months or 1 year of future need and provision the resources. You will end up paying for the resources which you need after 6 months or 1 years in future.

Design your application to be stateless which can scale horizontally. Using Autoscaling feature of EC2 services you can scale on demand. Using containers and container orchestration services like ECS or EKS, Kubernetes makes it simple to scale on demand.

Here is a reference architecture diagram of auto scalable containerized application.

You should monitor your resource metrics like CPU usage, Memory usage, Storage, Network, etc. and then scale the resources vertically as needed. For example, lets you start a database server on RDS with a small instance type and after some time your CPU usage is consistently high, you can upgrade the Database RDS to next level. You can use cloud monitoring tools like CloudWatch, Prometheus and Grafana, Datadog, etc to monitor your infrastructure resources.



As the cloud provides you an option to choose any infrastructure resource you want, often people tend to overprovision resources to overcome performance glitches in the code.

Performance and Scalability are often mixed and misunderstood terms. I define performance as response time of the system for a single transaction / user. We can take this as benchmark and do load testing and gradually increase number of transactions / users and find out how much load a single server can take. Beyond this you can scale horizontally.

Doing a performance testing of each transaction without load to find the benchmark is important and then to handle more users/transactions scale horizontally (or scale-out). Don’t increase the resources to hide the performance bottlenecks of the code. Bottomline is


AWS provides web services to manage the infrastructure. You can do all kinds of automations to use the infrastructure resources in the cloud.

Using AWS SDK, you can automatically stop and start the resource based on a schedule or on-demand specially in non-production environments. Keeping your development, testing environments running for 24×7 is not required. You can provide your developers and testers a mechanism to stop and start the environments on demand whenever they need and save the cost of AWS.

AWS tools like Trusted Advisor, CloudWatch can give you the data about utilization of the AWS resources.

Bottomline is


AWS provides discount on the bills if you commit a certain amount of usage of compute. The Saving plan is very flexible than using reserved instance. The reservation can be for 3 years all upfront payment to 1 year no-upfront payment. You can choose the plan based on your need and save around 20% or more on your compute resources like EC2, lambda, and Fargate.

You can use recommendation tools available in the AWS Cost Explorer for selecting the right saving plan.

You can also reserve RDS instances if you are certain about usage of your RDS instance in future. Bottomline is


AWS provides savings around 90% if you use spot instance of ec2 instead of on-demand. The only drawback is that spot instances can be stopped if your price is less that latest price. You can automate your infrastructure to provision new resources whenever the spot instances are taken away.

It is highly recommended to use Spot instances in the non-production environments in your ECS or EKS cluster or autoscaling group. You can mix spot instances with on-demand in production.


Always check for latest types in AWS instances and use them. They are not only better in performance but comparatively cheaper as well. You can upgrade the existing resources to latest one by stopping it changing the type.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.