IT Update: AWS Cost Savings 101
AWS announced recently that they have dropped their prices 51% since launch! That’s massive, yet the majority of IT leaders I speak with face growing costs each month. Why? AWS may look cheap on the surface but as they add more and more high-value propositions (kinesis, CloudFront, Lambda, Continuous Delivery, RDS etc…), you begin to consume more and more and thus increase your monthly spend.
I’m going to use a pub analogy. Imagine you’re en route to the grocery store to purchase ingredients for a healthy home made meal. On the way to the store; however, your friend calls, he’s seen the local pub is having a two for one steak night. You quickly agree to join Steve for dinner. It’s a no brainer after all, no trip to the store, no cooking, no cleaning and you don’t need to work out how much ingredients to buy. And if that wasn’t enough: you’ll only spend $10! You couldn’t even buy the ingredients for that!
When you get there however; something unplanned happens, you notice the entrée looks good, only an extra $5 why not? You order a glass of red with your steak and after dinner you have some desert and a few beers. The next morning you wake up with a large bill and a sore head and to make matters worse you have to explain to your boss (wife) how you spent $100.
This is how AWS has modelled it’s offering, they cut the price of their entry points (EC2 etc…) but add more and more value-adding services, increasing the average spend per client.
It is possible to have a cheap night at the pub but it takes discipline and AWS is no different. Over the last 18 months at Digital Turbine we’ve spent a lot of time fine tuning our AWS environments to be more cost effective, below I’ve mapped out our five key strategies and how we’ve avoided the AWS hangover.
Before you begin
- Right size your AWS instances. Don’t use an XL if a M will suffice.
- Standardise the generation of EC2 instances you are using. This will give you more flexibility down the line.
1. Reserved Instances
A reserved instance is when you purchase a machine from AWS for 1 – 3 years. You can save up to 65% of the on demand costs (in our case 40%). Think of the RI as a coupon. Each hour AWS looks at the machines you are running and then looks at your RIs and if any match they apply it to your bill.
We use Cloudability to help plan our reserved instances and have also written up a reserved instance purchasing strategy (based on Cloudability’s) outlined below.
Our goal over the coarse of a the year is to get to 70% RI utilisation, that is, we want 70% of our EC2 instances to be covered by reservations
You need someone to own and drive the RI purchasing strategy, essentially someone to hold accountable at the end of each month. We’ve called them the RI Czar (thanks again Cloudability).
Purchase a conservative amount of RIs each month. Many companies purchase their RI’s annually, their thinking is “if our AWS infrastructure remained the same for a year straight, we could cover the whole year with one bulk Reserved Instance purchase and not worry about it for a whole year”. On the surface this may seem like a great idea, but it has several huge drawbacks:
- Assumes your AWS infrastructure will remain the same for a year straight.
- Risk missing out on new generations of CPUs and being tied to older less preformant generations.
- If you commit to use one specific set of reservations until your yearly reserved instances roll around you won’t be able to address new needs as they arise.
- When your reserved instance anniversary rolls around the following year you will have to purchase them all again or face the Reserved Instance Cliff.
- It’s difficult to manage from a cash flow perspective.
Introducing the iterative approach
Rather than back yourself up against a Reserved Instance cliff, you can save yourself a lot of angst—and a lot of money—by buying your reservations at a more frequent cadence. This is the approach we have taken.
Our month 1 purchase will aim to cover all of the heavy-lifting aspects of our infrastructure. EC2’s with very high utilisation. This allows us to save on the large cost items quickly.
Month 2 onwards
Our following purchases will be much smaller. Each month, we will aim to address our changing needs. The outcome here is we will spend less time paying on-demand than we would have had we done one mega purchase annually. If and when we start using new instances, we can purchase RIs immediately without waiting until April of next year. Just as our needs are always changing, our selection of reservations changes too.
Monthly Calendar events
21st of each month
- The RI Czar will present the purchase order for the following month. This should include:
- The number of RI’s for each class.
- The impact to cashflow on the month.
- The savings this will represent.
28th of each month
The relevant stakeholders have until the 28th to raise any objections or propose alternatives.
1st of the month
The Czar actions the purchase order. Purchasing on the 1st of the month ensures there are no surprise mid month as RI’s expire.
2. Spot Priced Machines
Spot priced machines allow you to pay market price for AWS’s unused inventory. We often get EC2 instances around 60-90% cheaper than the on demand price.
How it works?
When launching a spot price you specify your bid threshold, i.e. the maximum amount you are willing to pay for the EC2. While the spot price of the machine stays below your threshold, your machine lives. If the spot price exceeds your bid price, your machine will be terminated with about 1 minutes warning. Our ingestion system runs purely on a spot fleet and with some clever scripting, switches to on demand during price spikes. Going forward we’re looking at scripting auto-scaling groups to use spot priced instances.
Pinterest are one such company who use spot priced machines to handle much of their load. They use Reserved Instances for they normal load and use spot priced machines when scaling.
3. Shut it down
One of the selling points of the cloud is that you only pay for what you use. You can terminate machines when you are finished. Unfortunately can and do are two very different things and companies often waste money leaving machines running 24×7 when they could be off.
Shutting down all non-production environments outside of office hours is one of the easiest steps to cutting costs. We run our QA environment from 7am-7pm Monday to Friday, this saves us 64% on those machines.
4. Multi Tenant
This one is definitely not in the low hanging fruit category, as it involves moving workloads around; but co-hosting services offers enormous savings. Do you really need a shiny new EC2 for every windows service your team writes or can you host them all on a beefier machine or a cluster of machines?
We’ve done this and in addition to cost benefits have seen the following added benefits:
- Less machines to maintain
- Less spikes, if you have 1 machine with one service running at 60% and it doubles due to a spike, you’re at 120% and in trouble.
- Contrast this with if you have 1 machine with 10 services running at 60% and one of them doubles you are at 66%
This approach may not always be suitable, as it will depend on your client’s appetite for living on a shared environment.
5. Serve your data wisely
Our Ignite product serves APKs to Android devices, a lot of APKs, how much exactly? Last month we did 1.5 petabytes in US-West at a cost $0.05 per GB. We also have Ignite services in South America and India where data costs at $0.08 and $0.19 respectively.
What costs $55,000 in California costs $198,000 in Sao Paulo! If latency is not a concern, consider serving all of your data from the one region and if you are serving more than 500TB per month, chat to your AWS account manager to get a preferential rate!
Reducing your AWS costs isn’t hard, it just takes discipline and a little time each month to ensure your processes are being followed. The rewards to follow are well worth it.