SkySQL and Life After Amazon Redshift, Part 2
We already gutted Amazon Redshift in Part 1, we might as well finish with a fatality. Yes, the Mortal Kombat in the arcade at the roller rink kind of fatality. And yes, I had custom roller skates. Will I share pictures in a future blog post? No.
Now, back to cloud data warehouses. Redshift is great if…
- You have money to burn, and enjoy bonfires.
- You love complexity, and prefer Reptangles over Legos.
If you don’t, it’s time for something better (and less expensive). Here’s the thing.
Redshift is up to 40% more expensive than SkySQL (and runs on a 15 year old database).
If you just need a data warehouse for an hour, it doesn’t matter. What’s a few dollars? But we’re talking about thousands of dollars monthly, tens of thousands annually. It adds up, and fast.
Let’s dig in.
Redshift pricing
Amazon has three instance types available for Redshift, but recommends two, so let’s focus on those. The DC2 instance types store data on local SSDs. The RA3 instance types use Redshift Managed Storage (RMS) with data stored on local SSDs and spilling over to S3 when they are full.
Here are your options (all four of them):
Name | vCPU | Memory | Storage | Price |
dc2.large | 2 | 15GB | 0.16TB | $0.33/hour |
dc2.8xlarge | 32 | 244GB | 2.56TB | $6.40/hour |
ra3.4xlarge | 12 | 96GB | 64TB | $3.606/hour |
ra3.16xlarge | 48 | 384GB | 64TB | $14.424/hour |
Basically, you can choose between extra small and extra large. However, unless you choose the dc2.large instance type, Redshift requires a cluster with at least two (2) nodes, doubling the starting price. That shirt that’s only available in the wrong size, you have to buy two of them.
Name | vCPU | Memory | Storage* | Price |
1x dc2.large | 2 | 15GB | 0.16TB | $0.33/hour |
2x dc2.8xlarge | 64 | 488GB | 2.56TB | $12.80/hour |
2x ra3.4xlarge | 24 | 192GB | 64TB | $7.212/hour |
2x ra3.16xlarge | 96 | 768GB | 64TB | $28.848/hour |
* A cluster of two (2) does not double the storage because the data is replicated, with both nodes storing all of the data.
SkySQL pricing
SkySQL, on the other hand, does not force you to use a separate, smaller set of instance types for data warehouses. You can create a data warehouse with any of the instance types available in SkySQL.
Name | vCPU | Memory | Storage * | Price |
Sky-4×15 | 4 | 15GB | Unlimited | $0.45/hour |
Sky-4×26 | 4 | 26GB | Unlimited | $0.63/hour |
Sky-8×30 | 8 | 30GB | Unlimited | $0.90/hour |
Sky-8×52 | 8 | 52GB | Unlimited | $1.27/hour |
Sky-16×60 | 16 | 60GB | Unlimited | $1.81/hour |
Sky-16×104 | 16 | 104GB | Unlimited | $2.53/hour |
Sky-32×120 | 32 | 120GB | Unlimited | $3.61/hour |
Sky-32×208 | 32 | 208GB | Unlimited | $5.07/hour |
Sky-64×240 | 64 | 240GB | Unlimited | $7.202/hour |
Sky-64×416 | 64 | 416GB | Unlimited | $10.141/hour |
* SkySQL data warehouse storage is unlimited, but like RMS, it does incur a cost. RMS is $0.0271/GB per month while SkySQL is $0.026/GB per month.
Redshift vs. SkySQL – gigabytes of data
If you have a small amount of data, say gigabytes, the Redshift dc2.large instances will suffice. The price difference between Redshift and SkySQL is negligible.
vCPU/Memory | SkySQL | Redshift | SkySQL | Redshift |
4/30GB | 1x Sky-4×26Â | 2x dc2.large | $0.63/hour | $0.66/hour |
8/60GB | 1x Sky-8-52 | 4x dc2.large | $1.27/hour | $1.32/hour |
16/120GB | 1x Sky-16-104 | 8x dc2.large | $2.53/hour | $2.64/hour |
32/240GB | 1x Sky-32-208 | 16x dc2.large | $5.07/hour | $5.28/hour |
Redshift vs. SkySQL – a few terabytes of data
If you have a medium amount of data, say single digit terabytes, you’re looking at Redshift dc2.8xlarge instance types. This is where Redshift prices start to take off. If you only need 32 vCPU, Redshift is now 110% more expensive than SkySQL because you have to create a cluster with two (2) nodes. If you need 64 vCPU, it’s not as bad. Redshift is now 20% more expensive than SkySQL.
Note: The Redshift clusters below (2x dc2.8xlarge) can store up to 2.56TB of data (because the data is replicated). Beyond that, you’ll need to add more nodes or move up to the RA3 instance types. The SkySQL instance can store an unlimited amount of data.
vCPU/Memory | SkySQL | Redshift | SkySQL | Redshift |
32/244GB | 1x Sky-32×208Â | 2x dc2.8xlarge | $5.07/hour | $12.80/hour |
64/488GB | 1x Sky-64-416 | 2x dc2.8xlarge | $10.14/hour | $12.80/hour |
Redshift vs. SkySQL – many terabytes of data
If you have a large amount of data, say tens to hundreds of terabytes, you’ll need the Redshift ra3 instance types. This is where Redshift prices enter the upper atmosphere. If you only need 12 vCPU, Redshift is now 185% more expensive than SkySQL because you have to create a cluster with two (2) nodes. If you only need 24 vCPU, it’s not as bad. Redshift is now 40% more expensive than SkySQL.
Note: The Redshift clusters below (2x ra3.4xlarge) can store up to 64TB of data (because the data is replicated). Beyond that, you’ll need to add more nodes or move up to the ra3.16xlarge instances. The SkySQL instance can store an unlimited amount of data.
vCPU/Memory | SkySQL | Redshift | SkySQL | Redshift |
12/96GB | 1x Sky-16-104 | 2x ra3.4xlarge | $2.53/hour | $7.212/hour |
24/192GB | 1x Sky-32-208 | 2x ra3.4xlarge | $5.07/hour | $7.212/hour |
48/384GB | 1x Sky-64-416 | 2x ra3.4xlarge | $10.14/hour | $14.424/hour |
In the examples above, I’ve aligned SkySQL and Redshift instance types based on vCPU, memory and storage – and using the minimum number of nodes. Because SkySQL instance types range from 2-64 vCPU, and have unlimited storage, you only need a single node.
The cloud-native advantage
With a single node, SkySQL scales up to 64 vCPU and provides high availability via Kubernetes, self repair, persistent disks and object storage. If a node fails, Kubernetes will create a new one and attach the persistent disk. The data? It’s still on object storage. That’s what we call a cloud-native storage architecture. It’s what happens when you engineer a cloud database using current technology instead of a 15 year old database.
When you’re ready to scale beyond 64 vCPU, SkySQL will have you covered. As mentioned previously, ColumnStore is fully distributed. In a future SkySQL update, we’re adding support for distributed deployments so you can scale out compute as much as you need to.
Welcome to life after Redshift!