Benchmarking database sharding in Akka

Akka 24.05 introduced a significant feature: database sharding for event storage. This innovation allows developers to achieve unprecedented throughput with ordinary relational databases like PostgreSQL–throughput that normally requires high-priced databases. In this post, we'll explore the power of this new feature through a series of benchmarks, demonstrating how it scales from one to eight database instances.

For a comprehensive understanding of Akka's innovative approach to database sharding, we encourage you to review the “The database is the bottleneck article” by Patrik Nordwall. This article provides an in-depth explanation of how Akka implements sharding, the concept of slice ranges, and how these are mapped to databases.

Benchmark setup

Application

To evaluate the performance of Akka's new database sharding feature, we created a small Akka application that emulates a digital twin solution for IoT devices. Each device keeps track of the last ten updates, just to simulate some complex failure detection logic when processing a new update and put some pressure on the heap memory. A single update (write request) contains a 500 bytes array with random data, appended to the event store. A read request returns the latest update from the device. Depending on the test, we keep from 3k to 5k device entities per Akka node. Each Akka node is deployed as a Kubernetes pod on a separate host. In other words, one host = one Akka pod. We test a request-response HTTP communication, with JSON-based payload serialization. Internally we use Protocol Buffers to exchange data between Akka nodes, and for persistence.

Database

Every RDS database used for benchmarks has the same configuration:

Instance type: db.r6g.2xlarge (8vCPUs, 64 GB RAM)
Storage: 100 GB io2, 5000 IOPS

Test

For testing, we used Gatling as a load generator tool. In each scenario we are testing a mixed load with 75% read operations and 25% update (write) operations. We gradually increase the throughput until we reach the maximum for a given setup and keep it for 20 minutes.

Results

1 database instance

	Throughput (req/s)	Latency (ms, 99p)
All	108,000	9
Read	81,000	3
Write	27,000	14

To handle this load, the Akka cluster used 6 * EC2 m5.4xlarge (16 vCPUs) instances. That’s our baseline. If you are not familiar with Akka, keep in mind that thanks to the Actor Model, read requests do not touch the database. Handling 27k writes per second with a single Postgres database is also pretty impressive. Using AWS io2 storage type definitely helps a lot to hold that throughput.

2 database instances

	Throughput (req/s)	Latency (ms, 99p)
All	252,000	17
Read	189,000	8
Write	63,000	28

This time we pushed the throughput even further, sacrificing some latency. The Akka cluster used 12 * EC2 m5.4xlarge (16 vCPUs) instances.

4 database instances

	Throughput (req/s)	Latency (ms, 99p)
All	500,000	29
Read	375,000	15
Write	125,000	49

Slowly we can see a very nice, close to linear, scaling characteristic. To handle 500k req/s Akka cluster required 25 * EC2 m5.4xlarge (16 vCPUs) instances.

8 database instances

	Throughput (req/s)	Latency (ms, 99p)
All	1,000,000	34
Read	750,000	23
Write	250,000	52

I must admit, seeing 1M req/s in Grafana was very satisfying. Keep in mind that the latency was measured at very high resource saturation in all tests. It can be improved by switching to a less intensive load or increasing the number of Akka nodes.

Requests Responses 1min

This time we used a different instance configuration. The Akka cluster fully utilized 25 * EC2 m5.8xlarge (32 vCPUs) instances. Together with databases, 864 vCPUs were required to handle this load. The load generation part required 4 * c7i.4xlarge Gatling instances to produce a given throughput.

Analysis

A graph is worth more than a thousand words, so let’s examine this data from a different perspective.

Scaling Characteristic

The benchmark results reveal a compelling story of scalability. As we increase the number of database instances from one to eight, we observe a near-linear growth in throughput. This impressive scalability demonstrates the effectiveness of Akka's database sharding strategy.

Cost per 1k IOPS per Month

On the other hand, costs are close to a flat line, varying from $24.32 to $28.78 per 1000 IOPS per month, where one IOPS is one read or write operation per second. For all calculations, we used a one-year reserved instance pricing.

Summary

The ability to distribute write operations across multiple databases while maintaining data consistency is a key factor in this performance boost. By alleviating the pressure on a single database, Akka allows each instance to operate more efficiently, resulting in a cumulative increase in overall system throughput.

However, it's important to note that perfect linear scaling is rare in real-world scenarios. Factors such as network latency, coordination overhead, and potential hotspots in data distribution may introduce some deviation from the ideal linear growth.

By enabling high throughput with ordinary PostgreSQL instances, Akka provides a cost-effective scaling solution. Organizations can leverage their existing PostgreSQL expertise and infrastructure while achieving performance levels that previously may have required specialized and more complex database solutions, which could have led to a major system rewrite.

For me, as an architect, the actual numbers are not that important. Whether it is 500k req/s or 501 req/s is not a big difference. The most important aspect is that even if I decide to use a moderate Postgres instance, I keep my options open for future scaling requirements. I can scale not only vertically but also horizontally. Some data migration is required for a live data set, but at least I’m not blocked. Not to mention that from the application perspective, not a single line of code change is required to work with more than one database. It’s just a matter of changing the database configuration.

Author

Andrzej Ludwikowski

Senior Engineer, Akka Platform Engineering, Lightbend, Inc.

Software Architect with over 12 years of experience in commercial software development. Conference speaker and blogger. Devotee of DDD, Event Sourcing and Polyglot Persistence. System performance bottlenecks validator. Continuously chasing the dream of a perfect software architecture, which does not exist, but looking for it is the goal itself.

August 6, 2024