Today we are excited to reveal early preview releases of two major projects the Akka team has been involved in to improve data streaming on the JVM: Akka Streams and Reactive Streams. While the projects are related, we would like to spend a little time discussing how each is important to Typesafe and the Akka community.
Both of these efforts address major challenges we have been seeing from developers working with streaming data, both real-time content delivery and bulk data transfers. The difficulty in these scenarios is to keep the data flowing while limiting the resources that are consumed on the systems through which the stream passes. The crucial ingredient to getting this right is applying flow control. The idea on which Reactive Streams are built is to protect each consumer of data from being overwhelmed by its producer by propagating back pressure. This is done in an asynchronous, non-blocking fashion in accordance with the principles laid out in the Reactive Manifesto.
Reactive Streams is an engineering collaboration between heavy hitters in the area of streaming data on the JVM. With the Reactive Streams Special Interest Group, we set out to standardize a common ground for achieving statically typed, high-performance, low latency, asynchronous streams of data with built-in non-blocking back pressure—with the goal of creating a vibrant ecosystem of interoperating implementations, and with a vision of one day making it into a future version of Java.
Transformations of data streams proceed in a series of steps which can be executed in parallel, in a pipelined fashion:
In the example depicted above, an incoming data stream is ingested into storage via a sequence of transformations that validate and enrich it, pulling in additional data from an external service. Back pressure flows in the opposite direction of the data streams, propagating back to the source. When the system reaches its capacity, overflow handling kicks in to protect the system and keep it functioning; this means that either the external source can be slowed down (e.g. by propagating back pressure via TCP congestion mechanisms) or requests must be dropped in a controlled fashion.
The interoperability of Reactive Streams means that the processing steps in the example could be composed by using different libraries that all exchange data via the standardized interface while preserving the non-blocking back pressure throughout.
The Akka implementation of the Reactive Streams draft specification is based on Actors as its mechanism for execution, distribution and resilience. A fluent DSL allows the formulation of processing graphs like the one shown above, which are then turned into an underlying Actor network by way of a configurable translator. The current implementation runs every step within its own Actor to make full use of the available parallelism, but in the future we will add other variants—for example, that optimize for lower latency by fusing multiple operations to run within a single Actor.
Done reading about it and now want to try it out? Don’t hesitate to take the new “Akka Streams with Scala!” Activator template for a spin! And be sure to give us feedback to help shape Akka Streams and the Reactive Streams initiatives!