Akka Streams & HTTP provides reactive, asynchronous, and non-blocking streams for processing data and HTTP requests and responses. It builds upon Akka IO and the Reactive Streams initiative to allow stream-based topologies to be declared and run for tasks like processing big data, serving clients simultaneously with limited resources, and building distributed applications that integrate with external systems over HTTP. Key features include stream sources, sinks, and transformations along with a routing DSL for building HTTP servers and clients on top of Akka IO and HTTP Core.
The document discusses Konrad Malawski's talk on reactive streams at GeeCON 2014 in Krakow, Poland. It introduces reactive streams and their goals of standardized, back-pressured asynchronous stream processing. Reactive streams allow different implementations like RxJava, Reactor, Akka Streams, and Vert.x to interoperate using a common protocol. The document provides an example of integrating RxJava, Akka Streams, and Reactor streams to demonstrate this interoperability. It also discusses concepts like back pressure to prevent buffer overflows when processing streams.
Things were easier when all our data used to be offline, analyzed overnight in batches. Now our data is online, in motion, and generated constantly. For architects, developers and their businesses, this means that there is an urgent need for tools and applications that can deliver real-time (or near real-time) streaming ETL capabilities. In this session by Konrad Malawski, author, speaker and Senior Akka Engineer at Lightbend, you will learn how to build these streaming ETL pipelines with Akka Streams, Alpakka and Apache Kafka, and why they matter to enterprises that are increasingly turning to streaming Fast Data applications.
Akka Streams and its amazing handling of stream back-pressure should be no surprise to anyone. But it takes a couple of use cases to really see it in action - especially use cases where the amount of work increases as you process make you really value the back-pressure. This talk takes a sample web crawler use case where each processing pass expands to a larger and larger workload to process, and discusses how we use the buffering capabilities in Kafka and the back-pressure with asynchronous processing in Akka Streams to handle such bursts. In addition, we will also provide some constructive “rants” about the architectural components, the maturity, or immaturity you’ll expect, and tidbits and open source goodies like memory-mapped stream buffers that can be helpful in other Akka Streams and/or Kafka use cases.
Presentation by Akara Sucharitakul about "Asynchronous Orchestration DSL on squbs" at Scala Bay Meetup
Get up and running quickly with Apache Kafka http://kafka.apache.org/ * Fast * A single Kafka broker can handle hundreds of megabytes of reads and writes per second from thousands of clients. * Scalable * Kafka is designed to allow a single cluster to serve as the central data backbone for a large organization. It can be elastically and transparently expanded without downtime. Data streams are partitioned and spread over a cluster of machines to allow data streams larger than the capability of any single machine and to allow clusters of co-ordinated consumers * Durable * Messages are persisted on disk and replicated within the cluster to prevent data loss. Each broker can handle terabytes of messages without performance impact. * Distributed by Design * Kafka has a modern cluster-centric design that offers strong durability and fault-tolerance guarantees.
How many services do you have? 5, 10, 100? How do you even run large number of services? A micro service may be relatively simple. But services also mean distributed systems, which are inherently complex. 5 services are complex. A thousand services across many generations are at least 200 times as complex. How do we deal with such complexity? This talk discusses service architecture at Internet scale, the need for larger transaction density, larger horizontal and vertical scale, more predictable latencies under stress, and the need for standardization and visibility. We’ll dive into how we build our latest generation service infrastructure based on Scala and Akka to serve the needs of such a large scale ecosystem. Lastly, have the cake and eat it too. No, we’re not keeping all the goodies only to ourselves. They are all there for you in open source.
The term 'streams' has been getting pretty overloaded recently–it's hard to know where to best use different technologies with streams in the name. In this talk by noted hAkker Konrad Malawski, we'll disambiguate what streams are and what they aren't, taking a deeper look into Akka Streams (the implementation) and Reactive Streams (the standard). You'll be introduced to a number of real life scenarios where applying back-pressure helps to keep your systems fast and healthy at the same time. While the focus is mainly on the Akka Streams implementation, the general principles apply to any kind of asynchronous, message-driven architectures.
This document discusses Akka Streams, which provide asynchronous back pressured stream processing in Akka. It describes key Akka Stream concepts like sources, sinks, and flows. It also discusses how Akka Streams integrate with other technologies like Reactive Streams, Kafka, HTTP servers and clients. Alpakka is mentioned as a community for developing Akka Stream connectors.
This document provides an overview and introduction to Akka Streams and Reactive Streams. Some key points: - Reactive Streams is a standard for asynchronous stream processing with non-blocking back pressure to prevent issues like out of memory errors. - Akka Streams is a toolkit for building powerful concurrent and distributed applications simply using a Reactive Streams-compliant API. It includes sources, sinks, flows and other stages for stream processing. - Examples show how to create simple stream graphs that process data asynchronously using Akka Streams APIs in both Java and Scala in just a few lines of code. More complex examples demonstrate features like parallelization. - The community Alpakka
This document provides an overview of Akka Streams, which is a toolkit for building highly concurrent, distributed, and resilient message-driven applications on the JVM. It discusses key aspects of Akka Streams including asynchronous back pressured stream processing using sources, sinks, and processes; non-linear stream topologies; Reactive Streams compatibility; the Java and Scala APIs; materialization; integrations with HTTP and Alpakka community connectors for technologies like Kafka, MQTT, and Cassandra; and opportunities to contribute to Akka Streams.
Presentation of specs2 functionalities from simple ones to less well-known + overview of the next release
Akka HTTP is a toolkit for building scalable REST services in Scala. It provides a high-level API built on top of Akka actors and Akka streams for writing asynchronous, non-blocking and resilient microservices. The document discusses Akka HTTP's architecture, routing DSL, directives, testing, additional features like file uploads and websockets. It also compares Akka HTTP to other Scala frameworks and outlines pros and cons of using Akka HTTP for building REST APIs.
This document summarizes a presentation about Akka streams, which is a toolkit for building highly concurrent, distributed, and resilient message-driven applications on the JVM. It provides asynchronous back pressured stream processing using sources, sinks, and flows. Key features include actors for concurrency, clustering for location transparency and resilience, and integration with technologies like Kafka, Cassandra and HTTP. The document outlines how Akka streams work, how to write stream applications, and how streams can be used for scenarios like HTTP requests/responses and streaming data. It encourages contributions to Akka and discusses next steps like improved remoting and more stream connectors.
The document provides an overview of Akka HTTP, streams, and routes. It discusses core concepts like streams, elements, back-pressure, sources, sinks, flows, and runnable graphs. It explains that routes define the request handling logic and are assembled into routing trees. Routes can complete requests, reject them, or perform asynchronous processing and return route results.
Building on the success of Reactive Extensions—first in Rx.NET and now in RxJava—we are taking Observers and Observables to the next level: by adding the capability of handling back-pressure between asynchronous execution stages we enable the distribution of stream processing across a cluster of potentially thousands of nodes. The project defines the common interfaces for interoperable stream implementations on the JVM and is the result of a collaboration between Twitter, Netflix, Pivotal, RedHat and Typesafe. In this presentation I introduce the guiding principles behind its design and show examples using the actor-based implementation in Akka.
"In this session, Twitter engineer Alex Payne will explore how the popular social messaging service builds scalable, distributed systems in the Scala programming language. Since 2008, Twitter has moved the development of its most critical systems to Scala, which blends object-oriented and functional programming with the power, robust tooling, and vast library support of the Java Virtual Machine. Find out how to use the Scala components that Twitter has open sourced, and learn the patterns they employ for developing core infrastructure components in this exciting and increasingly popular language."
Reactive streaming is becoming the best approach to handle data flows across asynchronous boundaries. Here, we present the implementation of a real-world application based on Akka Streams. After reviewing the basics, we will discuss the development of a data processing pipeline that collects real-time sensor data and sends it to a Kinesis stream. There are various possible point of failures in this architecture. What should happen when Kinesis is unavailable? If the data flow is not handled in the correct way, some information may get lost. Akka Streams are the tools that enabled us to build a reliable processing logic for the pipeline that avoids data losses and maximizes the robustness of the entire system.
In this presentation, Akka Team Lead and author Roland Kuhn presents the freshly released final specification for Reactive Streams on the JVM. This work was done in collaboration with engineers representing Netflix, Red Hat, Pivotal, Oracle, Typesafe and others to define a standard for passing streams of data between threads in an asynchronous and non-blocking fashion. This is a common need in Reactive systems, where handling streams of "live" data whose volume is not predetermined. The most prominent issue facing the industry today is that resource consumption needs to be controlled such that a fast data source does not overwhelm the stream destination. Asynchrony is needed in order to enable the parallel use of computing resources, on collaborating network hosts or multiple CPU cores within a single machine. Here we'll review the mechanisms employed by Reactive Streams, discuss the applicability of this technology to a variety of problems encountered in day to day work on the JVM, and give an overview of the tooling ecosystem that is emerging around this young standard.
Akka Streams is an implementation of Reactive Streams, which is a standard for asynchronous stream processing with non-blocking backpressure on the JVM. In this talk we'll cover the rationale behind Reactive Streams, and explore the different building blocks available in Akka Streams. I'll use Scala for all coding examples, but Akka Streams also provides a full-fledged Java8 API. After this session you will be all set and ready to reap the benefits of using Akka Streams!
The document provides an overview of Yardena Meymann's background and experience working with asynchronous programming in Scala. It discusses some of the common tools and approaches for writing asynchronous programs in Scala, including Futures, Actors, Streams, HTTP clients/servers, and integration with Kafka. It highlights some of the challenges of asynchronous programming and how different tools address issues like error handling, retries, and backpressure.
Reactive Streams 1.0.0 is now live, and so are our implementations in Akka Streams 1.0 and Slick 3.0. Reactive Streams is an engineering collaboration between heavy hitters in the area of streaming data on the JVM. With the Reactive Streams Special Interest Group, we set out to standardize a common ground for achieving statically-typed, high-performance, low latency, asynchronous streams of data with built-in non-blocking back pressure—with the goal of creating a vibrant ecosystem of interoperating implementations, and with a vision of one day making it into a future version of Java. Akka (recent winner of “Most Innovative Open Source Tech in 2015”) is a toolkit for building message-driven applications. With Akka Streams 1.0, Akka has incorporated a graphical DSL for composing data streams, an execution model that decouples the stream’s staged computation—it’s “blueprint”—from its execution (allowing for actor-based, single-threaded and fully distributed and clustered execution), type safe stream composition, an implementation of the Reactive Streaming specification that enables back-pressure, and more than 20 predefined stream “processing stages” that provide common streaming transformations that developers can tap into (for splitting streams, transforming streams, merging streams, and more). Slick is a relational database query and access library for Scala that enables loose-coupling, minimal configuration requirements and abstraction of the complexities of connecting with relational databases. With Slick 3.0, Slick now supports the Reactive Streams API for providing asynchronous stream processing with non-blocking back-pressure. Slick 3.0 also allows elegant mapping across multiple data types, static verification and type inference for embedded SQL statements, compile-time error discovery, and JDBC support for interoperability with all existing drivers.
Akka Streams and its amazing handling of streaming with back-pressure should be no surprise to anyone. But it takes a couple of use cases to really see it in action - especially in use cases where the amount of work continues to increase as you’re processing it. This is where back-pressure really shines. In this talk for Architects and Dev Managers by Akara Sucharitakul, Principal MTS for Global Platform Frameworks at PayPal, Inc., we look at how back-pressure based on Akka Streams and Kafka is being used at PayPal to handle very bursty workloads. In addition, Akara will also share experiences in creating a platform based on Akka and Akka Streams that currently processes over 1 billion transactions per day (on just 8 VMs), with the aim of helping teams adopt these technologies. In this webinar, you will: *Start with a sample web crawler use case to examine what happens when each processing pass expands to a larger and larger workload to process. *Review how we use the buffering capabilities in Kafka and the back-pressure with asynchronous processing in Akka Streams to handle such bursts. *Look at lessons learned, plus some constructive “rants” about the architectural components, the maturity, or immaturity you’ll expect, and tidbits and open source goodies like memory-mapped stream buffers that can be helpful in other Akka Streams and/or Kafka use cases.
A session about the Reactor story with Reactive Streams pragmatic specification. Talk recorded at SpringOne2GX 2014 Dallas.
This document discusses using Akka streams for dataflow and reactive programming. It begins with an overview of dataflow concepts like nodes, arcs, graphs, and features such as push/pull data, mutable/immutable data, and compound nodes. It then covers Reactive Streams including back pressure, the asynchronous non-blocking protocol, and the publisher-subscriber interface. Finally, it details how to use Akka streams, including defining sources, sinks, and flows to create processing pipelines as well as working with more complex flow graphs. Examples are provided for bulk exporting data to Elasticsearch and finding frequent item sets from transaction data.
This talk is an introduction into Stream Processing with Apache Flink. I gave this talk at the Madrid Apache Flink Meetup at February 25th, 2016. The talk discusses Flink's features, shows it's DataStream API and explains the benefits of Event-time stream processing. It gives an outlook on some features that will be added after the 1.0 release.
Building data pipelines shouldn't be so hard, you just need to choose the right tools for the task. We will review Akka and Spark streaming, how they work and how to use them and when.
This is the first part of a mini-series where we discuss how to build distributed stateful real-time applications using actor model and messaging. The second part: https://www.slideshare.net/PeterCsala/akkademy-aka-how-to-build-stateful-distributed-systems-iiii
Slides from IT talk: «API Testing. Streamline your testing process. A step by step tutorial» Code on github: https://github.com/a-oleynik/soap-ui Webinar on youtube: https://www.youtube.com/watch?v=x2ALtuCjuUo DataArt P. https://www.meetup.com/ru-RU/DataArt-Wroclaw-IT-talk/events/246967484/?eventId=246967484 Wroclaw, 2018, February 15
Spark Streaming and Kafka Streams are two popular stream processing platforms. Spark Streaming uses micro-batching and allows for code reuse between batch and streaming jobs. Kafka Streams is embedded directly into Apache Kafka and leverages Kafka as its internal messaging layer. Both platforms support stateful stream processing operations like windowing, aggregations, and joins through distributed state stores. A demo application is shown that detects dangerous driving by joining truck position data with driver data using different streaming techniques.
The document provides an overview of reactive programming and Spring WebFlux. It defines reactive programming as an asynchronous paradigm concerned with data streams and change propagation. It discusses why reactive programming is useful for handling back-pressure, communicating change, and improving scalability and performance. It also summarizes key concepts in reactive programming like Project Reactor's Mono and Flux types, and how Spring WebFlux allows developing reactive applications with annotated controllers or functional routing.
The document discusses Reactive Streams, which provide a standard for asynchronous stream processing between producers and consumers. Reactive Streams use back-pressure to prevent issues like out of memory errors from unbounded buffers or dropped messages from bounded buffers. They operate using a pull-based model where the fast producer will send no more than the amount of data requested by the consumer, ensuring demands are met without overflowing buffers. Major companies collaborated to develop the Reactive Streams specification to support stream processing across systems.
What are reactive streams ? What is backpressure ? Why Akka streams ? A quick look at the Akka Streams API.
This document provides an overview of stream processing with Apache Flink. It discusses the rise of stream processing and how it enables low-latency applications and real-time analysis. It then describes Flink's stream processing capabilities, including pipelining of data, fault tolerance through checkpointing and recovery, and integration with batch processing. The document also summarizes Flink's programming model, state management, and roadmap for further development.
Stream processing applications built on Apache Apex run on Hadoop clusters and typically power analytics use cases where availability, flexible scaling, high throughput, low latency and correctness are essential. These applications consume data from a variety of sources, including streaming sources like Apache Kafka, Kinesis or JMS, file based sources or databases. Processing results often need to be stored in external systems (sinks) for downstream consumers (pub-sub messaging, real-time visualization, Hive and other SQL databases etc.). Apex has the Malhar library with a wide range of connectors and other operators that are readily available to build applications. We will cover key characteristics like partitioning and processing guarantees, generic building blocks for new operators (write-ahead-log, incremental state saving, windowing etc.) and APIs for application specification.
- Scrapy is a framework for web scraping that allows for extraction of structured data from HTML/XML through selectors like CSS and XPath. It provides features like an interactive shell, feed exports, encoding support, and more. - Scrapy is built on top of the Twisted asynchronous networking framework, which provides an event loop and deferreds. It handles protocols and transports like TCP, HTTP, and more across platforms. - Scrapy architecture includes components like the downloader, scraper, and item pipelines that communicate internally. Flow control is needed between these to limit memory usage and scheduling through techniques like concurrent item limits, memory limits, and delays between calls.
Stream data processing is increasingly required to support business needs for faster actionable insight with growing volume of information from more sources. Apache Apex is a true stream processing framework for low-latency, high-throughput and reliable processing of complex analytics pipelines on clusters. Apex is designed for quick time-to-production, and is used in production by large companies for real-time and batch processing at scale. This session will use an Apex production use case to walk through the incremental transition from a batch pipeline with hours of latency to an end-to-end streaming architecture with billions of events per day which are processed to deliver real-time analytical reports. The example is representative for many similar extract-transform-load (ETL) use cases with other data sets that can use a common library of building blocks. The transform (or analytics) piece of such pipelines varies in complexity and often involves business logic specific, custom components. Topics include: * Pipeline functionality from event source through queryable state for real-time insights. * API for application development and development process. * Library of building blocks including connectors for sources and sinks such as Kafka, JMS, Cassandra, HBase, JDBC and how they enable end-to-end exactly-once results. * Stateful processing with event time windowing. * Fault tolerance with exactly-once result semantics, checkpointing, incremental recovery * Scalability and low-latency, high-throughput processing with advanced engine features for auto-scaling, dynamic changes, compute locality. * Who is using Apex in production, and roadmap. Following the session attendees will have a high level understanding of Apex and how it can be applied to use cases at their own organizations.
https://berlinbuzzwords.de/17/session/batch-streaming-etl-apache-apex Stream data processing is increasingly required to support business needs for faster actionable insight with growing volume of information from more sources. Apache Apex is a true stream processing framework for low-latency, high-throughput and reliable processing of complex analytics pipelines on clusters. Apex is designed for quick time-to-production, and is used in production by large companies for real-time and batch processing at scale. This session will use an Apex production use case to walk through the incremental transition from a batch pipeline with hours of latency to an end-to-end streaming architecture with billions of events per day which are processed to deliver real-time analytical reports. The example is representative for many similar extract-transform-load (ETL) use cases with other data sets that can use a common library of building blocks. The transform (or analytics) piece of such pipelines varies in complexity and often involves business logic specific, custom components. Topics include: Pipeline functionality from event source through queryable state for real-time insights. API for application development and development process. Library of building blocks including connectors for sources and sinks such as Kafka, JMS, Cassandra, HBase, JDBC and how they enable end-to-end exactly-once results. Stateful processing with event time windowing. Fault tolerance with exactly-once result semantics, checkpointing, incremental recovery Scalability and low-latency, high-throughput processing with advanced engine features for auto-scaling, dynamic changes, compute locality. Recent project development and roadmap. Following the session attendees will have a high level understanding of Apex and how it can be applied to use cases at their own organizations.
Cloud computing, reactive systems, microservices: distributed programming has become the norm. But while the shift to loosely coupled message-based systems has manifest benefits in terms of resilience and elasticity, our tools for ensuring correct behavior has not grown at the same pace. Statically typed languages like Java and Scala allow us to exclude large classes of programming errors before the first test is run. Unfortunately, these guarantees are limited to the local behavior within a single process, the compiler cannot tell us that we are sending the wrong JSON structure to a given web service. Therefore distribution comes at the cost of having to write large test suites, with timing-dependent non-determinism. In this presentation we take a first peek at ways out of this dilemma. The principles are demonstrated on the simplest distributed system: Actors. We show how parameterized ActorRefs à la Akka Typed together with effect tracking similar to HLists can help us define what an Actor can and cannot do during its lifetime—and have the compiler yell at us when we do it wrong.
Distributed systems are becoming more and more commonplace, microservices and cloud deployments force this notion into the day to day routine of many developers. One of the great features of a strongly typed language like Scala—with its expressive type system—is that we can achieve a high level of safety and confidence by letting the compiler verify that our code behaves as specified. But can this safety be carried over into the interactions between distributed parts of an application? Can we for example safely compose Actor behaviours from different pieces? This presentation introduces some approaches to this problem, including Session Types and π-calculus, and discusses their limitations. The practical ramifications are illustrated using Akka Typed, with a preview of composable and reusable behaviors.
Our software needs to become reactive, this realization is widely understood: we need to consider responsiveness, maintainability, elasticity and scalability from the outset. Not all systems need to implement all these to the same degree, specific project requirements will determine where effort is most wisely spent, but in the vast majority of cases the need to go reactive will demand that we design our applications differently. In this presentation we explore several architecture elements that are commonly found in reactive systems (like the circuit breaker, various replication techniques, or flow control protocols). These patterns are language agnostic and also independent of the abundant choice of reactive programming frameworks and libraries, they are well-specified starting points for exploring the design space of a concrete problem: thinking is strictly required!
The new Actor representation in Akka Typed allows formulations that lend themselves to monadic interpretation or introspection. This leads us to explore possibilities for expressing and verifying dynamic properties like the adherence to a communication protocol between multiple agents as well as the safety properties of that protocol on a global level. Academic research in this area is far from complete, but there are interesting initial results that we explore in this session: precisely how much purity and reasoning can we bring to the distributed world?
The document discusses Akka Typed, which introduces typed actors to the Akka framework. It provides typed actors as a thin layer on top of untyped actors to add stronger safety properties through static types. Typed actors are defined by behaviors that specify how actors respond to different message types, rather than extending the Actor trait. This allows protocols and actor methods to be encoded through message types in a type-safe way. The document also discusses how session types can be used to define interaction protocols and explores open questions around supporting dynamic changes to protocols while retaining safety.