This document discusses Apache Kafka and Red Hat OpenShift Streams for Apache Kafka. It begins with an overview of what Apache Kafka is and its common use cases. It then demonstrates how Red Hat OpenShift Streams provides a managed Apache Kafka cluster as a service, including a dedicated cluster, configuration management, metrics, monitoring and other features to provide a streamlined developer experience. It concludes with information on trying OpenShift Streams for Apache Kafka and additional resources.
One of the great things about running applications in the cloud is that you only pay for the resources that you use. But that also makes it more important than ever for our applications to be resource-efficient. This becomes even more critical when we use serverless functions. Micronaut is an application framework that provides dependency injection, developer productivity features, and excellent support for Apache Kafka. By performing dependency injection, AOP, and other productivity-enhancing magic at compile time, Micronaut allows us to build smaller, more efficient microservices and serverless functions. In this session, we'll explore the ways that Apache Kafka and Micronaut work together to enable us to build fast, efficient, event-driven applications. Then we'll see it in action, using the AWS Lambda Sink Connector for Confluent Cloud.
Joins in Kafka Streams and ksqlDB are a killer-feature for data processing and basic join semantics are well understood. However, in a streaming world records are associated with timestamps that impact the semantics of joins: welcome to the fabulous world of _temporal_ join semantics. For joins, timestamps are as important as the actual data and it is important to understand how they impact the join result. In this talk we want to deep dive on the different types of joins, with a focus of their temporal aspect. Furthermore, we relate the individual join operators to the overall ""time engine"" of the Kafka Streams query runtime and explain its relationship to operator semantics. To allow developers to apply their knowledge on temporal join semantics, we provide best practices, tip and tricks to ""bend"" time, and configuration advice to get the desired join results. Last, we give an overview of recent, and an outlook to future, development that improves joins even further.
The document discusses 4 reasons to use a cloud-native Kafka service like Confluent Cloud instead of managing Kafka yourself. It notes that managing Kafka requires significant investment of time and resources for tasks like architecture planning, cluster sizing, software upgrades, and more. A cloud-native service handles all operational overhead automatically so you can focus on your core business. Confluent Cloud specifically offers elastic scaling, infinite data retention, global access across clouds, and integrations to make it a complete data streaming platform.
Presentation from South Bay.NET meetup on 3/30. Speaker: Matt Howlett, Software Engineer at Confluent Apache Kafka is a scalable streaming platform that forms a key part of the infrastructure at many companies including Uber, Netflix, Walmart, Airbnb, Goldman Sachs and LinkedIn. In this talk Matt will give a technical overview of Kafka, discuss some typical use cases (from surge pricing to fraud detection to web analytics) and show you how to use Kafka from within your C#/.NET applications.
This three-day course teaches developers how to build applications that can publish and subscribe to data from an Apache Kafka cluster. Students will learn Kafka concepts and components, how to use Kafka and Confluent APIs, and how to develop Kafka producers, consumers, and streams applications. The hands-on course covers using Kafka tools, writing producers and consumers, ingesting data with Kafka Connect, and more. It is designed for developers who need to interact with Kafka as a data source or destination.
The distributed cache is becoming a popular technique to improve performance and simplify the data access layer when dealing with databases. Bringing the data as close as possible to the CPU allows unparalleled execution speed as well as horizontal scalability. This approach is often successful when used in a microservices design in which the cache is accessed only by a single API. However, it becomes more challenging if multiple applications are involved and changes are made to the database directly by other applications. The data held in the cache eventually becomes stale and no longer consistent with its underlying database. When consistency problems arise, the Engineering team must address that through additional coding — which directly jeopardizes the team’s ability to be agile between releases. This talk presents a set of patterns for cache-based architectures that aim to keep the caches always hot; by using Apache Kafka and its connectors to accomplish that goal. It will be shown how to set up these patterns across different IMDGs such as Hazelcast, Apache Ignite or Coherence. These patterns can be used in conjunction with different cache topologies such as cache-aside, read-through, write-behind, and refresh-ahead, making it reusable enough to be used as a framework to achieve data consistency in any architecture that relies on distributed caches.
Managing a distributed system like Apache Kafka can be extremely challenging, especially when you try to approach monitoring and managing from a single centralized GUI approach. In this talk come here and see a demo of a more decoupled approach to Kafka management and Kafka Monitoring where data is centralized but access is is distributed to scale to enterprise deployments, CICD pipelines and much much more.
Dual writes are a common source of issues in distributed event-driven applications. A dual write occurs when an application has to change data in two different systems - for instance, when an application needs to persist data in the database and send a Kafka message to notify other systems. If one of these two operations fail, you might end up with inconsistent data which can be hard to detect and fix. OpenShift Streams for Apache Kafka is Red Hat's fully hosted and managed Apache Kafka service targeting development teams that want to incorporate streaming data and scalable messaging in their applications, without the burden of setting up and maintaining a Kafka cluster infrastructure. Debezium is an open source distributed platform for change data capture. Built on top of Apache Kafka, it allows applications to react to inserts, updates, and deletes in your databases. In this session you will learn how you can leverage OpenShift Streams for Apache Kafka and Debezium to avoid the dual write issue in an event-driven application using the outbox pattern. More specifically, we will show you how to: Provision a Kafka cluster on OpenShift Streams for Apache Kafka. Deploy and configure Debezium to use OpenShift Streams for Apache Kafka. Refactor an application to leverage Debezium and OpenShift Streams for Apache Kafka to avoid the dual write problem.
This document discusses building microservices for data streaming and processing using Spring Cloud and Kafka. It provides an overview of Spring Cloud Stream and how it can be used to build event-driven microservices that connect to Kafka. It also discusses how Spring Cloud Data Flow can be used to orchestrate and deploy streaming applications and topologies. The document includes code samples of building a basic Kafka Streams processor application using Spring Cloud Stream and deploying it as part of a streaming data flow. It concludes with proposing a demonstration of these techniques.