Skip to main content

Apache Samza

  • Living reference work entry
  • Latest version View entry history
  • First Online:
Encyclopedia of Big Data Technologies

Definition

Apache Samza is an open source framework for distributed processing of high-volume event streams. Its primary design goal is to support high throughput for a wide range of processing patterns, while providing operational robustness at the massive scale required by Internet companies. Samza achieves this goal through a small number of carefully designed abstractions: partitioned logs for messaging, fault-tolerant local state, and cluster-based task scheduling.

Overview

Stream processing is playing an increasingly important part of the data management needs of many organizations. Event streams can represent many kinds of data, for example, the activity of users on a website, the movement of goods or vehicles, or the writes of records to a database.

Stream processing jobs are long-running processes that continuously consume one or more event streams, invoking some application logic on every event, producing derived output streams, and potentially writing output to databases for...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Martin Kleppmann .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Kleppmann, M. (2018). Apache Samza. In: Sakr, S., Zomaya, A. (eds) Encyclopedia of Big Data Technologies. Springer, Cham. https://doi.org/10.1007/978-3-319-63962-8_197-2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-63962-8_197-2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-63962-8

  • Online ISBN: 978-3-319-63962-8

  • eBook Packages: Springer Reference MathematicsReference Module Computer Science and Engineering

Publish with us

Policies and ethics

Chapter history

  1. Latest

    Apache Samza
    Published:
    20 March 2018

    DOI: https://doi.org/10.1007/978-3-319-63962-8_197-2

  2. Original

    Samza
    Published:
    19 February 2018

    DOI: https://doi.org/10.1007/978-3-319-63962-8_197-1