skip to main content
keynote

Thinking in events: from databases to distributed collaboration software

Published: 28 June 2021 Publication History
  • Get Citation Alerts
  • Abstract

    In this keynote I give a subjective but systematic overview of the landscape of distributed event-based systems, with an emphasis on two areas I have worked on over the last decade: large-scale stream processing with Apache Kafka and associated tools, and real-time collaboration software in the style of Google Docs. While these may seem at first glance to be very different topics, there are also important points of overlap. This paper lays out a taxonomy of event-based systems that shows where their commonalities and differences lie. It also highlights some of the key trade-offs that arise in the implementation of event-based systems, drawing both from distributed systems theory and from experience of their practical deployment. Finally, the paper outlines a number of open research problems in this field.

    References

    [1]
    [n.d.]. Debezium. https://debezium.io/
    [2]
    Tyler Akidau, Robert Bradshaw, Craig Chambers, Slava Chernyak, Rafael J Fernández-Moctezuma, Reuven Lax, Sam McVeety, Daniel Mills, Frances Perry, Eric Schmidt, and Sam Whittle. 2015. The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing. Proceedings of the VLDB Endowment 8, 12 (Aug. 2015), 1792--1803.
    [3]
    Tyler Akidau, Slava Chernyak, and Reuven Lax. 2018. Streaming Systems: The What, Where, When, and How of Large-Scale Data Processing. O'Reilly Media.
    [4]
    Hagit Attiya, Amotz Bar-Noy, and Danny Dolev. 1995. Sharing memory robustly in message-passing systems. J. ACM 42, 1 (Jan. 1995), 124--142.
    [5]
    Hagit Attiya, Faith Ellen, and Adam Morrison. 2015. Limitations of Highly-Available Eventually-Consistent Data Stores. In ACM Symposium on Principles of Distributed Computing (PODC). ACM, 385--394.
    [6]
    Engineer Bainomugisha, Andoni Lombide Carreton, Tom van Cutsem, Stijn Mostinckx, and Wolfgang de Meuter. 2013. A Survey on Reactive Programming. Comput. Surveys 45, 4, Article 52 (Aug. 2013), 34 pages.
    [7]
    Philip A Bernstein, Sergey Bykov, Alan Geller, Gabriel Kliot, and Jorgen Thelin. 2014. Orleans: Distributed Virtual Actors for Programmability and Scalability. Technical Report MSR-TR-2014-41. Microsoft Research. https://www.microsoft.com/en-us/research/publication/orleans-distributed-virtual-actors-for-programmability-and-scalability/
    [8]
    Philip A. Bernstein and Nathan Goodman. 1983. Multiversion Concurrency Control---Theory and Algorithms. ACM Transactions on Database Systems 8, 4 (Dec. 1983), 465--483.
    [9]
    Dominic Betts, Julián Domínguez, Grigori Melnik, Fernando Simonazzi, and Mani Subramanian. 2012. Exploring CQRS and Event Sourcing. Microsoft. http://aka.ms/cqrs
    [10]
    Kenneth P Birman, André Schiper, and Pat Stephenson. 1991. Lightweight causal and atomic group multicast. ACM Transactions on Computer Systems 9, 3 (Aug. 1991), 272--314.
    [11]
    Jamie Brandon. 2021. Internal consistency in streaming systems. https://scattered-thoughts.net/writing/internal-consistency-in-streaming-systems/
    [12]
    Paris Carbone, Asterios Katsifodimos, Stephan Ewen, Volker Markl, Seif Haridi, and Kostas Tzoumas. 2015. Apache Flink: Stream and Batch Processing in a Single Engine. IEEE Data Engineering Bulletin 38, 4 (Dec. 2015), 28--38. http://sites.computer.org/debull/A15dec/p28.pdf
    [13]
    Tushar Deepak Chandra and Sam Toueg. 1996. Unreliable failure detectors for reliable distributed systems. J. ACM 43, 2 (March 1996), 225--267.
    [14]
    Bernadette Charron-Bost, Fernando Pedone, and André Schiper (Eds.). 2010. Replication: Theory and Practice. Vol. 5959. Springer LNCS.
    [15]
    Rada Chirkova and Jun Yang. 2012. Materialized Views. Foundations and Trends in Databases 4, 4 (Dec. 2012), 295--405.
    [16]
    Evan Czaplicki and Stephen Chong. 2013. Asynchronous Functional Reactive Programming for GUIs. In 34th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). ACM, 411--422.
    [17]
    Shirshanka Das, Chavdar Botev, Kapil Surlaker, Bhaskar Ghosh, Balaji Varadarajan, Sunil Nagaraj, David Zhang, Lei Gao, Jemiah Westerman, Phanindra Ganti, Boris Shkolnik, Sajid Topiwala, Alexander Pachev, Naveen Somasundaram, and Subbu Subramaniam. 2012. All Aboard the Databus! Linkedin's Scalable Consistent Change Data Capture Platform. In 3rd ACM Symposium on Cloud Computing (SoCC).
    [18]
    Susan B Davidson, Hector Garcia-Molina, and Dale Skeen. 1985. Consistency in Partitioned Networks. Comput. Surveys 17, 3 (1985), 341--370.
    [19]
    Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. 2007. Dynamo: Amazon's highly available key-value store. In 21st ACM Symposium on Operating Systems Principles (SOSP). ACM, 205--220.
    [20]
    Martin Fowler. 2005. Event Sourcing. https://martinfowler.com/eaaDev/EventSourcing.html
    [21]
    Seth Gilbert and Nancy A Lynch. 2002. Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services. ACM SIGACT News 33, 2 (June 2002), 51--59.
    [22]
    Victor B F Gomes, Martin Kleppmann, Dominic P Mulligan, and Alastair R Beresford. 2017. Verifying strong eventual consistency in distributed systems. Proceedings of the ACM on Programming Languages 1, OOPSLA (2017).
    [23]
    Jim N Gray, Pat Helland, Patrick O'Neil, and Dennis Shasha. 1996. The dangers of replication and a solution. In ACM SIGMOD International Conference on Management of Data. ACM, 173--182.
    [24]
    Ashish Gupta and Inderpal Singh Mumick (Eds.). 1999. Materialized Views: Techniques, Implementations, and Applications. MIT Press.
    [25]
    Peter van Hardenberg and Martin Kleppmann. 2020. PushPin: Towards production-quality peer-to-peer collaboration. In 7th Workshop on Principles and Practice of Consistency for Distributed Data (PaPoC). ACM.
    [26]
    Pat Helland. 2015. Immutability Changes Everything. In 7th Biennial Conference on Innovative Data Systems Research (CIDR). http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper16.pdf
    [27]
    Pat Helland and Dave Campbell. 2009. Building on Quicksand. In 4th Biennial Conference on Innovative Data Systems Research (CIDR). https://database.cs.wisc.edu/cidr/cidr2009/Paper_133.pdf
    [28]
    Joseph M Hellerstein, Michael Stonebraker, and James Hamilton. 2007. Architecture of a Database System. Foundations and Trends in Databases 1, 2 (Nov. 2007), 141--259.
    [29]
    Heidi Howard and Richard Mortier. 2020. Paxos vs Raft: have we reached consensus on distributed consensus?. In 7th Workshop on Principles and Practice of Consistency for Distributed Data (PaPoC). ACM.
    [30]
    Ink & Switch. 2019. End-user programming. https://www.inkandswitch.com/end-user-programming.html
    [31]
    David R Jefferson. 1985. Virtual time. ACM Transactions on Programming Languages and Systems 7, 3 (July 1985), 404 -- 425.
    [32]
    Ralph Kimball and Margy Ross. 2013. The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling (3rd ed.). John Wiley & Sons.
    [33]
    Martin Kleppmann. 2015. Turning the database inside-out with Apache Samza. https://martin.kleppmann.com/2015/03/04/turning-the-database-inside-out.html
    [34]
    Martin Kleppmann. 2017. Designing Data-Intensive Applications. O'Reilly Media.
    [35]
    Martin Kleppmann. 2020. Moving Elements in List CRDTs. In 7th Workshop on Principles and Practice of Consistency for Distributed Data (PaPoC). ACM.
    [36]
    Martin Kleppmann, Alastair R Beresford, and Boerge Svingen. 2019. Online Event Processing. Commun. ACM 62, 5 (May 2019), 43--49.
    [37]
    Martin Kleppmann, Victor B F Gomes, Dominic P Mulligan, and Alastair R Beresford. 2018. OpSets: Sequential Specifications for Replicated Datatypes (Extended Version). https://arxiv.org/abs/1805.04263
    [38]
    Martin Kleppmann and Jay Kreps. 2015. Kafka, Samza and the Unix Philosophy of Distributed Data. IEEE Data Engineering Bulletin 38, 4 (Dec. 2015), 4--14. http://sites.computer.org/debull/A15dec/p4.pdf
    [39]
    Martin Kleppmann, Adam Wiggins, Peter van Hardenberg, and Mark McGranaghan. 2019. Local-First Software: You own your data, in spite of the cloud. In ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software (Onward!). ACM, 154--178.
    [40]
    Jay Kreps. 2013. The Log: What every software engineer should know about real-time data's unifying abstraction. http://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying
    [41]
    Jay Kreps, Neha Narkhede, and Jun Rao. 2011. Kafka: a Distributed Messaging System for Log Processing. In 6th International Workshop on Networking Meets Databases (NetDB).
    [42]
    Raffi Krikorian. 2012. Timelines at Scale. In QCon San Francisco. https://www.infoq.com/presentations/Twitter-Timeline-Scalability/
    [43]
    Roland Kuhn. 2021. Local-First Cooperation. https://www.infoq.com/articles/local-first-cooperation/
    [44]
    Ajay Kulkarni and Ryan Booz. 2020. What the heck is time-series data (and why do I need a time-series database)? https://blog.timescale.com/blog/what-the-heck-is-time-series-data-and-why-do-i-need-a-time-series-database-dcf3b1b18563/
    [45]
    Leslie Lamport. 1978. Time, clocks, and the ordering of events in a distributed system. Commun. ACM 21, 7 (July 1978), 558--565.
    [46]
    Leslie Lamport. 2001. Paxos Made Simple. ACM SIGACT News 32, 4 (Dec. 2001), 51--58.
    [47]
    João Leitão, José Pereira, and Luís Rodrigues. 2007. HyParView: A Membership Protocol for Reliable Gossip-Based Broadcast (37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks). IEEE, 419--429.
    [48]
    Linux Programmer's Manual. [n.d.]. select(2) - Linux manual page. https://www.man7.org/linux/man-pages/man2/select.2.html
    [49]
    Geoffrey Litt, Peter van Hardenberg, and Orion Henry. 2021. Cambria: Schema Evolution in Distributed Systems with Edit Lenses. In 8th Workshop on Principles and Practice of Consistency for Distributed Data (PaPoC). ACM, Article 8.
    [50]
    David C Luckham. 2002. The Power of Events: An Introduction to Complex Event Processing in Distributed Enterprise Systems. Addison-Wesley.
    [51]
    Frank McSherry, Derek G Murray, Rebecca Isaacs, and Michael Isard. 2013. Differential dataflow. In 6th Biennial Conference on Innovative Data Systems Research (CIDR). http://cidrdb.org/cidr2013/Papers/CIDR13_Paper111.pdf
    [52]
    Jayadev Misra. 1986. Distributed Discrete-Event Simulation. Comput. Surveys 18, 1 (March 1986), 39--65.
    [53]
    Mozilla Developer Network. [n.d.]. Event reference. https://developer.mozilla.org/en-US/docs/Web/Events
    [54]
    Shadi A. Noghabi, Kartik Paramasivam, Yi Pan, Navina Ramesh, Jon Bringhurst, Indranil Gupta, and Roy H. Campbell. 2017. Samza: Stateful Scalable Stream Processing at LinkedIn. Proceedings of the VLDB Endowment 10, 12 (Aug. 2017), 1634--1645.
    [55]
    Diego Ongaro and John K Ousterhout. 2014. In Search of an Understandable Consensus Algorithm. In USENIX Annual Technical Conference (ATC). USENIX.
    [56]
    Michiel Overeem, Marten Spoor, and Slinger Jansen. 2017. The dark side of event sourcing: Managing data conversion. In 24th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 193--204.
    [57]
    Kevin De Porre, Florian Myter, Christophe De Troyer, Christophe Scholliers, Wolfgang De Meuter, and Elisa Gonzalez Boix. 2019. Putting Order in Strong Eventual Consistency. In IFIP International Conference on Distributed Applications and Interoperable Systems (DAIS 2019). Springer, 36--56.
    [58]
    Yasushi Saito and Marc Shapiro. 2005. Optimistic Replication. Comput. Surveys 37, 1 (March 2005), 42--81.
    [59]
    Fred B Schneider. 1990. Implementing fault-tolerant services using the state machine approach: a tutorial. Comput. Surveys 22, 4 (Dec. 1990), 299--319.
    [60]
    Marc Shapiro, Nuno Preguiça, Carlos Baquero, and Marek Zawirski. 2011. Conflict-Free Replicated Data Types. In 13th International Symposium on Stabilization, Safety, and Security of Distributed Systems (SSS 2011). Springer, 386--400.
    [61]
    Marc Shapiro, Nuno Preguiça, Carlos Baquero, and Marek Zawirski. 2011. A comprehensive study of Convergent and Commutative Replicated Data Types. Technical Report 7506. INRIA. http://hal.inria.fr/inria-00555588/
    [62]
    Supreeth Shastri, Vinay Banakar, Melissa Wasserman, Arun Kumar, and Vijay Chidambaram. 2020. Understanding and Benchmarking the Impact of GDPR on Database Systems. Proceedings of the VLDB Endowment 13, 7 (March 2020), 1064--1077.
    [63]
    Ben Stopford. 2017. Handling GDPR with Apache Kafka: How does a log forget? https://www.confluent.io/blog/handling-gdpr-log-forget/
    [64]
    Douglas B Terry, Marvin M Theimer, Karin Petersen, Alan J Demers, Mike J Spreitzer, and Carl H Hauser. 1995. Managing update conflicts in Bayou, a weakly connected replicated storage system. In 15th ACM Symposium on Operating Systems Principles (SOSP). ACM, 172--182.
    [65]
    Ivan Valkov, Natalia Chechina, and Phil Trinder. 2018. Comparing Languages for Engineering Server Software: Erlang, Go, and Scala with Akka. In 33rd Annual ACM Symposium on Applied Computing (SAC). ACM, 218--225.
    [66]
    Marko Vukolić. 2015. The Quest for Scalable Blockchain Fabric: Proof-of-Work vs. BFT Replication. In IFIP WG 11.4 International Workshop on Open Problems in Network Security (iNetSec). Springer, 112--125.
    [67]
    Guozhang Wang, Joel Koshy, Sriram Subramanian, Kartik Paramasivam, Mammad Zadeh, Neha Narkhede, Jun Rao, Jay Kreps, and Joe Stein. 2015. Building a replicated logging system with Apache Kafka. Proceedings of the VLDB Endowment 8, 12 (Aug. 2015), 1654--1655.
    [68]
    Alexey Zimarev. 2020. What is Event Sourcing? https://www.eventstore.com/blog/what-is-event-sourcing

    Cited By

    View all

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    DEBS '21: Proceedings of the 15th ACM International Conference on Distributed and Event-based Systems
    June 2021
    207 pages
    ISBN:9781450385558
    DOI:10.1145/3465480
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 June 2021

    Check for updates

    Author Tags

    1. CRDTs
    2. event sourcing
    3. real-time collaboration
    4. state machine replication
    5. stream processing

    Qualifiers

    • Keynote

    Funding Sources

    Conference

    DEBS '21

    Acceptance Rates

    DEBS '21 Paper Acceptance Rate 7 of 26 submissions, 27%;
    Overall Acceptance Rate 130 of 553 submissions, 24%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 575
      Total Downloads
    • Downloads (Last 12 months)89
    • Downloads (Last 6 weeks)5

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media