Technical overview of three of the most representative KeyValue Stores: Cassandra, Redis and CouchDB. Focused on Ruby and Ruby on Rails developement, this talk shows how to solve common problems, the most popular libraries, benchmarking and the best use case for each one of them. This talk was part of the Conferencia Rails 2009, Madrid, Spain. http://app.conferenciarails.org/talks/43-key-value-stores-conviertete-en-un-jedi-master
These are the slides from my talk at the 2012 Sphinx Search Day in Santa Clara, California. It provides a high-level picture of where Sphinx is used at craigslist, a bit of history, issues, and future work.
The document discusses building a high performance Weibo platform. It covers using MySQL with user_timeline indexing initially, and then integrating caching with Memcached and Redis. It also discusses using NoSQL databases like MongoDB and HBase. The key topics are data structures like JSON, binary formats, and case studies of architectures with one or multiple data centers and regions for timelines. The overall goal is to build scalable timelines and optimize performance.
Apache Jackrabbit is just about to reach the 3.0 milestone based on a new architecture called Oak. Based on concepts like eventual consistency and multi-version concurrency control, and borrowing ideas from distributed version control systems and cloud-scale databases, the Oak architecture is a major leap ahead for Jackrabbit. This presentation describes the Oak architecture and shows what it means for the scalability and performance of modern content applications. Changes to existing Jackrabbit functionality are described and the migration process is explained.
Redis is a networked data structure server that provides fast, simple access to various data types like Strings, Lists, Sets, Sorted Sets and Hashes. It uses an abstract data type interface where operations take a key as the first parameter and match the type of object stored. For example, list operations like LPUSH take a key and value, and the LRANGE operation takes a key and range to return elements in a list. Redis supports multiple programming language clients and can be used for tasks like leader boards, shopping carts and user profiles.
This document discusses Redis, a key-value store that is commonly used at Weibo for caching and storing relationship data. Redis has fast read and write performance but has limitations for large datasets due to its fully in-memory design. The document describes how Weibo uses Redis in conjunction with MySQL and Memcached to store relationship data for over 100 million users in a performant and scalable way. Challenges around high memory usage, persistence, and availability are also discussed.
This document discusses Ruby's role in data processing. It outlines the typical steps of data processing - collect, summarize, analyze, visualize. It then provides examples of open-source Ruby tools that can be used for each step, such as Fluentd for collection, and libraries for numerical analysis, bioinformatics, and machine learning. Services that use Ruby for collection and processing are also mentioned, like Log Analytics and Stackdriver Logging. The document encourages continuing to improve Ruby tools to make data processing better and celebrates Ruby's 25th anniversary.
You’ve got your Hadoop cluster, you’ve got your petabytes of unstructured data, you run mapreduce jobs and SQL-on-Hadoop queries. Something is still missing though. After all, we are not expected to enter SQL queries while looking for information on the web. Altavista and Google solved it for us ages ago. Why are we still requiring SQL or Java certification from our enterprise bigdata users? In this talk, we will look into how integration of SolrCloud into Apache Bigtop is now enabling building bigdata indexing solutions and ingest pipelines. We will dive into the details of integrating full-text search into the lifecycle of your bigdata management applications and exposing the power of Google-in-a-box to all enterprise users, not just a chosen few data scientists.
The document reviews JavaScript languages that can be compiled to JavaScript, including CoffeeScript, Dart, TypeScript, Traceur, Emscripten, Scala.js, ClojureScript, Kotlin, and others. It discusses their features like static typing, classes, async/await support, and ability to port other language ecosystems to run in the browser. It also covers architectures like Opa and Ur/Web that aim to use a single language across front-end, back-end, and databases.
This is an intro to Sphinx and PHP. It will take you through the very basics of how Sphinx works, how you can set up an index, and using the mysql client to search your index. Then, it culminates in a quick little PHP script that builds a small search interface around your index. I will be posting the example code into my github account soon. This presentation was given to the LV PHP meetup on August 5th.
This document discusses database performance optimization techniques when using Hibernate. It covers topics like connection management, identifier generation strategies, batching, fetching strategies, caching. The key message is that database access time consists of several components like connection acquisition, data access logic, statement submission/execution, result fetching. Optimizing these areas can significantly improve performance.
This document provides information about MongoDB and its suitability for e-commerce applications. It discusses how MongoDB allows for a flexible schema that can accommodate different product types like books, music albums, jeans, without needing to define all attributes in advance. This flexibility addresses the "data dilemma" that traditional relational databases have in modeling diverse e-commerce data. Examples of companies successfully using MongoDB for e-commerce are also provided.
(A talk given at Wix R&D in Dnipro, Ukraine on March 2017. Video available at https://www.youtube.com/watch?v=eIX33mQdkAI&feature=youtu.be) While microservices are conceptually simple, it's a deep rabbit hole to go down. Deceptively simple questions can have far-reaching implications: Which communication protocol should I choose? Is event-driven the way to go? What monitoring tools should I put in place? In this talk we'll cover some of the fundamental questions, outline the solutions adopted or developed by Wix, and share our hindsight on what worked well for us, what didn't and thoughts on future directions for our stack.
LevelDB is an embedded key-value store developed by Google that is optimized for fast read and write operations. It can be used as an embedded database or with networking protocols. LevelDB stores data on disk and supports keys and values in arbitrary byte arrays. The LevelUp Node.js wrapper provides an interface to perform common operations like put, get, delete, batch operations, and iterating over data streams. LevelDB can store JSON objects by setting the valueEncoding option to 'json'.
- Craigslist is a classified advertising website with over 500 cities worldwide and handles over 20 billion pageviews and 50 million users per month. It allows users to post free classified ads for jobs, housing, items for sale, and other services. - The technical challenges for Craigslist include high ad churn rate, growth in traffic volume, need for data archiving and search capabilities, and maintaining the system with a small team. - Craigslist uses open source technologies like MySQL, memcached, Apache, and Sphinx to power its infrastructure while keeping it simple, efficient and low cost. It employs techniques like vertical and horizontal data partitioning and incremental indexing to handle its scale.
memcached Distributed Cache. memcached is the most popular cache solution for low latency high throughput websites. improves the read timings drastically.
Sharding allows you to distribute load across multiple servers and keep your data balanced across those servers. This session will review MongoDB’s sharding support, including an architectural overview, design principles, and automation.
This document summarizes Jeremy Zawodny's work with MySQL and search at Craigslist. It discusses how Craigslist uses MySQL for its classified listings but encountered scaling issues as traffic grew. To address this, Craigslist implemented the Sphinx search engine, which improved performance and allowed them to reduce their MySQL cluster size. The document also outlines Craigslist's data archiving strategy using eventual consistency and their goals for further optimizing their database and search infrastructure.
Social networks by their nature deal with large amounts of user-generated data that must be processed and presented in a time sensitive manner. Much more write intensive than previous generations of websites, social networks have been on the leading edge of non-relational persistence technology adoption. This talk presents how Germany's leading social networks Schuelervz, Studivz and Meinvz are incorporating Redis and Project Voldemort into their platform to run features like activity streams.
There are many fast data stores, and then there is Redis. Learn about this excellent NoSQL solution that is a powerful in-memory key-value store. Learn how to solve traditionally difficult problems with Redis, and how you can benefit from 100,000 reads/writes a second on commodity hardware. We’ll discuss how and when to use the different datatypes and commands to fit your needs. We’ll discuss the different PHP libraries with their pros and cons. We’ll then show some live examples on how to use it for a chatroom, and how Redis manages a billion data points for our dating matching system. Finally, we’ll discuss some of the upcoming features in the near future, such as clustering and scripting.
The document provides an overview of the 7 steps to create a JavaFX application: 1. Sketch - Come up with an idea and sketch the application UI. 2. Setup development environment - Download JavaFX SDK and JDK, choose an IDE like NetBeans, and setup a Maven project. 3. Draw UI - Literally draw the application UI using tools like JavaFX Production Suite or Illustrator. 4. Develop JavaFX - Learn JavaFX APIs and features, use third-party libraries, and develop the application code. 5. Build - Create JAR files for deployment using tools like the JavaFX packager, Maven, or Ant. 6.
This document discusses JavaFX and how it can be used to create rich desktop applications. It provides an overview of JavaFX 1.0 and 2.0, key APIs like properties, bindings, collections, timelines and transitions. It also covers controls, graphics, layouts, CSS, and the Scene Builder tool. The document discusses how JavaFX can be used with other JVM languages like Scala and Groovy. It lists several JavaFX related projects, books, and Twitter accounts that can provide additional resources.
PlayNice.ly's presentation at the first London Redis Meetup. A quick into to Redis, then digging down into some schema design examples.
Updated version of the JavaFX Your Way talk for Devoxx. This includes additional JavaFX 2.0 API changes and an example of some Fantom code snippets.
Max Katz presents on building rich internet applications (RIA) with JavaFX. Some key points: - JavaFX is a tool for building RIA applications across devices like web, mobile, and desktop. It focuses on web and enterprise applications. - JavaFX uses a declarative and procedural expression language called JavaFX Script and integrates with Java classes. - Exadel has developed a JavaFX plugin for Eclipse that provides features like a JavaFX perspective and project creation wizard to help develop JavaFX applications. - Exadel also provides frameworks like Flamingo that connect JavaFX and Flex with backend technologies like Seam and Spring, and Fiji that integrates JSF with Flex or JavaFX
This is a talk that I gave on July 20, 2012 at the Southern California Python Interest Group meetup at Cross Campus, with food and drinks provided by Graph Effect.
These is the slide deck which I presented in tutorial section at PyCon India in Pune (India) in 2011.
In this talk I make an introduction to Redis, then I explain how some big names (twitter, pinterest...) are using it, then I describe some pitfalls, then I explain how we are using redis at teowaki