Wikimedia Content API: A Cassandra Use-case

Wikimedia Content API:
A Cassandra Use-case
Eric Evans <eevans@wikimedia.org>
@jericevans
Berlin Buzzwords | June 6, 2016

Our Vision:
A world in which every single human can freely
share in the sum of all knowledge.

About:
● Global movement
● Largest collection of free, collaborative knowledge in human history
● 16 projects
● 16.5 billion total page views per month
● 58.9 million unique devices per day
● More than 13k new editors each month
● More than 75k active editors month-to-month

About: Wikipedia
● More than 38 million articles in 290 languages
● Over 10k new articles added per day
● 13 million edits per month
● Ranked #6 globally in web traffic

Wikitext
= Star Wars: The Force Awakens =
Star Wars: The Force Awakens is a 2015 American epic space opera
film directed, co-produced, and co-written by [[J. J. Abrams]].

HTML
<h1>
Star Wars: The Force Awakens
</h1>
<p>
Star Wars: The Force Awakens is a 2015 American epic space opera
film directed, co-produced, and co-written by
<a href="/wiki/J._J._Abrams" title="J. J. Abrams">
J. J. Abrams
</a>
</p>

Metadata
[[Foo|bar]]
<a rel="mw:WikiLink" href="./Foo">bar</a>

Metadata
[[Foo|{{echo|bar}}]]
<a rel="mw:WikiLink" href="./Foo">
<span about="#mwt1" typeof="mw:Object/Template"
data-parsoid="{...}" >bar</span>
</a>

Parsoid
● Node.js service
● Converts wikitext to HTML/RDFa
● Converts HTML/RDFa to wikitext
● Semantics, and syntax (avoid dirty diffs)!
● Expensive (slow)
● Resulting output is large

RESTBase
● Services aggregator / proxy (REST)
● Durable cache (Cassandra)
● Wikimedia’s content API (e.g. https://en.wikipedia.org/api/rest_v1?doc)

Cassandra
RESTBase
RESTBase
Parsoid ... ...

Other use-cases
● Mobile content service
● Math formula rendering service
● Dumps
● ...

Environment
● 2 datacenters
● 3 racks per datacenter
● 18 hosts (16 core, 128G, SSDs)
● 54 nodes
● Deflate compression (~14-18%)
● 31T storage (~206T uncompressed)
● Cassandra 2.1.13 (moving to 2.2.6)
● Read-heavy workload (5:1)

Data model
CREATE TABLE data (
domain text,
title text,
rev int,
tid timeuuid,
value blob,
PRIMARY KEY ((domain, title), rev, tid)
) WITH CLUSTERING ORDER BY (rev DESC, tid DESC)

Data model
en.wikipedia.org + Star_Wars:_The_Force_Awakens
717862573 717873822
...97466b12...7c7a913d3d8a1f2dd66c...7c7a913d3d8a
...
09877568...7c7a913d3d8a
bdebc9a6...7c7a913d3d8a827e2ec2...7c7a913d3d8a

Brotli compression
● Brought to you by the folks at Google; Successor to deflate
● Cassandra implementation (https://github.com/eevans/cassandra-brotli)
● Initial results very promising
● Better compression, lower cost (apples-apples)
● And, wider windows are possible (apples-oranges)
○ GC/memory permitting
○ Example: level=1, lgblock=4096, chunk_length_kb=4096, yields 1.73% compressed size!
○ https://phabricator.wikimedia.org/T122028
● Stay tuned!

Compaction
● The cost of having log-structured storage
● Asynchronously (post-write) optimize data on disk for reads
● At a minimum, reorganize into fewer files
○ Dropping what is obsolete
○ Expiring TTLs
○ Removing deleted (aka tombstoned) data (after a fashion)
● Reorganize data so results are nearer each other

Compaction strategies
● Size-tiered
○ Combines tables of similar size
○ Oblivious to column distribution; Works best for workloads with no overwrites/deletes
○ Minimal IO
● Leveled
○ Small, fixed size files in levels of exponentially increasing size
○ Files have non-overlapping ranges within a level
○ Very efficient reads, but also quite IO intensive
● Date-tiered
○ For append only, total ordered data
○ Avoids mixing old data with new
○ Cold data eventually ceases to be compacted

Compaction strategies
● Size-tiered
○ Combines tables of similar size
○ Oblivious to column distribution; Works best for workloads with no overwrites/deletes
○ Minimal IO
● Leveled
○ Small, fixed size files in levels of exponentially increasing size
○ Files have non-overlapping ranges within a level
○ Very efficient reads, but also quite IO intensive
● Date-tiered
○ For append only, total ordered data
○ Avoids mixing old data with new
○ Cold data eventually ceases to be compacted OMG, THIS!

DTCS: Well...no, actually
● Hard to reason about
● Optimizations easily defeated
● See: https://phabricator.wikimedia.org/T126221

DTCS: So now what?
● Size-tiered compaction? Might as well.
● TimeWindowCompactionStrategy (https://github.com/jeffjirsa/twcs)?
Maybe...
● Reduce node density?

G1GC
● Early adopters of G1 (aka “Garbage 1st”)
● Successor to Concurrent Mark-sweep (CMS)
● Incremental parallel compacting collector
● More predictable performance than CMS

Humongous objects
● Anything >= ½ region size is classified as Humongous
● Humongous objects are allocated into Humongous Regions
● Only one object for a region (wastes space, creates fragmentation)
● Until 1.8u40, humongous regions collected only during full collections (Bad)
● Since 1.8u40, end of the marking cycle, during the cleanup phase (Better)
● Treated as exceptions, so should be exceptional
○ For us, that means 8MB regions
● Enable GC logging and have a look!

“Many smaller-sized Cassandra nodes is
always better than fewer, dense ones.”
— Everyone

Motivation
● Compaction
● GC
● ...

What we do
● Processes (yup)
● Puppetized configuration
○ /etc/cassandra-a/
○ /etc/cassandra-b/
○ systemd units
○ Etc
● Shared RAID-0

What we should have done
● Virtualization
● Containers
● Blades
● Not processes

Cassandra: The Good
● Fault-tolerance
● Availability
● Datacenter / rack awareness
● Visibility
● Ubiquity
● Nice, helpful people (tickets, IRC, etc)

Cassandra: The Bad
● Usability
○ Compaction
○ Streaming
○ JMX
○ etc
● Vertical scaling
● JVM

Cassandra: The Ugly
● Upgrading
● Release process

Wikimedia Content API: A Cassandra Use-case

Related slideshows

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Wikimedia Content API: A Cassandra Use-case

Similar to Wikimedia Content API: A Cassandra Use-case (20)

More from Eric Evans

More from Eric Evans (9)

Recently uploaded

Recently uploaded (20)

Wikimedia Content API: A Cassandra Use-case