SlideShare a Scribd company logo
The Most Used and
Most Underappreciated
Database
versus
● Embedded
● Direct I/O
● Compact
● Portable
● Fast
versus
BerkeleyDB
Kyoto Cabinet
LevelDB
RocksDB
etc ...
● Query Language
● Transactions
● Concurrency
[db tech showcase Tokyo 2017] A11: SQLite - The most used yet least appreciated database engine by SQLite.org - Richard Hipp
used in....
● Every Android device (~2 billion)
● Every Mac and iOS device (~1 billion)
● Every Win10 machine (~500 million)
● Every Chrome and Firefox browser (~2 billion)
● Every Skype, iTunes, WhatApp (~2 billion)
● Millions of other applications
● Many billions of running instances
● 100s of billions, perhap trillions, of databases
More Copies of Than...
● Linux
● Windows
● MacOS and iOS
● All other database engines combined
● Any application
● Any other library¹
¹except maybe zLib
[db tech showcase Tokyo 2017] A11: SQLite - The most used yet least appreciated database engine by SQLite.org - Richard Hipp
One File Of C-code
sqlite3.c
● 204K lines
● 125K SLOC¹
● 7.2MB
Also: sqlite3.h
● 10.7K lines
● 1.5K SLOC
● 0.5MB
¹SLOC: “Source Lines Of Code” - Lines of code
not counting comments and blank lines.
Small Footprint
sqlite3.c
sqlite3.h
7.8 MB
sqlite3.o
0.49 MB
gcc -Os
Low Dependency
● memcmp()
● memcpy()
● memmove()
● memset()
● strcmp()
● strlen()
● strncmp()
gcc -c -DSQLITE_ZERO_MALLOC -DSQLITE_ENABLE_MEMSYS5 
-DSQLITE_OS_OTHER -DSQLITE_THREADSAFE=0 
-DSQLITE_OMIT_LOCALTIME 
sqlite3.c
Single File Database
● One file holds
thousands of tables,
indexes, and views
● Space efficient
● Send a complete
database as an email
attachment
● Name it whatever you
like.
whatever.db
Open File Format
● sqlite.org/fileformat.html
● Cross-platform
– 32-bit ↔ 64-bit
– little-endian ↔ big-endian
● Backwards-compatible
● Readable by 3rd-party
tools
Faster Than The Filesystem
● 35% faster reads and writes
● 20% less disk space
● https://sqlite.org/fasterthanfs.html
35% faster
than
open()
read()
close()
Faster Than The File System
https://sqlite.org/fasterthanfs.html
SQLite android ubuntu mac win7 win10
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
5.5
6.0
Time
Time to read 100,000 BLOBs with average size of 10,000 bytes
from SQLite versus directly from a file on disk.
Faster Than The File System
https://sqlite.org/fasterthanfs.html
Time to write 10,000 BLOBs with average size of 10,000 bytes
from SQLite versus writing directly into a file on disk.
SQLite ubuntu mac win7 win10
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
time
[db tech showcase Tokyo 2017] A11: SQLite - The most used yet least appreciated database engine by SQLite.org - Richard Hipp
Documentation
● Hundreds of pages of stand-alone HTML
– Extensively hyperlinked
– Downloadable as a single tarball
● Source code is 29% comment
– Useful comments, not boilerplate
● Thousands of assert() statements
● Test cases linked to documentation
● More documentation than source code
Aviation-Grade Testing
● DO-178B development process
● 100% MC/DC, as-deployed, with independence
results
in
● Very few bugs
● Refactor and optimize without breaking things
https://sqlite.org/testing.html
Preformance Improvements
2009 2010 2011 2012 2013 2014 2015 2016 2017 2018
0%
50%
100%
150%
200%
250%
300%
350%
3.6.7
3.6.15
3.6.23
3.7.2
3.7.5
3.7.8
3.7.13
3.7.14
3.7.17
3.8.0
3.8.1
3.8.2
3.8.3
3.8.4
3.8.5
3.8.6
3.8.7
3.8.8
3.8.9
3.8.10
3.8.11
3.9.0
3.10.0
3.11.0
3.12.0
3.13.0
3.14.0
3.15.0
3.16.0
3.17.0
3.18.0
3.19.0
3.20.0
CPU Cycles
Copyright
“Lite”?
Limits
● 140 terabytes per database
● 125 databases per connection
● 2 gigabyte strings & BLOBs
● 64 tables in a join
● 32,767 columns per table
● 1 simultaneous writer + N readers
● No arbitrary limits on the number of tables or
the number of rows in a table
17.5 petabytes
per connection
SQL Features
● Tables, indexes,
views, & triggers
● Clustered & covering
indexes
● Partial indexes
● Expression indexes
● Common table
expressions
● Row values
● ACID, power-safe,
nested transactions
● R-Tree indexes
● Full-text search
● JSON support
● Table-valued
functions
● Correlated
subqueries
● Memory usage
● Complication
● Dependencies
● License Issues
● Features
● Reliability
● Capabilities
● Freedom
“Lite” “Heavy”
Storage Decision Checklist
Remote Data?
Concurrent Writers?
Big Data?
Otherwise
Gazillion transactions/sec?
Storage Decision Checklist FAIL!
fopen()
No!
Remote Data?
Concurrent Writers?
Big Data?
Otherwise
Gazillion transactions/sec?
= Data Container
Application
(1) Gather data from
the cloud
(2) Transmit one SQLite database
file to the device
(3) Use locally
File Format
● Row-store
● Variable-length records
● Forest of B-trees
– One B-tree for each table and each index
– Table key: PRIMARY KEY or ROWID
– Index key: indexed columns + table key
● Stable, cross-platform, byte-order independent, and
carefully documented
● SERIALIZABLE, power-safe transactions
Ins & Outs of
Compile SQL
into bytecode
Bytecode
InterpreterSQL Prep'ed
Stmt
Result
B-Tree
Storage Engine
addr opcode p1 p2 p3 p4 p5 comment
---- ------------- ---- ---- ---- ------------- -- -------------
0 Init 0 12 0 00 Start at 12
1 OpenRead 0 2 0 3 00 root=2 iDb=0; tab
2 Explain 0 0 0 SCAN TABLE tab 00
3 Rewind 0 10 0 00
4 Column 0 0 1 00 r[1]=tab.Fruit
5 Ne 2 9 1 (BINARY) 69 if r[2]!=r[1] goto 9
6 Column 0 2 3 00 r[3]=tab.Price
7 RealAffinity 3 0 0 00
8 ResultRow 3 1 0 00 output=r[3]
9 Next 0 4 0 01
10 Close 0 0 0 00
11 Halt 0 0 0 00
12 Transaction 0 0 1 0 01
13 TableLock 0 2 0 tab 00 iDb=0 root=2 write=0
14 String8 0 2 0 Orange 00 r[2]='Orange'
15 Goto 0 1 0 00
EXPLAIN SELECT price FROM tab WHERE fruit='Orange'
Opcode documentation: https://www.sqlite.org/opcode.html
Documentation generated from comments in the vdbe.c source file.
Business Model
● Technical support contracts
– Annual maintenance subscription
– Technical support
– SQLite Consortium membership
● Proprietary extensions
– Encryption
– Compression
● Keep expenses very low
If the software is free, how do we get money to live?
Primary
Revenue
Source
The Future
● Support through 2050
● 5 or 6 releases per year
● Keep it stable, backwards-compatible, free, and
organic
● Constantly improving query planner
● Better performance
[db tech showcase Tokyo 2017] A11: SQLite - The most used yet least appreciated database engine by SQLite.org - Richard Hipp
added to
Mac OS 10.3 (Tiger)
2004-06-28
Small, Fast, Reliable
Small, Fast, Reliable
Why aren't you using it more?

More Related Content

[db tech showcase Tokyo 2017] A11: SQLite - The most used yet least appreciated database engine by SQLite.org - Richard Hipp

  • 1. The Most Used and Most Underappreciated Database
  • 2. versus ● Embedded ● Direct I/O ● Compact ● Portable ● Fast
  • 3. versus BerkeleyDB Kyoto Cabinet LevelDB RocksDB etc ... ● Query Language ● Transactions ● Concurrency
  • 5. used in.... ● Every Android device (~2 billion) ● Every Mac and iOS device (~1 billion) ● Every Win10 machine (~500 million) ● Every Chrome and Firefox browser (~2 billion) ● Every Skype, iTunes, WhatApp (~2 billion) ● Millions of other applications ● Many billions of running instances ● 100s of billions, perhap trillions, of databases
  • 6. More Copies of Than... ● Linux ● Windows ● MacOS and iOS ● All other database engines combined ● Any application ● Any other library¹ ¹except maybe zLib
  • 8. One File Of C-code sqlite3.c ● 204K lines ● 125K SLOC¹ ● 7.2MB Also: sqlite3.h ● 10.7K lines ● 1.5K SLOC ● 0.5MB ¹SLOC: “Source Lines Of Code” - Lines of code not counting comments and blank lines.
  • 10. Low Dependency ● memcmp() ● memcpy() ● memmove() ● memset() ● strcmp() ● strlen() ● strncmp() gcc -c -DSQLITE_ZERO_MALLOC -DSQLITE_ENABLE_MEMSYS5 -DSQLITE_OS_OTHER -DSQLITE_THREADSAFE=0 -DSQLITE_OMIT_LOCALTIME sqlite3.c
  • 11. Single File Database ● One file holds thousands of tables, indexes, and views ● Space efficient ● Send a complete database as an email attachment ● Name it whatever you like. whatever.db
  • 12. Open File Format ● sqlite.org/fileformat.html ● Cross-platform – 32-bit ↔ 64-bit – little-endian ↔ big-endian ● Backwards-compatible ● Readable by 3rd-party tools
  • 13. Faster Than The Filesystem ● 35% faster reads and writes ● 20% less disk space ● https://sqlite.org/fasterthanfs.html 35% faster than open() read() close()
  • 14. Faster Than The File System https://sqlite.org/fasterthanfs.html SQLite android ubuntu mac win7 win10 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 Time Time to read 100,000 BLOBs with average size of 10,000 bytes from SQLite versus directly from a file on disk.
  • 15. Faster Than The File System https://sqlite.org/fasterthanfs.html Time to write 10,000 BLOBs with average size of 10,000 bytes from SQLite versus writing directly into a file on disk. SQLite ubuntu mac win7 win10 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 time
  • 17. Documentation ● Hundreds of pages of stand-alone HTML – Extensively hyperlinked – Downloadable as a single tarball ● Source code is 29% comment – Useful comments, not boilerplate ● Thousands of assert() statements ● Test cases linked to documentation ● More documentation than source code
  • 18. Aviation-Grade Testing ● DO-178B development process ● 100% MC/DC, as-deployed, with independence results in ● Very few bugs ● Refactor and optimize without breaking things https://sqlite.org/testing.html
  • 19. Preformance Improvements 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 0% 50% 100% 150% 200% 250% 300% 350% 3.6.7 3.6.15 3.6.23 3.7.2 3.7.5 3.7.8 3.7.13 3.7.14 3.7.17 3.8.0 3.8.1 3.8.2 3.8.3 3.8.4 3.8.5 3.8.6 3.8.7 3.8.8 3.8.9 3.8.10 3.8.11 3.9.0 3.10.0 3.11.0 3.12.0 3.13.0 3.14.0 3.15.0 3.16.0 3.17.0 3.18.0 3.19.0 3.20.0 CPU Cycles
  • 22. Limits ● 140 terabytes per database ● 125 databases per connection ● 2 gigabyte strings & BLOBs ● 64 tables in a join ● 32,767 columns per table ● 1 simultaneous writer + N readers ● No arbitrary limits on the number of tables or the number of rows in a table 17.5 petabytes per connection
  • 23. SQL Features ● Tables, indexes, views, & triggers ● Clustered & covering indexes ● Partial indexes ● Expression indexes ● Common table expressions ● Row values ● ACID, power-safe, nested transactions ● R-Tree indexes ● Full-text search ● JSON support ● Table-valued functions ● Correlated subqueries
  • 24. ● Memory usage ● Complication ● Dependencies ● License Issues ● Features ● Reliability ● Capabilities ● Freedom “Lite” “Heavy”
  • 25. Storage Decision Checklist Remote Data? Concurrent Writers? Big Data? Otherwise Gazillion transactions/sec?
  • 26. Storage Decision Checklist FAIL! fopen() No! Remote Data? Concurrent Writers? Big Data? Otherwise Gazillion transactions/sec?
  • 27. = Data Container Application (1) Gather data from the cloud (2) Transmit one SQLite database file to the device (3) Use locally
  • 28. File Format ● Row-store ● Variable-length records ● Forest of B-trees – One B-tree for each table and each index – Table key: PRIMARY KEY or ROWID – Index key: indexed columns + table key ● Stable, cross-platform, byte-order independent, and carefully documented ● SERIALIZABLE, power-safe transactions
  • 29. Ins & Outs of Compile SQL into bytecode Bytecode InterpreterSQL Prep'ed Stmt Result B-Tree Storage Engine
  • 30. addr opcode p1 p2 p3 p4 p5 comment ---- ------------- ---- ---- ---- ------------- -- ------------- 0 Init 0 12 0 00 Start at 12 1 OpenRead 0 2 0 3 00 root=2 iDb=0; tab 2 Explain 0 0 0 SCAN TABLE tab 00 3 Rewind 0 10 0 00 4 Column 0 0 1 00 r[1]=tab.Fruit 5 Ne 2 9 1 (BINARY) 69 if r[2]!=r[1] goto 9 6 Column 0 2 3 00 r[3]=tab.Price 7 RealAffinity 3 0 0 00 8 ResultRow 3 1 0 00 output=r[3] 9 Next 0 4 0 01 10 Close 0 0 0 00 11 Halt 0 0 0 00 12 Transaction 0 0 1 0 01 13 TableLock 0 2 0 tab 00 iDb=0 root=2 write=0 14 String8 0 2 0 Orange 00 r[2]='Orange' 15 Goto 0 1 0 00 EXPLAIN SELECT price FROM tab WHERE fruit='Orange' Opcode documentation: https://www.sqlite.org/opcode.html Documentation generated from comments in the vdbe.c source file.
  • 31. Business Model ● Technical support contracts – Annual maintenance subscription – Technical support – SQLite Consortium membership ● Proprietary extensions – Encryption – Compression ● Keep expenses very low If the software is free, how do we get money to live? Primary Revenue Source
  • 32. The Future ● Support through 2050 ● 5 or 6 releases per year ● Keep it stable, backwards-compatible, free, and organic ● Constantly improving query planner ● Better performance
  • 34. added to Mac OS 10.3 (Tiger) 2004-06-28
  • 36. Small, Fast, Reliable Why aren't you using it more?