Postgres Vienna DB Meetup 2014
- 2. Michael Renner
@terrorobe
https://pganalyze.com
Mein Name ist Michael Renner
Twitter Handle - der mich auch schon in Probleme gebracht hat.
Web Operations, starkes Interesse an Datenbanken, Skalierung und
Performance.
PG-Enthusiast seit 2004
If you've got questions - please just ask!
- 4. Postgres.
A free RDBMS done right
Relational database management system
It does SELECT, INSERT, UPDATE, DELETE
In a sane & maintainable way.
Tries hard to not surprise users, hype resistant.
- 6. One major release per year
Five years maintenance
Multiple maintenance releases per year
Does...
- 7. Friendly & Competent
Community
• http://www.postgresql.org/list/
• Freenode: #postgresql(-de)
• http://pgconf.(de|eu|us)
more often than not the consultants from various companies are hanging out
in the channels
- 8. 9.4 ante portas
~Sep 2014
http://www.postgresql.org/docs/devel/static/release-9-4.html
That being said, the next major release will come after the summer,
extrapolating from past releases it should be here around September.
It'll bring quite a bit of new features, I selected a few interesting ones.
- 10. Calculate 95th percentile
postgres=# SELECT percentile_disc(0.95) WITHIN GROUP(ORDER BY i) FROM
generate_series(1,100) AS s(i);
percentile_disc
-----------------
95
(1 row)
...calculate percentiles
- 12. New JSON functions
$ SELECT * FROM json_to_recordset(
'[
{"name":"e","value":2.718},
{"name":"pi","value":3.141},
{"name":"tau","value":6.283}
]', TRUE)
AS x (name text, value numeric);
name | value
------+-------
e | 2.718
pi | 3.141
tau | 6.283
(3 rows)
http://www.postgresql.org/docs/devel/static/functions-json.html
http://www.depesz.com/2014/01/30/waiting-for-9-4-new-json-functions/
...and to complement the new data type, there are also new accessor functions
- 15. A tale of sorrows
or: "Brewer hates us"
If you've got a strong stomach, read through:
http://aphyr.com/tags/jepsen
which is a tale of sorrows, and this is not limited to Postgres or SQL databases.
Getting distributed database systems right is _HARD_.
And even the distributed database poster childs get it wrong
- 16. Brewer's CAP Theorem
• it is impossible for a distributed system to
simultaneously provide these guarantees:
• Consistency
• Availability
• Partition tolerance
In a nutshell
Consistency - all nodes see the same data at the same time
Availability - a guarantee that every request receives a response about
whether it was successful or failed
Partition tolerance - the system continues to operate despite arbitrary message
loss or failure of part of the system
Brewer says: It's impossible to get all three
Managers like things available & partition tolerant
- 17. PG Mantra:
Scale up, not out
Postgres, in the past, solved this problem by not dealing with it in the first
place!
So that we don't have to bother with this, most people will usually tell you to
just scale up
Throw more/bigger hardware at the problem and be done with it.
- 18. Real world says:
"NO"
But that's not always possible.
You might need to have geo-redundant database servers, you might run in an
environment where "scaling up" is no feasible option (hello ec2!)
- 19. So we need replication.
What are our options?
So we need replication... Postgres has a bit of a Perl problem - TMTOWTDI
- 20. shared storage
...one of the oldest options
Usually achieved by using a SAN or DRBD
HA solution tacked on top of it, if one server goes down, other starts up
- 21. Trigger-based
Add a trigger to all replicated tables
Changes get written to a separate table
Daemon reads changes from source DB and writes to destination DB
- 22. Statement-based
or "The proxy approach"
Connect to middleware instead of real database
All queries executed on middleware will be sent to many databases
That's fine until one of the servers isn't reachable!
- 23. (Write Ahead) Log-based
And the most common ones
* Postgres writes all changes it does to the table & index files into a log, which
would be used during crash recovery
* Send log contents to a secondary server
* Secondary server does "continuous crash recovery"
- 24. What should you use?
With all those options the question that comes up is...
and since "it depends" is probably not a sufficient answer for most of you
- 26. Two flavors
• Log-Shipping
• Completed WAL-segments are copied to
slave and applied there
• Streaming replication
• Transactions are streamed to slave
servers
• Can also be configured for synchronous
replication
Log-based replication in Postgres comes in two flavors
- 27. On WAL handling
• Server generates WAL with every
modifying operation, 16MB segments
• Normally gets rotated after successful
checkpoint
• Lots of conditions and config settings
that can change the behaviour
• Slave needs base copy from master + all
WAL files to reach consistent state
- 28. Master config
$ $EDITOR pg_hba.conf
host replication replication 192.0.2.0/24 trust
$ $EDITOR postgresql.conf
wal_level = hot_standby
max_wal_senders = 5
wal_keep_segments = 32
http://wiki.postgresql.org/wiki/Streaming_Replication
http://www.postgresql.org/docs/current/static/warm-standby.html
This is a strict streaming replication example, no log archiving
If the slave server is offline too long, it needs to be freshly initialized from the
master.
- 29. Slave config
$ pg_basebackup -R -D /path/to/cluster --host=master --port=5432
$ $EDITOR postgresql.conf
hot_standby = on
$ $EDITOR recovery.conf
standby_mode = 'on'
primary_conninfo = 'host=master port=5432 user=replication'
trigger_file = '/path/to/trigger'
- 30. Caveats
• Slaves are 100% identical to master
• No selective replication (DBs,Tables, etc.)
• No slave-only indexes
• WAL segment handling can be tricky
• Slave Query conflicts due to master TXs
• Excessive disk space usage on master
• Broken replication due to already-recycled
segments on master
But when running with log based replication there are things to look out for
- 31. Coming in 9.4
Q3 2014
All of the stuff works out of the box with 9.3
There are a few new things coming in postgres 9.4
- 32. Logical decoding
One of the most interesting additions is logical decoding
Master Server generates a list of tuple modifications
Similar to trigger-based replication, but much more efficient and easier to
maintain
Almost identical to "row based replication" format in MySQL
- 33. $ INSERT INTO z (whatever) VALUES ('row2');
INSERT 0 1
$ SELECT * FROM pg_logical_slot_get_changes('depesz', null, null, 'include-xids', '0');
location | xid | data
------------+-----+------------------------------------------------------------
0/5204A858 | 932 | BEGIN
0/5204A858 | 932 | table public.z: INSERT: id[integer]:1 whatever[text]:'row2'
0/5204A928 | 932 | COMMIT
(3 rows)
http://www.depesz.com/2014/03/06/waiting-for-9-4-introduce-logical-
decoding/
Here's an example of what logical decoding will produce
You can find more extensive examples at Hubert Depesz blog
- 34. Replication slots
Replication slots are an additional feedback mechanism between slave and
master to communicate which WAL files are still needed
Also the backbone for logical replication
- 36. What's coming in 9.5+?
These were the things that are already included in 9.4,
for the coming development cycles there're already a few things in the pipeline
- 37. Logical replication
cont'd
What's currently missing is a reliable consumer for the data generated by 9.4
logical replication
People, mostly Andres Freund from 2nd Quadrant, are working on this topic
and I expect that there's more to talk about next year with 9.5
Will be possible to build Galera-Like systems with the infrastructure