Fun Things to do with Logical Decoding

Migrating Rant & Rave to PostgreSQL
Mike Fowler, mike@mlfowler.com mike.fowler@rantandrave.com
PGDayUK 2014
Fun Things to do with Logical Decoding
Mike Fowler
PGDay UK 2015

PGDayUK 2015
Overview
● What is Logical Decoding?
● Replications Slots
● Output Plugins
● Enabling Logical Decoding
● What does Logical Decoding make possible?

PGDayUK 2015
About Me
● Head of Systems Engineering at Rant & Rave
● Been using PostgreSQL for over 10 years (since 7.4)
● Contributed some XML support
– XMLEXISTS/xpath_exists()
– xml_is_well_formed()
● Contributed a number of bugfixes to the JDBC
driver

PGDayUK 2015
What is Logical Decoding?
“The process of extracting all persistent changes to
a database's tables into a coherent, easy to
understand format which can be interpreted
without detailed knowledge of the database's
internal state”
“Implemented by decoding the contents of the
write-ahead log, which describe changes on a
storage level, into an application-specific form such
as a stream of tuples or SQL statements”

PGDayUK 2015
No, really - what is Logical Decoding?
● The results of logical decoding give a perspective on the
logical changes made during a transaction
● The process of logical decoding allows human
understanding of the physical changes made during a
transaction
● Achieved by creating a replication slot with a plugin to
produce data for a receiver
– Basic example plugin available in contrib
– New utility pg_recvlogical to consume plugin output
– Can roll your own to match your requirements

PGDayUK 2015
Some PostgreSQL Basics
● PostgreSQL is a transactional database
● All changes in PostgreSQL are recorded in a Write
Ahead Log (WAL) before any data on disk is actually
changed
● The WAL is the key to replication
– Essentially log based binary replication
– Master runs the walsender to transmit new segments
– Slaves run walreceivers to receive the segments

PGDayUK 2015
A Basic Example
We'll use a simple table called sample:
CREATE TABLE sample (
id SERIAL PRIMARY KEY,
data JSON,
captured TIMESTAMP
DEFAULT CURRENT_TIMESTAMP
);

PGDayUK 2015
A Basic Example
INSERT INTO sample (data)
VALUES
('{"beverage":"coffee","temp":96}');
● The backend takes our INSERT and calculates the
values for the omitted columns id and captured
● The backend also determines exactly what needs to
be changed on disk and commits this to the WAL

PGDayUK 2015
Replication Slots
● 9.4 introduced replication slots
● Capability to retain WAL segments on the server
instead of relying on continuous archiving
● Allows master to retain segments even when a
slave is offline
● There are two types of replication slot
– Physical
– Logical

PGDayUK 2015
Logical Replication Slots
● Entries in a logical slot are WAL segments that have
been decoded by means of an output plugin
● Output plugins are responsible for transforming the
data from the WAL into the format the consumer of
a logical replication slot desires
● contrib/test_decoding is a working example of the
interface and can be used to play

PGDayUK 2015
Output Plugin Rules
● Only transactions that have been successfully
flushed to disk are decoded
● Transactions that are rolled back never get decoded
● Changes to UNLOGGED or TEMPORARY tables are
also not decoded
● Concurrent transactions are decoded in commit
order

PGDayUK 2015
Making an Output Plugin
● Consists of building a shared library that implements a number of
functions as detailed in section 46.6 of the manual
– An initialization function
– Transaction begin, row change and transaction end callbacks
– Optional plugin start & stop callbacks
– 9.5 also adds an optional origin filter callback
● For a given transaction the begin & end is called once with the row
change called for each changed row even if multiple rows were
changed in a single statement

PGDayUK 2015
Output Plugin Considerations
● Keep the plugin simple
– It's running as part of your database
– It's going to process all your changes
● Offload complex filter/analysis to the slot consumer
● Drop the slot if it's no longer in use to stop using
server resourcesSELECT
pg_drop_replication_slot('pgday_slot');

PGDayUK 2015
Enabling Logical Decoding
● Enable replication slots (will require restart)
– wal_level must be set to 'logical'
– max_replication_slots must be at least 1
● Create a replication slot
pgday15=# SELECT * FROM
pg_create_logical_replication_slot('pgday_slot',
'test_decoding');
slot_name | xlog_position
------------+---------------
pgday_slot | 0/16C9A78
(1 row)

PGDayUK 2015
Revisiting the Basic Example
We used a simple table called sample:
CREATE TABLE sample (
id SERIAL PRIMARY KEY,
data JSON,
captured TIMESTAMP
DEFAULT CURRENT_TIMESTAMP
);

PGDayUK 2015
Revisiting the Basic Example
INSERT INTO sample (data)
VALUES ('{"beverage":"coffee","temp":96}');
● The backend took our INSERT and calculated the values for
the omitted columns id and captured
● The backend also determined exactly what needed to be
changed on disk and committed it to the WAL
● This time the test_decoding plugin was invoked when the
WAL segment entered the logical replication slot

PGDayUK 2015
Consuming the Plugin Output
● All data from the plugin will remain in the slot until
consumed
● Data can be consumed in two ways
– SQL interface
● Useful for experimentation and demonstration
● Can be used to form a basic programmatic integration with non-C
applications (e.g. Java using JDBC)
– pg_recvlogical
● Similar to pg_receivexlog
● Can be used to output to a file for later consumption

PGDayUK 2015
The Output from our Example
pgday15=# SELECT * FROM
pg_logical_slot_get_changes('pgday_slot', NULL, NULL);
location | xid | data
-----------+-----+----------------------------------
0/16C9ED0 | 723 | BEGIN 723
0/16C9ED0 | 723 | table public.sample: INSERT:
id[integer]:1
data[json]:'{"beverage":"coffee","temp":96}'
captured[timestamp without tz]:'2015-07-03 19:52:39.497161'
0/16CA000 | 723 | COMMIT 723
(3 rows)

PGDayUK 2015
pg_recvlogical
● Use pg_recvlogical to record to a single file
– -f option to specify output file
– -F to specify frequency of fsync in seconds
● Run pg_recvlogical on a separate server
– -h to specify hostname or ip
– -p port
● Can also create/delete slots

PGDayUK 2015
Some simple LD Use Cases
● Trigger-less auditing
– Keep a record of all changes in transaction order
– Change only query log
● Trigger auditing
● Basic throughput analysis

PGDayUK 2015
Limitations of test_decoding
● Only details the new/changed row
● No detail of a WHERE clause
● No timestamps in output to help cross reference
● Changed indexes don't make for easy reading
table public.sample: UPDATE: old-key: id[integer]:3 new-tuple: id[integer]:4
data[json]:'{"beverage":"coffee","temp":96}' captured[timestamp without time
zone]:'2015-07-07 00:07:56.305964'

PGDayUK 2015
Enhancing test_decoding
● Plugins can use most of backend infrastructure
– Can use output functions
– Read only access to relations
– Can not use anything that requires a transaction
● Information logged to WAL can be tuned with
REPLICA IDENTITY
– Can be adjusted per table with ALTER TABLE
– DEFAULT keeps old value of PRIMARY KEY
– USING INDEX will keep values of columns covered

PGDayUK 2015
Example: decoder_raw
● Enhanced version of test_decoding by Michael Paquier
● Appends WHERE clauses based on data derived from
REPLICA IDENTITY
– If DEFAULT, PRIMARY KEY column(s) appear in WHERE
● Removes the commit number from BEGIN and COMMIT
● Available on Git Hub
https://github.com/michaelpq/pg_plugins/tree/master/decoder_r
aw

PGDayUK 2015
decoder_raw Output
=# SELECT slot_name
FROM
pg_create_logical_replication_slot('custom_slot',
'decoder_raw');
slot_name
--------------
custom_slot
(1 row)
=# CREATE TABLE aa (a int primary key, b text);
CREATE TABLE
=# INSERT INTO aa VALUES (1, 'aa'), (2, 'bb');
INSERT 0 2
=# UPDATE aa SET b = 'cc' WHERE a = 1;
UPDATE 1
=# DELETE FROM aa WHERE a = 1;
DELETE 1

PGDayUK 2015
decoder_raw Output
=# SELECT data
FROM pg_logical_slot_peek_changes('custom_slot',
NULL, NULL, 'include-transaction', 'on');
data
----------------------------------------------------
BEGIN;
COMMIT;
BEGIN;
INSERT INTO public.aa (a, b) VALUES (1, 'aa');
INSERT INTO public.aa (a, b) VALUES (2, 'bb');
COMMIT;
BEGIN;
UPDATE public.aa SET a = 1, b = 'cc' WHERE a = 1 ;
COMMIT;
BEGIN;
DELETE FROM public.aa WHERE a = 1 ;
COMMIT;
(12 rows)

PGDayUK 2015
Partial Replication
● Can be achieved in two ways
– Plugin can filter out unneeded tables
– Slot consumer can discard unneeded tables
● Useful if you have write notes independent of a
cluster
● Streaming partial replication is possible
– If output of plugin is well structured pg_recvlogical
could be piped straight into a psql client

PGDayUK 2015
Full Statement Replication
● Changes can be streamed to other DBMS platforms
– Allows replication to non-PostgreSQL servers
– Useful for keeping a backup during a migration to
PostgreSQL
● Differs from most versions of statement replication
– The master computes the statement and determines the
changes
– Values of columns have already been computed so will be
identical (e.g. CURRENT_TIMESTAMP)

PGDayUK 2015
Conclusion
● Logical Decoding exposes meta-data about
committed transactions
● Plugins can be written to enrich and format the
meta-data for placement in replication slots
● Data in replication slots can be consumed by
external programs using SQL or
pg_recvlogical to perform a number of useful
tasks

PGDayUK 2014
Questions?
Mike Fowler
PGDay UK 2015

PGDayUK 2014
Thank you!
Mike Fowler
PGDay UK 2015

Fun Things to do with Logical Decoding

More Related Content

Similar to Fun Things to do with Logical Decoding

Similar to Fun Things to do with Logical Decoding (20)

More from Mike Fowler

More from Mike Fowler (15)

Recently uploaded

Recently uploaded (20)

Fun Things to do with Logical Decoding

Editor's Notes