Latest Blog Posts

What tables were touched within given range of wal LSN?
Posted by Hubert 'depesz' Lubaczewski on 2024-07-19 at 15:00
We had a case recently where one of our DBs failed over to a new primary. To this server (old primary, and then new primary) we had connection from some kind of CDC tool (Debezium, I think). The thing is that while there was failover, this tool read (using logical decoding) changes on old primary … Continue reading "What tables were touched within given range of wal LSN?"

Magic Tricks for Postgres psql: Settings, Presets, Echo, and Saved Queries
Posted by Elizabeth Garrett Christensen in Crunchy Data on 2024-07-19 at 12:00

As I’ve been working with Postgres psql cli, I’ve picked up a few good habits from my Crunchy Data co-workers that make my terminal database environment easier to work with. I wanted to share a couple of my favorite things I’ve found that make getting around Postgres better. If you’re just getting started with psql, or haven’t ventured too far out of the defaults, this is the post for you. I’ll walk you through some of the friendliest psql settings and how to create your own preset settings file.

Some of the most helpful psql commands

Formatting psql output

Postgres has an expanded display mode, which will read your query results as batches of column and data, instead of a huge wide list of columns that expand the display to the right.

A sample expanded display looks like this:

-[ RECORD 1 ]------------------------------
id         | 1
name       | Alice Johnson
position   | Manager
department | Sales
salary     | 75000.00
-[ RECORD 2 ]------------------------------
id         | 2
name       | Bob Smith
position   | Developer
department | Engineering
salary     | 65000.00
--Automatically format expanded display for wide columns
\x auto

I have a tutorial up about using basic psql if you’re just getting started and want to try these commands out.

Table column borders in psql output

If you’re not using the extended display, you can have psql do some fancy column outlines with the \pset linestyle.

--Outline table borders and separators using Unicode characters
\pset linestyle unicode

That will get you query output that looks like this:

┌────┬───────┬─────┐
│ id │ name  │ age │
├────┼───────┼─────┤
│  1 │ Alice │  30 │
│  2 │ Bob   │  25 │
└────┴───────┴─────┘

Show query run times in psql

This will give you a result in milliseconds for the time the query took to run at the bottom:

-- Always show query time
\timing

Create a preset for your null values in psql

This will work with emojis or really anything utf-8 compatible:

-- Set Null char output to differentiate 
[...]

Row pattern recognition feature for PostgreSQL
Posted by Tatsuo Ishii in SRA OSS LLC on 2024-07-19 at 06:14

What is row pattern recognition feature?

Row pattern recognition (RPR) is a feature defined in the SQL standard. It allows to search for a sequence of rows by pattern.  Since I am working on this feature for PostgreSQL, I would like to give a brief introduction to RPR.
 
Consider a table holding date and daily stock price of a company. 
 company  |   tdate    | price
----------+------------+-------
 company1 | 2024-07-01 |   100
 company1 | 2024-07-02 |   200
 company1 | 2024-07-03 |   150
 company1 | 2024-07-04 |   140
 company1 | 2024-07-05 |   150
 company1 | 2024-07-06 |    90
 company1 | 2024-07-07 |   110
 company1 | 2024-07-08 |   130
 company1 | 2024-07-09 |   120
 company1 | 2024-07-10 |   130
(10 rows)
Suppose you want to find a sequence of rows in which the stock price rises once or more then falls once or more. For example, the stock price on July 1 is 100, then rises to 200 on July 2 and then falls to 150 and 140 on July 2 to July 3. RPR allows users to write this kind of queries in intuitive way.
 

RPR syntax

 To express the query in RPR, you first define row pattern  variables.
 
DEFINE
START AS TRUE,
UP AS price > PREV(price),
DOWN AS price < PREV(price)

Here DEFINE is a keyword to start the definition. START, UP and DOWN are row pattern variable names. The right hand side of AS is logical expression which the pattern variable needs to satisfy. The logical expression "TRUE" means always true. PREV() is a special function that only used in RPR (cannot use elsewhere, for example SELECT target or WHERE clause).  PREV takes a column name as an argument, and returns the previous row's column value. So if today's price is higher than yesterday's price, "price > PREV(price)" returns TRUE. DOWN has opposite meaning.

Once the row pattern variables are defined, you can define patterns to search for by using the row pattern variables.

PATTERN (START, UP+, DOWN+)

Here PATTERN is a k

[...]

pgroll 0.6.0 update
Posted by Andrew Farries on 2024-07-19 at 00:00
Learn about the latest changes in pgroll in the 0.6.0 release as we continue to build and turn it into a first-class open-source schema migration tool for Postgres.

PostgreSQL Performance Farm 2024 Progress Update
Posted by Mark Wong on 2024-07-18 at 00:26

It feels like it was time to revisit the PostgreSQL Performance Farm that Tomas Vondra envisioned back in 2010.  Between that and several Google Summer of Code iterations starting in 2018, the project just didn't seem to gain enough traction.  See Ilaria Battiston's presentation in 2022 for a demo.

I spent a few days proofing whether something like BuildBot might work on a whim.  It was to see if I could get something basic working in order to get a feel for whether leveraging an established continuous integration framework might be worth the trade offs from building our own system.  The result was a system running 4 TPC-derived workloads that are simply running the tests quickly, not necessarily interestingly, just to produce data.

I added some additional desired functionality after a few more days:

  • Trigger only when there are code changes.
  • Define Buildbot worker specific test and PostgreSQL parameters.
  • Overrides to the Buildbot worker defined test and PostgreSQL parameters.
  • Submit patches against PostgreSQL and the test kits.
How I implemented the added functionality probably needs some review, but I liked how things were looking enough to spend a little more time on it.

Next I wanted to see how much effort is needed to mine the data, especially since the results of the tests are not captured in a structured way.   I'm pretty sure it would take me much longer than a few days to write a new BuildBot view, so I instead opted to try scripting something to munge data and plot metrics.  I declared success by being able to quickly do so.  Here's an example of results from one system vanillaleaf, a Buildbot worker, showing the DBT-2 test metric per commit per branch:


The PostgreSQL community has access to a couple of systems for processor and memory intensive workloads, thanks to Equinix Metal, OSUOSL and IBM.  Both systems are continuing to run small tests so if building an interface to peruse results is successful, then I'll have more to show in the near future.

Postgres major version upgrades with minimal downtime
Posted by Carlos Pérez-Aradros Herce on 2024-07-18 at 00:00
With the beta release of dedicated clusters, we've added the ability to move branches between clusters. Xata customers can now perform Postgres major version upgrades with minimal downtime.

PostgreSQL Berlin July 2024 Meetup
Posted by Andreas Scherbaum on 2024-07-17 at 15:09

On July 16th, 2024, we had the PostgreSQL July Meetup in Berlin. Adjust hosted and Neon sponsored the Meetup in their Berlin Headquarter at Prenzlauer Berg, near the TV Tower.

Exploring PostgreSQL 17: A Developer’s Guide to New Features – Part 3: The COPY Command Gets More User-Friendly
Posted by Deepak Mahto on 2024-07-17 at 14:12

PostgreSQL 17 Beta was released on May 23, 2024, introducing a host of exciting new features anticipated to be part of the official PostgreSQL 17 release. In this blog series, we’ll delve into these features and explore how they can benefit database developers and migration engineers transitioning to the latest PostgreSQL version.

First part of the blog was on newer features with PL\pgSQL – Procedural language in PostgreSQL 17.

Second part of the blog was on newer features with Nulls Constraint and Performance – Procedural language in PostgreSQL 17.

PostgreSQL 17 – Enhancements with the COPY Command

The COPY command is the default native option for loading flat files, primarily in text or CSV format, into PostgreSQL. If you are a data engineer or database developer looking for ways to load flat files into PostgreSQL, you must take care of quality checks and encapsulate a mechanism to report or discard failed records.

Until PostgreSQL 16, if quality checks were planned to be done on the database side, one option was to preload flat files into staging tables with all columns as text or a generic data type and later move legitimate and discarded data to the concerned tables.

With PostgreSQL 17, the COPY command is more user-friendly and provides options for handling data type incompatibilities and logging failed rows or records. Based on your use case, you now have the feasibility to avoid staging tables and can load directly into your preferred target tables with all quality checks on data types.

PostgreSQL 17 – ON_ERROR/LOG_VERBOSITY

Let’s start by creating sample target tables and a CSV file with intentional data type issues. The sample table also has check and NOT NULL constraints defined on it.

CREATE TABLE copy17 (
    col1 integer,
    col2 text DEFAULT 'x'::text,
    col3 text NOT NULL,
    col4 text,
    col5 boolean,
    col6 date,
    col7 timestamp without time zone
    CONSTRAINT copy17_col4_check CHECK (length(col4) > 2)
);

The CSV file contain

[...]

PGDay UK 2024 - Schedule published
Posted by Dave Page in PGDay UK on 2024-07-16 at 10:35

Join us on 11th September 2024 in London, for a day of talks on the World's Most Advanced Open Source Database coupled with the usual valuable hallway track. This event is aimed at all users and developers of PostgreSQL and is your chance to meet and exchange ideas and knowledge with like-minded database fanatics in London.

Schedule

We are pleased to announce that the schedule for PGDay UK 2024 has now been published. You can see the fantastic selection of talks we have planned at:

https://2024.pgday.uk/schedule/

The team would like to thank all those who submitted talks, as well as the program committee who had a long and difficult job selecting the talks!

Registration

Registration for attendees is now open. For more information and to secure your seat, please visit:

https://2024.pgday.uk/registration/

Sponsors

Sponsor the event and take your chance to present your services or products to the PostgreSQL community - or see it as a give-back opportunity. The benefactor sponsorship level also includes a free entrance ticket. Please head to:

https://2024.pgday.uk/become-sponsor/

for more details.

See you there!

As usual, if you have any questions, don't hesitate to contact us at contact@pgday.uk.

We look forward to seeing you in London in September!

PGDay UK 2024 is a PostgreSQL Europe event run according to the PostgreSQL Community Conference Recognition programme. Local logistics and other services are provided by Slonik Enterprises Ltd. on a not-for-profit basis.

We look forward to seeing you in London in September!

Keyset pagination with descending order
Posted by Laurenz Albe in Cybertec on 2024-07-16 at 05:49
no-offset-banner
© Markus Winand 2014

Keyset pagination is the most performant way to retrieve a large result set page by page. However, the neat trick with composite type comparison doesn't always work. This article explains why and how you can work around that shortcoming.

An example table for paginated queries

We create a table with a million rows:

CREATE TABLE sortme (
   id bigint PRIMARY KEY,
   val1 integer NOT NULL,
   val2 timestamp with time zone NOT NULL,
   val3 text NOT NULL
);

-- deterministic, but sort of random values
INSERT INTO sortme
SELECT i,
       abs(hashint8(i)) % 200 + 1,
       TIMESTAMPTZ '2024-01-01 00:00:00' +
          (abs(hashint8(-i)) % 200) * INTERVAL '1 hour',
       substr(md5(i::text), 1, 2)
FROM generate_series(1, 1000000) AS i;

-- get statistics and set hint bits
VACUUM (ANALYZE) sortme;

Normal keyset pagination

If we want to ORDER BY val1, val2, the first query would look like

SELECT val1, val2, val3, id
FROM sortme
ORDER BY val1, val2, id
LIMIT 50;

The primary key id was added to the result list and the ORDER BY clause to provide uniqueness and thereby guarantee a stable order. Assuming that the last row returned from that query was (1, '2024-01-01 01:00:00+01', 'e3', 920198), the query for the next page would be

SELECT val1, val2, val3, id
FROM sortme
WHERE (val1, val2, id) > (1, '2024-01-01 01:00:00+01', 920198)
ORDER BY val1, val2, id
LIMIT 50;

With a multi-column index on (val1, val2, id), both queries would be lightning fast, as would be the queries for all following pages.

The problem with keyset pagination in descending and mixed order

Matters become more complicated as soon as we need descending sort order. A fairly simple case is if we need to sort the result by a single colum (for example, val3). Again, we need the unique column id as a tie-breaker. The solution is to use descending sort order for both columns:

-- first page
SELECT val1, val2, val3, id
FROM sortme
ORDER BY val3 DESC, id DESC
LIMIT 50;

-- next pa
[...]

Introducing multi-version schema migrations
Posted by Andrew Farries on 2024-07-16 at 00:00
Today's release of multi-version schema migrations addresses one of the most common pain points of application deployment - keeping your application code and database schema in sync. You can now present two versions of your schema, both old and new, to client applications.

RFC: PGXN Meta Spec v2
Posted by David Wheeler in Tembo on 2024-07-15 at 19:15

Two bits of news on the “PGXN v2” project.

PGXN RFCs: The Book

First, I’ve moved the RFC process (again, sorry) from PGXN Discussions, which were a bit fussy about Markdown formatting and don’t support inline comments, to the PGXN RFCs project, where use of pull requests on CommonMark Markdown documents address these issues. This process borrows heavily from the Rust RFCs project, right down to publishing accepted RFCs as a “book” site.

So I’d also like to introduce rfcs.pgxn.org, a.k.a., the PGXN RFCs Book.

It currently houses only one RFC: Meta Spec v1, dating from 2010. This document defines the structure of the META.json file required in archives published on PGXN.

But I expect many more RFCs to be drafted in the coming years, starting with draft RFC–2, the binary distribution RFC I POCed a few weeks ago. There has already been some great feedback in that pull request, in addition to the previous discussion. More eyes will make it even better.

PGXN Meta Spec v2 RFC

Last week I also iterated on the PGXN Metadata Sketch several times to produce draft RFC–3: Meta Spec v2. This represents a major reworking of the original spec in an attempt to meet the following goals:

  • Allow more comprehensive dependency specification, to enable packagers to identify and install system dependencies and dependencies from other packaging systems, like PyPI and CPAN
  • Adopt more industry-standard formats like SPDX License Expressions and purls.
  • Improve support multiple types of Postgres extensions, including apps, LOADable modules, background workers, and TLEs.
  • Improve curation and evaluation via categories, badging, and additional download links.

There’s a lot here, but hope the result can better serve the community for the next decade, and enable lots of new services and features.

The proof will be in the applicatio

[...]

Implementing UUIDs v7 in pure SQL
Posted by Daniel Vérité on 2024-07-15 at 18:14
In May 2024, the IETF standard on UUIDs (Universally Unique IDentifiers) has been updated with RFC 9562, finally officializing the UUID Version 7. This version is known to be a much better choice for database indexes than previous ones, since it has values generated consecutively already sorted. PostgreSQL does not yet have a built-in function to generate UUIDs v7, but of course several extensions do exist. The ones I found tend to require a compilation step and superuser privileges to install, as they’re written in “untrusted languages” like C or Rust. However, UUID-v7 functionalities can be implemented in pure SQL so they can be installed easily everywhere. In this post, let’s see how to do that.

Autovacuum Tuning Basics
Posted by Tomas Vondra in EDB on 2024-07-15 at 17:16

A few weeks ago I covered the basics of tuning checkpoints, and in that post I also mentioned autovacuum as the second common source of performance issues (based on what we see on the mailing list and at our customers). Let me follow-up on that with this post about how to tune autovacuum, to minimize the risk of performance issues. In this post I'll briefly explain why we even need autovacuum (dead rows, bloat and how autovacuum deals with it), and then move to the main focus of this blog post - tuning. I’ll go over all the relevant configuration options, and some basic rules for tuning them.

Ozgun Erdogan
Posted by Andreas 'ads' Scherbaum on 2024-07-15 at 13:15
PostgreSQL Person of the Week Interview with Ozgun Erdogan: I’m originally from Istanbul. After college, I moved to the Bay Area for grad school and then up to Seattle for my first programming gig at Amazon. After about four years, I wanted to learn more and co-founded Citus Data with two of the smartest guys I know.

How expensive is it to maintain extended statistics?
Posted by Andrei Lepikhov in Postgres Professional on 2024-07-14 at 23:24

In the previous post, I passionately advocated for integrating extended statistics and, moreover, creating them automatically. But what if it is too computationally demanding to keep statistics fresh?

This time, I will roll up my sleeves, get into the nitty-gritty and shed light on the burden extended statistics put on the digital shoulders of the database instance. Let's set aside the cost of using this type of statistics during planning and focus on one aspect - how much time we will spend in an ANALYZE command execution.

I understand how boring numbers look sometimes, as well as benchmarks. However, my vast experience in computational physics and analysing long listings full of numbers shows that it can be a fantastic source of inspiration.

So, let's start and create a test table:

DROP TABLE IF EXISTS bench;
CREATE TABLE bench (
  x1 integer, x2 integer, x3 integer, x4 integer,
  x5 integer, x6 integer, x7 integer, x8 integer
) WITH (autovacuum_enabled = false);
INSERT INTO bench (x1,x2,x3,x4,x5,x6,x7,x8) (
  SELECT x%11,x%13,x%17,x%23,x%29,x%31,x%37,x%41
  FROM generate_series(1,1E6) AS x
);

Why eight columns, you might wonder? - This deliberate choice is due to the hard limit of extended statistics - STATS_MAX_DIMENSIONS, which allows only eight columns or expressions in its definition clause.

Let me compare the performance of plain statistics with extended ones. I'll consider different variations of 'ndistinct', 'MCV' and 'dependencies' types. Additionally, I'll include a comparison with a statistic type called 'Joinsel', which builds and uses histograms, MCV, and distinct statistics over a predefined set of columns, treating them as a single value of a composite type. It can be found in standard and enterprise variants of the private Postgres Professional fork.

To measure execution time, use "\timing on". I'm going to observe how much time it takes if we build statistics over two, four and eight columns. I will take the surrogate test for plain statistics by creating the 'bench'

[...]

Use pg_easy_replicate for setting up Logical Replication and Switchover in PostgreSQL
Posted by Shayon Mukherjee on 2024-07-13 at 13:11
Logical replication is a powerful feature in PostgreSQL that allows for real-time data replication between databases. It can be used for performing major version upgrades using a blue/green setup where you have two databases, allowing you to test and switch over to a new version with minimal downtime. Logical replication can also be use to facilitate database migrations between different environments, using the same technique and tooling. In this post, I will describe the process of setting up simple replication and switchover between two databases using pg_easy_replicate.

SaaS on Rails on PostgreSQL — POSETTE 2024
Posted by Andrew Atkinson on 2024-07-13 at 00:00

In this talk attendees will learn how Ruby on Rails and PostgreSQL can be used to create scalable SaaS applications, focusing on schema and query design, and leveraging database capabilities.

We’ll define SaaS concepts, B2B, B2C, and multi-tenancy. Although Rails doesn’t natively support SaaS or multi-tenancy, solutions like Bullet Train and Jumpstart Rails can be used for common SaaS needs.

Next we’ll cover database designs from the Apartment and acts_as_tenant gems which support multi-tenancy concepts, then connect their design concepts to Citus’s row and schema sharding capabilities from version 12.0.

We’ll also cover PostgreSQL’s LIST partitioning and how to use it for efficient detachment of unneeded customer data.

We’ll cover the basics of leveraging Rails 6.1’s Horizontal Sharding for database-per-tenant designs.

Besides the benefits for each tool, limitations will be described so that attendees can make informed choices.

Attendees will leave with a broad survey of building multi-tenant SaaS applications, having reviewed application level designs and database designs, to help them put these into action in their own applications.

💻 Slide Deck

🎥 YouTube Recording

Checking Your Privileges, 2
Posted by Christophe Pettus in pgExperts on 2024-07-10 at 17:48

I turned the last blog post into a talk; you can get the slides here.

Guide to Auditing and Monitoring Access in PostgreSQL
Posted by Umair Shahid in Stormatics on 2024-07-10 at 08:52

In the data-driven world of today, maintaining the security and integrity of your database is paramount. Auditing and monitoring access to your database are critical components of an effective security strategy. These processes help ensure that only authorized users are accessing sensitive information and that any unauthorized access attempts are detected and addressed promptly.

PostgreSQL is renowned for its robustness, extensibility, and adherence to standards. It offers a rich set of features for auditing and monitoring, which can be leveraged to enhance security and compliance.

Purpose and Scope of the Blog:

This blog aims to provide an in-depth guide on auditing and monitoring access in PostgreSQL. We will explore various tools, techniques, and best practices to help you implement effective auditing and monitoring strategies in your PostgreSQL environment.

The post Guide to Auditing and Monitoring Access in PostgreSQL appeared first on Stormatics.

Mentoring Program Updates
Posted by Robert Haas in EDB on 2024-07-09 at 14:04

Here are a few updates on the mentoring program that I announced in a previous blog post.

First, I ended up keeping applications open for 12 days. During that time, I got 34 applications. Many of the 9 committer-mentors who initially offered to mentor one person ended up deciding that they would be willing to mentor two people, so I was able to send out 14 acceptances, which I think is a great result, especially considering that this is the first time we've done something like this.

Read more »

CloudNativePG Recipe 9 - Inspecting the network and the storage in a CloudNativePG cluster
Posted by Gabriele Bartolini in EDB on 2024-07-09 at 11:20

In this blog post, I delve into the crucial aspects of networking and storage within a CloudNativePG cluster deployed on Kubernetes. I explore how Kubernetes services manage external and internal communications for PostgreSQL clusters, ensuring high availability and seamless failover. Additionally, I examine the role of PersistentVolumeClaims in handling PostgreSQL data storage, offering insights into effective resource management. This article provides an example of the kind of knowledge DBAs need to acquire when managing PostgreSQL in cloud-native environments, highlighting the importance of collaboration with infrastructure teams and developers to ensure robust and resilient cluster operations.

Using AI to generate data structures for PostgreSQL
Posted by Hans-Juergen Schoenig in Cybertec on 2024-07-09 at 05:26

Artificial Intelligence is the motto of the day. Everybody is talking about it, everybody seems to know what it means and the insane number of buzzwords floating around on the internet is just mind-boggling. It seems that in some areas of the industry, the number of slides and marketing videos exceeds the number of real-world use cases and sometimes even the amount of code that actually exists.

However, as I tend to prefer the real PostgreSQL world over the fake marketing side of things, I decided to show some of the things  we recently experimented with to give you  some insights into what can be done with AI and to point out where the limitations are.

First step: Generating data models for PostgreSQL

One of the most obvious things one might do with AI is to help people create basic data models. This use case presents itself as a perfect fit for all the new LLMs (Large Language Models) including ChatGPT and similar models.

For this little demo, I used a local deployment of “llama3”, which has been provided by Meta. It is a general-purpose model, but it happens to do a decent job and produces reasonable drafts that can serve as the basis for expanding your SQL code quickly and efficiently.

How can anybody make use of an LLM when dealing with PostgreSQL? Well, the first thing to do is to install an Open Source tool called “ollama”. It will help you to download, deploy, and run the language model of your choice. Ollama runs as a server and offers a really simple interface. I ran these models on my local Apple M1 chip. OS X is able to automatically include your GPU. On Linux, a bit more effort is needed to run on your NVIDIA chip, but you will also be awarded with better response times in general, assuming your are not utilizing an outdated chip.

In general, I found it useful to use Python to interact with AI models. However, for testing, the command line works just fine.

Generating PostgreSQL code using Ollama:

Before we dive into the Python code we can use the command line interf

[...]

Say hello to the Talking Postgres podcast
Posted by Claire Giordano in CitusData on 2024-07-09 at 00:03

The TL;DR of this blog post is simple: the “Path To Citus Con” podcast for developers who love Postgres has been renamed—and the new name is Talking Postgres.

And if you’re just hearing about the Talking Postgres podcast for the first time, it is a monthly podcast for developers who love Postgres, with amazing guests from the Postgres world who talk about the human side of Postgres, databases, and open source.

Listening to the Talking Postgres podcast is the next best thing to being in the hallway at a Postgres conference, eavesdropping on other people’s conversations and learning from the experiences of experts. As Floor Drees says, it’s as if you’re sharing a coffee with them.

Past podcast guests include (in order of appearance) some amazing Postgres, database, and open source people such as: Simon Willison, Marco Slot, Abdullah Ustuner, Burak Yucesoy, Melanie Plageman, Samay Sharma, Álvaro Herrera, Boriss Mejías, Grant Fritchey, Ryan Booz, Chelsea Dole, Floor Drees, Andres Freund, Heikki Linnakangas, Dimitri Fontaine, Vik Fearing, Lukas Fittl, Rob Treat, Jelte Fennema-Nio, Marco Slot, Arda Aytekin, Chris Ellis, Michael Christofides, Aaron Wislang, and Teresa Giacomini. The podcast is produced by the Postgres team at Microsoft—and I have the privilege of being your host.

So whether you’re an existing listener or new to this podcast, we hope you enjoy the Talking Postgres podcast—and help to spread the word about the new name.

Talking Postgres logo
Figure 1: The new “Talking Postgres with Claire Giordano” podcast name (formerly called Path To Citus Con) is depicted here with the same elephant mascot we’ve always used.

Some key things to know about the Talking Postgres podcast

  • Why did we rename the podcast?: Guests & friends repeatedly—and I mean repeatedly—nudged us to rename the podcast to a name that makes it more clear what the podcast is about. And at the end of the day it’s about Postgres things! So while the podcast was born in March 2023 as a pre-event to last year’s Cit
[...]

Automated Major Version Upgrades in Percona Operator for PostgreSQL
Posted by Sergey Pronin in Percona on 2024-07-08 at 15:41
Automated Major Version Upgrades in Percona Operator for PostgreSQLPostgreSQL major versions are released every year, with each release delivering better performance and new features. With such rapid innovation, it is inevitable that there will be a need to upgrade from one version to another. Upgrade procedures are usually very complex and require thorough planning. With the 2.4.0 release of Percona Operator for PostgreSQL, […]

Access Controls in PostgreSQL
Posted by Umair Shahid in Stormatics on 2024-07-08 at 15:41

Secure your PostgreSQL database with effective access control. Learn role management, privileges, and Row-Level Security (RLS) techniques.

The post Access Controls in PostgreSQL appeared first on Stormatics.

Grant Zhou
Posted by Andreas 'ads' Scherbaum on 2024-07-08 at 14:00
PostgreSQL Person of the Week Interview with Grant Zhou: I was born in Qingdao, China, and have been living in Vancouver, Canada since 2013. I am the founder of Highgo Software North America and lead the company’s PostgreSQL open source development team, focusing on PostgreSQL contributions and the IvorySQL project. Additionally, I serve as the secretary of the International Consultant Committee of the PostgreSQL China Association and as the Organizer & Asia Liaison at PostgresConf.

Introducing Go SQL/JSON Path and Playground
Posted by David Wheeler in Tembo on 2024-07-08 at 13:59

For a personal project, I needed to parse and execute PostgreSQL-compatible jsonpath expressions.1 So I’ve spent just about every spare evening and weekend the last several months porting Postgres jsonpath to Go, and it’s finally ready to ship.

Introducing Go SQL/JSON, featuring the path package. This project provides full support for all of the PostgresSQL 17 jsonpath features2 in the Go programming language. An example:

package main

import (
      "context"
      "encoding/json"
      "fmt"
      "log"

      "github.com/theory/sqljson/path"
      "github.com/theory/sqljson/path/exec"
)

func main() {
      var value any
      err := json.Unmarshal([]byte(`{"a":[1,2,3,4,5]}`), &value)
      if err != nil {
              log.Fatal(err)
      }

      p := path.MustParse("$.a[*] ? (@ >= $min && @ <= $max)")
      res, err := p.Query(
              context.Background(),
              value,
              exec.WithVars(exec.Vars{"min": float64(2), "max": float64(4)}),
      )
      if err != nil {
              log.Fatal(err)
      }

      fmt.Printf("%v\n", res)
    // Output: [2 3 4]
}

I think the API is decent, but may implement better patterns as I discover them. Overall I’m quite satisfied with how it turned out, and just how well its implementation and performance compare to the original.

🛝 Playground

But why stop there? One of the nice things about this project is that Go supports compiling applications into WebAssembly (a.k.a. Wasm) via Go WebAssembly. Borrowing from the Goldmark project, I created and published the sqljson/path playground and populated the docs with links for all of its examples.

Now anyone can experiment with SQL/JSON path expressions, and share links to demonstrate patterns and techniques. The Playground is a stateless JavaScript/Wasm web application: data persists only in permalink URLs.3

🛝 Try this example Playground permalink right now!4

The

[...]

Basics of Tuning Checkpoints
Posted by Tomas Vondra in EDB on 2024-07-08 at 12:00
On systems doing non-trivial number of writes, tuning checkpoints is crucial for getting good performance. Yet checkpoints are one of the areas where we often identify confusion and configuration issues, both on the community mailing lists and during support and consulting for our customers. This post is meant to explain what checkpoints are - the purpose and how the database implements that - and then also how to tune them.

Parallel Queries in Postgres
Posted by Elizabeth Garrett Christensen in Crunchy Data on 2024-07-08 at 10:00

Many folks are surprised to hear that Postgres has parallel queries out of the box. This was released in small batches across a half dozen versions of Postgres, so the major fanfare for having parallelism got a little bit lost.

By default Postgres is configured for two parallel workers. The Postgres query planner will assemble a few plans for any given query and will estimate the additional overhead of performing parallel queries, and make a go or no-go decision. Depending on the settings and the calculations of the query planner, parallel queries are typically used by large and long running queries — like warehouse or analytical workloads.

A simplified diagram of the process for how Postgres decides to use the parallel query capabilities.

Below is the output of a sample EXPLAIN when the query planner uses 2 parallel workers:

EXPLAIN (ANALYZE, VERBOSE, COSTS, BUFFERS)
SELECT * FROM orders WHERE order_date > '2024-01-01';

                                     QUERY PLAN
-------------------------------------------------------------------------------------
Gather  (cost=1000.00..11000.00 rows=10000 width=123) (actual time=0.051..12.345 rows=8500 loops=1)
  Output: order_id, customer_id, order_date, order_amount
  Workers Planned: 2
  Workers Launched: 2
  Buffers: shared hit=50 read=1000
  ->  Parallel Seq Scan on public.orders  (cost=0.00..9000.00 rows=4167 width=123) (actual time=0.028..10.045 rows=2833 loops=3)
        Output: order_id, customer_id, order_date, order_amount
        Filter: (orders.order_date > '2024-01-01'::date)
        Rows Removed by Filter: 11667
        Buffers: shared hit=50 read=1000
Planning Time: 0.123 ms
Execution Time: 12.567 ms

Parallel queries were first released in Postgres 10 and we are currently at 16, with version 17 right around the corner. Every major release since version 10 has included performance improvements for parallel queries and has implemented parallelism into indexes, joins, and vacuum. Starting small and expanding is a common theme of Postgres development. Even PostGIS and pgvector take advantage of parallelism for queries and index builds.

[...]

Top posters

Number of posts in the past two months

Top teams

Number of posts in the past two months

Feeds

Planet

  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.

Contact

Get in touch with the Planet PostgreSQL administrators at planet at postgresql.org.