0

I have a PostgreSQL database and I noticed a weird behaviour while working with indexes and partitions. The engine version is 10.21.

Now, I have a table with this structure:

guid varchar(50) PK
guid_a varchar(50)
data text
part_key varchar(2)

There are other columns but they are irrelevant. The query I have to run on this table looks like this'

select * from mytable where guid_a = 'jxxxxx-xxxxxxx' and data like '%7263628%';

Let me explain a things: The column guid_a contains a code that identifies a person in the format: 'jxxxx-xxxxxxx' where 'x' are numbers. The first two digits goes from 00 to 99, so, for example:

j01xxx-xxxxxx
j02xxx-xxxxxx
...
j99xxx-xxxxxx

I created an index on this column and then I also created an index using trgm module on the data column. Launching the query I get a giant improvement on the performance. Everything's good until now.

I also decided to use partitions (the table has 6.4 million records) and I created 99 partitions (by list) on the column part_key, which contains the first two digits only of the guid_a value. I obtained 99 partitions with each an average of 65 thousand rows. Each partition has the same indexes I talked about before. Improved the performance again. Obviously le query has another condition for the part_key, so that the engine knows which partition should query.

Now the weird stuff. I removed the trgm index on the table without the partitions and, surprise surprise: it's faster. Even faster than the partitioned table. Even removing the trgm indexes on the partitioned table.

What I noticed on the explain is that the query on the non-partitioned table is forcing the engine go for a index scan only (shouldn't then also make another scan for the second condition on the data table?).

On the partitioned table, on the other hand, it goes for the hitman index scan, then it does a heap scan and then an append. This apparently costs more than indexing all the 6.4 million rows.

I made different tests with different values but same results.

Performance:

On average:

11 ms on the partitioned table 9 ms on the non-partitioned table with one index only on the guid_a 20 ms on the non-partitioned table with two indexes, the second on the data column using trgm.

What's going on here?

2

0

You must log in to answer this question.