Split comma separated values into target table with fixed number of columns

Question

I have a table with a single column in a Postgres 13.1 database. It consists of many rows with comma-separated values - around 20 elements at most.

I want to split the data into multiple columns. But I have only a limited number of columns say 5 and more than 5 CSV values in a single row, so excess values must be shifted to new/next row). How to do this?

Example:

a1, b1, c1
a2, b2, c2, d2, e2, f2
a3, b3, c3, d3, e3, f3, g3, h3, i3, j3
a4
a5, b5, c5
'
'
'

Columns are only 5, so the output would be like:

c1 c2 c3 c4 c5
---------------
a1 b1 c1
a2 b2 c2 d2 e2 
f2
a3 b3 c3 d3 e3
f3 g3 h3 i3 j3
a4
a5 b5 c5
'
'
'

@ErwinBrandstetter Yes, the order in which they are in. There's no logic part in the order. Thanks — Martin2000, Commented Feb 20, 2021 at 12:27
@ErwinBrandstetter Hello. Is there any function or a query that puts the values from the right side into the columns? Like a1 value under c5, b1 under c4, and c1 under c3? — Martin2000, Commented Feb 23, 2021 at 9:07

Erwin Brandstetter · Accepted Answer · 2021-02-24 12:18:43Z

It is typically bad design to store CSV values in a single column. If at all possible, use an array or a properly normalized design instead.

While stuck with your current situation ...

For known small maximum number of elements

A simple solution without trickery or recursion will do:

SELECT id, 1 AS rnk
     , split_part(csv, ', ', 1) AS c1
     , split_part(csv, ', ', 2) AS c2
     , split_part(csv, ', ', 3) AS c3
     , split_part(csv, ', ', 4) AS c4
     , split_part(csv, ', ', 5) AS c5
FROM   tbl
WHERE  split_part(csv, ', ', 1) <> '' -- skip empty rows

UNION ALL
SELECT id, 2
     , split_part(csv, ', ', 6)
     , split_part(csv, ', ', 7)
     , split_part(csv, ', ', 8)
     , split_part(csv, ', ', 9)
     , split_part(csv, ', ', 10)
FROM   tbl
WHERE  split_part(csv, ', ', 6) <> '' -- skip empty rows

-- three more blocks to cover a maximum "around 20"

ORDER  BY id, rnk;

db<>fiddle here

id being the PK of the original table.
This assumes ', ' as separator, obviously.
You can adapt easily.

Split comma separated column data into additional columns

For unknown number of elements

Various ways. One way use regexp_replace() to replace every fifth separator before unnesting ...

-- for any number of elements
SELECT t.id, c.rnk
     , split_part(c.csv5, ', ', 1) AS c1
     , split_part(c.csv5, ', ', 2) AS c2
     , split_part(c.csv5, ', ', 3) AS c3
     , split_part(c.csv5, ', ', 4) AS c4
     , split_part(c.csv5, ', ', 5) AS c5
FROM   tbl t
     , unnest(string_to_array(regexp_replace(csv, '((?:.*?,){4}.*?),', '\1;', 'g'), '; ')) WITH ORDINALITY c(csv5, rnk)
ORDER  BY t.id, c.rnk;

db<>fiddle here

This assumes that the chosen separator ; never appears in your strings. (Just like , can never appear.)

The regular expression pattern is the key: '((?:.*?,){4}.*?),'

(?:) ... “non-capturing” set of parentheses
() ... “capturing” set of parentheses
*? ... non-greedy quantifier
{4}? ... sequence of exactly 4 matches

The replacement '\1;' contains the back-reference \1.

'g' as fourth function parameter is required for repeated replacement.

Fill from right to left

(Like you added in How to put values starting from the right side into columns?)
Simply count down numbers like:

SELECT t.id, c.rnk
     , split_part(c.csv5, ', ', 5) AS c1
     , split_part(c.csv5, ', ', 4) AS c2
     , split_part(c.csv5, ', ', 3) AS c3
     , split_part(c.csv5, ', ', 2) AS c4
     , split_part(c.csv5, ', ', 1) AS c5
FROM ...

db<>fiddle here

Thank you so much for the help. Question- what does '<>' this means in WHERE clause? Though I am new to this. Is there any option to do it with hard coding like what if in the future there is a row with more than 20 elements? Please help. — Martin2000, Commented Feb 20, 2021 at 12:45
@Martin2000: <> is the "not equal" operator. If the first column is not equal to the empty string (''), then the row isn't noise. See: stackoverflow.com/a/23767625/939860 — Erwin Brandstetter, Commented Feb 20, 2021 at 12:49
@Martin2000: I added solutions for an arbitrary number of elements, and to fill from right to left. — Erwin Brandstetter, Commented Feb 24, 2021 at 12:19

bobflux · Accepted Answer · 2021-02-20 13:47:37Z

CREATE UNLOGGED TABLE foo( x TEXT );
\copy foo FROM stdin
a1, b1, c1
a2, b2, c2, d2, e2, f2
a3, b3, c3, d3, e3, f3, g3, h3, i3, j3
a4
a5, b5, c5
\.

From lines to single column...

SELECT (ROW_NUMBER() OVER () - 1)/5 AS r, u FROM (SELECT unnest(string_to_array(x,', ')) u from foo) y;
 r | u
---+----
 0 | a1
 0 | b1
 0 | c1
 0 | a2
 0 | b2
 1 | c2
 1 | d2
...etc

...and back to lines of known length.

SELECT r,array_agg(u) a FROM (
 SELECT (ROW_NUMBER() OVER () - 1)/5 AS r, u FROM (
  SELECT unnest(string_to_array(x,', ')) u from foo) y) y1 
GROUP BY r ORDER BY r;
 r |    a
---+------------------
 0 | {a1,b1,c1,a2,b2}
 1 | {c2,d2,e2,f2,a3}
 2 | {b3,c3,d3,e3,f3}
 3 | {g3,h3,i3,j3,a4}
 4 | {a5,b5,c5}

After this you can insert it into a table using a[] for each column. What to do with the last line is left as an exercise to the reader...

S-Man · Accepted Answer · 2021-02-23 14:12:47Z

Answer to related question: How to put values starting from the right side into columns?

The accepted great answer from @ErwinBrandstetter can be easily adapted to required right-to-left output.

You just need to change to order of the split parts. So you don't return split parts 1-5 and 6-10 but 5-1 and 10-6:

demo:db<>fiddle

SELECT id, 1 AS rnk
     , split_part(csv, ', ', 5) AS c1
     , split_part(csv, ', ', 4) AS c2
     , split_part(csv, ', ', 3) AS c3
     , split_part(csv, ', ', 2) AS c4
     , split_part(csv, ', ', 1) AS c5
FROM   tbl
WHERE  split_part(csv, ', ', 1) <> '' -- skip empty rows

UNION ALL
SELECT id, 2
     , split_part(csv, ', ', 10)
     , split_part(csv, ', ', 9)
     , split_part(csv, ', ', 8)
     , split_part(csv, ', ', 7)
     , split_part(csv, ', ', 6)
FROM   tbl
WHERE  split_part(csv, ', ', 6) <> '' -- skip empty rows

-- more?

ORDER  BY id, rnk;

What do you mean with "strong logic"? Sry, the question is not quite clear to me. — S-Man, Commented Feb 23, 2021 at 14:58

Kevzz8.15.17 · Accepted Answer · 2021-02-20 12:21:27Z

0

You need to do this in the whatever backend layer you are using.

First, convert the CSV rows to array of string

Then, use logic something like this to add Values to the database

int row = 0; // database row index - can be used to just have a count

final int MAX_COLUMNS = 5;

for(int i = 0; i<rows.length; i++) {
    // Convert csv row string to array of each value.
    String [] values = rows[i].split(",");

    // Dividing whole row into chunks of size of number of columns

    for(int j = 0; j < (values.length/(MAX_COLUMNS)) + 1; j++) {
       Add Values [MAX_COLUMNS*j,MAX_COLUMNS*j+(MAX_COLUMNS - 1)] to the row [row + j]
      row++;
    }

}

answered Feb 20, 2021 at 12:21

Kevzz8.15.17

463 bronze badges

Could you please tell me in Postgres? I am new to this and don't know the syntax you are using or how to convert your syntax into Postgres.
– Martin2000
Commented Feb 20, 2021 at 16:15

Add a comment |

Collectives™ on Stack Overflow

Split comma separated values into target table with fixed number of columns

4 Answers 4

For known small maximum number of elements

For unknown number of elements

Fill from right to left

Not the answer you're looking for? Browse other questions tagged
sql
postgresql
split
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

For known small maximum number of elements

For unknown number of elements

Fill from right to left

Not the answer you're looking for? Browse other questions tagged sqlpostgresqlsplit or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
sql
postgresql
split
or ask your own question.