21

I'm inserting bulk records using COPY statement in PostgreSQL. What I realize is, the sequence IDs are not getting updated and when I try to insert a record later, it throws duplicate sequence ID. Should I manually update the sequence number to get the number of records after performing COPY? Isn't there a solution while performing COPY, just increment the sequence variable, that is, the primary key field of the table? Please clarify me on this. Thanks in advance!

For instance, if I insert 200 records, COPY does good and my table shows all the records. When I manually insert a record later, it says duplicate sequence ID error. It very well implies that it didn’t increment the sequence ids during COPYing as work fine during normal INSERTing. Instead of instructing the sequence id to set the max number of records, won’t there be any mechanism to educate the COPY command to increment the sequence IDs during its bulk COPYing option?

3 Answers 3

32

You ask:

Should I manually update the sequence number to get the number of records after performing COPY?

Yes, you should, as documented here:

Update the sequence value after a COPY FROM:

| BEGIN;
| COPY distributors FROM 'input_file';
| SELECT setval('serial', max(id)) FROM distributors;
| END;

You write:

it didn’t increment the sequence ids during COPYing as work fine during normal INSERTing

But that is not so! :) When you perform a normal INSERT, typically you do not specify an explicit value for the SEQUENCE-backed primary key. If you did, you would run in to the same problems as you are having now:

postgres=> create table uh_oh (id serial not null primary key, data char(1));
NOTICE:  CREATE TABLE will create implicit sequence "uh_oh_id_seq" for serial column "uh_oh.id"
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "uh_oh_pkey" for table "uh_oh"
CREATE TABLE
postgres=> insert into uh_oh (id, data) values (1, 'x');
INSERT 0 1
postgres=> insert into uh_oh (data) values ('a');
ERROR:  duplicate key value violates unique constraint "uh_oh_pkey"
DETAIL:  Key (id)=(1) already exists.

Your COPY command, of course, is supplying an explicit id value, just like the example INSERT above.

1
  • 1
    The sequence only increments when a value is "consumed" from it to fill in the default value during an INSERT (internally uses the nextval function). If you provide values for your id, the sequence is not used, thus it doesn't move.
    – bobflux
    Commented Feb 1, 2012 at 9:12
11

I realize that this is a bit old but maybe someone might still be looking for the answer.

As other said COPY works in a similar way as INSERT, so for inserting into a table that has a sequence, you simply don't mention the sequence field at all and it is taken care of for you. For COPY it works in the same exact way. But doesn't it COPY require ALL fields in the table to be present in the text file? The correct answer is NO, it doesn't, but it is the default behavior.

To COPY and leave the sequence out do the following:

COPY $YOURSCHEMA.$YOURTABLE(col1,col2,col3,col4) FROM '$your_input_file' DELIMITER ',' CSV HEADER;

No need to manually update the schema afterwards, it works as intended and in my testing is just about as fast.

1
  • 2
    Just to add to this, the source (input) file needs to have only the columns you are copying in it. Postgres is not "smart" - it does not look at column headers to match up the columns you have named in the copy command. So, in @Phobos example above, the input file must only have 4 columns. Just hoping to save someone some hassle - this answer helped me.
    – Chris Hart
    Commented Feb 28, 2015 at 21:15
3

You could copy to a sister table, then insert into mytable select * from sister - that would increment the sequence.

If your loaded data has the id field, don't select it for the insert: insert into mytable (col1, col2, col3) select col1, col2, col3 from sister

0

Not the answer you're looking for? Browse other questions tagged or ask your own question.