Postgres: INSERT if does not exist already

Question

I'm using Python to write to a postgres database:

sql_string = "INSERT INTO hundred (name,name_slug,status) VALUES ("
sql_string += hundred + ", '" + hundred_slug + "', " + status + ");"
cursor.execute(sql_string)

But because some of my rows are identical, I get the following error:

psycopg2.IntegrityError: duplicate key value  
  violates unique constraint "hundred_pkey"

How can I write an 'INSERT unless this row already exists' SQL statement?

I've seen complex statements like this recommended:

IF EXISTS (SELECT * FROM invoices WHERE invoiceid = '12345')
UPDATE invoices SET billed = 'TRUE' WHERE invoiceid = '12345'
ELSE
INSERT INTO invoices (invoiceid, billed) VALUES ('12345', 'TRUE')
END IF

But firstly, is this overkill for what I need, and secondly, how can I execute one of those as a simple string?

Regardless of how you solve this issue, you shouldn't generate your query like that. Use parameters in your query and pass the values separately; see stackoverflow.com/questions/902408/… — Thomas Wouters, Commented Nov 1, 2010 at 14:36
As of Posgres 9.5(currently on beta2) there is a new upsert like feature, see: postgresql.org/docs/9.5/static/sql-insert.html#SQL-ON-CONFLICT — Ezequiel Moreno, Commented Nov 11, 2015 at 14:08
@AP257 Why haven't you accepted any answer yet? For example Arie's answer is very useful and highly upvoted. — Honza Zidek, Commented Oct 11, 2021 at 20:22

Arie · Accepted Answer · 2020-04-30 13:19:44Z

1153

+250

Postgres 9.5 (released since 2016-01-07) offers an "upsert" command, also known as an ON CONFLICT clause to INSERT:

INSERT ... ON CONFLICT DO NOTHING/UPDATE

It solves many of the subtle problems you can run into when using concurrent operation, which some other answers propose.

edited Apr 30, 2020 at 13:19

user330315

answered Jul 31, 2015 at 9:33

Arie

12k1 gold badge16 silver badges12 bronze badges

2

@TusharJain prior to PostgreSQL 9.5 you can do an "old fashioned" UPSERT (with CTE) but you may experience problems with race conditions and it will not be performant as 9.5 style. There is a good detail about upsert on this blog (in the updated area at the bottom) including some links if you want to read more about the details.
– Skyguard
Commented Apr 1, 2017 at 16:03
99

For those needed, here's two simple examples. (1) INSERT if not exists else NOTHING - INSERT INTO distributors (did, dname) VALUES (7, 'Redline GmbH') ON CONFLICT (did) DO NOTHING; (2) INSERT if not exists else UPDATE - INSERT INTO distributors (did, dname) VALUES (5, 'Gizmo Transglobal'), (6, 'Associated Computing, Inc') ON CONFLICT (did) DO UPDATE SET dname = EXCLUDED.dname; These examples are from manual - postgresql.org/docs/9.5/static/sql-insert.html
– AnnieFromTaiwan
Commented Apr 22, 2017 at 9:10
63

There is one caveat/side effect. In a table with sequence column (serial or bigserial), even if no row is inserted the sequence is incremented at every insert attempt.
– Grzegorz Luczywo
Commented Jul 2, 2017 at 12:10
4

It'd be better linking to INSERT documentation instead of pointing to release. Doc link: postgresql.org/docs/9.5/static/sql-insert.html
– borjagvo
Commented Nov 3, 2017 at 7:36
14

If you need to ON CONFLICT DO NOTHING RETURNING id, then read this answer stackoverflow.com/a/42217872/368691.
– Gajus
Commented Oct 22, 2018 at 15:57

| Show 9 more comments

Basil Bourque · Accepted Answer · 2014-11-19 06:37:21Z

488

How can I write an 'INSERT unless this row already exists' SQL statement?

There is a nice way of doing conditional INSERT in PostgreSQL:

INSERT INTO example_table
    (id, name)
SELECT 1, 'John'
WHERE
    NOT EXISTS (
        SELECT id FROM example_table WHERE id = 1
    );

CAVEAT This approach is not 100% reliable for concurrent write operations, though. There is a very tiny race condition between the SELECT in the NOT EXISTS anti-semi-join and the INSERT itself. It can fail under such conditions.

edited Nov 19, 2014 at 6:37

Basil Bourque

326k113 gold badges903 silver badges1.2k bronze badges

answered Nov 12, 2012 at 10:27

John Doe

9,61413 gold badges51 silver badges69 bronze badges

How safe is this assuming that the "name"-field has a UNIQUE constraint? Will it ever fail with unique-violation?
– agnsaft
Commented Dec 29, 2012 at 19:57
2

This works fine. The only problem is the coupling I guess: what if one modifies the table such that more columns are unique. In that case all the scripts must be modified. It would be nice if there was a more generic way to do this...
– willeM_ Van Onsem
Commented Aug 1, 2014 at 13:17
1

Is it possible to use it with RETURNS id for example to get the id whether is has been inserted or not?
– Olivier Pons
Commented Jan 3, 2016 at 10:58
2

@OlivierPons yes, it's possible. Add RETURNING id at the and of the query and it will return either a new row id or nothing, if no row has been inserted.
– pumbo
Commented Feb 15, 2016 at 10:08
9

I have found this to be unreliable. It appears Postgres sometimes executes the insert before it has executed the select and I end up with a duplicate key violation even though the record hasn't been inserted yet. Try to use version =>9.5 with ON CONFLICT.
– Michael Silver
Commented May 28, 2017 at 2:38

| Show 7 more comments

Michael Easter · Accepted Answer · 2023-06-03 13:45:34Z

63

One approach would be to create a non-constrained (no unique indexes) table to insert all your data into and do a select distinct from that to do your insert into your hundred table.

So high level would be. I assume all three columns are distinct in my example so for step3 change the NOT EXISTS join to only join on the unique columns in the hundred table.

Create temporary table. See docs here.

 CREATE TEMPORARY TABLE temp_data(name, name_slug, status);

INSERT Data into temp table.

 INSERT INTO temp_data(name, name_slug, status);

Add any indexes to the temp table.

Do main table insert.

 INSERT INTO hundred(name, name_slug, status) 
     SELECT DISTINCT name, name_slug, status
     FROM hundred
     WHERE NOT EXISTS (
         SELECT 'X' 
         FROM temp_data
         WHERE 
             temp_data.name          = hundred.name
             AND temp_data.name_slug = hundred.name_slug
             AND temp_data.status    = status
     );

edited Jun 3, 2023 at 13:45

Michael Easter

24.3k9 gold badges78 silver badges110 bronze badges

answered Nov 1, 2010 at 15:36

Kuberchaun

29.9k8 gold badges52 silver badges59 bronze badges

3

This is the fastest way I have found to do mass inserts when I do not know if the row already exists.
– nate c
Commented Nov 18, 2010 at 3:51
1

select 'X'? can someone clarify? This is simply a select statement right: SELECT name,name_slug,status or *
– roberthuttinger
Commented Oct 8, 2014 at 14:33
6

Lookup up correlated subquery. 'X' could be change to a 1 or even 'SadClown'. SQL requires there to be something and 'X' is a common thing to use. It's small and it makes it obvious a correlated subquery is being used and meets the requirements of of what SQL requires.
– Kuberchaun
Commented Oct 9, 2014 at 16:08
6

You mentioned "insert all your data into(assuming temp table) and do a select distinct from that". In that case, should it not be SELECT DISTINCT name, name_slug, status FROM temp_data?
– gibbz00
Commented Jun 20, 2019 at 21:32
Would this solution be reliable for concurrent write operations? Won't we be expecting race conditions between the INSERT and the SELECT in the subquery?
– RianLauw
Commented Dec 8, 2021 at 11:49

| Show 2 more comments

tuanngocptn · Accepted Answer · 2019-05-16 06:12:39Z

21

This is exactly the problem I face and my version is 9.5

And I solve it with SQL query below.

INSERT INTO example_table (id, name)
SELECT 1 AS id, 'John' AS name FROM example_table
WHERE NOT EXISTS(
            SELECT id FROM example_table WHERE id = 1
    )
LIMIT 1;

Hope that will help someone who has the same issue with version >= 9.5.

Thanks for reading.

answered May 16, 2019 at 6:12

tuanngocptn

1,38114 silver badges22 bronze badges

5

This answer is the same as @John Doe, where the caveat of concurrent write operations has been pointed out.
– RianLauw
Commented Dec 8, 2021 at 11:37
@RianLauw: yeah right, That is the same about 80% ~ 90%. But you can use both way to find the better answer for you.
– tuanngocptn
Commented Dec 9, 2021 at 4:37

Add a comment |

Quassnoi · Accepted Answer · 2010-11-01 14:28:27Z

20

Unfortunately, PostgreSQL supports neither MERGE nor ON DUPLICATE KEY UPDATE, so you'll have to do it in two statements:

UPDATE  invoices
SET     billed = 'TRUE'
WHERE   invoices = '12345'

INSERT
INTO    invoices (invoiceid, billed)
SELECT  '12345', 'TRUE'
WHERE   '12345' NOT IN
        (
        SELECT  invoiceid
        FROM    invoices
        )

You can wrap it into a function:

CREATE OR REPLACE FUNCTION fn_upd_invoices(id VARCHAR(32), billed VARCHAR(32))
RETURNS VOID
AS
$$
        UPDATE  invoices
        SET     billed = $2
        WHERE   invoices = $1;

        INSERT
        INTO    invoices (invoiceid, billed)
        SELECT  $1, $2
        WHERE   $1 NOT IN
                (
                SELECT  invoiceid
                FROM    invoices
                );
$$
LANGUAGE 'sql';

and just call it:

SELECT  fn_upd_invoices('12345', 'TRUE')

answered Nov 1, 2010 at 14:28

Quassnoi

422k91 gold badges624 silver badges619 bronze badges

1

Actually, this doesn't work: I can call INSERT INTO hundred (name, name_slug, status) SELECT 'Chichester', 'chichester', NULL WHERE 'Chichester' NOT IN (SELECT NAME FROM hundred); any number of times, and it keeps inserting the row.
– AP257
Commented Mar 9, 2011 at 13:23
1

@AP257: CREATE TABLE hundred (name TEXT, name_slug TEXT, status INT); INSERT INTO hundred (name, name_slug, status) SELECT 'Chichester', 'chichester', NULL WHERE 'Chichester' NOT IN (SELECT NAME FROM hundred); INSERT INTO hundred (name, name_slug, status) SELECT 'Chichester', 'chichester', NULL WHERE 'Chichester' NOT IN (SELECT NAME FROM hundred); SELECT * FROM hundred. There is one record.
– Quassnoi
Commented Mar 9, 2011 at 16:30
2

Postgres does support ON DUPLICATE KEY UPDATE. It's called ON CONFLICT (column) DO UPDATE SET
– kolypto
Commented Sep 20, 2020 at 0:45
1

@kolypto: sure, since 2016. Feel free to edit the answer.
– Quassnoi
Commented Sep 20, 2020 at 0:52
Supports MERGE too since v15.
– milan
Commented Jan 23 at 12:53

Add a comment |

score 16 · Accepted Answer · 2012-03-30 10:11:14Z

16

You can make use of VALUES - available in Postgres:

INSERT INTO person (name)
    SELECT name FROM person
    UNION 
    VALUES ('Bob')
    EXCEPT
    SELECT name FROM person;

edited Mar 30, 2012 at 10:11

answered Mar 30, 2012 at 9:02

user1200257

17

SELECT name FROM Person <--- what if there's a billion rows in person?
– Henley Wing Chiu
Commented Apr 24, 2013 at 0:34
2

I think this is a nice quick way to solve the issue, but only when you're sure the source table will never grow huge. I've got a table which will never have more than 1000 rows, so I can use this solution.
– Leonard
Commented Feb 9, 2015 at 8:34
1

WOW, this is exactly what I needed. I was worried I'd need to create a function or a temp table, but this precludes all that--thank you!
– Alkanshel
Commented Feb 3, 2016 at 22:24
@HenleyChiu has quite a good point. Since those selects are agains existing table maybe we can add a where clause to each ensuring we only select 'Bob''s row?
– stackh34p
Commented Sep 26, 2020 at 16:43

Add a comment |

ktr · Accepted Answer · 2012-05-21 15:32:19Z

10

I know this question is from a while ago, but thought this might help someone. I think the easiest way to do this is via a trigger. E.g.:

Create Function ignore_dups() Returns Trigger
As $$
Begin
    If Exists (
        Select
            *
        From
            hundred h
        Where
            -- Assuming all three fields are primary key
            h.name = NEW.name
            And h.hundred_slug = NEW.hundred_slug
            And h.status = NEW.status
    ) Then
        Return NULL;
    End If;
    Return NEW;
End;
$$ Language plpgsql;

Create Trigger ignore_dups
    Before Insert On hundred
    For Each Row
    Execute Procedure ignore_dups();

Execute this code from a psql prompt (or however you like to execute queries directly on the database). Then you can insert as normal from Python. E.g.:

sql = "Insert Into hundreds (name, name_slug, status) Values (%s, %s, %s)"
cursor.execute(sql, (hundred, hundred_slug, status))

Note that as @Thomas_Wouters already mentioned, the code above takes advantage of parameters rather than concatenating the string.

answered May 21, 2012 at 15:32

ktr

7369 silver badges15 bronze badges

If anyone else was wondering too, from the docs: "Row-level triggers fired BEFORE can return null to signal the trigger manager to skip the rest of the operation for this row (i.e., subsequent triggers are not fired, and the INSERT/UPDATE/DELETE does not occur for this row). If a nonnull value is returned then the operation proceeds with that row value."
– Pete
Commented Feb 24, 2013 at 4:45
Exactly this answer I was looking for. Clean code, using function + trigger instead of select statement. +1
– Jacek Krawczyk
Commented Feb 3, 2016 at 10:03
I love this answer, use function and trigger. Now I find another way to break the deadlock using functions and triggers...
– Sukma Saputra
Commented May 16, 2019 at 4:22

Add a comment |

Zoey Hewll · Accepted Answer · 2017-10-14 04:32:45Z

8

There is a nice way of doing conditional INSERT in PostgreSQL using WITH query: Like:

WITH a as(
select 
 id 
from 
 schema.table_name 
where 
 column_name = your_identical_column_value
)
INSERT into 
 schema.table_name
(col_name1, col_name2)
SELECT
    (col_name1, col_name2)
WHERE NOT EXISTS (
     SELECT
         id
     FROM
         a
        )
  RETURNING id

edited Oct 14, 2017 at 4:32

Zoey Hewll

5,1552 gold badges21 silver badges37 bronze badges

answered Oct 14, 2017 at 4:26

Ritesh Jha

2913 silver badges6 bronze badges

1

This is essentially the same as the answer (here stackoverflow.com/a/13342031/5294065) by @John Doe, where the subquery is just a CTE, right?
– RianLauw
Commented Dec 8, 2021 at 11:50

Add a comment |

Brian Burns · Accepted Answer · 2021-09-20 13:32:40Z

8

we can simplify the query using upsert

insert into invoices (invoiceid, billed) 
  values ('12345', 'TRUE') 
  on conflict (invoiceid) do 
    update set billed=EXCLUDED.billed;

edited Sep 20, 2021 at 13:32

Brian Burns

21.5k10 gold badges90 silver badges78 bronze badges

answered Mar 26, 2021 at 17:12

Enggar R Hariawan

3893 silver badges5 bronze badges

Add a comment |

Pavel Francírek · Accepted Answer · 2015-03-26 12:56:12Z

4

INSERT .. WHERE NOT EXISTS is good approach. And race conditions can be avoided by transaction "envelope":

BEGIN;
LOCK TABLE hundred IN SHARE ROW EXCLUSIVE MODE;
INSERT ... ;
COMMIT;

answered Mar 26, 2015 at 12:56

Pavel Francírek

1883 silver badges10 bronze badges

Add a comment |

score 3 · Accepted Answer · 2016-07-15 10:04:15Z

3

It's easy with rules:

CREATE RULE file_insert_defer AS ON INSERT TO file
WHERE (EXISTS ( SELECT * FROM file WHERE file.id = new.id)) DO INSTEAD NOTHING

But it fails with concurrent writes ...

edited Jul 15, 2016 at 10:04

answered Jul 12, 2016 at 8:52

user6578371

Add a comment |

i000174 · Accepted Answer · 2023-07-22 06:17:47Z

2

You can also use MERGE introduced in Postgres 15

Nevertheless, not as straightforward as INSERT … ON CONFLICT DO NOTHING

edited Jul 22, 2023 at 6:17

answered Jul 21, 2023 at 10:06

i000174

1,20711 silver badges17 bronze badges

I don't think merge works in all cases. Firstly, I don't think you want to insert for all non-matches and secondly on an empty table the insert would still fail even though the id does not exist.
– weberc2
Commented Feb 7 at 20:48

Add a comment |

Christoph · Accepted Answer · 2013-11-23 11:29:38Z

The approach with the most upvotes (from John Doe) does somehow work for me but in my case from expected 422 rows i get only 180. I couldn't find anything wrong and there are no errors at all, so i looked for a different simple approach.

Using IF NOT FOUND THEN after a SELECT just works perfectly for me.

(described in PostgreSQL Documentation)

Example from documentation:

SELECT * INTO myrec FROM emp WHERE empname = myname;
IF NOT FOUND THEN
  RAISE EXCEPTION 'employee % not found', myname;
END IF;

bcorso · Accepted Answer · 2015-02-01 21:24:34Z

1

psycopgs cursor class has the attribute rowcount.

This read-only attribute specifies the number of rows that the last execute*() produced (for DQL statements like SELECT) or affected (for DML statements like UPDATE or INSERT).

So you could try UPDATE first and INSERT only if rowcount is 0.

But depending on activity levels in your database you may hit a race condition between UPDATE and INSERT where another process may create that record in the interim.

edited Feb 1, 2015 at 21:24

bcorso

46.5k10 gold badges64 silver badges76 bronze badges

answered Nov 1, 2010 at 14:59

johnbaum

6644 silver badges5 bronze badges

1

Presumably wrapping these queries in a transaction would alleviate the race condition.
– Daniel Lyons
Commented Feb 28, 2012 at 22:19

Add a comment |

Boodoo · Accepted Answer · 2018-08-30 12:34:05Z

1

Your column "hundred" seems to be defined as primary key and therefore must be unique which is not the case. The problem isn't with, it is with your data.

I suggest you insert an id as serial type to handly the primary key

answered Aug 30, 2018 at 12:34

Boodoo

351 silver badge8 bronze badges

Add a comment |

opena · Accepted Answer · 2019-02-21 03:30:13Z

1

If you say that many of your rows are identical you will end checking many times. You can send them and the database will determine if insert it or not with the ON CONFLICT clause as follows

  INSERT INTO Hundred (name,name_slug,status) VALUES ("sql_string += hundred  
  +",'" + hundred_slug + "', " + status + ") ON CONFLICT ON CONSTRAINT
  hundred_pkey DO NOTHING;" cursor.execute(sql_string);

edited Feb 21, 2019 at 3:30

answered Feb 21, 2019 at 3:18

opena

73 bronze badges

Add a comment |

Ahmed Mohamed · Accepted Answer · 2023-07-13 15:31:51Z

0

In Postgres version 9.5 or higher you can use ON CONFLICT to avoid errors of contraints like @Arie mentioned above. To know more options related to this INSERT query refer to Postgres Docs. An alternative solution is by using try/catch to handle runtime errors

answered Jul 13, 2023 at 15:31

Ahmed Mohamed

303 bronze badges

Add a comment |

buddemat · Accepted Answer · 2022-10-20 05:46:34Z

-1

INSERT INTO invoices (invoiceid, billed) (
    SELECT '12345','TRUE' WHERE NOT EXISTS (
        SELECT 1 FROM invoices WHERE invoiceid='12345' AND billed='TRUE'
        )
)

edited Oct 20, 2022 at 5:46

buddemat

5,15214 gold badges32 silver badges54 bronze badges

answered Oct 13, 2022 at 9:01

tothocga

1

Add a comment |

Jeff Fairley · Accepted Answer · 2014-09-26 06:53:00Z

I was looking for a similar solution, trying to find SQL that work work in PostgreSQL as well as HSQLDB. (HSQLDB was what made this difficult.) Using your example as a basis, this is the format that I found elsewhere.

sql = "INSERT INTO hundred (name,name_slug,status)"
sql += " ( SELECT " + hundred + ", '" + hundred_slug + "', " + status
sql += " FROM hundred"
sql += " WHERE name = " + hundred + " AND name_slug = '" + hundred_slug + "' AND status = " + status
sql += " HAVING COUNT(*) = 0 );"

Patrick · Accepted Answer · 2016-07-25 23:44:37Z

Here is a generic python function that given a tablename, columns and values, generates the upsert equivalent for postgresql.

import json

def upsert(table_name, id_column, other_columns, values_hash):

    template = """
    WITH new_values ($$ALL_COLUMNS$$) as (
      values
         ($$VALUES_LIST$$)
    ),
    upsert as
    (
        update $$TABLE_NAME$$ m
            set
                $$SET_MAPPINGS$$
        FROM new_values nv
        WHERE m.$$ID_COLUMN$$ = nv.$$ID_COLUMN$$
        RETURNING m.*
    )
    INSERT INTO $$TABLE_NAME$$ ($$ALL_COLUMNS$$)
    SELECT $$ALL_COLUMNS$$
    FROM new_values
    WHERE NOT EXISTS (SELECT 1
                      FROM upsert up
                      WHERE up.$$ID_COLUMN$$ = new_values.$$ID_COLUMN$$)
    """

    all_columns = [id_column] + other_columns
    all_columns_csv = ",".join(all_columns)
    all_values_csv = ','.join([query_value(values_hash[column_name]) for column_name in all_columns])
    set_mappings = ",".join([ c+ " = nv." +c for c in other_columns])

    q = template
    q = q.replace("$$TABLE_NAME$$", table_name)
    q = q.replace("$$ID_COLUMN$$", id_column)
    q = q.replace("$$ALL_COLUMNS$$", all_columns_csv)
    q = q.replace("$$VALUES_LIST$$", all_values_csv)
    q = q.replace("$$SET_MAPPINGS$$", set_mappings)

    return q


def query_value(value):
    if value is None:
        return "NULL"
    if type(value) in [str, unicode]:
        return "'%s'" % value.replace("'", "''")
    if type(value) == dict:
        return "'%s'" % json.dumps(value).replace("'", "''")
    if type(value) == bool:
        return "%s" % value
    if type(value) == int:
        return "%s" % value
    return value


if __name__ == "__main__":

    my_table_name = 'mytable'
    my_id_column = 'id'
    my_other_columns = ['field1', 'field2']
    my_values_hash = {
        'id': 123,
        'field1': "john",
        'field2': "doe"
    }
    print upsert(my_table_name, my_id_column, my_other_columns, my_values_hash)

el fuser · Accepted Answer · 2016-11-08 11:00:38Z

-19

The solution in simple, but not immediatly.
If you want use this instruction, you must make one change to the db:

ALTER USER user SET search_path to 'name_of_schema';

after these changes "INSERT" will work correctly.

answered Nov 8, 2016 at 11:00

el fuser

6241 gold badge6 silver badges10 bronze badges

Add a comment |

Collectives™ on Stack Overflow

Postgres: INSERT if does not exist already

21 Answers 21

Not the answer you're looking for? Browse other questions tagged
postgresql
sql-insert
upsert
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

21 Answers 21

Not the answer you're looking for? Browse other questions tagged postgresqlsql-insertupsert or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
postgresql
sql-insert
upsert
or ask your own question.