0

I have two sets of tables (i.e. a.1, a.2, a.3, b.1, b.2, b.3, etc) created using slightly different logic. The analogous table in the two schemas have the exact same columns (i.e. a.1 has the same columns as b.1). My belief is that the tables in the two schemas should contain the exact same information, but I want to test that belief. Therefore I want to write a query that compares two analogous tables and returns lines that are not in both tables. Is there an easy way to write a query to do that without manually writing the join? In other words, can I have a query that can produce the results that I want where I only have to change the table names I want to compare while leaving the rest of the query unchanged?

To be a bit more explicit, I'm looking to do something like the following:

select * 
from a.1 
where (all columns in a.1) not in (select * from b.1);

If I could write something like this then all I would have to do to compare a.2 to b.2 would be to change the table names. However, it's not clear to me how to come up with the (all columns in a.1) piece in a general way.

Based on a recommendation in the comments, I've created the following showing the kind of thing I'd like to see:

https://dbfiddle.uk/?rdbms=db2_11.1&fiddle=ad0141b0daf8f8f92e6e3fa8d57e67ad

5
  • A full anti-join? That may be easy in some databases, and not so easy on other ones. Which database engine are you using? Commented Aug 18, 2022 at 18:14
  • These are on DB2. Commented Aug 18, 2022 at 18:15
  • A minimal reproducible example would make this much clearer.
    – jarlh
    Commented Aug 18, 2022 at 18:20
  • OK, I will work on producing that Commented Aug 18, 2022 at 18:26
  • @jarlh Please see the link here: dbfiddle.uk/… Commented Aug 18, 2022 at 18:46

3 Answers 3

1

I was looking for the except clause.

So

select * 
from a.1 
where (all columns in a.1) not in (select * from b.1);

can be written as

select * from a.1 
except
select * from b.1

In db-fiddle I give an explicit exmaple of what I wanted.

0

If you have a primary key to match rows between the tables, then you can try a full anti-join. For example:

select a.id as aid, b.id as bid
from a
full join b on b.id = a.id
where a.id is null or b.id is null

If the tables are:

A: 1, 2, 3
B: 1, 2, 4

The result is:

AID  BID
---- ----
null    4   -- means it's present in B, but not in A
   3 null   -- means it's present in A, but not in B

See running example at db<>fiddle.

Of course, if your tables do not have a primary key, or if the rows are inconsistent (same PK, different data), then you'll need to adjust the query.

1
  • Thank you for the answer. I apologize if I was not clear in my question, however in this case I need to specify the column names I want to join on. Furthermore, this only compares the column names I have specified. I want to compare all columns without having to explicitly name them if there is a way to do so. Please see the following link, which might help clarify my request: dbfiddle.uk/… Commented Aug 18, 2022 at 18:47
0

As an alternative you can try this:

select 'a1' t,* from (
    select a1.*,row_number() over (partition by c1 order by 1) as rn from a1
    minus
    select b1.*,row_number() over (partition by c1 order by 1) as rn from b1
)
union all
select 'b1' t,* from (
    select b1.*,row_number() over (partition by c1 order by 1) as rn from b1
    minus
    select a1.*,row_number() over (partition by c1 order by 1) as rn from a1
) 

fiddle

edit: you can shorten the query by precalculating the rn part, instead of doing the same calculation again.

Not the answer you're looking for? Browse other questions tagged or ask your own question.