86

I have two tables with similar column names and I need to return records from the left table which are not found in the right table? I have a primary key(column) which will help me to compare both tables. Which join is preferred?

6

8 Answers 8

118

Try This

SELECT f.*
FROM first_table f LEFT JOIN second_table s ON f.key=s.key
WHERE s.key is NULL

For more please read this article : Joins in Sql Server

enter image description here

5
  • 5
    key is IS NULL not = NULL Commented Mar 20, 2017 at 3:41
  • @HaydenThring s.key = NULL means no matching row found in second table,so condition is true i think
    – Shamseer K
    Commented Mar 22, 2017 at 3:52
  • sorry i meant "the key to getting this to work is" not to be confused with use of "key" as a variable in this answer Commented Mar 22, 2017 at 5:37
  • In the left join, A&B will the output as per the diagram. Can you clarify me if adding "WHERE s.key is NULL" at the end of statement will remove the results in the B? Commented Oct 3, 2019 at 7:04
  • NOT EXISTS might be better: dba.stackexchange.com/questions/121034/… Commented May 8, 2023 at 7:24
65

If you are asking for T-SQL then lets look at fundamentals first. There are three types of joins here each with its own set of logical processing phases as:

  1. A cross join is simplest of all. It implements only one logical query processing phase, a Cartesian Product. This phase operates on the two tables provided as inputs to the join and produces a Cartesian product of the two. That is, each row from one input is matched with all rows from the other. So if you have m rows in one table and n rows in the other, you get m×n rows in the result.
  2. Then are Inner joins : They apply two logical query processing phases: A Cartesian product between the two input tables as in a cross join, and then it filters rows based on a predicate that you specify in ON clause (also known as Join condition).
  3. Next comes the third type of joins, Outer Joins:

    In an outer join, you mark a table as a preserved table by using the keywords LEFT OUTER JOIN, RIGHT OUTER JOIN, or FULL OUTER JOIN between the table names. The OUTER keyword is optional. The LEFT keyword means that the rows of the left table are preserved; the RIGHT keyword means that the rows in the right table are preserved; and the FULL keyword means that the rows in both the left and right tables are preserved.

    The third logical query processing phase of an outer join identifies the rows from the preserved table that did not find matches in the other table based on the ON predicate. This phase adds those rows to the result table produced by the first two phases of the join, and uses NULL marks as placeholders for the attributes from the nonpreserved side of the join in those outer rows.

Now if we look at the question: To return records from the left table which are not found in the right table use Left outer join and filter out the rows with NULL values for the attributes from the right side of the join.

0
12

I also like to use NOT EXISTS. When it comes to performance if index correctly it should perform the same as a LEFT JOIN or better. Plus its easier to read.

SELECT Column1
FROM TableA a
WHERE NOT EXISTS ( SELECT Column1
                   FROM Tableb b
                   WHERE a.Column1 = b.Column1
                 )
3

This page gives a decent breakdown of the different join types, as well as venn diagram visualizations to help... well... visualize the difference in the joins.

As the comments said this is a quite basic query from the sounds of it, so you should try to understand the differences between the joins and what they actually mean.

Check out http://blog.codinghorror.com/a-visual-explanation-of-sql-joins/

You're looking for a query such as:

DECLARE @table1 TABLE (test int)
DECLARE @table2 TABLE (test int)

INSERT INTO @table1
(
    test
)
SELECT 1
UNION ALL SELECT 2

INSERT INTO @table2
(
    test
)
SELECT 1
UNION ALL SELECT 3

-- Here's the important part
SELECT  a.*
FROM    @table1 a
LEFT    join @table2 b on a.test = b.test -- this will return all rows from a
WHERE   b.test IS null -- this then excludes that which exist in both a and b

-- Returned results:

2
2

I can't add anything but a code example to the other two answers: however, I find it can be useful to see it in action (the other answers, in my opinion, are better because they explain it).

DECLARE @testLeft TABLE (ID INT, SomeValue VARCHAR(1))
DECLARE @testRight TABLE (ID INT, SomeOtherValue VARCHAR(1))

INSERT INTO @testLeft (ID, SomeValue) VALUES (1, 'A')
INSERT INTO @testLeft (ID, SomeValue) VALUES (2, 'B')
INSERT INTO @testLeft (ID, SomeValue) VALUES (3, 'C')


INSERT INTO @testRight (ID, SomeOtherValue) VALUES (1, 'X')
INSERT INTO @testRight (ID, SomeOtherValue) VALUES (3, 'Z')

SELECT l.*
FROM 
    @testLeft l
     LEFT JOIN 
    @testRight r ON 
        l.ID = r.ID
WHERE r.ID IS NULL 
1
  • if you are using Mysql replace the last line WHERE r.ID IS NULL with WHERE ISNULL(r.ID)
    – Ananda
    Commented Jul 24, 2016 at 9:01
2

select * from left table where key field not in (select key field from right table)

2
  • Can you post an explanation for why your code works? also use the formatting tools to better format your answer.
    – Spoody
    Commented Apr 12, 2018 at 19:58
  • 1
    This is one of the correct ways to solve the question. But, its performance is not good if we work with 10k, 20k records
    – NoName
    Commented Nov 1, 2018 at 8:29
0

I also like to use NOT IN. When it comes to performance if index correctly it should perform the same as a LEFT JOIN or better. Plus its easier to read.

SELECT Column1
FROM TableA a
WHERE NOT EXISTS ( SELECT Column2
                   FROM Tableb b
                   WHERE a.Column1 = b.Column2
                 )
0

This is an example from real life work, I was asked to supply a list of users that bought from our site in the last 6 months but not in the last 3 months.

For me, the most understandable way I can think of is like so:

--Users that bought from us 6 months ago and between 3 months ago.
DECLARE @6To3MonthsUsers table (UserID int,OrderDate datetime)
INSERT @6To3MonthsUsers
    select u.ID,opd.OrderDate
        from OrdersPaid opd
        inner join Orders o
        on opd.OrderID = o.ID
        inner join Users u
        on o.BuyerID = u.ID
        where 1=1 
        and opd.OrderDate BETWEEN DATEADD(m,-6,GETDATE()) and DATEADD(m,-3,GETDATE())

--Users that bought from us in the last 3 months
DECLARE @Last3MonthsUsers table (UserID int,OrderDate datetime)
INSERT @Last3MonthsUsers
    select u.ID,opd.OrderDate
        from OrdersPaid opd
        inner join Orders o
        on opd.OrderID = o.ID
        inner join Users u
        on o.BuyerID = u.ID
        where 1=1 
        and opd.OrderDate BETWEEN DATEADD(m,-3,GETDATE()) and GETDATE()

Now, with these 2 tables in my hands I need to get only the users from the table @6To3MonthsUsers that are not in @Last3MonthsUsers table.

There are 2 simple ways to achieve that:

  1. Using Left Join:

    select distinct a.UserID
    from @6To3MonthsUsers a
    left join @Last3MonthsUsers b
    on a.UserID = b.UserID
    where b.UserID is null
    
  2. Not in:

    select distinct a.UserID
    from @6To3MonthsUsers a
    where a.UserID not in (select b.UserID from @Last3MonthsUsers b)
    

Both ways will get me the same result, I personally prefer the second way because it's more readable.

Not the answer you're looking for? Browse other questions tagged or ask your own question.