Finding differences between two tables in SQL
Problem
Sometimes you need to determine if there are any differences between two sets of data. For instance, I recently refactored a complicated view, which returns roughly 60k rows of 26 columns. The original view contained 7 correlated subqueries, which were bogging down performance. Pro-tip: don’t use correlated subqueries. After removing all of the correlated subqueries, I wanted to see if the before and after resulted in any difference among the roughly 60k rows.
Solution
This is simple to do generically, with a combination of the union, except, and intersect set operators in T-SQL.
Example
For example, given the following SQL, which creates two tables that differ on their row with Id = 2, we can use this trick to see what is different:
This will result in the following output:
Id | First | Last | Age |
---|---|---|---|
2 | Christine | Bateman | 38 |
2 | Mora | Grissom | 12 |
Caveat
This approach does not let you know if there are duplicates in one table. For instance, if you have two rows in one table that are identical, and only one row in the second table that matches them, this query finds no differences between the tables. You may still need to rely on comparing the counts for a final verification. For instance, the following is an example of this verification when comparing two views: