How to delete duplicate rows in SQL

H

To find and delete duplicate rows from a table, you can execute the query below. Note that this is for Oracle. You can use it for other databases but you need to change the ‘rowid’ then.

delete from
   table_name A
where
  a.rowid >
   any (
     select B.rowid
     from table_name B
     where A.column_name = B.column_name
   );

You need to change the ‘table_name’ by your table name an the ‘column_name’ by your column you want to check on.

To explain it simply, I will show an example.

Take, for example, a table named persons. In that table, there are the following fields: id, username, first_name, last_name, emailaddress

Let’s say we want to check that username and emailaddress are unique, we need to use the following query:

delete from
   persons A
where
  a.rowid >
   any (
     select B.rowid
     from persons B
     where A.username = B.username
     and A.emailaddress = B.emailaddress
   );

Don’t replace rowid by id. In oracle rowid will always be 1,2,3,4… If you want to use it for another sql language, search google for rowid alternatives.

Result:

  • Person[id=123, username=laurent, first_name=laurent, last_name=hinoul, emailaddress=info@laurenthinoul.com]
  • Person[id=456, username=laurent, first_name=test, last_name=test, emailaddress=info@laurenthinoul.com]

The first one will be kept and the last one will be removed because username and emailaddress are equal and the first rowid is less than the second one.

Add comment

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Tag Cloud

Categories