Use this space to put some text. Update this text in HTML

468x60 banner ad

Advertise with Us

Powered by Blogger.
Showing posts with label Deleting Duplicate Records in Sql Server. Show all posts
Showing posts with label Deleting Duplicate Records in Sql Server. Show all posts

Wednesday, 11 May 2016

Deleting Duplicate Records in Sql Server


Sometimes when we don’t use primary or unique key in the SQL-Server table, then most of the chances to insert duplicate row into the SQL-Server table. It’s quite difficult task to delete the duplicate row from the SQL-Server table.

For the prevention of inserting duplicate rows into the SQL-Server table, we always use primary key or unique key in the SQL-Server table.

First Example:

Using the same ROW_NUMBER() function you can also easily delete the Duplicate Records present inside a table.
I have a [dbo].[tm_Employee] Table which is having duplicate records. I need to delete the duplicate records which are having both the First_name and Last_name same.

So, with the help of ROW_NUMBER()all the duplicate records present in the below table can be removed easily.

Below are the SQL-Server table Structure and Insert Query

CREATE TABLE [dbo].[tm_Employee](
      [First_name] [nvarchar](300) NULL,
      [Last_name] [nvarchar](300) NULL,
      [Salary] [decimal](20, 2) NULL,
      [Joining_date] [datetime] NULL,
      [Department] [nvarchar](200) NULL,
      [City] [nvarchar](250) NULL,
      [Age] [int] NULL
) ON [PRIMARY]


INSERT [dbo].[tm_Employee] ([First_name], [Last_name], [Salary], [Joining_date], [Department], [City],[Age]) VALUES ('Deepak', 'Kumar', 5000.00, 1905-06-07, 'Banking', 'Delhi',18)
INSERT [dbo].[tm_Employee] ([First_name], [Last_name], [Salary], [Joining_date], [Department], [City],[Age]) VALUES ('Deepak', 'Kumar', 5000.00, 1905-06-07, 'Banking', 'Delhi',18)
INSERT [dbo].[tm_Employee] ([First_name], [Last_name], [Salary], [Joining_date], [Department], [City],[Age]) VALUES ('Deepak', 'Singh', 8000.00, 1894-06-28, 'Insurance', 'Delhi',25)
INSERT [dbo].[tm_Employee] ([First_name], [Last_name], [Salary], [Joining_date], [Department], [City],[Age]) VALUES ('Satish', 'Singh', 7000.00, 1894-06-27, 'Banking','Mumbai',28)
INSERT [dbo].[tm_Employee] ([First_name], [Last_name], [Salary], [Joining_date], [Department], [City],[Age]) VALUES ('Satish', 'Singh', 7000.00, 1894-06-27, 'Banking','Mumbai',28)
INSERT [dbo].[tm_Employee] ([First_name], [Last_name], [Salary], [Joining_date], [Department], [City],[Age]) VALUES ('Satish', 'Rai', 12000.00, 1894-06-27, 'Insurance', 'Delhi',32)
INSERT [dbo].[tm_Employee] ([First_name], [Last_name], [Salary], [Joining_date], [Department], [City],[Age]) VALUES ('Pankaj', 'Kumar',6000.00, 1894-06-28, 'Services', 'kolkata',31)
INSERT [dbo].[tm_Employee] ([First_name], [Last_name], [Salary], [Joining_date], [Department], [City],[Age]) VALUES ('Pankaj', 'Singh', 6000.00, 1894-06-28, 'Services', 'Delhi',35)
INSERT [dbo].[tm_Employee] ([First_name], [Last_name], [Salary], [Joining_date], [Department], [City],[Age]) VALUES ('Pankaj', 'Kumar',6000.00, 1894-06-28, 'Services', 'kolkata',31)
INSERT [dbo].[tm_Employee] ([First_name], [Last_name], [Salary], [Joining_date], [Department], [City],[Age]) VALUES ('Abhishek', 'Kumar', 650000.00, 1894-06-28, 'Services', 'Delhi',24)
INSERT [dbo].[tm_Employee] ([First_name], [Last_name], [Salary], [Joining_date], [Department], [City],[Age]) VALUES ('Abhishek', 'Kumar', 650000.00, 1894-06-28, 'Services', 'Delhi',24)
INSERT [dbo].[tm_Employee] ([First_name], [Last_name], [Salary], [Joining_date], [Department], [City],[Age]) VALUES ('Abhishek', 'Kumar', 650000.00, 1894-06-28, 'Services', 'Delhi',24)

SELECT * FROM [dbo].[tm_Employee]



The query for Delete Duplicate Records:-

We can check duplicate record on the basis of  First_name and Last_name both are same then it means record has duplicate record.

Note: - If First_name and Last_name are not same means record has not duplicate record.

You can select below query simultaneously and then press execute query command.


WITH TempEmployee (First_name, Last_name, duplicateRecordCount)
AS
(
 SELECT
   First_name,
   Last_name,
   ROW_NUMBER()OVER(PARTITION BY First_name, Last_name ORDER BY First_name)
   AS duplicateRecordCount
 FROM [dbo].[tm_Employee]
)

DELETE
FROM TempEmployee
WHERE duplicateRecordCount > 1


TempEmployee is used for Temporary purpose; Instead of TempEmployee you can give any name as per convenance upto you.

After the execution of the above query, [dbo].[tm_Employee] table will have the following records.


SELECT * FROM [dbo].[tm_Employee]




Note: Row_Number() function is present in SQL Server 2005 and in later version(s)



Second Example:

If your table has identity column or auto increament column. Then you can have to delete the duplicate record by the following sub-query.

Here I have taken [dbo].[tm_Employee1] table with the [ID] as identity column.

Below are the Table structure and Insert query:

CREATE TABLE [dbo].[tm_Employee1]
(
      [ID] INT IDENTITY(1,1) NOT NULL,
      [First_name] [nvarchar](300) NULL,
      [Last_name] [nvarchar](300) NULL,
      [Salary] [decimal](20, 2) NULL,
      [Joining_date] [datetime] NULL,
      [Department] [nvarchar](200) NULL,
      [City] [nvarchar](250) NULL,
      [Age] [int] NULL
) ON [PRIMARY]


INSERT [dbo].[tm_Employee1] ([First_name], [Last_name], [Salary], [Joining_date], [Department], [City],[Age]) VALUES ('Deepak', 'Kumar', 5000.00, 1905-06-07, 'Banking', 'Delhi',18)
INSERT [dbo].[tm_Employee1] ([First_name], [Last_name], [Salary], [Joining_date], [Department], [City],[Age]) VALUES ('Deepak', 'Kumar', 5000.00, 1905-06-07, 'Banking', 'Delhi',18)
INSERT [dbo].[tm_Employee1] ([First_name], [Last_name], [Salary], [Joining_date], [Department], [City],[Age]) VALUES ('Deepak', 'Singh', 8000.00, 1894-06-28, 'Insurance', 'Delhi',25)
INSERT [dbo].[tm_Employee1] ([First_name], [Last_name], [Salary], [Joining_date], [Department], [City],[Age]) VALUES ('Satish', 'Singh', 7000.00, 1894-06-27, 'Banking','Mumbai',28)
INSERT [dbo].[tm_Employee1] ([First_name], [Last_name], [Salary], [Joining_date], [Department], [City],[Age]) VALUES ('Satish', 'Singh', 7000.00, 1894-06-27, 'Banking','Mumbai',28)
INSERT [dbo].[tm_Employee1] ([First_name], [Last_name], [Salary], [Joining_date], [Department], [City],[Age]) VALUES ('Satish', 'Rai', 12000.00, 1894-06-27, 'Insurance', 'Delhi',32)
INSERT [dbo].[tm_Employee1] ([First_name], [Last_name], [Salary], [Joining_date], [Department], [City],[Age]) VALUES ('Pankaj', 'Kumar',6000.00, 1894-06-28, 'Services', 'kolkata',31)
INSERT [dbo].[tm_Employee1] ([First_name], [Last_name], [Salary], [Joining_date], [Department], [City],[Age]) VALUES ('Pankaj', 'Singh', 6000.00, 1894-06-28, 'Services', 'Delhi',35)
INSERT [dbo].[tm_Employee1] ([First_name], [Last_name], [Salary], [Joining_date], [Department], [City],[Age]) VALUES ('Pankaj', 'Kumar',6000.00, 1894-06-28, 'Services', 'kolkata',31)
INSERT [dbo].[tm_Employee1] ([First_name], [Last_name], [Salary], [Joining_date], [Department], [City],[Age]) VALUES ('Abhishek', 'Kumar', 650000.00, 1894-06-28, 'Services', 'Delhi',24)
INSERT [dbo].[tm_Employee1] ([First_name], [Last_name], [Salary], [Joining_date], [Department], [City],[Age]) VALUES ('Abhishek', 'Kumar', 650000.00, 1894-06-28, 'Services', 'Delhi',24)
INSERT [dbo].[tm_Employee1] ([First_name], [Last_name], [Salary], [Joining_date], [Department], [City],[Age]) VALUES ('Abhishek', 'Kumar', 650000.00, 1894-06-28, 'Services', 'Delhi',24)

SELECT * FROM [dbo].[tm_Employee1]

--Selecting distinct records
SELECT * FROM [dbo].[tm_Employee1] E1
WHERE E1.ID = (SELECT MAX(ID) FROM [dbo].[tm_Employee1] E2
WHERE E2.First_name = E1.First_name AND E1.Last_name = E2.Last_name)

The query for Delete Duplicate Records:-

--Deleting duplicates
DELETE [dbo].[tm_Employee1]
WHERE ID < (SELECT MAX(ID) FROM [dbo].[tm_Employee1] E2
WHERE E2.First_name = tm_Employee1.First_name AND E2.Last_name = tm_Employee1.Last_name)

SELECT * FROM [dbo].[tm_Employee1]