Sometimes when we don’t use primary or unique key in the SQL-Server table, then most of the chances to insert duplicate row into the SQL-Server table. It’s quite difficult task to delete the duplicate row from the SQL-Server table.
For the prevention of inserting duplicate rows into the
SQL-Server table, we always use primary key or unique key in the SQL-Server table.
First Example:
Using the same ROW_NUMBER()
function you can also easily delete the Duplicate Records present inside a
table.
I have a [dbo].[tm_Employee] Table
which is having duplicate records. I need to delete the duplicate records
which are having both the First_name and Last_name same.
So, with the help of ROW_NUMBER()all the duplicate records present in the below table can
be removed easily.
Below are the SQL-Server table
Structure and Insert Query
|
CREATE TABLE [dbo].[tm_Employee](
[First_name] [nvarchar](300) NULL,
[Last_name] [nvarchar](300) NULL,
[Salary] [decimal](20, 2) NULL,
[Joining_date] [datetime] NULL,
[Department] [nvarchar](200) NULL,
[City] [nvarchar](250) NULL,
[Age] [int] NULL
) ON [PRIMARY]
INSERT [dbo].[tm_Employee] ([First_name], [Last_name],
[Salary], [Joining_date], [Department],
[City],[Age])
VALUES ('Deepak', 'Kumar', 5000.00, 1905-06-07, 'Banking', 'Delhi',18)
INSERT [dbo].[tm_Employee] ([First_name], [Last_name],
[Salary], [Joining_date], [Department],
[City],[Age])
VALUES ('Deepak', 'Kumar', 5000.00, 1905-06-07, 'Banking', 'Delhi',18)
INSERT [dbo].[tm_Employee] ([First_name], [Last_name],
[Salary], [Joining_date], [Department],
[City],[Age])
VALUES ('Deepak', 'Singh', 8000.00, 1894-06-28, 'Insurance', 'Delhi',25)
INSERT [dbo].[tm_Employee] ([First_name], [Last_name],
[Salary], [Joining_date], [Department],
[City],[Age])
VALUES ('Satish', 'Singh', 7000.00, 1894-06-27, 'Banking','Mumbai',28)
INSERT [dbo].[tm_Employee] ([First_name], [Last_name],
[Salary], [Joining_date], [Department],
[City],[Age])
VALUES ('Satish', 'Singh', 7000.00, 1894-06-27, 'Banking','Mumbai',28)
INSERT [dbo].[tm_Employee] ([First_name], [Last_name],
[Salary], [Joining_date], [Department],
[City],[Age])
VALUES ('Satish', 'Rai', 12000.00, 1894-06-27, 'Insurance', 'Delhi',32)
INSERT [dbo].[tm_Employee] ([First_name], [Last_name],
[Salary], [Joining_date], [Department],
[City],[Age])
VALUES ('Pankaj', 'Kumar',6000.00, 1894-06-28, 'Services', 'kolkata',31)
INSERT [dbo].[tm_Employee] ([First_name], [Last_name],
[Salary], [Joining_date], [Department],
[City],[Age])
VALUES ('Pankaj', 'Singh', 6000.00, 1894-06-28, 'Services', 'Delhi',35)
INSERT [dbo].[tm_Employee] ([First_name], [Last_name],
[Salary], [Joining_date], [Department],
[City],[Age])
VALUES ('Pankaj', 'Kumar',6000.00, 1894-06-28, 'Services', 'kolkata',31)
INSERT [dbo].[tm_Employee] ([First_name], [Last_name],
[Salary], [Joining_date], [Department],
[City],[Age])
VALUES ('Abhishek', 'Kumar', 650000.00, 1894-06-28, 'Services', 'Delhi',24)
INSERT [dbo].[tm_Employee] ([First_name], [Last_name],
[Salary], [Joining_date], [Department],
[City],[Age])
VALUES ('Abhishek', 'Kumar', 650000.00, 1894-06-28, 'Services', 'Delhi',24)
INSERT [dbo].[tm_Employee] ([First_name], [Last_name],
[Salary], [Joining_date], [Department],
[City],[Age])
VALUES ('Abhishek', 'Kumar', 650000.00, 1894-06-28, 'Services', 'Delhi',24)
SELECT * FROM
[dbo].[tm_Employee]
The
query for Delete Duplicate Records:-
We can check duplicate record on
the basis of First_name and Last_name
both are same then it means record has duplicate record.
Note: - If First_name and Last_name are not same means record has
not duplicate record.
You can select below query simultaneously
and then press execute query command.
WITH TempEmployee (First_name,
Last_name, duplicateRecordCount)
AS
(
SELECT
First_name,
Last_name,
ROW_NUMBER()OVER(PARTITION BY First_name,
Last_name ORDER BY
First_name)
AS
duplicateRecordCount
FROM [dbo].[tm_Employee]
)
DELETE
FROM TempEmployee
WHERE duplicateRecordCount >
1
TempEmployee
is used for Temporary purpose;
Instead of TempEmployee you can give any name
as per convenance upto you.
After the execution of
the above query, [dbo].[tm_Employee] table will have the following records.
SELECT * FROM
[dbo].[tm_Employee]
Note: Row_Number() function is present in SQL
Server 2005 and in later version(s)
Second Example:
If your table
has identity column or auto increament column. Then you can have to delete the
duplicate record by the following sub-query.
Here I have
taken [dbo].[tm_Employee1]
table with the [ID] as identity
column.
Below are the Table structure and Insert query:
CREATE TABLE [dbo].[tm_Employee1]
(
[ID]
INT IDENTITY(1,1) NOT NULL,
[First_name] [nvarchar](300) NULL,
[Last_name] [nvarchar](300) NULL,
[Salary] [decimal](20, 2) NULL,
[Joining_date] [datetime] NULL,
[Department] [nvarchar](200) NULL,
[City] [nvarchar](250) NULL,
[Age] [int] NULL
) ON [PRIMARY]
INSERT [dbo].[tm_Employee1] ([First_name],
[Last_name], [Salary],
[Joining_date], [Department], [City],[Age]) VALUES ('Deepak', 'Kumar', 5000.00, 1905-06-07, 'Banking', 'Delhi',18)
INSERT [dbo].[tm_Employee1] ([First_name],
[Last_name], [Salary],
[Joining_date], [Department], [City],[Age]) VALUES ('Deepak', 'Kumar', 5000.00, 1905-06-07, 'Banking', 'Delhi',18)
INSERT [dbo].[tm_Employee1] ([First_name],
[Last_name], [Salary],
[Joining_date], [Department], [City],[Age]) VALUES ('Deepak', 'Singh', 8000.00, 1894-06-28, 'Insurance', 'Delhi',25)
INSERT [dbo].[tm_Employee1] ([First_name],
[Last_name], [Salary],
[Joining_date], [Department], [City],[Age]) VALUES ('Satish', 'Singh', 7000.00, 1894-06-27, 'Banking','Mumbai',28)
INSERT [dbo].[tm_Employee1] ([First_name],
[Last_name], [Salary],
[Joining_date], [Department], [City],[Age]) VALUES ('Satish', 'Singh', 7000.00, 1894-06-27, 'Banking','Mumbai',28)
INSERT [dbo].[tm_Employee1] ([First_name],
[Last_name], [Salary],
[Joining_date], [Department], [City],[Age]) VALUES ('Satish', 'Rai', 12000.00, 1894-06-27, 'Insurance', 'Delhi',32)
INSERT [dbo].[tm_Employee1] ([First_name],
[Last_name], [Salary],
[Joining_date], [Department], [City],[Age]) VALUES ('Pankaj', 'Kumar',6000.00, 1894-06-28, 'Services', 'kolkata',31)
INSERT [dbo].[tm_Employee1] ([First_name],
[Last_name], [Salary],
[Joining_date], [Department], [City],[Age]) VALUES ('Pankaj', 'Singh', 6000.00, 1894-06-28, 'Services', 'Delhi',35)
INSERT [dbo].[tm_Employee1] ([First_name],
[Last_name], [Salary],
[Joining_date], [Department], [City],[Age]) VALUES ('Pankaj', 'Kumar',6000.00, 1894-06-28, 'Services', 'kolkata',31)
INSERT [dbo].[tm_Employee1] ([First_name],
[Last_name], [Salary],
[Joining_date], [Department], [City],[Age]) VALUES ('Abhishek', 'Kumar', 650000.00, 1894-06-28, 'Services', 'Delhi',24)
INSERT [dbo].[tm_Employee1] ([First_name],
[Last_name], [Salary],
[Joining_date], [Department], [City],[Age]) VALUES ('Abhishek', 'Kumar', 650000.00, 1894-06-28, 'Services', 'Delhi',24)
INSERT [dbo].[tm_Employee1] ([First_name],
[Last_name], [Salary],
[Joining_date], [Department], [City],[Age]) VALUES ('Abhishek', 'Kumar', 650000.00, 1894-06-28, 'Services', 'Delhi',24)
SELECT * FROM [dbo].[tm_Employee1]
--Selecting
distinct records
SELECT * FROM [dbo].[tm_Employee1] E1
WHERE E1.ID = (SELECT MAX(ID) FROM [dbo].[tm_Employee1]
E2
WHERE E2.First_name = E1.First_name AND E1.Last_name = E2.Last_name)
The query for Delete Duplicate
Records:-
--Deleting
duplicates
DELETE [dbo].[tm_Employee1]
WHERE ID < (SELECT MAX(ID) FROM [dbo].[tm_Employee1] E2
WHERE E2.First_name =
tm_Employee1.First_name AND E2.Last_name = tm_Employee1.Last_name)
SELECT * FROM [dbo].[tm_Employee1]
|