Pages

Tuesday, September 28, 2010

How to remove Duplicate records from table


This is the frequently asked question in an interview. How to remove or select distinct all data with the unique value. You all are knows that whenever we write distinct keyword with column it will returns unique combination of all columns.
In following query if you use distinct then it will return all rows then how to eliminate particular column duplication.You can not use distinct at this moment. I have a solution for this.
Just copy and paste following code in your SQL Server 2005 query window and then run it.
 
DECLARE @EMPDATA TABLE(EMPNO BIGINT PRIMARY KEY,EMAIL VARCHAR(50))

INSERT INTO @EMPDATA VALUES(1,'parag@gmail.com')
INSERT INTO @EMPDATA VALUES(2,'deven@gmail.com')
INSERT INTO @EMPDATA VALUES(3,'parag@gmail.com')
INSERT INTO @EMPDATA VALUES(4,'jig@gmail.com')
INSERT INTO @EMPDATA VALUES(5,'dharmik@gmail.com')
INSERT INTO @EMPDATA VALUES(6,'jig@gmail.com')
INSERT INTO @EMPDATA VALUES(7,'sandy@gmail.com')
INSERT INTO @EMPDATA VALUES(8,'dharmik@gmail.com')

;WITH CTE
AS
(
   SELECT ROW_NUMBER()OVER(PARTITION BY EMAIL ORDER BY  EMPNO)RANK,*
     FROM @EMPDATA
)

SELECT EMPNO,EMAIL FROM CTE
WHERE RANK=1
ORDER BY EMPNO

In above query it will just select the unique value if you want to remove then write DELETE statement where rank = 1.