SQL Server 重复检查

SQL Server Duplicate Checking

提问人: 提问时间:9/6/2008 最后编辑:Shef 更新时间:11/25/2015 访问量:1934

问:

确定 SQL Server 表中重复记录的最佳方法是什么?

例如,我想在表格中查找收到的最后一封重复电子邮件(表格具有主键、接收日期和电子邮件字段)。

示例数据:

1  01/01/2008 [email protected]
2  02/01/2008 [email protected]
3  01/12/2008 [email protected]
sql-server

评论


答:

0赞 Rob Rolnick 9/6/2008 #1

您不能加入电子邮件字段中的列表,然后查看结果中有哪些空值吗?

或者更好的是,计算每个电子邮件地址的实例数?并且只返回计数为 1 >

甚至使用电子邮件和 ID 字段。并返回电子邮件相同且 ID 不同的条目。(为避免重复,请勿使用 !=,而应使用 < 或 >。

9赞 SQLMenace 9/6/2008 #2

像这样的东西

select email ,max(receiveddate) as MaxDate
from YourTable
group by email 
having count(email) > 1
0赞 Michael Sharek 9/6/2008 #3

试试这个

select * from table a, table b
where a.email = b.email
0赞 palehorse 9/6/2008 #4
SELECT [id], [receivedate], [email]
FROM [mytable]
WHERE [email] IN ( SELECT [email]
    FROM [myTable]
    GROUP BY [email]
    HAVING COUNT([email]) > 1 )
0赞 enigmatic 9/6/2008 #5

你想要最后一项的清单吗?如果是这样,您可以使用:

SELECT [info] FROM [table] t WHERE NOT EXISTS (SELECT * FROM [table] tCheck WHERE t.date > tCheck.date)

如果您想要所有重复电子邮件地址的列表,请使用 GROUP BY 来收集类似的数据,然后使用 HAVING 子句来确保数量大于 1:

SELECT [info] FROM [table] GROUP BY [email] HAVING Count(*) > 1 DESC

如果您想要最后一封重复的电子邮件(单个结果),只需添加“TOP 1”和“ORDER BY”:

SELECT TOP 1 [info] FROM [table] GROUP BY [email] HAVING Count(*) > 1 ORDER BY Date DESC
0赞 Brian 9/6/2008 #6

如果你有代理键,那么使用SQLMenance的帖子中提到的按语法分组是相对容易的。从本质上讲,按使两行或多行“相同”的所有字段进行分组。

用于删除重复记录的伪代码示例。

Create table people (ID(PK), Name, Address, DOB)

Delete from people where id not in (
Select min(ID) from people group by name, address, dob
)
1赞 user2051770 2/8/2013 #7

尝试如下操作:

SELECT * FROM (
  SELECT *, 
  ROW_NUMBER() OVER (PARTITION BY ReceivedDate, Email ORDER BY ReceivedDate, Email DESC) AS RowNumber 
  FROM EmailTable
) a
WHERE RowNumber = 1

查看 http://www.technicaloverload.com/working-with-duplicates-in-sql-server/