T-SQL: Deleting all duplicate rows but keeping one(T-SQL:删除所有重复的行但保留一个)
问题描述
I have a table with a very large amount of rows. Duplicates are not allowed but due to a problem with how the rows were created I know there are some duplicates in this table. I need to eliminate the extra rows from the perspective of the key columns. Some other columns may have slightly different data but I do not care about that. I still need to keep one of these rows however. SELECT DISTINCT won't work because it operates on all columns and I need to suppress duplicates based on the key columns.
How can I delete the extra rows but still keep one efficiently?
You didn't say what version you were using, but in SQL 2005 and above, you can use a common table expression with the OVER Clause. It goes a little something like this:
WITH cte AS (
SELECT[foo], [bar],
row_number() OVER(PARTITION BY foo, bar ORDER BY baz) AS [rn]
FROM TABLE
)
DELETE cte WHERE [rn] > 1
Play around with it and see what you get.
(Edit: In an attempt to be helpful, someone edited the ORDER BY clause within the CTE. To be clear, you can order by anything you want here, it needn't be one of the columns returned by the cte. In fact, a common use-case here is that "foo, bar" are the group identifier and "baz" is some sort of time stamp. In order to keep the latest, you'd do ORDER BY baz desc)
这篇关于T-SQL:删除所有重复的行但保留一个的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:T-SQL:删除所有重复的行但保留一个
基础教程推荐
- SQL 效率:WHERE IN 子查询 vs. JOIN 然后 GROUP 2021-01-01
- 需要 MySQL 5.1 中的抽象触发器来更新审计日志 2021-01-01
- 将 SQL Server DateTime 列迁移到 DateTimeOffset 2021-01-01
- 是否可以执行按位分组功能? 2021-01-01
- 无法解决整理冲突 2021-01-01
- 如何使用 mysql.connector 禁用查询缓存 2022-01-01
- 在 SQL 中连接多个表 2021-01-01
- SQL:使用来自具有相同列名的两个表中的数据... 2021-01-01
- SSMS 中的权限问题:“对象 'extended_properties'、数据库 'mssqlsystem_resource'、... 错误 229)上的 SELECT 权限被拒绝" 2022-01-01
- SQL Server 实例在登录协商期间返回无效或不受支持的协议版本 2021-01-01
