向具有重复值的列添加顺序值

Adding a sequential value to a column which has duplicate values

提问人:Diya Nair 11 5C 提问时间:11/17/2022 最后编辑:marc_sDiya Nair 11 5C 更新时间:11/18/2022 访问量:45

问:

我在 Postgres 中有一个表,其中有一列在模式 1234P001 中具有不同的字母数字值。但是,由于一些错误,列中存在重复值,例如出现三次。1234P001

我想用 和 替换 duplicate 的 。如何在 PostgresSql 中执行此操作?1234P0011234P0021234P0031234P004

我尝试使用序列,但没有用。

PostgreSQL 序列 自动增量

评论

0赞 S-Man 11/17/2022
你怎么能确定,新创建的 1234P003 也不存在?
0赞 Stefanov.sm 11/17/2022
可能,会更好吗?1234P001.011234P001.02

答:

0赞 Stefanov.sm 11/18/2022 #1

这可以通过临时表和窗口功能来完成。这是一个插图。row_number

-- Prepare a test case
create table the_table (id integer, the_column text);
insert into the_table values 
(1, '1234P001'), 
(2, '1235P001'), 
(3, '1234P001'), 
(4, '1236P001'), 
(5, '1235P001'), 
(6, '1234P001');


create temporary table the_temp_table as 
 select *, row_number() over (partition by the_column order by id) ord 
 from the_table ;

update the_temp_table
 set the_column = the_column||'.'||ord::text where ord > 1;

truncate table the_table;

insert into the_table(id, the_column)
 select id, the_column from the_temp_table;

select * from the_table order by the_column;
编号 the_column
1 编号: 1234P001
3 编号:1234P001.2
6 编号:1234P001.3
2 编号: 1235P001
5 编号:1235P001.2
4 编号: 1236P001

评论

0赞 Diya Nair 11 5C 11/19/2022
谢谢。。这是一个很好的解决方案。简单而有效。
0赞 Marmite Bomber 11/18/2022 #2

使用此示例数据来说明概念

create table tab (id varchar(8) );
insert into tab(id) values 
('1234P001'), 
('1234P001'), 
('1234P001'), 
('1234P002'), 
('1234P004'), 
('1234P004'),
('1234P005');

首先,您需要识别重复的密钥 - 使用count .. over

select id,
count(*) over (partition by id) > 1  is_dup
from tab;

id      |is_dup|
--------+------+
1234P001|true  |
1234P001|true  |
1234P001|true  |
1234P002|false |
1234P004|true  |
1234P004|true  |
1234P005|false |

为每个重复的行分配一个唯一的序列号(您很快就会知道原因)

with dup as (
select id,
count(*) over (partition by id) > 1  is_dup
from tab
)
select id,  
row_number() over (order by id) dup_idx
from dup
where is_dup;

id      |dup_idx|
--------+-------+
1234P001|      1|
1234P001|      2|
1234P001|      3|
1234P004|      4|
1234P004|      5|

现在根据您的密钥架构生成所有不存在的密钥(此处前缀为长度为 5 和 3 位整数)

with free_key as (
select distinct substring(id,1,5)||lpad(idx::text,3,'0') id 
from tab
cross join generate_series(1,10) as t(idx) /* increase the count up to 999 if required */
except 
select id from tab)
select id,
row_number() over (order by id) free_id_idx
from free_key

id      |free_id_idx|
--------+-----------+
1234P003|          1|
1234P006|          2|
1234P007|          3|
1234P008|          4|
1234P009|          5|
1234P010|          6|

在最后一步中,简单地将具有重复键的表与未分配的键连接在一起,使用唯一索引来获得分辨率和唯一old_idnew_id

注意:我使用外部连接 - 如果您得到一个空的,则存在一个问题,您在架构中没有可用密钥可以修复。new_id

with dup as (
select id,
count(*) over (partition by id) > 1  is_dup
from tab
),
dup2 as (
select id,  
row_number() over (order by id) dup_idx
from dup
where is_dup),
free_key as (
select distinct substring(id,1,5)||lpad(idx::text,3,'0') id 
from tab
cross join generate_series(1,10) as t(idx) /* increase the count up to 999 if required */
except 
select id from tab),
free_key2 as (
select id,
row_number() over (order by id) free_id_idx
from free_key)
select dup2.id old_id, free_key2.id new_id
from dup2
left outer join free_key2
on dup2.dup_idx = free_key2.free_id_idx;

old_id  |new_id  |
--------+--------+
1234P001|1234P003|
1234P001|1234P006|
1234P001|1234P007|
1234P004|1234P008|
1234P004|1234P009|

评论

0赞 Diya Nair 11 5C 11/19/2022
谢谢。。。这也确实是一个可行的解决方案,我可以做我想做的事谢谢
0赞 Marmite Bomber 11/21/2022
最好的奖励是投票和/或接受@DiyaNair115C