提问人:Diya Nair 11 5C 提问时间:11/17/2022 最后编辑:marc_sDiya Nair 11 5C 更新时间:11/18/2022 访问量:45
向具有重复值的列添加顺序值
Adding a sequential value to a column which has duplicate values
问:
我在 Postgres 中有一个表,其中有一列在模式 1234P001 中具有不同的字母数字值。但是,由于一些错误,列中存在重复值,例如出现三次。1234P001
我想用 和 替换 duplicate 的 。如何在 PostgresSql 中执行此操作?1234P001
1234P002
1234P003
1234P004
我尝试使用序列,但没有用。
答:
0赞
Stefanov.sm
11/18/2022
#1
这可以通过临时表和窗口功能来完成。这是一个插图。row_number
-- Prepare a test case
create table the_table (id integer, the_column text);
insert into the_table values
(1, '1234P001'),
(2, '1235P001'),
(3, '1234P001'),
(4, '1236P001'),
(5, '1235P001'),
(6, '1234P001');
create temporary table the_temp_table as
select *, row_number() over (partition by the_column order by id) ord
from the_table ;
update the_temp_table
set the_column = the_column||'.'||ord::text where ord > 1;
truncate table the_table;
insert into the_table(id, the_column)
select id, the_column from the_temp_table;
select * from the_table order by the_column;
编号 | the_column |
---|---|
1 | 编号: 1234P001 |
3 | 编号:1234P001.2 |
6 | 编号:1234P001.3 |
2 | 编号: 1235P001 |
5 | 编号:1235P001.2 |
4 | 编号: 1236P001 |
评论
0赞
Diya Nair 11 5C
11/19/2022
谢谢。。这是一个很好的解决方案。简单而有效。
0赞
Marmite Bomber
11/18/2022
#2
使用此示例数据来说明概念
create table tab (id varchar(8) );
insert into tab(id) values
('1234P001'),
('1234P001'),
('1234P001'),
('1234P002'),
('1234P004'),
('1234P004'),
('1234P005');
首先,您需要识别重复的密钥 - 使用count .. over
select id,
count(*) over (partition by id) > 1 is_dup
from tab;
id |is_dup|
--------+------+
1234P001|true |
1234P001|true |
1234P001|true |
1234P002|false |
1234P004|true |
1234P004|true |
1234P005|false |
为每个重复的行分配一个唯一的序列号(您很快就会知道原因)
with dup as (
select id,
count(*) over (partition by id) > 1 is_dup
from tab
)
select id,
row_number() over (order by id) dup_idx
from dup
where is_dup;
id |dup_idx|
--------+-------+
1234P001| 1|
1234P001| 2|
1234P001| 3|
1234P004| 4|
1234P004| 5|
现在根据您的密钥架构生成所有不存在的密钥(此处前缀为长度为 5 和 3 位整数)
with free_key as (
select distinct substring(id,1,5)||lpad(idx::text,3,'0') id
from tab
cross join generate_series(1,10) as t(idx) /* increase the count up to 999 if required */
except
select id from tab)
select id,
row_number() over (order by id) free_id_idx
from free_key
id |free_id_idx|
--------+-----------+
1234P003| 1|
1234P006| 2|
1234P007| 3|
1234P008| 4|
1234P009| 5|
1234P010| 6|
在最后一步中,简单地将具有重复键的表与未分配的键连接在一起,使用唯一索引来获得分辨率和唯一old_id
new_id
注意:我使用外部连接 - 如果您得到一个空的,则存在一个问题,您在架构中没有可用密钥可以修复。new_id
with dup as (
select id,
count(*) over (partition by id) > 1 is_dup
from tab
),
dup2 as (
select id,
row_number() over (order by id) dup_idx
from dup
where is_dup),
free_key as (
select distinct substring(id,1,5)||lpad(idx::text,3,'0') id
from tab
cross join generate_series(1,10) as t(idx) /* increase the count up to 999 if required */
except
select id from tab),
free_key2 as (
select id,
row_number() over (order by id) free_id_idx
from free_key)
select dup2.id old_id, free_key2.id new_id
from dup2
left outer join free_key2
on dup2.dup_idx = free_key2.free_id_idx;
old_id |new_id |
--------+--------+
1234P001|1234P003|
1234P001|1234P006|
1234P001|1234P007|
1234P004|1234P008|
1234P004|1234P009|
评论
0赞
Diya Nair 11 5C
11/19/2022
谢谢。。。这也确实是一个可行的解决方案,我可以做我想做的事谢谢
0赞
Marmite Bomber
11/21/2022
最好的奖励是投票和/或接受@DiyaNair115C
评论
1234P001.01
1234P001.02