提问人:gevaraweb 提问时间:11/17/2023 更新时间:11/17/2023 访问量:58
Pandas - 计算数据帧中每个字符串的数字字符之和
Pandas - compute the sum of numerical characters of every string in a dataframe
问:
我有一个带有列“dname”的数据帧。它包含许多行的 2LD 域名。即
123ask
example92
what3ver
...
我想找到每行中每个字符串的位数。
6
11
3
...
因此,要在数据帧中创建一个新列,其值即
我有:
df['numeric'] = df.dname.apply(lambda v : sum(x, x.isdigit() for x in v))
不工作( 有什么帮助吗?提前致谢!
答:
2赞
mozway
11/17/2023
#1
使用您的方法:
df['numeric'] = [sum(int(x) for x in s if x.isnumeric())
for s in df['dname']]
# or
df['numeric'] = df['dname'].apply(lambda s: sum(int(x) for x in s
if x.isnumeric()))
您还可以提取所有
数字,然后转换为整数和 groupby.sum
:
df['numeric'] = (df['dname'].str.extractall(r'(\d)')[0]
.astype(int).groupby(level=0).sum()
)
输出:
dname numeric
0 123ask 6
1 example92 11
2 what3ver 3
评论
1赞
gevaraweb
11/17/2023
非常感谢,它有效!程序代码非常短。
0赞
user_stack_overflow
11/17/2023
#2
就是这样:
import pandas as pd
def summ(n):
sum1=0
while n>0:
sum1=sum1+(n%10)
n=int(n/10)
return sum1
df = pd.DataFrame()
df['dname'] = ['123ask','example92','what3ver']
list1=[]
for s in df['dname']:
ss=''
for b in s:
if ((ord(b)>=48 and ord(b)<=57)):
ss=ss+b;
list1.append(ss)
df['digits']=list1
print(df)
df['sum1'] = [ summ(int(a)) for a in df['digits']]
print(df)
输出:
dname digits sum1
0 123ask 123 6
1 example92 92 11
2 what3ver 3 3
0赞
PaulS
11/17/2023
#3
另一个可能的解决方案:
df['sum'] = (df['dname'].str.replace('\D', '', regex=True)
.map(lambda x: sum(map(int, x))))
另一个可能的解决方案:
df['sum'] = (df['dname'].str.split('', expand=True)
.map(lambda x: int(x) if x is not None and x.isnumeric()
else int(0)).sum(axis=1))
或者:
df['sum'] = (df['dname'].str.replace('\D', '', regex=True)
.str.split('').map(
lambda x: sum([int(y) for y in x if y != ''])))
输出:
dname sum
0 123ask 6
1 example92 11
2 what3ver 3
评论