如何遍历 Pandas 中列中的列表以找到匹配项?

how to iterate over a list in a column in pandas to find a match?

提问人:pheonix 提问时间:4/11/2023 最后编辑:Bhargav - Retarded Skillspheonix 更新时间:4/11/2023 访问量:55

问:

我有一个术语列表,想找出特定单词是否匹配

['家', '杂货店', '蛋糕']
['家', '杂货店', '饼干', '奥利奥']

我正在尝试从此列表中找到匹配项: 术语列表 = ['cake', 'biscuit']

预期输出:

B列
['家', '杂货店', '蛋糕']
['家', '杂货店', '饼干', '奥利奥']
Python Pandas 数据帧 numpy 匹配

评论


答:

2赞 jezrael 4/11/2023 #1

将列表转换为集合,并在列表推导式中使用 set.isdisjointing,将列表的值转换为小写:

terms = ['cake', 'biscuit']
S = set(terms)

df['Column B'] = [not set(y.lower() for y in x).isdisjoint(S) for x in df['meta']]

df['Column B'] = [not set(map(str.lower, x)).isdisjoint(S) for x in df['meta']]
print (df)
                             meta  Column B
0           [Home, grocery, cake]      True
1  [Home, grocery, Biscuit, Oreo]      True

因为不是马塔赫:Biscuit

df['Column B'] = [not set(x).isdisjoint(S) for x in df['meta']]
print (df)
                             meta  Column B
0           [Home, grocery, cake]      True
1  [Home, grocery, Biscuit, Oreo]     False
3赞 mozway 4/11/2023 #2

您可以使用交叉点set

terms = {'cake', 'biscuit'}

df['Column B'] = [bool(set(x)&terms) for x in df['meta']]

如果大小写无关紧要(例如 /),请使用 str.lower(或 str.casefold)将字符串设为小写:'Biscuit''biscuit'

df['Column B'] = [bool(set(map(str.lower, x))&terms) for x in df['meta']]

输出:

                             meta  Column B
0           [Home, grocery, cake]      True
1  [Home, grocery, Biscuit, Oreo]      True