识别混合整数和字符串列中的特定整数-解网

问：

我在 pandas df 中有一列，如下所示：specialty

0         1,5
1           1
2     1,2,4,6    
3           2
4           1
5         1,5
6           3
7           3
8           1
9         2,3

我想创建一个新列，该列包含包含 1 的所有行的 1 和不包含 1 的行的 0。输出如下所示：is_1specialty

我不确定如何使用混合 dtypes 列来做到这一点。我会只在通话中使用吗？这样：np.where()str.contains()

np.where((part_chars['specialty'] == 1) | part_chars['specialty'].str.contains('1'), 1, 0)

是的，这行得通......

python pandas numpy 整数对象类型

part_chars['is_1'] = (part_chars['specialty'].astype(str)
                          .str.contains(r'\b1\b').astype(int))
print(part_chars)

# Output
  specialty  is_1
0       1,5     1
1         1     1
2   1,2,4,6     1
3         2     0
4         1     1
5       1,5     1
6         3     0
7         3     0
8         1     1
9       2,3     0

替代：str.split

part_chars['is_1'] = (part_chars['specialty'].str.split(',', expand=True)
                          .eq('1').any(axis=1).astype(int))
print(part_chars)

# Output
  specialty  is_1
0       1,5     1
1         1     1
2   1,2,4,6     1
3         2     0
4         1     1
5       1,5     1
6         3     0
7         3     0
8         1     1
9       2,3     0

2赞 mozway 3/30/2023 #2

将 str.contains 与正则表达式一起使用，该正则表达式匹配等于：1

part_chars['is_1'] = (part_chars['specialty'].astype(str)
                      .str.contains(r'\b1\b').astype(int)
                     )

输出：

  specialty  is_1
0       1,5     1
1         1     1
2   1,2,4,6     1
3         2     0
4         1     1
5       1,5     1
6         3     0
7         3     0
8         1     1
9       2,3     0

您的解决方案：

part_chars = pd.DataFrame({'specialty': ['1,5', '1', '1,2,4,6', '2', '1', '1,5', '3', '3', '1', '2,3', '21']})
part_chars['is_1'] = np.where((part_chars['specialty'] == 1) | part_chars['specialty'].str.contains('1'), 1, 0)

输出：

   specialty  is_1
0        1,5     1
1          1     1
2    1,2,4,6     1
3          2     0
4          1     1
5        1,5     1
6          3     0
7          3     0
8          1     1
9        2,3     0
10        21     1  # might be unwanted

识别混合整数和字符串列中的特定整数

Identify specific integers in column of mixed ints and strings

评论

您的解决方案：

评论