有没有办法使用 pandas str.replace 仅在单词单独出现时替换它，而不是作为较长字符串的一部分？-解网

问：

我有一个数据帧，当它作为数据帧中的单个项目/单元格/条目单独出现时，我只想替换“Blah”——而不是作为像“Blah guh”这样的较长字符串的一部分。请参阅以下示例：

data={"Col":["Blah","Blah gah","Blah bluh"],'Subs':["one","two","three"]}
df=pd.DataFrame(data)

期望输出：

山坳	潜艇
等等	一
咻	二
咻	三

我尝试使用单词边界，但它只是在所有三个条目中替换了 Blah......

df["Col"] = df["Col"].str.replace(r'\bBlah\b', "Blah ALL", regex=True)

山坳	潜艇
等等	一
等等	二
等等 ALL bluh	三

很确定我在这里遗漏了一些明显的东西。

Python pandas 正则表达式

计时

这也快得多。

在 30k 行上：

# replace
3.9 ms ± 450 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

# str.replace with regex=True
37.2 ms ± 3.57 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

上一个：数据帧中的字符串与具有多个匹配选项的其他数据帧的文本匹配

下一个：如果单词在其他单词之前，则返回 Python

有没有办法使用 pandas str.replace 仅在单词单独出现时替换它，而不是作为较长字符串的一部分？

Is there a way using pandas str.replace to replace a word ONLY when it occurs by itself, rather than as part of a longer string?

评论

评论

计时