提问人:i30817 提问时间:11/4/2023 更新时间:11/4/2023 访问量:49
在自由形式的混合字符串中,用实际数字替换一组固定的罗马数字的最佳方法?[复制]
Best way to replace a fixed set of roman numerals by actual numbers in freeform mixed strings? [duplicate]
问:
我想用字符串中的实际数字替换罗马数字,以便进行规范化,并为这些字符串的模糊近等检查做准备。我目前正在做一些有点浪费性能的事情。
def replaceRoman(source, romana, number):
source = regex.sub(rf"([\s']|^){romana}([\s,]|$)", rf"\g<1>{number}\g<2>", source)
return source
...
st = replaceRoman(st, "XVIII", "18")
st = replaceRoman(st, "XVII", "17")
st = replaceRoman(st, "XVI", "16")
st = replaceRoman(st, "XIII", "13")
st = replaceRoman(st, "XII", "12")
st = replaceRoman(st, "XIV", "14")
st = replaceRoman(st, "XV", "15")
st = replaceRoman(st, "XIX", "19")
st = replaceRoman(st, "XX", "20")
st = replaceRoman(st, "XI", "11")
st = replaceRoman(st, "VIII", "8")
st = replaceRoman(st, "VII", "7")
st = replaceRoman(st, "VI", "6")
st = replaceRoman(st, "III", "3")
st = replaceRoman(st, "II", "2")
st = replaceRoman(st, "IV", "4")
st = replaceRoman(st, "V", "5")
st = replaceRoman(st, "IX", "9")
st = replaceRoman(st, "X", "10")
st = replaceRoman(st, "I", "1")
它必须是正则表达式的原因,是边界是空格、字符串的开头或结尾以及结束边框的逗号的一些安全性。特殊替换顺序的原因是为了防止特定的误报检查替换部分匹配,尽管现在我想到它,这是我使用正则表达式测试边界之前的遗留物,并且不需要排序。
我喜欢在一次传递中执行此操作,无论是否正则表达式,最好不要。 有什么建议吗?
答: 暂无答案
评论
span
Match
rn_to_int
pat = re.compile(r'\b(?=[MCDLXVI])M{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})\b'); pat.sub(lambda m:str(rn_to_int(m.group())), 'II men landed on the moon on July XX MCMLXIX')
2 men landed on the moon on July 20 1969