Python 使用循环替换/删除输入字符串中的子字符串,并在每次迭代时更改原始字符串

Python replace/remove substring from input string with a loop and mutating the original string with each iteration

提问人:Sean.realitytester 提问时间:7/29/2023 更新时间:7/31/2023 访问量:66

问:

我希望处理输入(字符串)并删除任何遵循括在方括号中的模式的项目,例如 [像这样]。

我已经能够识别出允许我隔离并创建字符串中所有方括号的列表的模式。使用正则表达式。

但是,我想替换这些出现的情况,并每次都改变原始字符串,以用“”替换方括号,以将它们从原始字符串中删除。

据我了解,string.replace(“example”,“”) 返回原始字符串的副本。

我将如何采用一种技术来删除每个匹配项,然后返回突变的输入字符串。

到目前为止,我这个函数的代码如下。

def remove_square_brackets(input):
  """remove square brackets e.g [verse] from song lyrics """
  pattern1 = r"\[.*?\]" 
  squares = re.findall(pattern1, input) #this works, here we have seperated by chorus verse. 
  #can also be used to remove these bookmarks from the phrases themselves. 
  print("squares are", squares) #this is picking up all the squares
  for square in squares:
        print(square)
        if square in input:
             print("found in the string!")
             #replace does not mutate, rather it makes a copy therefore this will not work,. 
             input.replace(square,"")
  return input
python 字符串 函数 替换 突变

评论

2赞 juanpa.arrivillaga 7/29/2023
“And mutate the original string”对象是不可变的,但你不需要在这里改变它。str
0赞 Barmar 7/29/2023
只需将结果赋值回原始变量即可string = string.replace(...)
1赞 Barmar 7/29/2023
您使用循环而不是 ?.replace()re.sub()

答:

0赞 John Collins 7/29/2023 #1

删除括在括号中的所有歌词部分标题

歌词(示例)

lyrics = """[Verse 1]
I'm the next act waiting in the wings
I'm an animal trapped in your hot car
I am all the days that you choose to ignore

[Chorus]
You are all I need
You're all I need
I'm in the middle of your picture
Lying in the reeds

[Verse 2]
I am a moth who just wants to share your light
I'm just an insect trying to get out of the night
I only stick with you because there are no others

[Chorus]
You are all I need
You're all I need
I'm in the middle of your picture
Lying in the reeds

[Outro]
It's all wrong, it's all wrong, it's all wrong
It's alright, it's alright, it's alright
It's all wrong, it's alright
It's alright, it's alright"""

如果我了解您要获得的输出,并且您想完全删除括在括号中的所有歌词标题,那么您拥有的模式是完美的,此代码将起作用:

import re

def get_lyrics_headings(lyrics):
    """Return all instances of headings (e.g., '[Chorus]') from song lyrics."""
    return re.findall(r"\[.*\]", lyrics)

def remove_square_brackets(lyrics):
    """Remove square brackets (e.g '[verse]') from song lyrics."""
    headings = get_lyrics_headings(lyrics)
    print(f"Found the following lyrics sections:\n{headings}")
    return re.sub(r"\[.*\]", "", lyrics)

其结果为:

>>> raw_lyrics = remove_square_brackets(lyrics)
Found the following lyrics sections:
['[Verse 1]', '[Chorus]', '[Verse 2]', '[Chorus]', '[Outro]']
>>> print(raw_lyrics)
I'm the next act waiting in the wings
I'm an animal trapped in your hot car
I am all the days that you choose to ignore


You are all I need
You're all I need
I'm in the middle of your picture
Lying in the reeds


I am a moth who just wants to share your light
I'm just an insect trying to get out of the night
I only stick with you because there are no others


You are all I need
You're all I need
I'm in the middle of your picture
Lying in the reeds


It's all wrong, it's all wrong, it's all wrong
It's alright, it's alright, it's alright
It's all wrong, it's alright
It's alright, it's alright

用双引号替换括号

如果要将歌词括号(“[”或“]”)替换为双引号(例如“[Chorus]”变→“Chorus”'),则需要稍微更改正则表达式模式。

例如:

import re

def remove_square_brackets(lyrics):
    """Remove square brackets (e.g '[verse]') from song lyrics."""
    return re.sub(r"\[|\]", '"', lyrics)

将导致:

"Verse 1"
I'm the next act waiting in the wings
I'm an animal trapped in your hot car
I am all the days that you choose to ignore

"Chorus"
You are all I need
You're all I need
I'm in the middle of your picture
Lying in the reeds

"Verse 2"
I am a moth who just wants to share your light
I'm just an insect trying to get out of the night
I only stick with you because there are no others

"Chorus"
You are all I need
You're all I need
I'm in the middle of your picture
Lying in the reeds

"Outro"
It's all wrong, it's all wrong, it's all wrong
It's alright, it's alright, it's alright
It's all wrong, it's alright
It's alright, it's alright
0赞 Akhilesh Pandey 7/29/2023 #2

str.replace()Python 中的函数不会改变原始字符串,而是返回一个带有替换的新字符串。这是因为 Python 中的字符串一旦创建就不可变,无法更改。

你可以将返回的值重新赋值回你原来的变量,以达到“变异”原始字符串的效果,如下图所示,

import re

def remove_square_brackets(input):
  """remove square brackets e.g [verse] from song lyrics """
  pattern1 = r"\[.*?\]" 
  squares = re.findall(pattern1, input) #this works, here we have seperated by chorus verse. 
  #can also be used to remove these bookmarks from the phrases themselves. 
  print("squares are", squares) #this is picking up all the squares
  for square in squares:
        print(square)
        if square in input:
             print("found in the string!")
             #replace does not mutate, rather it makes a copy therefore this will not work,. 
             input = input.replace(square,"")
  return input

您可以简化您的函数,因为无需单独查找然后替换模式的出现。您可以直接使用 re.sub() 函数将所有出现的模式替换为空字符串。

import re

def remove_square_brackets(input):
  """remove square brackets e.g [verse] from song lyrics """
  pattern1 = r"\[.*?\]" 
  # Substitute all occurrences of the pattern with an empty string.
  input = re.sub(pattern1, "", input)
  return input

此函数的行为将与以前的版本完全相同,但代码更少,执行速度可能更快,尤其是对于大型输入,因为它只需要扫描一次输入字符串。

评论

0赞 Sean.realitytester 7/29/2023
谢谢!帮了大忙
1赞 Dev. Francesca Mazzeo 7/29/2023 #3

使用 input = input.replace(square, “”) 将更改的字符串重新分配给循环中的输入变量。这可确保从输入变量中一次删除一个出现的括号,并正确返回最终结果。

import re


def remove_square_brackets(input):
    """Remove square brackets e.g [verse] from song lyrics."""

    # Define the regular expression pattern to match square brackets and the text inside them.
    pattern1 = r"\[.*?\]"

    # Find all occurrences of the pattern1 in the input string using regular expression.
    squares = re.findall(pattern1, input)

    # Iterate through each occurrence of square brackets in the input string.
    for square in squares:
        # Print the square bracket occurrence found in the input string.
        print(square, "found in string!")

        if square in input:
            # Replace the square bracket occurrence and reassign it to the input variable.
            input = input.replace(square, "")

    # Return the modified input string with square brackets removed.
    return input


# Example usage:
lyrics = "This is [verse] some [chorus] example [bridge] lyrics."
result = remove_square_brackets(lyrics)
print(result)

这是输出:

[verse] found in string!
[chorus] found in string!
[bridge] found in string!
This is  some  example  lyrics.