提问人:tpk 提问时间:10/20/2008 最后编辑:codeforestertpk 更新时间:8/12/2022 访问量:41538
带引号的 bash 正则表达式?
bash regex with quotes?
问:
以下代码
number=1
if [[ $number =~ [0-9] ]]
then
echo matched
fi
工程。但是,如果我尝试在正则表达式中使用引号,它会停止:
number=1
if [[ $number =~ "[0-9]" ]]
then
echo matched
fi
我也试过了。我错过了什么?"\[0-9\]"
有趣的是,bash 高级脚本指南建议这应该有效。
Bash 版本 3.2.39。
答:
131赞
Vinko Vrsalovic
10/20/2008
#1
它在 3.1 和 3.2 之间更改。猜猜高级指南需要更新。
这是对新事物的简要描述 bash-3.2 添加的功能自 发布 bash-3.1。与往常一样, 手册页(doc/bash.1)是这个地方 以查找完整的说明。
- Bash 中的新增功能
剪 断
f.将字符串参数引用到 [[ 命令的 =~ 运算符现在强制 字符串匹配,与其他模式匹配运算符一样。
可悲的是,这将使用脚本破坏现有的报价,除非您有洞察力将模式存储在变量中并直接使用它们而不是正则表达式。下面的例子。
$ bash --version
GNU bash, version 3.2.39(1)-release (i486-pc-linux-gnu)
Copyright (C) 2007 Free Software Foundation, Inc.
$ number=2
$ if [[ $number =~ "[0-9]" ]]; then echo match; fi
$ if [[ $number =~ [0-9] ]]; then echo match; fi
match
$ re="[0-9]"
$ if [[ $number =~ $re ]]; then echo MATCH; fi
MATCH
$ bash --version
GNU bash, version 3.00.0(1)-release (i586-suse-linux)
Copyright (C) 2004 Free Software Foundation, Inc.
$ number=2
$ if [[ $number =~ "[0-9]" ]]; then echo match; fi
match
$ if [[ "$number" =~ [0-9] ]]; then echo match; fi
match
评论
28赞
Pavel Šimerda
10/30/2016
这真的很有趣。带引号的正则表达式不再起作用。带空格的未加引号的正则表达式不起作用。基于变量的正则表达式即使包含空格也能正常工作。真是一团糟。
1赞
ingyhere
8/24/2021
有趣的是,这有效:if [[ $number =~ ["0-9"] ]]; then echo match; fi
0赞
siulkilulki
9/27/2021
这太令人失望了,我们需要依赖或解决方法......echo
compat31
23赞
Nicholas Sushkin
6/28/2011
#2
Bash 3.2 引入了一个兼容性选项 compat31,它将 bash 正则表达式引用行为恢复到 3.1
不带 compat31:
$ shopt -u compat31
$ shopt compat31
compat31 off
$ set -x
$ if [[ "9" =~ "[0-9]" ]]; then echo match; else echo no match; fi
+ [[ 9 =~ \[0-9] ]]
+ echo no match
no match
使用 compat31:
$ shopt -s compat31
+ shopt -s compat31
$ if [[ "9" =~ "[0-9]" ]]; then echo match; else echo no match; fi
+ [[ 9 =~ [0-9] ]]
+ echo match
match
补丁链接:http://ftp.gnu.org/gnu/bash/bash-3.2-patches/bash32-039
9赞
Ankur Agarwal
9/11/2013
#3
GNU bash,版本 4.2.25(1)-release (x86_64-pc-linux-gnu)
字符串匹配和正则表达式匹配的一些示例
$ if [[ 234 =~ "[0-9]" ]]; then echo matches; fi # string match
$
$ if [[ 234 =~ [0-9] ]]; then echo matches; fi # regex natch
matches
$ var="[0-9]"
$ if [[ 234 =~ $var ]]; then echo matches; fi # regex match
matches
$ if [[ 234 =~ "$var" ]]; then echo matches; fi # string match after substituting $var as [0-9]
$ if [[ 'rss$var919' =~ "$var" ]]; then echo matches; fi # string match after substituting $var as [0-9]
$ if [[ 'rss$var919' =~ $var ]]; then echo matches; fi # regex match after substituting $var as [0-9]
matches
$ if [[ "rss\$var919" =~ "$var" ]]; then echo matches; fi # string match won't work
$ if [[ "rss\\$var919" =~ "$var" ]]; then echo matches; fi # string match won't work
$ if [[ "rss'$var'""919" =~ "$var" ]]; then echo matches; fi # $var is substituted on LHS & RHS and then string match happens
matches
$ if [[ 'rss$var919' =~ "\$var" ]]; then echo matches; fi # string match !
matches
$ if [[ 'rss$var919' =~ "$var" ]]; then echo matches; fi # string match failed
$
$ if [[ 'rss$var919' =~ '$var' ]]; then echo matches; fi # string match
matches
$ echo $var
[0-9]
$
$ if [[ abc123def =~ "[0-9]" ]]; then echo matches; fi
$ if [[ abc123def =~ [0-9] ]]; then echo matches; fi
matches
$ if [[ 'rss$var919' =~ '$var' ]]; then echo matches; fi # string match due to single quotes on RHS $var matches $var
matches
$ if [[ 'rss$var919' =~ $var ]]; then echo matches; fi # Regex match
matches
$ if [[ 'rss$var' =~ $var ]]; then echo matches; fi # Above e.g. really is regex match and not string match
$
$ if [[ 'rss$var919[0-9]' =~ "$var" ]]; then echo matches; fi # string match RHS substituted and then matched
matches
$ if [[ 'rss$var919' =~ "'$var'" ]]; then echo matches; fi # trying to string match '$var' fails
$ if [[ '$var' =~ "'$var'" ]]; then echo matches; fi # string match still fails as single quotes are omitted on RHS
$ if [[ \'$var\' =~ "'$var'" ]]; then echo matches; fi # this string match works as single quotes are included now on RHS
matches
6赞
Digital Trauma
2/14/2014
#4
正如其他答案中提到的,将正则表达式放在变量中是实现对不同 bash 版本的兼容性的通用方法。您也可以使用此解决方法来实现相同的目的,同时将正则表达式保留在条件表达式中:
$ number=1
$ if [[ $number =~ $(echo "[0-9]") ]]; then echo matched; fi
matched
$
评论
0赞
Near Privman
8/11/2022
使用命令替换会产生很小的性能损失,这在某些情况下可能会很大(例如,在循环中执行大量检查)。
2赞
Near Privman
8/12/2022
#5
使用局部变量的性能略好于使用命令替换。
对于较大的脚本或脚本集合,使用实用程序来防止不需要的局部变量污染代码并减少详细程度可能是有意义的。这似乎很有效:
# Bash's built-in regular expression matching requires the regular expression
# to be unqouted (see https://stackoverflow.com/q/218156), which makes it harder
# to use some special characters, e.g., the dollar sign.
# This wrapper works around the issue by using a local variable, which means the
# quotes are not passed on to the regex engine.
regex_match() {
local string regex
string="${1?}"
regex="${2?}"
# shellcheck disable=SC2046 `regex` is deliberately unquoted, see above.
[[ "${string}" =~ ${regex} ]]
}
用法示例:
if regex_match "${number}" '[0-9]'; then
echo matched
fi
评论