为什么 Regex.Escape 支持数字符号和空格？-解网

问：

据我所知，Regex.Escape 通过自动将包含正则表达式元字符的模式转换为该模式的“转义”版本来帮助生成正则表达式模式。

因此，如果您只想匹配，则不能将其用作模式，因为句点（或点）是正则表达式元字符，恰好是通配符。Regex.Escape 可以方便地将其转换为可用作不使用点作为通配符的正则表达式模式。abc.abcabc\.abc

Regex.Escape “转义”了 14 个字符......

转义最少的字符集（、*、+、？、|、{、[、（，）、^、$、 .、# 和空格），将它们替换为转义码。

然而，似乎只有 12 个角色与自己不匹配。除这些字符外，其他字符确实会匹配自己。

.$ ^ { [ ( | ) * + ? \

这是指向参考的链接，该参考说有 12 个字符不匹配。

提出这一主张的确切措辞......

“字符或序列”列中列出的字符以外的字符在正则表达式中没有特殊含义;他们匹配自己。

区别在于 Regex.Escape 文档中提到的最后 2 个字符，它们是空格和数字符号。为什么 Regex.Escape 支持的字符超过 12 个与自己不匹配的字符？

.NET 正则表达式

RegexOptions.IgnorePatternWhitespace允许您编写自记录模式。具体而言，该选项允许引入不成为模式一部分的空格，以及在不成为模式一部分的 numberSign 之后引入注释。实际上，它将空格转换为将被忽略的模式元字符。实际上，它将 numberSign 转换为一个模式元字符，该字符将被引入，并且下面的所有内容都将被忽略。我们引入了要忽略的字符，因为它们是文档。

使用此选项意味着必须转义空格才能使其为文字。使用此选项意味着必须对 numberSign 进行转义，才能使其为文字。

Regex r;
        
r = new Regex("a a a #comment"); // in normal mode whitespace is literal and numberSign is literal
Debug.Assert(r.IsMatch("aaa") == false);
Debug.Assert(r.IsMatch("a a a #comment"));

// in normal mode escaping whitespace and escaping numberSign are needless but it has no harmful effect
// interpretation... escaping a meta char means make it literal
// in normal mode whitespace is not a meta char so the effect is to take a literal and make it literal which has no effect
// in normal mode numberSign is not a meta char so the effect is to take a literal and make it literal which has no effect
r = new Regex(Regex.Escape("a a a #comment"));
Debug.Assert(r.IsMatch("aaa") == false);
Debug.Assert(r.IsMatch("a a a #comment"));

r = new Regex("a a a #comment", RegexOptions.IgnorePatternWhitespace); // the option renders the whitespace and comment insignificant
Debug.Assert(r.IsMatch("aaa"));
Debug.Assert(r.IsMatch("a a a #comment") == false);

// escape whitespace and escape numberSign by human coded escape sequences
r = new Regex(@"a\ a\ a\ \#comment", RegexOptions.IgnorePatternWhitespace);
Debug.Assert(r.IsMatch("aaa") == false);
Debug.Assert(r.IsMatch("a a a #comment"));

// escape whitespace and escape numberSign by using Regex.Escape
r = new Regex(Regex.Escape("a a a #comment"), RegexOptions.IgnorePatternWhitespace);
Debug.Assert(r.IsMatch("aaa") == false);
Debug.Assert(r.IsMatch("a a a #comment"));

旁白。。。启用 2 个额外元字符的选项让我想起了连字符。连字符不在 12 个元字符的“正常”集合中，但在方括号的上下文中，连字符成为元字符。

r = new Regex("a-z[P-R]");

第一个连字符是文字，第二个连字符定义一个组，其中模式与一个字符完全匹配。

上一个：我是否应该使用.htaccess/REGEX根据原始URL格式转发到不同的URL？

下一个：.NET ComponentModel.DataAnnotations RegularExpression 不适用于 BsonIgnoreExtraElements

为什么 Regex.Escape 支持数字符号和空格？

Why does Regex.Escape support number sign and whitespace?

评论