提问人:Peter Jaspers 提问时间:10/25/2023 最后编辑:Peter Jaspers 更新时间:10/25/2023 访问量:45
输入字符串的微小差异如何导致 std::regex_search 性能的巨大差异
How can a small difference in input string cause a large difference in std::regex_search performance
问:
字符串的微小变化怎么可能引起这么大的变化
性能差异。我的正则表达式错了吗?
目的是提取两个标记“|>”之间的(可能是多行字符串)。
我使用 Visual Studio 和 Microsoft C++ 编译器编译了此代码。command
bool testCommandMatching() {
const std::string group(R"(
gcc
src\hello.c
-o bin\hello
)");
const std::regex cmdRe(R"(^\|>((?:.*\s*)*)\|>)");
bool bothMatched = true;
{
std::cmatch match;
const std::string command(R"(|>
gcc
src\hello.c
-o bin\hello
|>)");
auto start = std::chrono::system_clock::now();
bool matched = std::regex_search(command.c_str(), match, cmdRe, std::regex_constants::match_continuous);
auto end = std::chrono::system_clock::now();
// elapsed is less than 10 ms
auto elapsed = std::chrono::duration_cast<std::chrono::milliseconds>(end - start);
bothMatched = bothMatched && group == (matched ? match[1] : std::string(""));
}
{
std::cmatch match;
const std::string command(R"(|>
gcc
src\hello.c
-o bin\hello
|> bin\hello)");
auto start = std::chrono::system_clock::now();
bool matched = std::regex_search(command.c_str(), match, cmdRe, std::regex_constants::match_continuous);
auto end = std::chrono::system_clock::now();
// elapsed is more than 1000 ms. How is this possible?
auto elapsed = std::chrono::duration_cast<std::chrono::milliseconds>(end - start);
bothMatched = bothMatched && group == (matched ? match[1] : std::string(""));
}
return bothMatched;
}
答: 暂无答案
评论
^\|>(.*(?:\r?\n.*)*)\|>
.*.*.*
(?:.*\s*)*
(?>.*\s*)*