提问人:Ben 提问时间:8/9/2016 更新时间:4/2/2021 访问量:768
正则表达式查找逗号分隔符不在引号中,也不在括号中
Regex find comma delimiters not in quotes AND not in parenthesis
问:
我的最终目标是开发一个函数来将 Access IIF() 语句转换为 T-SQL (2008) CASE WHEN 语句。我已经有一个 VBA 例程,可以找到 IIF 和匹配的右括号,即使它在引号之外。它是递归的,可以在不使用正则表达式的情况下查找嵌套的 IIF() 语句。当我缩小到 IIF 括号内的文本时,我需要确定分隔三个参数的两个逗号分隔符。当括号在引号内时,我遇到了麻烦。如何设置正则表达式以在处理表达式的其余部分之前忽略引号中的任何内容?
我正在尝试创建一个表达式组,该组将查找单引号内的任何内容和括号内的任何内容,然后排除与该组匹配的任何内容,并找到逗号。(如果我说得不正确,请原谅我,因为“捕获组”和“非捕获组”有时会给我带来与我期望相反的结果)。
请注意,此解决方案必须与 VBScript 正则表达式支持一起使用,这与 JavaScript 风格基本相同。
condition、true、false < -- 这是我的 IIF 解析函数在尝试拆分为 3 个部分之前返回的字符串。
到目前为止,这是我拼凑的表达方式:
,(?=([^']*'[^']*')*(?![^']*'))(?![^(]*[)])
这适用于此:
a=1, nz(b,0), Left('xy,z',2)
但这些线路更具挑战性。我找不到适用于所有这些的表达方式。
a=1, '1st)', '(2nd)'
left(right(a,5),1)='b', '1st)', '(2nd)'
a=1, Left('a,bc',1) , 'xy,z'
这是我一直在研究的 Regex101:
答:
答案是,我试图做的事情不适合正则表达式。相反,我通过编写一个处理嵌套括号、匹配引号和杂项逗号的解析器来解决这个问题。
我希望你能找到这个答案,因为你需要一个函数将 IIF() 语句转换为 CASE WHEN,而不是我复杂的正则表达式查询。此函数的主要用例是将 Access SQL 转换为 SQL Server 2008 及更早版本的 T-SQL。SQL Server 2012SQL Server 2012SQL Server 2012SQL Server 2012SQL Server 2012SQL Server 2012SQL Server 2012SQL Server 2012SQL Server 201您可以在任何 VBA 编辑器中使用此 VBA 函数。如果 Access 不方便,则可以使用 Excel。
下面是一个 VBA 函数,用于将 Access IIF() 语句转换为 T-SQL CASE WHEN 语句
Public Function ReplaceIIFwithCASE(ByVal strInput As String) As String
' Parse the passed string and find the first "IIF(" and parse it into
' a standard T-SQL CASE WHEN statement. If a nested IIF is found,
' recurse and call this function again.
'
' Ben Sacherich - May 2016-Feb 2017
'
' This supports:
' IIF() embedded anywhere in the input string.
' Nested IIF() statements.
' The input string containing multiple IIF() statements on the same level.
' Strings between open and close IIF parenthesis that contains literal commas or commas for other functions.
' Example: IIF(a=1, nz(b,0) , Left('xy,z',2))
'
' Be aware:
' This does not optimize the CASE statement in places where a nested IIF could
' be written as a single CASE statement.
' It will fail if text inside IIF() containes the pipe character |. If you
' need to process statements with this character, modify this routine
' to use another temporary character for |.
'
' Try these in the Immediate window:
' ? ReplaceIIFwithCASE("IIF(a=1, nz(b,0) , Left('xy,z',2))")
' ? ReplaceIIFwithCASE("IIf(x='one',IIf(Abs(Z)=1,2,3),'three')")
' ? ReplaceIIFwithCASE("IIF(a=1,'1st)', '2nd)')")
' ? ReplaceIIFwithCASE("SELECT Name, IIF(Gender='M', 'Boy', 'Girl') FROM Students")
'
' How this works:
' Find "IIF(" in the passed string. Return original string if not found.
' Search for the matching closing parenthesis.
' When the match is found, recurse and make sure an "IIF(" is not nested.
' After recursing, replace the IIF with a CASE statement.
' - Once I find the inner part of an IIF this will use the Split function
' to delimit by commas "," into an array.
' - Then it looks at each array element. If it contains an odd number of
' single or double quote characters or different number of opening and
' closing parenthesis, it combines the array element part with the next
' part and tests again.
' - When there are matched single/double quotes and equivalent number of
' parenthesis it holds that part and appends the "|" character. This
' means that it has identified one of the 3 parameters that is passed
' to the IIF function.
' - Then it splits the string by the "|" character into three pieces
' and builds the CASE statement.
' Continue searching the passed string for another occurrence of "IIF(" (not nested).
Dim lngFuncStart As Long
Dim lngPosition As Long
Dim intStack As Integer
Dim strFunction As String
Dim strChar As String
Dim strQuoteChar As String
Dim bolInQuotes As Boolean
Dim strSplit() As String
Dim ReturnValue As String
On Error GoTo ErrorHandler
strFunction = "IIF("
strQuoteChar = "'" ' Define the style of quotes to look for and exclude.
bolInQuotes = False ' We are currently not inside quotes.
lngFuncStart = InStr(1, strInput, strFunction, vbTextCompare)
If lngFuncStart > 0 Then
lngFuncStart = lngFuncStart + Len(strFunction)
intStack = 1
lngPosition = lngFuncStart
Do While lngPosition <= Len(strInput)
' Use a WHILE loop instead of a FOR loop because the current and end positions will change inside the loop.
strChar = Mid(strInput, lngPosition, 1)
If strChar = strQuoteChar Then
bolInQuotes = Not bolInQuotes
' Move on to the next character
ElseIf bolInQuotes = False Then
' We are currently not inside quotes.
Select Case strChar
Case ")"
' Closing a group
intStack = intStack - 1
Case "("
' Starting new group
intStack = intStack + 1
End Select
If intStack = 0 Then ' Found closing parenthesis.
' See if there is a nested match. ### Recursive ###
ReturnValue = ReplaceIIFwithCASE(Mid(strInput, lngFuncStart, lngPosition - lngFuncStart))
' Begin parsing commas.
strSplit() = Split(ReturnValue, ",")
Dim strPart As String
Dim strRebuilt As String
Dim i As Integer
strRebuilt = ""
If UBound(strSplit()) > 2 Then ' There are more than 2 commas. Piece together the parts.
strPart = strSplit(0)
For i = 1 To UBound(strSplit)
If UBound(Split(strPart, "'")) Mod 2 = 0 _
And UBound(Split(strPart, """")) Mod 2 = 0 _
And UBound(Split(strPart, "(")) = UBound(Split(strPart, ")")) Then
' Number of single quotes is Even or Zero (matched)
' Number of double quotes is Even or Zero (matched)
' Number of parenthesis is matched
' Add the "|" symbol where the IIF should have commas.
strRebuilt = strRebuilt & "|" & strPart
strPart = strSplit(i)
Else
strPart = strPart & "," & strSplit(i)
End If
Next
ReturnValue = Mid(strRebuilt & "|" & strPart, 2)
strSplit() = Split(ReturnValue, "|")
End If
If UBound(strSplit) = 2 Then
' IIF has 3 parameters and is the normal case.
'--- Replace the IIF statement with CASE WHEN ---
' CASE statement help: https://msdn.microsoft.com/en-us/library/ms181765.aspx
ReturnValue = "(CASE WHEN " & Trim(strSplit(0)) & " THEN " & Trim(strSplit(1)) & " ELSE " & Trim(strSplit(2)) & " END)"
'ReturnValue = "(CASE WHEN...)"
If Right(Mid(strInput, 1, lngFuncStart - Len(strFunction) - 1), 2) = vbCrLf Then
' Don't bother to add a CrLf
Else
' Add a CrLf before the CASE statement to make identification easier.
' Comment this out if you don't want it added.
ReturnValue = vbCrLf & ReturnValue
End If
strInput = Mid(strInput, 1, lngFuncStart - Len(strFunction) - 1) & ReturnValue & Mid(strInput, lngPosition + 1)
Else
' Something is wrong. Return the original IIF statement.
' Known issues:
' Text inside IIF() contained pipe character |
' Text contained unmatched parenthesis, maybe inside of a literal string like '1st)'
ReturnValue = "IIF(" & ReturnValue & ") /*### Unable to parse IIF() ###*/ "
strInput = Mid(strInput, 1, lngFuncStart - Len(strFunction) - 1) & ReturnValue & Mid(strInput, lngPosition + 1)
End If
'--- Check to see if there is another function call following the one just addressed. ---
lngFuncStart = InStr(lngFuncStart + Len(ReturnValue) - Len(strFunction), strInput, strFunction, vbTextCompare)
If lngFuncStart > 0 Then
' Another IIF function call is at the same level as the one just processed.
lngFuncStart = lngFuncStart + Len(strFunction)
intStack = 1
lngPosition = lngFuncStart
Else
ReturnValue = strInput
Exit Do
End If
End If
End If
lngPosition = lngPosition + 1
Loop
Else
' Function not found in passed string.
ReturnValue = strInput
End If
ReplaceIIFwithCASE = ReturnValue
Exit Function
ErrorHandler:
MsgBox "Error #" & Err.Number & " - " & Err.Description & vbCrLf & "in procedure ReplaceIIFwithCASE()" _
& vbCrLf & "Input: " & strInput, vbExclamation, "Error"
End Function
您可以使用递归匹配:
(?>(?>\([^()]*(?R)?[^()]*\))|(?>'[^']*')|(?>[^()' ,]+))+
允许重复外部匹配。在里面,有三个选项。第一个匹配平衡:(?>...)+
()
(?>\([^()]*(?R)?[^()]*\)
第二个匹配单引号之间的任何内容:'...'
(?>'[^']*')
第三个匹配除 、 逗号或空格之外的任何内容:()
'
(?>[^()' ,]+)
评论
,(?=([^']*'[^']*')*(?![^']*'))(?![^(]*[)])