提问人:nightmare637 提问时间:8/5/2022 最后编辑:nightmare637 更新时间:8/7/2022 访问量:79
如何从数组中解析和过滤掉子字符串中的唯一名称?
How can I parse and filter out unique names in a substring from within an array?
问:
我有一个包含名称的字符串数组,例如以下内容(在 psuedocode 中):
string nameArray[7]
nameArray[0] = "JohnJacob"
nameArray[1] = "George"
nameArray[2] = "JacobBill"
nameArray[3] = "GeorgeBill"
nameArray[4] = "PatZach"
nameArray[5] = "BobJacob"
nameArray[6] = "MichaelScottChristine"
我想使这个数组没有出现不止一次的名称,甚至有一个子字符串。如果两个名字粘在一起也没关系;只要在数组的任何其他元素中都找不到任何子字符串。一个可能的结果可能如下所示:
nameArrayUpdated[0] = "John"
nameArrayUpdated[1] = "Jacob"
nameArrayUpdated[2] = "George"
nameArrayUpdated[3] = "Bill"
nameArrayUpdated[4] = "PatZach"
nameArrayUpdated[5] = "Bob"
nameArrayUpdated[6] = "MichaelScottChristine"
我最初的方法是遍历数组,并检查是否有任何元素是另一个元素的子字符串,然后删除该子字符串。像这样的东西(同样,伪代码):
for (i = 0; i < arrLength; i++)
{
str tmp = nameArray[i];
for (j = 0; j < arrLength; j++)
{
if (i != j)
{
if (nameArray[j].ContainsSubstring(tmp) == true)
nameArray[j].RemoveSubstring(tmp)
}
}
}
但由于几个原因,这是有问题的。首先,它需要未知数量的迭代才能通过数组。例如,“JohnJacob”不会在第一次迭代时分离,而是在第二次迭代时分离。其次,像“JacobBill”和“BobJacob”这样的字符串本身就被认为是唯一的,而不是其他任何东西的子字符串......尽管雅各布包含在这两个字符串中。
有什么实用的方法可以区分出这些名称吗?
答:
0赞
FaneDuru
8/6/2022
#1
你没有回答我的澄清问题......
请复制 Excel 标准模块中的下一个代码并运行第一个 ()。该代码假设我在评论中表达的理解是正确的:Sub
extractNamesArr
Sub extractNamesArr()
Dim arrN, arrFin, arr, mtch, strRepl As String, i As Long, j As Long, boolNot As Boolean
arrN = Split("JohnJacob,George,JacobBill,GeorgeBill,PatZach,BobJacob,MichaelScottChristine", ",")
arr = extractNB(arrN) 'create an array of twp dictionaries
'1. above array elements (as keys) and all extracted names (as item) per each key
'2. all unique extracted names, as dictionary keys
Debug.Print Join(arr(0).Keys, "|") ' just to visually see the dictionary keys
arrFin = arr(1).Keys 'place in an array the unique extracted names
Dim key, k
For Each key In arr(0).Keys 'iterate between the dictionary keys
For i = 0 To UBound(arr(0)(key))
For Each k In arr(0).Keys
If key <> k Then
If InStr(1, k, arr(0)(key)(i), vbBinaryCompare) > 0 Then
boolNot = True: Exit For
End If
End If
Next k
If boolNot Then Exit For
Next i
If Not boolNot Then 'none of the names belonging to the respective kay, NOT FOUND
For j = 0 To UBound(arr(0)(key))
mtch = Application.match(arr(0)(key)(j), arrFin, 0)
If j = UBound(arr(0)(key)) Then
arrFin(mtch - 1) = key
Else
strRepl = arrFin(mtch - 1) & "@#$%": arrFin(mtch - 1) = strRepl
arrFin = filter(arrFin, strRepl, False)
End If
Next j
End If
boolNot = False
Next key
Debug.Print Join(arrFin, "|") 'just to see the final returned array (unique names)
End Sub
Function extractNB(arr) As Variant
Dim arrInt, k As Long, El, i As Long, frstEl As Long
Dim dict As Object, dict1 As Object
Set dict = CreateObject("Scripting.Dictionary") 'dictionary to keep all initial composed array elements (as key) and its split names (as item)
Set dict1 = CreateObject("Scripting.Dictionary") 'dictionary to keep UNIQUE extracted names
frstEl = 1 'first letter to build the first name string
For Each El In arr 'iterate between the array elements
ReDim arrInt(Len(El)) 'ReDim the intermediary array to the string number of digits
For i = 2 To Len(El) 'iterate between string characters, starting from the second one:
If isUpper(Mid(El, i, 1)) Then 'if the character is Upper case:
arrInt(k) = Mid(El, frstEl, i - frstEl): k = k + 1 'place the Name inthe intermediary dictionary item array
dict1(arrInt(k - 1)) = vbNullString 'load the dictionary to keep all UNIQUE extracted names
frstEl = i 'adapt the first character number, to extract the next name
End If
Next i
If frstEl < i - 1 Then 'extract the last name:
arrInt(k) = Mid(El, frstEl, Len(El) - frstEl + 1): k = k + 1
dict1(arrInt(k - 1)) = vbNullString 'load the dictionary keeping all UNIQUE extracted names
End If
ReDim Preserve arrInt(k - 1) 'keep only the loaded array elements
dict.Add El, arrInt 'create the dictionary key and its array item
frstEl = 1: Erase arrInt: k = 0 'reinitialize the necessary variables
Next
extractNB = Array(dict, dict1) 'return an array containing the two dictionaries
End Function
Function isUpper(mstr As String) As Boolean 'Check if the character is Upper case
Select Case Asc(mstr)
Case 65 To 90
isUpper = True
Case 97 To 122
isUpper = False
Case Else
isUpper = False
End Select
End Function
请在测试后发送一些反馈。如果有什么不清楚的地方,请随时要求澄清。
我尝试注释所有不容易理解的代码行......
上一个:这是什么离散优化系列?
评论
UpperCase
obBill
BobBill
JacobBilly
nameArrayUpdated[7] = "JohnGeorge"