提问人:JHen 提问时间:8/24/2023 最后编辑:JHen 更新时间:8/26/2023 访问量:72
从大于 5 条记录的 csv 文件中提取数据
Pulling data from csv files larger than 5 records
问:
有关更多上下文,请参阅我的第一篇文章:第一篇文章
我需要将数据从 csv 文件提取到现有的 excel 文件中。我对自定义函数和整个 excel 的更深层次机制知之甚少,因此在 GPT 3.5 的帮助下,我开发了一个自定义函数,用于从具有 2 条记录的测试 csv 文件中提取数据。
以下是 GPT 给我的功能,根据我得到的反馈进行了一些调整:
Function GetCSVCellValueFromRecord(csvFilePath As String, recordIndex As Long, targetColumnName As String) As Variant
Dim csvContent As String
Dim lines() As String
Dim headers() As String
Dim columnIndex As Long
Dim i As Long
' Read the entire CSV file into a string
Open ThisWorkbook.Path & "\" & csvFilePath For Input As #1
csvContent = Input$(LOF(1), 1)
Close #1
' Split the CSV content into lines
lines = Split(csvContent, vbCrLf)
' Get the headers from the first line
headers = Split(lines(0), ",")
' Find the column index of the target data
columnIndex = -1
For i = LBound(headers) To UBound(headers)
If Trim(headers(i)) = targetColumnName Then
columnIndex = i
Exit For
End If
Next i
' Return an error if the column name is not found
If columnIndex = -1 Then
GetCSVCellValueFromRecord = CVErr(xlErrValue)
Exit Function
End If
' Check if the requested record index is within bounds
If recordIndex >= 1 And recordIndex <= UBound(lines) Then
Dim fields() As String
fields = Split(lines(UBound(lines) - recordIndex), ",") ' Remove the subtraction of 1 here
If UBound(fields) >= columnIndex Then
GetCSVCellValueFromRecord = Trim(fields(columnIndex))
Exit Function
End If
End If
' Return "N/A" if the record or data is not found
GetCSVCellValueFromRecord = "N/A"
End Function
我有一个包含所需 csv 文件名称的单元格,我的函数调用引用此单元格,以便使用 csv 中的信息更新所有相关字段。
函数调用示例:如果找不到记录,则返回“N/A”。=GetCSVCellValueFromRecord(F1, 1, "Comment")
该函数适用于包含 5 条记录的小型 csv 文件,但对于不存在的第 6 条记录,它返回指定列的名称,然后继续正确显示不存在的第 7 条到第 20 条记录的“N/A”。我不确定是什么原因导致了这个错误。当测试一个有 10 条记录的文件时,该函数会完全失败,并且所有字段都返回不正确的数据,并且应该存在的字段返回 N/A。
再一次,我不确定这个问题。请指教
csv 文件将最新的记录作为第一行,出于我的目的,我需要将最早的记录视为第一行
来自工作 csv 的数据:
Date,Time,Service provider,Client name,Client phone,Comment,Service category
24/8/2023,01:00 PM - 01:55 PM,Tim Robinson,Severus Mitsu,18765894838,"5.Checked only 1 mirror before moving off
",Manual General License
24/8/2023,10:15 AM - 11:10 AM,Tim Robinson,Severus Mitsu,18765894838,"4.Started the vehicle properly
",Manual General License
24/8/2023,09:20 AM - 10:15 AM,Tim Robinson,Severus Mitsu,18765894838,"3.Failed to disengage parking brake
",Manual General License
24/8/2023,08:25 AM - 09:20 AM,Tim Robinson,Severus Mitsu,18765894838,"2.Failed to engage seatbelt
",Manual General License
24/8/2023,07:30 AM - 08:25 AM,Tim Robinson,Severus Mitsu,18765894838,"1.Successfully Opened door
",Manual General License
来自损坏的 csv 的数据:
Date,Tim e,Service,"Service provider","Client name","Client phone",Comment,"Service category"
26-08-2023,"01:00 PM - 01:55 PM","Nissan Frontier #CO 026","Harold Richards","Severus Mitsu",+18765894838,"10. Excessive Speeding","Manual General License"
26-08-2023,"10:15 AM - 11:10 AM","Nissan Frontier #CO 026","Harold Richards","Severus Mitsu",+18765894838,"9.Ran the red lights","Manual General License"
26-08-2023,"09:20 AM - 10:15 AM","Nissan Frontier #CO 026","Harold Richards","Severus Mitsu",+18765894838,"8.Mastered Parallel Parking","Manual General License"
26-08-2023,"08:25 AM - 09:20 AM","Nissan Frontier #CO 026","Harold Richards","Severus Mitsu",+18765894838,"7.botched attempt at parallel parking","Manual General License"
26-08-2023,"07:30 AM - 08:25 AM","Nissan Frontier #CO 026","Harold Richards","Severus Mitsu",+18765894838,"6. Making good progress","Manual General License"
24-08-2023,"01:00 PM - 01:55 PM","L200 Mitsubishi #794 AL","Tim Robinson","Severus Mitsu",+18765894838,"5.Checked only 1 mirror before moving off
","Manual General License"
24-08-2023,"10:15 AM - 11:10 AM","L200 Mitsubishi #794 AL","Tim Robinson","Severus Mitsu",+18765894838,"4.Started the vehicle properly
","Manual General License"
24-08-2023,"09:20 AM - 10:15 AM","L200 Mitsubishi #794 AL","Tim Robinson","Severus Mitsu",+18765894838,"3.Failed to disengage parking brake
","Manual General License"
24-08-2023,"08:25 AM - 09:20 AM","L200 Mitsubishi #794 AL","Tim Robinson","Severus Mitsu",+18765894838,"2.Failed to engage seatbelt
","Manual General License"
24-08-2023,"07:30 AM - 08:25 AM","L200 Mitsubishi #794 AL","Tim Robinson","Severus Mitsu",+18765894838,"1.Success fully Opened door
","Manual General License"
答:
某些带引号的字段值包含换行符。仅当您的文件在行之间和字段之间只有这些内容时,才有效拆分逗号:如果字段值可能包含其中任何一个,则不起作用。vbCrLf
为了更清楚起见,这将显示换行符在文件中的位置(vbCrLf/vbLf/vbCr 中的任何一个):
ff = FreeFile ' Read the entire CSV file into a string...
Open ThisWorkBook.path & "\" & csvFilePath For Input As ff
csvContent = Input$(LOF(ff), ff)
Close ff
csvContent = Replace(csvContent, vbCrLf, "<CrLf>")
csvContent = Replace(csvContent, vbCr, "<Cr>")
csvContent = Replace(csvContent, vbLf, "<Lf>")
csvContent = Replace(csvContent, "<CrLf>", "<CrLf>" & vbLf)
Debug.Print csvContent
某些记录的引号内有 CrLf - 这些不是分隔线的新行,而是嵌入在字段数据中的换行符。
Date,Tim e,Service,"Service provider","Client name","Client phone",Comment,"Service category"<CrLf>
26-08-2023,"01:00 PM - 01:55 PM","Nissan Frontier #CO 026","Harold Richards","Severus Mitsu",+18765894838,"10. Excessive Speeding","Manual General License"<CrLf>
26-08-2023,"10:15 AM - 11:10 AM","Nissan Frontier #CO 026","Harold Richards","Severus Mitsu",+18765894838,"9.Ran the red lights","Manual General License"<CrLf>
26-08-2023,"09:20 AM - 10:15 AM","Nissan Frontier #CO 026","Harold Richards","Severus Mitsu",+18765894838,"8.Mastered Parallel Parking","Manual General License"<CrLf>
26-08-2023,"08:25 AM - 09:20 AM","Nissan Frontier #CO 026","Harold Richards","Severus Mitsu",+18765894838,"7.botched attempt at parallel parking","Manual General License"<CrLf>
26-08-2023,"07:30 AM - 08:25 AM","Nissan Frontier #CO 026","Harold Richards","Severus Mitsu",+18765894838,"6. Making good progress","Manual General License"<CrLf>
24-08-2023,"01:00 PM - 01:55 PM","L200 Mitsubishi #794 AL","Tim Robinson","Severus Mitsu",+18765894838,"5.Checked only 1 mirror before moving off<CrLf>
","Manual General License"<CrLf>
24-08-2023,"10:15 AM - 11:10 AM","L200 Mitsubishi #794 AL","Tim Robinson","Severus Mitsu",+18765894838,"4.Started the vehicle properly<CrLf>
","Manual General License"<CrLf>
24-08-2023,"09:20 AM - 10:15 AM","L200 Mitsubishi #794 AL","Tim Robinson","Severus Mitsu",+18765894838,"3.Failed to disengage parking brake<CrLf>
","Manual General License"<CrLf>
24-08-2023,"08:25 AM - 09:20 AM","L200 Mitsubishi #794 AL","Tim Robinson","Severus Mitsu",+18765894838,"2.Failed to engage seatbelt<CrLf>
","Manual General License"<CrLf>
24-08-2023,"07:30 AM - 08:25 AM","L200 Mitsubishi #794 AL","Tim Robinson","Severus Mitsu",+18765894838,"1.Success fully Opened door<CrLf>
","Manual General License"<CrLf>
编辑:这成功地从“问题”文件中读取内容:
Sub Tester()
Debug.Print GetCSVCellValueFromRecord("Test2.csv", 1, "Comment")
'>> 1.Success fully Opened door
Debug.Print GetCSVCellValueFromRecord("Test2.csv", 2, "Comment")
'>> 2.Failed to engage seatbelt
Debug.Print GetCSVCellValueFromRecord("Test2.csv", 1, "Commentx")
'>> ?header?
Debug.Print GetCSVCellValueFromRecord("Test2.csv", 4, "Comment")
'>> 4.Started the vehicle properly
Debug.Print GetCSVCellValueFromRecord("Test2.csv", 30, "Comment")
'>> ?rec num?
End Sub
Function GetCSVCellValueFromRecord(csvFilePath As String, recordIndex As Long, targetColumnName As String) As Variant
Dim numLines As Long, lines As Collection
Dim headers() As String, fields() As String
Dim columnIndex As Long, i As Long
Set lines = CSVToLines(ThisWorkBook.path & "\" & csvFilePath)
numLines = lines.Count
If numLines = 0 Then
GetCSVCellValueFromRecord = "?no data?"
Exit Function
End If
' Find the column index of the target data
headers = lines(1) ' Get the headers from the first line
columnIndex = -1
For i = LBound(headers) To UBound(headers)
If Trim(headers(i)) = targetColumnName Then
columnIndex = i
Exit For
End If
Next i
' check the column name was found
If columnIndex = -1 Then
GetCSVCellValueFromRecord = "?header?"
Else
' record index is within bounds?
If recordIndex >= 1 And recordIndex <= lines.Count - 1 Then
fields = lines(numLines - (recordIndex - 1))
If UBound(fields) >= columnIndex Then
GetCSVCellValueFromRecord = Trim(fields(columnIndex))
Exit Function
End If
Else
GetCSVCellValueFromRecord = "?rec num?"
End If
End If
End Function
'Return the content from a CSV file as a Collection of
' arrays, one array per line
'Account for quoted fields containing commas, quotes, or newlines
Function CSVToLines(csvFilePath As String) As Collection
Dim csvContent As String, line As String, cNext As String
Dim c, col As New Collection
Dim i As Long, ff As Integer, length As Long, inQ As Boolean
ff = FreeFile ' Read the entire CSV file into a string...
Open csvFilePath For Input As ff
csvContent = Input$(LOF(ff), ff)
Close ff
csvContent = Replace(csvContent, vbCrLf, vbLf) & " " 'normalize newlines and add a space
length = Len(csvContent) - 1 'skip the added space when looping...
inQ = False 'not yet inside a quoted field
i = 1
Do While i < length
c = Mid(csvContent, i, 1)
cNext = Mid(csvContent, i + 1, 1)
If c = """" Then 'already in quotes?
If cNext = """" Then
line = line & c 'was a doubled-up quote - add one quote
i = i + 1 'skip the next character
Else
inQ = Not inQ 'switching in/out of a quoted field
End If
Else
If Not inQ Then
'Not in a quoted field - decide how to handle the character
Select Case c
Case vbLf 'not inside quotes, so this is the end of a line...
col.Add Split(line, Chr(0)) 'split fields on chr(0)
line = ""
Case ",": line = line & Chr(0) 'sub comma for chr(0) for later splitting
Case Else: line = line & c
End Select
Else
'In a quoted field: add each character unless a vblf
' change behaviour to suit your needs....
Select Case c
Case vbLf 'do nothing?
Case Else: line = line & c
End Select
End If
End If
i = i + 1
Loop
If Len(line) > 0 Then col.Add Split(line, Chr(0)) 'add any remaining line
Set CSVToLines = col
End Function
评论