提问人:CyclikP 提问时间:8/12/2022 最后编辑:CyclikP 更新时间:8/15/2022 访问量:560
引号字段后的分隔符,如何转义引号
Delimiter after a quoted field, how to escape quote
问:
我有那种文件
info1;info2;info3";info4;info5
解析后,我遇到了这个错误
Error: [42636] ETL-2106: Error while parsing row=0 (starting from 0) [CSV Parser found at byte 5 (starting with 0 at the beginning of the row) of 5 a field delimiter after an quoted field (with an additional whitespace) in file '~/path'. Please check for correct enclosed fields, valid field separators and e.g. unescaped field delimiters that are contained in the data (these have to be escaped)]
我敢肯定原因在这里 info3“;但是我该如何解决这个问题我不知道
我也不能去掉引号,因为它应该在报告中
python代码的主要部分是
# Transform data to valid CSV format: remove BOM, remove '=' sign, remove repeating quotes in Size column
decoded_csv = r.content.decode('utf-8').replace(u'\ufeff', '').replace('=', '')
print(decoded_csv)
cr = csv.reader(decoded_csv.splitlines(), delimiter=';')
lst = list(cr)[1:]
f = csv.writer(open(base_folder + 'txt/' + shop, "w+"), delimiter=';')
for row in lst:
f.writerow(row[:-2])
在这段代码之后,我得到了那种文件
info1;info2;"info3""";info4;info5
这不是我需要的 但是当我通过添加“quoteting=csv.QUOTE_NONE, quotechar='')”
# Transform data to valid CSV format: remove BOM, remove '=' sign, remove repeating quotes in Size column
decoded_csv = r.content.decode('utf-8').replace(u'\ufeff', '').replace('=', '')
print(decoded_csv)
cr = csv.reader(decoded_csv.splitlines(), delimiter=';')
lst = list(cr)[1:]
f = csv.writer(open(base_folder + 'txt/' + shop, "w+"), delimiter=';' quoting=csv.QUOTE_NONE, quotechar='')
for row in lst:
f.writerow(row[:-2])
我得到了我需要的
info1;info2;info3";info4;info5
这是第二步(exasol),代码返回了错误
MERGE INTO hst AS dst
USING (
SELECT DISTINCT
ar,
ar_na,
FROM (
IMPORT INTO
(
ar VARCHAR(100) UTF8 COMMENT IS 'ar',
ar_na VARCHAR(100) UTF8 COMMENT IS 'ar na',
)
FROM CSV /*SS:R*/
AT '&1'
USER '&2'
IDENTIFIED BY '&3'
FILE '~/path'
SKIP = 0
ROW SEPARATOR = 'CRLF'
COLUMN SEPARATOR = ';'
TRIM
)
GROUP BY
ar,
ar_na,
) src ON src.ar = dst.ar
WHEN MATCHED THEN UPDATE SET
dst.ar_na = src.ar_na,
WHEN NOT MATCHED THEN
INSERT (
ar
ar_na,
)
VALUES (
src.ar,
src.ar_na,
);
如果文件看起来一切正常,则所有脚本都可以工作info1;info2;info3;info4;info5
答:
默认情况下,Exaosl 将双引号 (“) 视为列分隔符。这使您能够指定包含列分隔符(在本例中为分号)的值。请参阅文档中的“特殊字符”条目。
您在这里有两个选择:
- 通过传递给 import 语句来禁用列分隔符。
COLUMN DELIMITER = ''
- 复制 csv 文件中的所有双引号。如果列分隔符连续出现两次,则 Exasol 将忽略该列分隔符。
评论