提问人:Tireur 提问时间:8/25/2023 最后编辑:Tireur 更新时间:8/25/2023 访问量:52
CSV 导入 MySQL 被截断 - 数据多于列
CSV import into MySQL truncated - more data than columns
问:
对不起,如果似乎已经回答了很多次,但花了很多时间试图弄清楚。我有一个从CSV到MySQL的复杂导入。我的CSV文件是在不同的软件中生成的,更改它是不可能的。
我的导入代码:
LOAD DATA INFILE 'krupka_lezaky.csv'
INTO TABLE table_lezaky_krupka5
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 LINES
(@var1, @var2, NAZEV, DOPLNEK, @var5, @var6, @var7, @var8, @var9, @var10, @var11, @var12)
SET
ID_ZBOZI = REPLACE(@var1, '"', ''), -- Remove double quotes
ID_CISELNIK = REPLACE(@var2, '"', ''), -- Remove double quotes
MNOZSTVI = REPLACE(REPLACE(@var5, ',', '.'), '"', ''), -- Remove double quotes and commas
NCENA_BD = REPLACE(REPLACE(@var6, ',', '.'), '"', ''),
NCENA_SD = REPLACE(REPLACE(@var7, ',', '.'), '"', ''),
PCENA_SD = REPLACE(REPLACE(@var8, ',', '.'), '"', ''),
PCENA_RP = REPLACE(REPLACE(@var9, ',', '.'), '"', ''),
DAT_PRIJEM = STR_TO_DATE(@var10, '%d.%m.%Y %H:%i:%s'),
DAT_VYDEJ = STR_TO_DATE(@var11, '%d.%m.%Y %H:%i:%s'),
KOD_VZP = REPLACE(@var12, '"', ''), -- Remove double quotes
ID_PROVOZOVNA = 4; -- Set the fixed value for the 13th column
我的表格格式:
我的表代码:
CREATE TABLE `table_lezaky_krupka5` (
`ID_ZBOZI` int NOT NULL,
`ID_CISELNIK` varchar(5) DEFAULT NULL,
`NAZEV` varchar(200) DEFAULT NULL,
`DOPLNEK` varchar(200) DEFAULT NULL,
`MNOZSTVI` decimal(10,4) DEFAULT NULL,
`NCENA_BD` decimal(10,4) DEFAULT NULL,
`NCENA_SD` decimal(10,4) DEFAULT NULL,
`PCENA_SD` decimal(10,4) DEFAULT NULL,
`PCENA_RP` decimal(10,4) DEFAULT NULL,
`DAT_PRIJEM` datetime DEFAULT NULL,
`DAT_VYDEJ` datetime DEFAULT NULL,
`KOD_VZP` varchar(20) DEFAULT NULL,
`ID_PROVOZOVNA` int NOT NULL,
PRIMARY KEY (`ID_ZBOZI`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
我的csv文件:
"ID_ZBOZI","ID_CISELNIK","NAZEV","DOPLNEK","MNOZSTVI","NCENA_BD","NCENA_SD","PCENA_SD","PCENA_RP","DAT_PRIJEM","DAT_VYDEJ","KOD_VZP"
"11311782","*","5PreveMax Imunit nukleotidy+betaglukan","tbl 60+20","1,0000","238,5000","274,2800","365,0000","365,0000","21.07.2022 07:34:57","14.07.2022 14:32:58",""
"15100001","1","ACYLCOFFIN","POR TBL NOB 10","6,0000","307,0100","337,7300","470,0000","470,0000","20.07.2022 08:08:38","10.02.2023 11:07:09","9"
"15103351","1","AERIUS 5 MG","TBL 30X5MG","1,0000","109,9100","120,9000","160,0000","160,0000","27.07.2022 07:55:40","26.07.2022 07:55:05","26329"
"15144399","1","AESCULUS HIPPOCASTANUM","GRA 4GM 5CH","2,0000","148,3200","163,1600","230,0000","230,0000","06.01.2021 08:30:50","","62741"
"15100094","1","AGOLUTIN","INJ 5X2ML/60MG","1,0000","55,7400","61,3100","70,0000","70,0000","11.02.2022 08:13:57","07.02.2022 14:58:29","186149"
"15103137","*","Alpa bylinný masážní gel kaštan ","100ml","1,0000","35,5700","43,0400","65,0000","65,0000","23.12.2022 10:15:38","22.12.2022 09:14:17",""
"15102019","*","Alpa emulze s arnikou ","100ml","2,0000","81,1200","98,1600","138,0000","138,0000","15.02.2023 07:51:23","14.02.2023 10:51:55",""
"15104744","1","AMPRILAN 5","TBL 30X5MG","1,0000","23,6000","25,9600","34,0000","34,0000","23.12.2022 10:15:25","22.12.2022 16:29:00","23962"
"15120270","1","ANALERGIN NEO","5MG TBL FLM 90 II","1,0000","195,7900","215,3700","283,0000","283,0000","23.09.2022 06:48:38","22.09.2022 11:23:23","201946"
错误:
Error Code: 1262. Row 1 was truncated; it contained more data than there were input columns
我想我的问题是 CSV 文件的格式,我们使用逗号作为小数分隔符,然后我的十进制值也用双引号引起来。我试图绕过它,使用替代品来获得正确的格式,但仍然出现错误。任何帮助将不胜感激,谢谢。
答:
0赞
Tireur
8/25/2023
#1
感谢大家的帮助,我想我终于找到了解决方案。 有两个问题,
- CSV 文件中的某些日期为 null,因此str_to_date导致错误,在那里设置条件会有所帮助(至少我认为这是原因)
- 我以 UTF-8 编码重新保存了 CSV 文件,似乎原始文件是 ANSI 编码
我现在的代码是这样的:
LOAD DATA INFILE 'hranik_lezaky.csv'
INTO TABLE table_lezaky_hranik
CHARACTER SET latin1 -- Assuming ANSI encoding is similar to ISO 8859-1 (Latin-1)
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
IGNORE 1 LINES
(@var1, @var2, NAZEV, DOPLNEK, @var5, @var6, @var7, @var8, @var9, @var10, @var11, @var12)
SET
ID_ZBOZI = @var1,
ID_CISELNIK = @var2,
MNOZSTVI = REPLACE(@var5, ',', '.'),
NCENA_BD = REPLACE(@var6, ',', '.'),
NCENA_SD = REPLACE(@var7, ',', '.'),
PCENA_SD = REPLACE(@var8, ',', '.'),
PCENA_RP = REPLACE(@var9, ',', '.'),
DAT_PRIJEM = IF(@var11 <> '', STR_TO_DATE(@var11, '%d.%m.%Y %H:%i:%s'), NULL),
DAT_VYDEJ = IF(@var11 <> '', STR_TO_DATE(@var11, '%d.%m.%Y %H:%i:%s'), NULL),
KOD_VZP = @var12,
ID_PROVOZOVNA = 1; -- Set the fixed value for the 13th column
最后,我设法导入了数据,现在我将尝试在此代码中更改导入的CSV文件的编码
编辑// 完成,谢谢:)
评论
REPLACE(@var1, '"', '')
ENCLOSED BY '"'
()