提问人:Uadip 提问时间:11/10/2023 最后编辑:Uadip 更新时间:11/13/2023 访问量:33
如何在python中使用/解析tr11.wc.arff,CSTR.arff数据进行聚类?
How to use/parse tr11.wc.arff, CSTR.arff data in python for clustering?
问:
我是一名学生,是新手。 我找到了一篇使用 tr11.wc.arff、tr23.arff 和 CSTR.arff 等进行聚类的论文。(数据可在此处找到:http://sites.labic.icmc.usp.br/ragero/arffs/) 我正在尝试使用相同的数据集并将其用于 python 中的聚类。 但是我无法读取或解析这些数据,而且我只得到了 arff 数据的要点。 tr11.wc.arff 的示例内容为:
% ARFF format training set
@RELATION tr11.mat
@ATTRIBUTE outfit integer
@ATTRIBUTE hasn integer
@ATTRIBUTE calm integer
@ATTRIBUTE gene integer
.
.
.
@DATA
{28 1,30 9,33 3,258 1,329 1,346 2,351 2,352 1,353 7,367 2,376 2,379 1,381 1,385 4,387 1,391 1,392 2,404 3,405 1,1162 4,1221 1,1460 1,1462 1,1470 1,1498 4,1499 1,1501 1,1502 1,1505 2,1506 2,1563 1,1576 1,1695 1,1708 1,1743 1,1755 1,1779 1,1828 1,1877 1,1915 1,1934 1,1973 1,2008 1,2130 1,2133 1,2149 2,2173 2,2186 1,2202 1,2219 1,2231 2,2235 1,2276 1,2282 1,2284 1,2325 1,2369 2,2376 1,2390 1,2401 3,2431 1,2457 1,2467 2,2498 1,2587 1,2726 2,2744 2,2747 1,2769 2,2774 1,2796 1,3005 1,3025 1,3192 1,3203 1,3207 1,3224 1,3228 3,3267 1,3268 1,3269 1,3270 1,3337 1,3367 1,3384 1,3413 1,3451 2,3472 4,3488 3,3505 1,3524 1,3528 1,3545 1,3546 1,3552 1,3589 3,3623 1,3675 1,3688 1,3690 2,3705 2,3724 3,3727 1,3732 3,3803 3,3814 6,3819 3,3825 1,3826 12,3839 2,3841 2,3846 2,3849 13,3868 1,3870 3,3882 1,3890 1,3917 2,3928 2,3980 4,4022 1,4049 1,4100 7,4137 1,4138 1,4242 1,4346 1,4376 1,4399 3,4405 2,4428 1,4430 3,4455 5,4485 1,4509 1,4520 1,4527 1,4542 3,4600 2,4616 1,4728 1,4770 1,4804 1,4824 15,4854 2,4863 1,4896 1,4901 1,4903 7,4943 1,4952 3,4957 1,5098 1,5110 1,5122 2,5161 1,5170 4,5191 1,5394 1,5401 1,5421 5,5486 1,5489 1,5494 5,5508 1,5512 1,5515 2,5543 1,5578 1,5774 1,5789 1,5810 1,5824 1,5828 2,6113 1,6234 2,6309 1,6429 4}
{30 7,36 1,256
我使用了不同的库,但无法使用数据。 请帮帮我!
我用过这个
from scipy.io import arff
import numpy as np
file = '/Users/uadip/Documents/tr11.wc.arff'
data, meta = arff.loadarff(file)
# Extract feature names and data
attributes = meta.names()
data = np.array(data.tolist())
并收到此错误
Traceback (most recent call last):
File "/Users/uadip/my_workspace/python/code_gen/final tests/pg_2.py", line 6, in <module>
data, meta = arff.loadarff(file)
File "/Users/uadip/my_workspace/python/code_gen/venv/lib/python3.9/site-packages/scipy/io/arff/_arffread.py", line 802, in loadarff
return _loadarff(ofile)
File "/Users/uadip/my_workspace/python/code_gen/venv/lib/python3.9/site-packages/scipy/io/arff/_arffread.py", line 867, in _loadarff
a = list(generator(ofile))
File "/Users/uadip/my_workspace/python/code_gen/venv/lib/python3.9/site-packages/scipy/io/arff/_arffread.py", line 865, in generator
yield tuple([attr[i].parse_data(row[i]) for i in elems])
File "/Users/uadip/my_workspace/python/code_gen/venv/lib/python3.9/site-packages/scipy/io/arff/_arffread.py", line 865, in <listcomp>
yield tuple([attr[i].parse_data(row[i]) for i in elems])
File "/Users/uadip/my_workspace/python/code_gen/venv/lib/python3.9/site-packages/scipy/io/arff/_arffread.py", line 223, in parse_data
return float(data_str)
ValueError: could not convert string to float: '{28 1'
也使用过其他方法,但做不到。 另一个:
import arff
file = '/Users/uadip/Documents/tr11.wc.arff'
data = arff.load(file)
data_values = []
for row in data:
row_values = []
for key in row:
row_values.append(row[key])
data_values.append(row_values)
错误:
Traceback (most recent call last):
File "/Users/uadip/my_workspace/python/code_gen/final tests/pg_2.py", line 7, in <module>
for row in data:
File "/Users/uadip/my_workspace/python/code_gen/venv/lib/python3.9/site-packages/arff/__init__.py", line 240, in load
for item in Reader(fhand):
File "/Users/uadip/my_workspace/python/code_gen/venv/lib/python3.9/site-packages/arff/__init__.py", line 273, in __iter__
field_type_text = space_separated[2].strip()
IndexError: list index out of range
答: 暂无答案
评论
pg_2.py
ValueError: could not convert string to float: '{28 1'
import arff