提问人:Hack-R 提问时间:3/20/2023 更新时间:3/20/2023 访问量:28
tf.data.experimental.CsvDataset 忽略我的特征规范 (tensorflow)
tf.data.experimental.CsvDataset ignore my feature specification (tensorflow)
问:
tf.data.experimental.CsvDataset
似乎忽略了我的功能规范。我在使用从 SchemaGen 架构以编程方式生成的record_default时遇到了同样的问题。数据是合成的,以消除数据作为潜在的错误来源。
我生成的数据如下:
# Generate the dataset
data = []
for i in range(100):
row = {}
row['label'] = random.randint(1, 10)
row['age'] = random.randint(50, 100)
row['Location'] = random.choice(location_list) # string categorical feature
row['text'] = ''.join(random.choices(string.ascii_letters + string.digits, k=10)).encode()
data.append(row)
# Write the dataset to a CSV file
# Create the directory if it doesn't exist
if not os.path.exists('temp123'):
os.mkdir('temp123')
else:
# Remove existing files if the directory already exists
for filename in os.listdir('temp123'):
file_path = os.path.join('temp123', filename)
try:
if os.path.isfile(file_path) or os.path.islink(file_path):
os.unlink(file_path)
except Exception as e:
print(f'Failed to delete {file_path}. Reason: {e}')
with open('temp123/data.csv', 'w') as f:
f.write('label,age,Location,text\n')
for row in data:
f.write(f"{row['label']},{row['age']},{row['Location']},{row['text'].decode()}\n")
我像这样加载数据:
# Define the feature specification for the CSV file
feature_types = {
'label': tf.int32,
'age': tf.int32,
'Location': tf.string,
'text': tf.string
}
# Load the CSV file
train_dataset = tf.data.experimental.CsvDataset(
['temp123/data.csv'],
feature_types,
header=True
)
无论如何,一切都被加载为字符串:
<CsvDatasetV2 element_spec=( TensorSpec(shape=(), dtype=tf.string, name=None), TensorSpec(shape=(), dtype=tf.string, name=None), TensorSpec(shape=(), dtype=tf.string, name=None), TensorSpec(shape=(), dtype=tf.string, name=None))>
答: 暂无答案
评论