提问人:Ibrahim 提问时间:11/6/2023 最后编辑:Ibrahim 更新时间:11/6/2023 访问量:44
如何在 python 中使用字典清理包含关键字列表的列
How to use a dictionary to clean a column that contains a list of keywords in python
问:
我在数据集中有以下列(示例导出为字典)
'amenities': {1913: '[Cooking basics, Hair dryer, Fire extinguisher, Microwave, Refrigerator, Dedicated workspace, Pocket wifi, Lock on bedroom door, Dishes and silverware, Private living room, Essentials, Free street parking, Oven, Paid parking off premises, Hot water, Paid washer In building, Extra pillows and blankets, Hangers, Backyard, Shampoo, TV with standard cable, Stove, Heating, First aid kit, Self check-in, Iron, Smoke alarm, Lockbox, Bed linens, Kitchen, Patio or balcony, Coffee maker, Crib, Wifi]',
11765: '[Cooking basics, Hair dryer, Courtyard view, Coffee maker: drip coffee maker, Standalone high chair - available upon request, Outdoor furniture, Fire extinguisher, Coffee, Microwave, Shared backyard Fully fenced, Refrigerator, Dishes and silverware, TV, Essentials, Free washer In unit, Heating - split type ductless system, Oven, Hot water, Extra pillows and blankets, Toaster, Hangers, Long term stays allowed, Trash compactor, Shower gel, Private patio or balcony, Clothing storage: closet, Cleaning products, Drying rack for clothing, City skyline view, Shampoo, Central air conditioning, Garden view, Freezer, Hot water kettle, Host greets you, Dishwasher, Babysitter recommendations, Wifi, Elevator, Private entrance, First aid kit, Baby bath - available upon request, Safe, AC - split type ductless system, Room-darkening shades, Iron, Smoke alarm, Bed linens, Wine glasses, Kitchen, Body soap, Dining table, Sun loungers, Crib, Stainless steel electric stove, Conditioner, Pack n play/Travel crib - available upon request, Free parking on premises]',
9320: '[Air conditioning, Free street parking, Fire extinguisher, Dedicated workspace, First aid kit, Paid street parking off premises]'}
我正在尝试做的是使用我手动创建的字典(见下面的示例,完整的数据集超过 150 个条目)来清理此列。
{'Silver refrigerator': 'Refrigerator',
'Electronia refrigerator': 'Refrigerator',
'Kunft refrigerator': 'Refrigerator',
'com zona de congelao refrigerator': 'Refrigerator',
'BEKO refrigerator': 'Refrigerator',
'Teka refrigerator': 'Refrigerator',
'Desconhecida refrigerator': 'Refrigerator',
'Pequeno com espao de congelao refrigerator': 'Refrigerator',
'SMEG refrigerator': 'Refrigerator',
'Indiferente refrigerator': 'Refrigerator',
'ORIMA refrigerator': 'Refrigerator',
'Hotpoint refrigerator': 'Refrigerator',
'JOCEL refrigerator': 'Refrigerator',
'Frigorico com congelador de encastre - BALAY refrigerator': 'Refrigerator',
'Grote Koelkast refrigerator': 'Refrigerator',
'SMEG refrigerator': 'Refrigerator',
'Samsung refrigerator': 'Refrigerator',
'Americano refrigerator': 'Refrigerator',
'Candy refrigerator': 'Refrigerator',
'Bosch refrigerator': 'Refrigerator',
'Lg refrigerator': 'Refrigerator',
'Resort access': 'Resort access'}
特别是,我尝试做的是检查字典的关键字是否在列表中,并将其替换为字典的值。
我编写了以下函数,但它不起作用。输出是一个列表列表,其中每个列表只是一个字母。我尝试用一个简单的例子运行相同的函数,它工作正常。我做错了什么?
def clean_words(word_list, replacement_dict):
cleaned_words = [replacement_dict.get(word, word) for word in word_list]
return cleaned_words
df['amenities'] = df['amenities'].apply(clean_words, replacement_dict=replacement_dict)
答:
1赞
mozway
11/6/2023
#1
问题是你没有列表,而是一个字符串。不幸的是,由于内部字符串没有引号,因此这不是 python 列表的有效表示形式,您不能使用 .ast.literal_eval
一种选择是:split
def clean_words(word_list, replacement_dict):
cleaned_words = '[%s]' % ', '.join(
[replacement_dict.get(word, word) for word
in word_list[1:-1].split(', ')])
return cleaned_words
df['amenities']= df['amenities'].apply(clean_words, replacement_dict=replacement_dict)
评论
1赞
Ibrahim
11/6/2023
谢谢!它成功了。我很困惑我是否有列表,因为每次我查看数据时,我都只是看到列表。只是当我导出为字典时,才会显示'''',我以为这是字典
评论