提问人:Jay Cheng 提问时间:2/6/2021 更新时间:2/6/2021 访问量:49
Pandas:创建 1 个函数读取 json,然后创建另一个函数创建数据帧
Pandas: Create 1 function to read json then create another function to create dataframe
问:
我想创建一个函数来从 API 获取数据,然后创建另一个函数来创建和清理相应的数据帧以供使用。
第一组 def 如下所示,工作正常:
def get_data():
print('start download the 1st set')
confirm_details = requests.get('https://api.data.gov.hk/v2/filter?q=%7B%22resource%22%3A%22http%3A%2F%2Fwww.chp.gov.hk%2Ffiles%2Fmisc%2Fenhanced_sur_covid_19_eng.csv%22%2C%22section%22%3A1%2C%22format%22%3A%22json%22%7D').content
print('complete download the 1st set')
print('start download the 2nd set')
latest_situ = requests.get('https://api.data.gov.hk/v2/filter?q=%7B%22resource%22%3A%22http%3A%2F%2Fwww.chp.gov.hk%2Ffiles%2Fmisc%2Flatest_situation_of_reported_cases_covid_19_eng.csv%22%2C%22section%22%3A1%2C%22format%22%3A%22json%22%7D').content
print('complete download the 2nd set')
print('start download the final set')
residential = requests.get('https://api.data.gov.hk/v2/filter?q=%7B%22resource%22%3A%22http%3A%2F%2Fwww.chp.gov.hk%2Ffiles%2Fmisc%2Fbuilding_list_eng.csv%22%2C%22section%22%3A1%2C%22format%22%3A%22json%22%7D').content
print('complete download the final set')
get_data()
第二个定义如下,但它说给我一个错误,即“NameError:名称'confirm_details'未定义:
def clean_confirm_df():
confirm_df = pd.read_json(io.StringIO(confirm_details.decode('utf-8')))
confirm_df.columns = confirm_df.columns.str.replace(" ", "_" )
confirm_df.columns = confirm_df.columns.str.replace('/', "_")
confirm_df.columns = confirm_df.columns.str.replace("*", "")
confirm_df.columns = confirm_df.columns.str.strip()
confirm_df['Report_date'] = pd.to_datetime(confirm_df['Report_date'], dayfirst=True)
confirm_df.rename(columns = {'Confirmed_probable': 'Confirmed'}, inplace = True)
confirm_df = confirm_df.drop(['Name_of_hospital_admitted', 'Date_of_onset'], axis = 1)
confirm_df['HK_Non-HK_resident'] = confirm_df['HK_Non-HK_resident'].str.upper()
confirm_df.head()
clean_confirm_df()
我看了一下第一个定义,我看到定义了“confirm_details”。我尝试过,创建相应 df 作品(confirm_df、latest_situ_df 和 residential_df)的代码在单独运行时工作正常。
我正在自学 python 和 pandas,感谢您提供任何建议,我应该如何更改我的代码以使其正常工作。
谢谢。
答:
0赞
Rob Raymond
2/6/2021
#1
根据注释 - 构建代码,以便您了解变量的范围。你假设一切都是全球性的,这将是一件非常糟糕的事情......
def get_data():
ret = {}
print('start download the 1st set')
ret["confirm_details"] = requests.get('https://api.data.gov.hk/v2/filter?q=%7B%22resource%22%3A%22http%3A%2F%2Fwww.chp.gov.hk%2Ffiles%2Fmisc%2Fenhanced_sur_covid_19_eng.csv%22%2C%22section%22%3A1%2C%22format%22%3A%22json%22%7D').content
print('complete download the 1st set')
print('start download the 2nd set')
ret["latest_situ"] = requests.get('https://api.data.gov.hk/v2/filter?q=%7B%22resource%22%3A%22http%3A%2F%2Fwww.chp.gov.hk%2Ffiles%2Fmisc%2Flatest_situation_of_reported_cases_covid_19_eng.csv%22%2C%22section%22%3A1%2C%22format%22%3A%22json%22%7D').content
print('complete download the 2nd set')
print('start download the final set')
ret["residential"] = requests.get('https://api.data.gov.hk/v2/filter?q=%7B%22resource%22%3A%22http%3A%2F%2Fwww.chp.gov.hk%2Ffiles%2Fmisc%2Fbuilding_list_eng.csv%22%2C%22section%22%3A1%2C%22format%22%3A%22json%22%7D').content
print('complete download the final set')
return ret
def clean_confirm_df(data):
confirm_df = pd.read_json(io.StringIO(data["confirm_details"].decode('utf-8')))
confirm_df.columns = confirm_df.columns.str.replace(" ", "_" )
confirm_df.columns = confirm_df.columns.str.replace('/', "_")
confirm_df.columns = confirm_df.columns.str.replace("*", "")
confirm_df.columns = confirm_df.columns.str.strip()
confirm_df['Report_date'] = pd.to_datetime(confirm_df['Report_date'], dayfirst=True)
confirm_df.rename(columns = {'Confirmed_probable': 'Confirmed'}, inplace = True)
confirm_df = confirm_df.drop(['Name_of_hospital_admitted', 'Date_of_onset'], axis = 1)
confirm_df['HK_Non-HK_resident'] = confirm_df['HK_Non-HK_resident'].str.upper()
return confirm_df
mydata = get_data()
df = clean_confirm_df(mydata)
print(df.head().to_markdown())
start download the 1st set
complete download the 1st set
start download the 2nd set
complete download the 2nd set
start download the final set
complete download the final set
| | Case_no. | Report_date | Gender | Age | Hospitalised_Discharged_Deceased | HK_Non-HK_resident | Case_classification | Confirmed |
|---:|-----------:|:--------------------|:---------|------:|:-----------------------------------|:---------------------|:----------------------|:------------|
| 0 | 1 | 2020-01-23 00:00:00 | M | 39 | Discharged | NON-HK RESIDENT | Imported case | Confirmed |
| 1 | 2 | 2020-01-23 00:00:00 | M | 56 | Discharged | HK RESIDENT | Imported case | Confirmed |
| 2 | 3 | 2020-01-24 00:00:00 | F | 62 | Discharged | NON-HK RESIDENT | Imported case | Confirmed |
| 3 | 4 | 2020-01-24 00:00:00 | F | 62 | Discharged | NON-HK RESIDENT | Imported case | Confirmed |
| 4 | 5 | 2020-01-24 00:00:00 | M | 63 | Discharged | NON-HK RESIDENT | Imported case | Confirmed |
评论
get_data()
dict
get_data()
clean_df()