提问人:Prakhar Rathi 提问时间:3/6/2022 更新时间:3/6/2022 访问量:124
在循环中更新数据帧时正在创建新的 DataFrame
New DataFrame is being made when updating dataframe inside a loop
问:
我正在尝试以这种方式对循环中的三个数据帧进行一些更改。
for sheet in [f1, f2, f3]:
sheet = preprocess_df(sheet)
函数如下所示preprocess_df
def preprocess_df(df):
""" Making a function to preprocess a dataframe individually rather then all three together """
# make column names uniform
columns = [
"Reporting_Type",
"AA_name",
"Date_DD/MM/YYYY",
"Time_HHMMSS",
"Type",
"Name",
"FI_Type",
"Count_linked",
"Average_timelag_FI_Notification",
"FI_Ready_to_FI_request_ratio",
"Count_Consent_Raised",
"Actioned_to_raised_ratio",
"Approved_to_raised_ratio",
"FI_Ready_to_FI_request_ratio(Daily)",
"Daily_Consent_Requests_Data_Delivered",
"Total_Consent_Requests_Data_Delivered",
"Consent_Requests_Data_Delivered_To_Raised_Ratio",
"Daily_Consent_Requests_Raised",
"Daily Consent_Requests_Data_Delivered_To_Raised_Ratio",
]
# Set the sheet size
df = df.iloc[:, :19]
# Set the column names
df.columns = columns
return df
我基本上是在更新列名并修复数据帧大小。我面临的问题是,如果我在循环中打印数据帧,变量确实会更新,但是,原始的 和数据帧不会更新。我认为这是因为该变量创建了 etc. 的副本,而不是实际使用相同的数据帧。这似乎是按引用传递或按值传递概念的扩展。有没有办法对循环内的所有工作表进行就地更改?sheet
f1
f2
f3
sheet
f1
答:
0赞
Stubborn
3/6/2022
#1
实际上,当您执行 .
但是,您可以通过使用 drop
来解决此问题,包括:df = df.iloc[:, :19]
inplace=True
import pandas as pd
import numpy as np
def preprocess_df(df):
columns = [
"a",
"b",
] # Swap this list with yours
df.drop(df.columns[:2],inplace=True, axis=1) # Replace 2 with 19 in your code
df.columns = columns
f1 = pd.DataFrame(np.arange(12).reshape(3, 4),columns=['A', 'B', 'C', 'D']) # Just an example
preprocess_df(f1) # You can put this in your for loop
print(f1)
上面的代码将输出如下内容:
a b
0 0 1
1 4 5
2 8 9
评论