Dataframe Append 与 pandas Concat [duplicate]

Dataframe Append vs pandas Concat [duplicate]

提问人:ACesar 提问时间:11/11/2023 更新时间:11/11/2023 访问量:53

问:

我正在尝试附加到数据帧,它正在工作,但是我必须使用旧版本的 python 才能这样做。我过去没有经常使用pd.concat。我不确定如何格式化我的代码,以便它输出与 append 相同的数据帧。

def new_df(df):
    Employee = []
    PayItem = ['E01', 'E40', 'E41', 'E42']
    Hours = []
    Dollars = []
    new = pd.DataFrame(columns = ['Key', 'PayItem', 'Hours', 'Dollars'])

    Employee = df.loc[0:27,'Employee Number']
    Key = Employee.values.tolist()
    Hours = df.loc[0:27,'Working Hours']

    for i in range(28):
        ok = []
        ok.append(np.nan)
        ok.append(df.loc[i, 'Product Commissions'])
        ok.append(df.loc[i, 'Membership Commissions'])
        ok.append(df.loc[i, 'Tips'])
        Dollars.append(ok)

    for i in range(len(Dollars)):
        for j in range(0,1):
            new = new.append({'Key' : Employee[i], 'PayItem' : PayItem[0], 'Hours': Hours[i],  'Dollars': Dollars[i][j]}, ignore_index=True)
            new = new.append({'Key' : Employee[i], 'PayItem' : PayItem[1], 'Hours': ' ', 'Dollars': Dollars[i][j+1]}, ignore_index=True)
            new = new.append({'Key' : Employee[i], 'PayItem' : PayItem[2], 'Hours': ' ', 'Dollars': Dollars[i][j+2]}, ignore_index=True)
            new = new.append({'Key' : Employee[i], 'PayItem' : PayItem[3], 'Hours': ' ', 'Dollars': Dollars[i][j+3]}, ignore_index=True)
        
    return new

我尝试使用 pandas concat,但是我的格式不正确并且没有得到相同的结果。

python pandas 串联 追加

评论

0赞 Bushmaster 11/11/2023
然后使用等等......new = pd.concat([new,pd.DataFrame([{'Key' : Employee[i], 'PayItem' : PayItem[0], 'Hours': Hours[i], 'Dollars': Dollars[i][j]}])])new= pd.concat([new,pd.DataFrame([{'Key' : Employee[i], 'PayItem' : PayItem[1], 'Hours': ' ', 'Dollars': Dollars[i][j+1]}])])

答:

0赞 mcjeb 11/11/2023 #1

pd.concat()连接两个 Panda DataFrame。因此,您需要先使用新条目设置一个数据帧。手工建造可能更容易。所以像这样:ok

def new_df(df):
    PayItem = ['E01', 'E40', 'E41', 'E42']
    new = pd.DataFrame(columns=['Key', 'PayItem', 'Hours', 'Dollars'])

    Employee = df.loc[0:27, 'Employee Number']
    Hours = df.loc[0:27, 'Working Hours']

    for i in range(28):
        ok = [np.nan, df.loc[i, 'Product Commissions'], df.loc[i, 'Membership Commissions'], df.loc[i, 'Tips']]
        Dollars = ok

        data = {'Key': [Employee[i]] * len(PayItem),
            'PayItem': PayItem,
            'Hours': [Hours[i]] * len(PayItem),
            'Dollars': Dollars}

        new = pd.concat([new, pd.DataFrame(data)], ignore_index=True)

return new

评论

1赞 user19077881 11/11/2023
首先形成所有数据(例如字典列表)会更有效率,然后只形成一次新的 DF,然后形成单个 concat。