How to keep column titles when inserting a sum row in Pandas dataframe

Question

I have a dataframe:

       Name    y1    y2   y3                  
 1     Ben     01    02   03
 2     Jane    04    05   06
 3     Sarah   07    07   06

I am trying to add in a row in my dataframe which provides a total of the rows in each column. My code is:

import pandas as pd

df = pd.DataFrame(np.insert(df.values, 0, values=[df.sum(axis=0)], axis=0))
df.set_value(0, 0,'total')
df.head()

This is successful, but also removes my column names like this:

       0       1     2    3                     
 0     Total   12    14   15
 1     Ben     01    02   03
 2     Jane    04    05   06
 3     Sarah   07    07   06

rather than returning this as desired:

       Name    y1    y2   y3                      
 0     Total   12    14   15
 1     Ben     01    02   03
 2     Jane    04    05   06
 3     Sarah   07    07   06

I have tried inserting

Index(['Name'], name=df.index.name)

to

df = pd.DataFrame(np.insert(df.values, 0, values=[df.sum(axis=0)], Index(['Name'], name=df.index.name) axis=0))

but this just returns the error

TypeError: unhashable type: 'Index'

Where am I going wrong?

stackoverflow.com/questions/24284342/…
– BENY
Commented Mar 21, 2018 at 13:27 — BENY, Commented Mar 21, 2018 at 13:27
Dream of each OP - a lot of perfect answers :)
– jezrael
Commented Mar 21, 2018 at 14:00 — jezrael, Commented Mar 21, 2018 at 14:00

Scott Boston · Accepted Answer · 2018-03-21 13:31:44Z

2

IIUC, you can do it this way, using select_types, assign, and pd.concat:

pd.concat([df.select_dtypes(include=np.number)
             .sum()
             .to_frame()
             .T
             .assign(Name='Total'),df])

Output:

    Name  y1  y2  y3
0  Total  12  14  15
1    Ben   1   2   3
2   Jane   4   5   6
3  Sarah   7   7   6

answered Mar 21, 2018 at 13:31

Scott Boston

152k15 gold badges153 silver badges198 bronze badges

Add a comment |

Bal Krishna Jha · Accepted Answer · 2018-03-21 14:04:30Z

2

One way to avoid this is to add a new row via .loc, then move it to the top:

df.loc[len(df)+1] = ['Total'] + df.iloc[:, 1:].sum(axis=0).tolist()

df = df.loc[[df.index[-1]] + df.index[:-1].tolist(), :]

#     Name  y1  y2  y3
# 4  Total  12  14  15
# 1    Ben   1   2   3
# 2   Jane   4   5   6
# 3  Sarah   7   7   6

You can use df.reset_index afterwards if this is important to you.

edited Mar 21, 2018 at 14:04

Bal Krishna Jha

6,8162 gold badges43 silver badges47 bronze badges

answered Mar 21, 2018 at 13:14

jpp

163k35 gold badges292 silver badges349 bronze badges

Add a comment |

jezrael · Accepted Answer · 2018-03-21 13:57:13Z

1

Solution with np.insert should be very fast, but is necessary create index with non numeric columns first:

#create index from `Name` column
df = df.set_index('Name')

#add first value to index
idx = np.insert(df.index, 0, 'Total')
#add columns and index parameters to DataFrame contructor and last reset index
df = pd.DataFrame(np.insert(df.values, 0, df.sum(), axis=0), 
                  columns=df.columns, 
                  index=idx).reset_index()
print (df)
    Name  y1  y2  y3
0  Total  12  14  15
1    Ben   1   2   3
2   Jane   4   5   6
3  Sarah   7   7   6

answered Mar 21, 2018 at 13:57

jezrael

854k100 gold badges1.4k silver badges1.3k bronze badges

This worked perfectly on a much bigger data set too. I also substituted in .mean for .sum which may be useful to others looking to do similar. idx = np.insert(df.index, 0, 'Mean') df = pd.DataFrame(np.insert(df.values, 0, df.mean(), axis=0),
– Alex
Commented Mar 21, 2018 at 16:14

Add a comment |

Graipher · Accepted Answer · 2018-03-21 13:17:24Z

1

You can use pandas.concat to stack two dataframes:

import pandas as pd
df = ...

df_total = pd.DataFrame(df.iloc[:, 1:].sum(), columns=["Total"]).T.reset_index()
df_total.columns = df.columns
df = pd.concat([df_total, df])
#     Name  y1  y2  y3
# 0  Total  12  14  15
# 1    Ben   1   2   3
# 2   Jane   4   5   6
# 3  Sarah   7   7   6

edited Mar 21, 2018 at 13:17

answered Mar 21, 2018 at 13:11

Graipher

7,09128 silver badges48 bronze badges

Add a comment |

BENY · Accepted Answer · 2018-03-21 13:47:07Z

1

You can try

s=df.sum()    
s.loc['Name']='Total'
df.loc[0]=s    
df.sort_index()
Out[457]: 
    Name  y1  y2  y3
0  Total  12  14  15
1    Ben   1   2   3
2   Jane   4   5   6
3  Sarah   7   7   6

edited Mar 21, 2018 at 13:47

answered Mar 21, 2018 at 13:40

BENY

322k22 gold badges173 silver badges247 bronze badges

Add a comment |

Collectives™ on Stack Overflow

How to keep column titles when inserting a sum row in Pandas dataframe

5 Answers 5

Not the answer you're looking for? Browse other questions tagged
python
pandas
dataframe
indexing
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Not the answer you're looking for? Browse other questions tagged pythonpandasdataframeindexing or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
python
pandas
dataframe
indexing
or ask your own question.