1

When I sum an empty series I expect the value to be zero, as indicated in the Pandas documentation. Indeed, if I follow the example there, create an empty Series, and sum it then that is what I get

(Pdb) empty = pd.Series([])
(Pdb) type(empty)
<class 'pandas.core.series.Series'>
(Pdb) empty.sum()
0.0

However, I have a DataFrame that, based on the underlying data, can sometimes be empty. In that case, I have something like

(Pdb) prior
Empty DataFrame
Columns: [ILC, FCTC, AWD, PD]
Index: []
(Pdb) prior['PD'].sum()
False

This is unexpected, because

(Pdb) type(prior['PD'])
<class 'pandas.core.series.Series'>

This seems to be a situation identical to what I had in the first code block. Can someone help me understand what I'm missing? Why is it the sum in the second code-block returns False, whereas the first one returns a numerical value?

Edit

I've been asked to post the code that creates prior. There are several steps that go into creating it, but I can recreate this issue by create an empty dataframe and summing over one of the columns. See below:

In [1]: import pandas as pd

In [2]: %paste
dfObj = pd.DataFrame(columns=['User_ID', 'UserName', 'Action'])

print("Empty Dataframe ", dfObj, sep='\n')

## -- End pasted text --
Empty Dataframe
Empty DataFrame
Columns: [User_ID, UserName, Action]
Index: []

In [3]: dfObj['User_ID'].sum()
Out[3]: False
5
  • 1
    Could you share your code for creating prior? It could also be related to 0 being a falsy value.
    – Dallan
    Commented May 26, 2020 at 21:37
  • @Dallan It's a SQL query which uses the IBM_DB module and pandas.read_sql() to pull data from DB2. AWD and PD are summed values, and ILC and FCTC are grouped by. There is a where clause, but that's it as far as the query goes. If you still feel like posting the code would help, I can do so. Commented May 26, 2020 at 21:41
  • If it helps, I can recreate the same issue by creating an empty pandas dataframe and summing over one of the columns. Commented May 26, 2020 at 21:48
  • I also tried creating an empty dataframe and summing over one of the columns, but this resulted in the integer 0. Could you try explicit casting: int(prior['PD'].sum())?
    – Dallan
    Commented May 26, 2020 at 21:54
  • Explicitly casting it does fix the issue. I am curious to know what the underlying cause of it returning false is, though. Thank you for your help. Commented May 26, 2020 at 21:56

0

Browse other questions tagged or ask your own question.