1

I need your help on a pandas problem :

I am currently extracting data via APIs that contain gaps in their ranks.

However I need to take into account these on the dataset by replacing them with an average value.

Then I need to insert a row in my dataframe to fill the dataframe.

Illustration :

Here's what my problem looks like :

   rank timestamp value
0    1     21:50  3450
1    4     21:40  3442
2    5     21:41  5964
3    6     14:27  5258
4    7     13:10  3001
5    8     14:02  2782

ranks 2 and 3 are missing

So,hHere's what I'm trying to get :

   rank timestamp value
0    1     21:50  3450
1    2      NaN   avg
2    3      NaN   avg
3    4     21:40  3442
4    5     21:41  5964
5    6     14:27  5258
6    7     13:10  3001
7    8     14:02  2782

I know approximately how to deal with columns, but I have no idea how to deal with rows.

Do you have an idea ?

I have already tried to use "append" but I struggle then to reindex my dataframe :/

1 Answer 1

2

You can use reindex to add missing ranks and fillna to fill missing values.

df = df.set_index('rank').reindex(np.arange(df['rank'].min(), df['rank'].max()+1)).reset_index()
df['value'] = df['value'].fillna(df['value'].mean()).round()


    rank    timestamp   value
0   1       21:50       3450
1   2       NaN         3982
2   3       NaN         3982
3   4       21:40       3442
4   5       21:41       5964
5   6       14:27       5258
6   7       13:10       3001
7   8       14:02       2782
2
  • Oh yes ! That's perfect ! Thank you very much :)
    –  Diev
    Commented May 28, 2019 at 6:49
  • @Diev, thank you. If the question was answered completely, don't forget to mark it as accepted by ticking the check box next to the question. Happy coding!
    – Vaishali
    Commented May 28, 2019 at 14:14

Not the answer you're looking for? Browse other questions tagged or ask your own question.