My DataFrame:
import pandas as pd
df = pd.DataFrame(
{
'a': ['a', 'a', 'a', 'b', 'c', 'x', 'j', 'w'],
'b': [1, 1, 1, 2, 2, 3, 3, 3],
}
)
Expected output is changing column a
:
a b
0 a 1
1 a 1
2 a 1
3 NaN 2
4 NaN 2
5 NaN 3
6 NaN 3
7 NaN 3
Logic:
The groups are based on b
. If for a group df.a.nunique() > 1
then df.a == np.nan
.
This is my attempt. It works but I wonder if there is a one-liner/more efficient way to do it:
df['x'] = df.groupby('b')['a'].transform('nunique')
df.loc[df.x > 1, 'a'] = np.nan