First, it's a bad idea to input your numerics as strings in your dataframe. Use plain int
s instead.
Your code currently forms a predicate, performs a slice on the frame and then finds the size of the frame. This is more work than necessary - the predicate itself is a series of booleans, and running a .sum()
on it produces the number of matching values.
That, plus your current code is not general-purpose. A general-purpose implementation could look like
from typing import Dict, Any
import pandas as pd
def match_count(df: pd.DataFrame, **criteria: Any) -> int:
pairs = iter(criteria.items())
column, value = next(pairs)
predicate = df[column] == value
for column, value in pairs:
predicate &= df[column] == value
return predicate.sum()
def test() -> None:
df = pd.DataFrame(
[[1, 2],
[2, 4],
[1, 4]],
columns=['A', 'B'],
)
print(match_count(df, A=1, B=2))
if __name__ == '__main__':
test()