1
$\begingroup$

Say for example, I built a classification model for a mailing campaign that will be applied to 1M records. The positive class for the model would be customers and the negative records would be non-customers.

For the mailings, we select the top 100,000 scoring (highest predicted probabilities) records. We are expecting a low response rate but we want to make sure we send mail pieces to all of the highest predicted prospects.

We bin the model scores into deciles and only select from decile 1 (highest scoring) which we'll say is >= 0.10 predicted probability. The model has a 95% recall and 5% precision at this threshold.

Does it make sense in this scenario to get the highest recall possible? Or would precision be important here and I should optimize on that or just optimize them both together.

The goal is to get the most customer conversions within that 100k and we already know there will be a good amount of false positives but we want as many true positives as possible.

$\endgroup$

1 Answer 1

1
$\begingroup$

Neither. What you really need is for the probability estimate to be accurate. It's not enough to optimize just one of them (and you can't optimize for two metrics simultaneously, since better precision tends to lead to worse recall, and vice versa).

I suggest you minimize the cross-entropy loss function. This is designed to help produce accurate estimates of the probability. In some sense, that will optimize both precision and recall and trade them off in a particular way. The specifics will depend on what classifier you use.

$\endgroup$
2
  • $\begingroup$ Sorry I forgot to mention it above but I am using a gradient boosting machine (LightGBM) and am optimizing that on log loss (instead of AUC) which I think is doing what you suggested for the most part. I guess my question is related more towards if a recall of 95% and precision of 5% is what I would want to see for this use case. $\endgroup$
    – bbennett36
    Commented Mar 27, 2018 at 20:33
  • $\begingroup$ @bbennett36, in this context, log loss and cross-entropy loss are the same thing. And my answer remains the same: neither recall or precision are a good measure in this context. Recall is about a binary classifier (that outputs just "yes"/"no") but you are really interested in something that outputs a probability. Optimizing the log loss will already trade off between precision vs recall in the optimal way, so my advice is to just optimize the log loss and don't try to second-guess it. $\endgroup$
    – D.W.
    Commented Mar 27, 2018 at 21:18

Not the answer you're looking for? Browse other questions tagged or ask your own question.