I would like some insight into what I have been working on here

Question

I work in roofing sales which involves door-knocking at the entry level. Our daily numbers are printed in our GroupMe chat for our branch. I took it upon myself to do some analysis on those numbers. There are knocks, talks, walks, and contingencies (deals made).

I won't put you through everything I did, but I will tell you that I found a correlation of -0.41 between knocks and contingencies, meaning the more doors we knock, the less deals we make; there is also a high variance of door knocks, far outside the range of the daily numbers. I was very intrigued by this as sales jobs are almost always "a numbers game" meaning they use the law of averages - if you knock a ton of doors, eventually you will make a sale.

I told leadership about this and I kind of got a "book burner" response from them. I was told not share anything "negative" or tell people not to work to make more money; I abandoned discussing this with them and took it upon myself to dive deeper into it.

First, I calculated how many contingencies are signed on average out of 100 doors knocked. It is ~2.59. I used this as what I call "the assumed probability" in a binomial distribution function, and I plotted the probability of a sale at every number of door knocks (0 - 100). I found that it is most likely to produce ~2.59 sales at about 25% or so probability for 2 sales. The negative correlation told me that there is a point of diminishing returns, so I still did not believe that more equals more.

Second, I decided to do an expected value calculation for every single number of doors out of 100. The 2.59% chance of a sale remains constant at each door, of course; however, the penalty (resources used such as gas and time etc.) does increment with each door. I just called it $1 in gas per door, because that's about what it costs me. This told me that at 52 door knocks, you really have nothing left to gain. This made me realize the binomial distribution probability changes because the number of trials changes...

Third, I went back to my spreadsheet and changed the number of trials to 52; this showed that the most likely outcome was 1 sale, and the probability was a little over 33% (which is greater than ~25% probability of 2 sales in 100 trials).

Since this has all been based on real performance numbers at our branch, I feel like I have proven (statistically at least) that I have explained the negative correlation and proven that the "rise and grind" strategy if you will, of doing as many as possible every single day is counterproductive and causes sales reps to produce inconsistent numbers (explains the high variance).

My hypothesis now is that if all sales reps did 52 knocks per cycle (day, week, or some other time period), variance would be closer to zero or at least within range of the data, sales would increase, and the correlation would likely show a positive relationship between door knocks and contingencies as door knocks would generally be lower and the number of contingencies would be higher.

I would like to know from someone more experienced in this type of analysis if there is merit to my findings or if I have missed something. On a less formal note, I had a good bit of fun writing Python code to plot all of this.

It isn't clear to me what the observation units actually are, i.e., between what kind of observations you computed the negative correlation. At first sight it looks counterintutive. If you knock a certain number of doors and make a certain number of contingencies and then you go on and knock on more doors, the number of contingencies can't go down as you don't lose what has already been achieved. So how can there be a negative correlation? The penalty doesn't enter the correlation computation and can therefore not explain it!? Or do "contingencies" actually mean "contingencies per knock"? — Christian Hennig, Commented Jun 21 at 21:20
I took the array of the number of door-knocks each day and the array of the number of contingencies each day and calculated the correlation coefficient. This means on some days there were a low number of door-knocks and more contingencies, and on other days there was a higher number of door-knocks and less contingencies. Also, the penalty is in the expected value calculation: (reward * probability of success) - (penalty * probability of failure) — therealchriswoodward, Commented Jun 21 at 21:23
Not sure I've got enough detail to say much of help but one thing to consider is whether the apparent negative correlation might be omitted variable bias (e.g. see the diagrams near the top of the Simpson's paradox article on Wikipedia). For example consider that the sales people are knocking on more doors when the prospects are worst. That could easily produce a negative correlation. I didn't read Christians comments fully so he may have already been addressing this in his comments. — Glen_b, Commented Jun 22 at 1:13
When I did door-to-door sales, if I ever got in the door there was then a fairly long time spent trying to convince the homeowner to make the purchase, and even more time spent filling out the paperwork if the homeowner agreed. That time spent with prospects necessarily reduced the time left in the day to knock on more doors. — EdM, Commented Jun 22 at 12:43
My first thought was exactly the samne as @EdM. Rejection knocks may only take 1 minute from your sales guys ("Thanks, not interested"). An actual sales may take 30 minutes (pitch, explain, answer questions, get some paperwork signbed, etc.). So yes, in any given day, the more sales you close, the less time you have for additional knocks. And the more you are rejected, the more knocks you can make. As far as high variance, it could be explained simply by the location (high/low income, home owners/renters, etc...) — jginestet, Commented Jun 22 at 22:20

EdM · Accepted Answer · 2024-06-29 13:57:53Z

Summarizing comments into an answer:

When a potential customer opens the door and allows the salesperson into the house, then the salesperson must take the time to make the sales pitch. If the pitch is successful, then there will be further time expended to fill out the paperwork and finish the contingency. With a limited number of working hours in a day, the extra time spent for each contingency will diminish the time available for knocking on more doors.

The causality is thus probably opposite what might have been inferred from the raw correlations: it isn't that more door knocking leads to fewer contingencies, it's that more contingencies lowers the time available to knock on doors. Once again, correlation does not necessarily mean causation.

Stack Exchange Network

I would like some insight into what I have been working on here

1 Answer 1

Not the answer you're looking for? Browse other questions tagged
correlation
binomial-distribution
expected-value
or ask your own question.

Hot Network Questions

I would like some insight into what I have been working on here

1 Answer 1

Not the answer you're looking for? Browse other questions tagged correlationbinomial-distributionexpected-value or ask your own question.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
correlation
binomial-distribution
expected-value
or ask your own question.