In “network hot” questions formula, discard answers when voting evidence indicates that these are not good data points

Question

TL;DR When votes of 20... 30... 100 users clearly indicate that only one or two answers are popular, it does not make sense to pretend that other answers are popular too.

In current version of “network hot” questions formula (AnswerCount * Qscore) ^*, all answers up to 10 are assumed to equally contribute to question "hotness score", including even those downvoted into oblivion.

Suggest discarding answers when there is a strong evidence that these do not provide good data points for question popularity, such as answers with score less than 1/10... 1/20... 1/100^{* * *} of top voted one.

Note that hotness formula "specification" ^* assumes exclusion of the answers that are not considered "good data points". An example of this is a justification for discarding accepted answers:

Note that accepted answers weight not at all in hotness. This is intentional, as I feel accepted answers are a fine social contract, but not a good data point for question or answer quality.

As far as I can tell, indiscriminate inclusion of low score answers in questions having lots of views and votes also goes against another underlying assumption in formula "specification":

one assumes... there will be a lot more voting on the answer

Suggest to discard answers having score ^* less than (TopAnswerScore/10-1).

^{This way, answers at -2 or less are ignored when question has answers with non-negative score. When some answer reaches score 10 (qualifies for Nice Answer badge), formula would start discarding answers with negative score. When there is an answer at +20, formula would ignore those having less than +2, when there is an answer at +100, it would ignore those scoring less than +9, and so on...}

Negative impact of counting low quality answers is closely related to popularity of MultiCollider^* and Hot Questions sidebar^* which use hotness score to arrange list of "hot questions"^* displayed to Stack Exchange users.

When SE users visit the questions from the top of sidebar (previously from collider), some of them choose to add their answers to it. Since this audience involves hundreds users, amount of answers brought into question could quickly increase by 5, 10, 20...

By indiscriminately counting these answers into hotness score (even when voting evidence suggests opposite), formula pushes impacted questions closer to the top of sidebar, which in turn brings more visitors, who in turn add their answers, which in turn push the question even higher at the sidebar and so on, and so on, over and over again, creating a positive feedback loop ^* of uncontrolled artificial increase of the hotness score.

This uncontrolled hotness growth in turn blocks a time decay mechanism "embedded" in the formula (MAX(QAgeInHours + 1, 6) ^ 1.4), which in its turn also contributes to positive feedback loop mentioned above.

A popular question quickly acquiring 9/10 "noisy", low interest answers, can stand on the top of sidebar for hours, even when the number of really popular answers remains the same, nor there is a substantial increase in the number of views (several hundreds views from sidebar do not come close to thousands views that come when question becomes fairly popular from outside of Stack Exchange).

Discarding answers proven to be "insufficiently hot" would allow for more efficient functioning of time decay component of the formula.

The last but not the least, contribution of under-scored posts into "question hotness" (which in turn blocks intended time decay) forces questions with multiple low quality answers stick for a long time at the top of the sidebar, making wrong impression on what kind posts are welcome at Stack Exchange.

This makes it look like good questions are those having many meh answers, the effect that is amplified by these questions being highly visible to sidebar audience - hundreds and thousands of SE users. Misguided users spread acquired attitude further into other questions and answers, posting stuff that follows what they saw at the "cool" ("hot") questions.

As far as I can tell, this jeopardizes the very idea ^* of making an Internet a better place.

Please stop counting proven low score answers in hotness formula. Please roll the dice fairly, let user voting and time decay contribute to hotness score as intended. Please promote to sidebar audience less brain-damaging content to learn from.

Update. Functionally limited, but performance-friendly alternatives of this feature request:

for the record: hotness formula damage case study - "+25, +14, -1, -1, -1", hotness formula damage case study #2 - "let that devil out" — gnat, Commented Aug 6, 2013 at 7:39
those interested to see the issues I refer to at work right now, are welcome to take a look at the question currently sitting on top of collider with funny score 100. At 21 hours and about 2,5K views, it has 13 answers (maybe there are also deleted ones) that are voted as follows: 53, 26, 20, 16, 9, 7, 4, 4, 4, 3, 3, 2, -5. Suggested formula would ignore 7 of 13 answers, making score about 70 or even lower, if one considers an effect of broken positive feedback loop (less eyeballs -> less votes) — gnat, Commented Aug 6, 2013 at 12:02
Given the further explanation, "Note that accepted answers weight not at all in hotness" seems to be more about the checkmark (i.e. the 'accepted' part) than the answer itself. It probably counts like any other answer. Otherwise, accepting an answer would reduce hotness? Seems weird. — Daniel Beck, Commented Aug 15, 2013 at 19:22
for the record: importance of hotness formula correction - Q&A -- Conversation in The Whiteboard "this makes indirect "brain-damaging" impact network wide ...nobody is really protected, collider spreads it across all the SE network" — gnat, Commented Aug 31, 2013 at 22:16
I don't have an answer or really even a useful reply here, but given its popularity I feel like I should note that we're in the midst of taking a good hard look at what and how we display in the network-wide "Genuine Stack Exchange Multicollider Doohickey Thingamajig" - expect a discussion on this from our design team at some point in the next 6-8 days. — Shog9, Commented Sep 27, 2013 at 0:19
@Shog9 in case if that matters, I personally like how collider works now, except for the "positive feedback loop" I talk about in my request. I especially like how current formula is sensitive at picking promising questions at early stages. If you aim at more radical changes, I am merely mildly interested to learn about these... — gnat, Commented Sep 27, 2013 at 6:43
...a guy worth discussing radical changes with is likely Mysticial, per our discussion it looks like he gave these quite a lot of thought — gnat, Commented Sep 27, 2013 at 6:44
@Shog9 - good news, thanks for sharing the update. Feel free to pull me into a chat if you want me to describe what I saw in more detail. — user194162, Commented Sep 27, 2013 at 11:34
for the record: details on some tweaks in hotness formula (?) "Succeeding questions from the same site are penalized by increasing amounts. So, the first question from SO in the list gets multiplied by 1.0, the second by 0.98, the third by 0.96, etc)... Community wiki questions are penalized... The benefit of many answers is capped at 10, and we only look at the score of the top 3 answers. We only degrade based on question age, and not the last update date on a question, so questions don't pop back up to the top every time they're edited." — gnat, Commented Oct 1, 2013 at 8:00
@gnat well, all the bounties ended up with a rejection... better than nothing, eh? ;) — Shadow Wizard, Commented Oct 1, 2013 at 13:09
@ShaWizDowArd if only rejection would somehow make the reported issues disappear. But, oh, it doesn't — gnat, Commented Oct 1, 2013 at 13:23
for the record: an item intended to address involved issues has been submitted to Feedback request: New top bar and MultiCollider redesign — gnat, Commented Oct 25, 2013 at 11:56
for the record: feature request offering a solution modified to comply with decline reasons of this one is Reorder questions picked for hot list based on adjusted hotness score (discard some answers by voting evidence) — gnat, Commented Jan 15, 2014 at 17:51

Community · Accepted Answer · 2017-05-23 12:37:00Z

TL;TMWtB

That's a brilliant idea! and would bring some much needed balance back into the hotness values! This change should really cut down on the run-away questions.

_{* Too long; tell me what to believe}

Data to back the assertions

Being an engineer, I have this crazy predisposition for having numbers to back assertions when making decisions. So I gathered some data, and I'm editing my answer to reflect what I found.

Near the end of August 2013, I noticed a question that looked like it would ride on top of the hotness collider for a while. Please note, I'm not suggesting anything was wrong with that question. I just thought it would provide some useful data for this meta question.

This is what I found. Apologies in advance for having to break up some of the images.

Baseline

Raw basis of question

Answers and totals

As-is Collider values

Times are GMT, and I'm not certain about the values I calculated for the denominator. The denominator is supposed to be the age-drag effect on the equation. Essentially, I took the calculated numerator and divided by the Collider value in order to determine what the denominator needed to be. The problem is that most of the time the denominator (age-drag effect) actually boosts the question for about the first 7 hours of the Q's life.

I assumed it was safe to ignore the denominator since it wasn't fully relevant to the proposed change.

Looking at the "percentage of numerator from ..." columns, we see that QScore * NumAnswers / 5 grows in overall impact to the collider value. In effect, the quality of the answers (aka the sum of their scores) is outweighed by the score of the question and the total number of answers. So we see that noisy answers help out the question's hotness value even though they don't help the question itself. Quite telling is the scores on the answers themselves - that's a long tail of low to no up-votes on answers.

Non-zero scored answers

So now what happens if we ignore all the zero scored (or lower) answers?

Non-zero answers

Our number of answers to consider drops off and holds steady at 9 throughout the time I watched this question. Notice that the relative effect of QScore * NumAnswers / 5 versus SumAnswers stays pretty consistent. My interpretation is that ignoring zero scored answers from the Collider formula would be a good step forward in curbing excessive hotness scores.

However, there's a problem in that the result of QScore * NumAnswers / 5 will always remain zero now until an answer gets an up-vote. That seems a little unfair to a fledgling question. Even a highly up-voted question won't gain traction in this category until an answer gets an up-vote.

10% threshold

So let's see what happens when we implement the 10% threshold as suggested by the OP (gnat).

10p Threshold

I think the first thing to notice is how SumAnswers starts to outweigh the other factors for the numerator as the question ages. Personally, I think this is the right approach for us to have. As a question ages and attracts answers, it really should be the quality of those answers that determine how hot a question should remain.

As an additional benefit, the impact from total views is a little greater and doesn't decay quite as quickly. With our current collider implementation, the number of views quickly drops off as a factor in the hotness score. In this case, the total views retains a little over half its original impact.

In case you're still following along, you might be wondering what's the impact of my suggestion to have negative voted answers drag down the hotness value? Honestly, it's not all that much. At the end, only 2 questions had a negative value at -1 each, so take 2 points off of SumAnswers. QScore * NumNegAnswers / 5 yields 17 which is about an additional 20% off the final numerator value from my observations.

Summary

Granted, it's only one question and one set of observations. And you can't draw a trend from only one point.

But I think there's enough information to show that this suggestion is worth considering. It would slow down how quickly hotness scores can ratchet up without punishing high quality questions. In fact, it would probably give up-and-coming high quality questions a better chance since the impact of the answer scores has a greater weight.

So please let the record show that engineers and their equations sometimes lead us to clearer discussions around change. :-)

Previous, and now a bit dated, thoughts

I think you have a solid suggestion in reining in the issues with the hotness formula, but I don't think your approach goes far enough. [edit: looks like the data shows it would be pretty decent without my suggestion]

Instead of just ignoring negatively scored answers, they should drag the hotness value down. Any answer with a negative score should have an equal, but opposite, effect on the hotness value. So whatever effect it would have had as a positively voted answer should be flipped around.

One of the problems that you're trying to address and describe in the under-damped feedback cycle is that poor questions attract poor answers and the current formula encourages that behavior. My suggestion is to extend your request and move into an over-damped control system so the formula will punish attracting poor answers.

This also provides an incentive to the community to down-vote the poor answers knowing that it will reduce the potential popularity of a poor question.

Adding my suggestion would get you the cooling effect you're looking for in a quicker fashion.

yeah I was pondering about this for a while. Preferred to keep it out of scope of this feature request mainly in order to keep it as a simple straight "bugfix" (no additional cooling, strictly removal of "bad data points"). As far as I can tell, for poor questions, this would work exactly as you suggest. Thing I can't make up my mind on is, what will it incentivize for good questions? Imagine a good question undeservedly polluted by a couple of meh braindumps from hotness lemmings, what is one supposed to do about these?... — gnat, Commented Aug 6, 2013 at 18:06
...Downvoting would drag the question down, would this be fair? No voting would make an impression that meh is acceptable, would this be OK? Flagging for deletion doesn't feel alright, too: "I for one would not want to rely on mod discretion splitting meh and okay answers neither at Programmers nor at Workplace (with all due respect)." — gnat, Commented Aug 6, 2013 at 18:06
IMO, it's reasonably rare that a good question is going to attract a number of poor answers. One or two, sure, but not a lot. It's the poor questions attracting lots of poor answers. And they should all be burned (where's that graphic?). But I may be suffering from selective bias in this case. In my suggestion, the community can help shape this by tolerating "meh to poor" answers but also beating down the poor answers in order to shape the hotness of the question. So I see this as a good lever for the communities to watch over questions with. — user194162, Commented Aug 6, 2013 at 18:36
when the question gets on top of collider, possibility of it to pick up a bit of crap from lemmings rises dramatically (lemmings are brainless, they sometimes even don't read the question). That's why I hesitated to include cool off part into request; if anything I'd rather prefer to first see how simple, straight change works and only after a while, if needed, proceed with more complicated adjustments from there — gnat, Commented Aug 6, 2013 at 18:43
speaking of dampening, my last amplifier before I quit audiophilia was SET. Taught me that lots of dampening isn't always better (for the sake of precision, speakers I used it with were very easy load). I am not certain yet how this translates to Stack Exchange; after all "speakers" here don't feel like an easy load :) — gnat, Commented Aug 6, 2013 at 23:58
@MarkBooth - I think the only answer is an unequivocal "yes." Those are hilarious. — user194162, Commented Aug 7, 2013 at 3:16
@gnat In control terms, you never want a system to be under damped or over damped, you always want it to be critically damped and the closer we can get the hotness formula to that ideal the better in my opinion. — Mark Booth, Commented Aug 7, 2013 at 8:07
@MarkBooth - that's a good point on terminology. And I think you're right - our idealized system would be a critically damped system as it would give middling questions a better chance at being corrected and then rightfully using the hotness score. At the moment, swinging the system into a not-too over damped state would be a refreshing change of pace. — user194162, Commented Aug 7, 2013 at 13:41
@MarkBooth Damned engineers and their "equations" :) Without a door analogy in that article, I wouldn't have a slightest idea what that critical damping means — gnat, Commented Aug 7, 2013 at 16:46
I don't really like this solution, due to sometime there are answers that are just crap to a really good question. — James Mertz, Commented Aug 28, 2013 at 15:04
@KronoS: I think it is a sign of a 'bad' question when you become a magnet for bad answers. — user7116, Commented Sep 12, 2013 at 15:19
"Even a highly up-voted question won't gain traction in this category until an answer gets an up-vote" -- I think this is a very valid concern. Suggested change addresses it by subtracting 1 from TopAnswer/10. As a result, at early stage, when there are not yet enough votes, it simply falls back to current score... — gnat, Commented Sep 12, 2013 at 21:12
...In your test example, early data points are missing, but very first one, at 13:15, with top answer being at relatively modest +30 looks good enough to extrapolate for cut at (TopAnswer/10-1). Up until top answer got to +19, there would be no cut at all (19/10-1=1-1=0, all answers still qualify) meaning that collider score would be the same as it was when you tracked it, no changes at all... — gnat, Commented Sep 12, 2013 at 21:15
@gnat - I definitely agree with your comments there. It's "easier" to implement an algorithm where only positively scored answers count, but that's unfair to a great question that hasn't attracted up votes on the answers yet. — user194162, Commented Sep 12, 2013 at 21:57

Community · Accepted Answer · 2017-03-20 10:31:32Z

12

+100

Sorry, marking this as status-declined.

We love seeing people contributing in concrete ways (especially backed up by research!) but in this case, not knowing the implementation details makes it almost impossible for you guys to solve this one.

The hotness formula that's used for the network hot question list is not the same as the one for the "hot" page. Just superficially comparing the two pages shows you that -- the hot page favors questions that are "instantaneously" hot. The collider algorithm does a lot to correct this, documented here. Because of that, a lot of the research here is based on a false premise.
Specifically, we already cap the number of answers that contribute to the score at 10. This means that any answer after 10 does not contribute to the hotness score.
In fact, views don't contribute either. They turned out to be inefficient to query, so somebody at some point just removed them from the calculation.
This query is very much based on "what data is easily queryable?". We already have a denormalized column for AnswerCount, so using it in the formula is easy. Forcing the query to actually consider each answer and count it or not based on a formula is not possible in a query that has to cover so many posts. Adding a denormalized field just for this seems like overkill. It'd be much easier to attack this using the columns already in the DB.

Since this question is based on such a specific modification to the algorithm, I'm marking it status-declined. If you want to pursue this, I suggest opening a more open-ended feature-request that demonstrates the problem you perceive ("Hot questions stay at the top of the supercollider for too long", or similar) and then we can start playing with the formula.

edited Mar 20, 2017 at 10:31

CommunityBot

1

answered Oct 1, 2013 at 12:57

David Fullerton

37.6k8 gold badges98 silver badges109 bronze badges

2

regarding the "more open-ended feature request" you mention, does this one fit: Don't let questions stick to the top of the hot questions list forever? Per my reading it points to the issue almost precisely as you describe: "Hot questions stay at the top of the supercollider for too long"
– gnat
Commented Oct 1, 2013 at 13:13
4

I think it's also worth mentioning that even cap at 10 answers doesn't really address the issue of using "bad data points" in the formula - "Even with tweaked formula, stuffing 8 useless, zero-score answers into +50 question would have the same effect as giving 80 upvotes to answers. At +200 question, this would be like giving 320 (over three hundreds!) upvotes to answers... and there's even a real, recent example for that..."
– gnat
Commented Oct 1, 2013 at 13:42
Looks like this is the current formula: meta.stackexchange.com/a/61343/194162. Thanks for updating your other answer!
– user194162
Commented Oct 1, 2013 at 16:27
1

There are a few things I'd change about the formula: A hard cutoff after x days, scaling not only for traffic but also for voting behaviour and removing CW questions (due to the few sites still using CW for soft questions). I think all of those have feature requests already. I'd like to play around with the algorithm to make more specific suggestions and add some hard data to support them, but as far as I see the necessary data is just not accessible to regular users.
– Mad Scientist
Commented Oct 1, 2013 at 19:18
wrt "denormalized field" to keep evidence-based score, I never ever imagined having stuff like that for every question - it just does not make sense as there are only 100 questions in hot list anyway (half of them at collider). Just 1) select 200-300 top-hot questions using current formula as approximation 2) recalculate score for these using voting evidence and 3) select top 100 by re-calculated score. Simple as that
– gnat
Commented Oct 1, 2013 at 19:55
regarding views, in the context of this feature request, it is purely tangential whether formula counts these or not; request explicitly factors views out: "several hundreds views from collider do not come close to thousands views that come when question becomes fairly popular from outside..." (and yeah, at Programmers, we even tested and discussed this)
– gnat
Commented Oct 1, 2013 at 21:13
David I wonder if you realize that the way how you describe the formula ignores community feedback in quite a brutal way? Just think of what happens when users downvote 5th, 6th, 7th low quality answer - formula keeps stuffing the hotness score with QuestionScore/5 no matter what, only making matters worse, increasing question exposure with all the associated risks of getting more low quality answers (just what has been observed in referred studies)...
– gnat
Commented Oct 8, 2013 at 14:06
@gnat I'm not sure what you mean. When users downvote answers, AnswerScore decreases, which decreases the overall hotness. See the formula here: meta.stackexchange.com/a/61343/146719
– David Fullerton
Commented Oct 8, 2013 at 14:25
1

@DavidFullerton - Yes, down voted answers reduce the AnswerScore, but they don't degrade the QScore portion of the numerator. QScore is multiplied by number of answers, which means that a poor answer (up to the tenth one) feeds into the final score. For example, a question with 50 upvotes will pick up 10 collider points for every answer (up to 10) that it receives. The 5 or 6 poor answers on the question will still add 50 - 60 collider points in this example.
– user194162
Commented Oct 8, 2013 at 14:29
David, I specifically refer to "we only look at the score of the top 3 answers" (hope this part is really gone and is not just lost in edit). And anyway, did you do the math (using eg this example): adding an answer (any answer) increases hotness by 83/5=16, how many downvotes would you expect to compensate for that? Note BTW that the best community effort I've ever seen at Programmers was like 5-6 downvotes, and it was barely sufficient to overcome sympathy upvotes piling on, thanks to Trouble with popularity
– gnat
Commented Oct 8, 2013 at 14:31
1

To continue the example, let's say I downvote all 6 of the poor answers. The 6 answers have contributed +60 to the collider value while my DVs contribute only -6 to the collider value. So the poor answers have a disproportionate effect in attracting attention to the question. Even if gnat comes along and DVs the 6 along with my DVs, that's still only -12 to AnswerScore vs +60 from QScore*Answers. Giving that DVs are comparatively rare and there is minimal effect from them, the community can't rein in a poor question that's hit the collider.
– user194162
Commented Oct 8, 2013 at 14:42
found the reference for best community effort I mentioned: diamond/10K-only link to deleted post. 9 DVs on the question and up to 6 DVs on the answers. Yeah we try to do our best to keep site professional but we can only do so much
– gnat
Commented Oct 8, 2013 at 14:43
1

@gnat ah, I see what you mean. I agree, the current formula isn't ideal. It's on my list to look at more when I get some time.
– David Fullerton
Commented Oct 8, 2013 at 17:14
thanks for understanding! in my experience isn't ideal is very soft charachterization
– gnat
Commented Oct 8, 2013 at 17:25
1

@gnat it's worth noting that the 10th -> final answers are pretty much chronologically ordered since they never received any upvotes. Based upon the algorithm clarifications, I don't know that those answers contribute anything to the collider score. They're shunted from the QScore multiplier and they have zero votes so no contribution to SumAnswers
– user194162
Commented Oct 23, 2013 at 13:41

| Show 6 more comments

Community · Accepted Answer · 2020-06-03 13:30:57Z

For those interested to support, or oppose, or adjust proposed change, here is more detailed explanation of suggested "cut value":

`(TopAnswerScore/10-1)`

Above is mostly based on my studies of questions listed in two meta posts:

_{For the sake of completeness, there were other multiple discussions involved, but the main focus was on the data in above two anyway.}

Studying low quality *) answers in questions listed in above posts made me believe that these tend to have low score.

Also, it looked like the way how current formula tries to take this into account in sum(Ascores) is insufficient due to indiscriminate stuffing of these into Qanswers * Qscore: eg in question scored +100, any answer with zero score makes same contribution as 20 upvotes to other answers.

I tried to figure how formula could be adjusted to better take into account answers that have low score.

My first approach was to consider ignoring negative score answers. This idea is extremely natural, for it's hard to imagine how these could be considered "popular". I think if I proposed it as MSO back then, we'd have modified formula several months ago.

The problem with it was, it didn't work on the data I studied. There were just too few answers with negative score to make any substantial difference. My guess is, this is due to how sympathy upvotes work in hot questions. Negative score answer tends to stand out among the rest and when there are hundreds, thousands eyeballs on these, there is a good chance for someone to "like".

^{It was sort of revelation that happened when I observed particular hot question. Among many others, there were 3 answers of different degree of crappiness, each voted accordingly at -1, -2, -3. Few hours later, all 3 answers were upvoted back to zero, and kept staying there, buried in the sea of other mediocre, low score answers.}

Okay, I tried to figure how it will work if the answers are cut at zero score. No luck either, there were too few to make a difference. Random upvotes in high view questions made it practically impossible to consider a cut at any constant value.

Cut like at +2 would probably be OK at the question with 1K views, but at one with 3K views, it would fail miserably and positive feedback loop of fake popularity would kick in again. At 3K views, one would rather cut at something like +5, but oh, such a cut would make no sense at lower views questions, with views at or below 1K. Dead end.

Messing with ideas of cut at some constant score has led me to believe that cut would better be proportional to the score of the top voted post (question or answer). I tried TopPost/100 and it made good sense on the set provided in sticky questions post but made too little difference in examples in answer quality post, so I decided to see how it would work if I cut harder.

At about this time, I also made a change that I would want it to be TopAnswer rather than TopPost, for I wanted cut to be based on comparison of more uniform kinds of posts (comparing score between answers felt more reliable than between answers and question).

If memory serves, cuts between TopAnswer/5 and TopAnswers/15 worked fairly well on my data sets, so I decided to pick a single value of these, the one that would be easier to explain and understand.

TopAnswer/10 made it, as it blends well in Stack Exchange "value system": 10x score difference is one that makes a Nice Answer, same difference is between Nice and Great answer, all right.

Okay, let it be cut at about TopAnswerScore/10, now I needed to figure how to score questions at early stages, when there are just too few votes to make any sense of this?

My studies of how hotness score works made so far led me to believe that at early stages, current formula works quite well (try to study it yourself, you'll find out that it starts to really mess things only when question gets to higher views / votes) so I basically wanted it to work until there are sufficient votes to reliably switch to "new cut".

I wanted current formula to work until at least one of the answers get votes about to qualify it for Good Answer badge, ie until about +25. I figured that if I subtract 1, that will make a good approximation, that made the final version: TopAnswer/10 - 1. Quoting self,

This way, answers at -2 or less are ignored when question has answers with non-negative score. When some answer reaches score 10 (qualifies for Nice Answer badge), formula would start discarding answers with negative score. When there is an answer at +20, formula would ignore those having less than +2, when there is an answer at +100, it would ignore those scoring less than +9, and so on...

_{*) "low quality answers" -- my evaluation of quality was subjective. Simply speaking, I read the answers and tried to figure whether these make a fair attempt to address the question asked and provide an explanation. I tried to abstain of opinions, ie one-liner slogans lacking an explanation were considered low quality even when I agreed 200%.}

Performance considerations

As soon as I arrived at conclusion that cut would better be proportional to the score of the top voted post, I took a while to ponder on performance. This algorithm looked substantially harder than current formula and even without knowing if it's feasible or not, I wanted a variant that would be comparable to current one in terms of performance.

For comparable approach, I took into account that 1) I really need an evidence-based score for a limited amount of posts (just those at collider) and 2) current formula tends to get in trouble only when posts are near the top of collider.

Above brought me to idea to pre-select candidates for collider using current formula (which is already proven to have acceptable performance) and only re-calculate and reorder these pre-selected posts, like just few hundreds questions.

My idea for amount of questions to pre-select was like 2x (3x, 4x) of amount needed for collider (the more the better, as long as performance remains acceptable). The point of 2x-3x-4x multiplier is to ensure that "candidates pool" includes questions that have been out of collider. As a result, even if currently hot questions have their initial approximate (old-fashioned) score screwed up by low interest answers, these are still guaranteed to compete against questions that gained popularity in a more natural way.

Feature request based on performance considerations above is: Reorder questions picked for hot list based on adjusted hotness score (discard some answers by voting evidence).

Performance considerations - age decay

Update: another performance-friendly way to approximate suggested feature is to make aging factor depend on the amount of answers, so that questions with more answers (5, 10, 20, 40...) start aging away sooner and stronger.

Underlying concept is the same as in proposed feature, that is new answer is either 1) voted high enough to compensate for increased age decay and keep popularity high, or 2) makes the hotness score decrease faster, thus lowering question exposure and decreasing chances for further damage.

Purpose is also the same, to make for a more robust, self-correcting process, when low quality answers contribute to resolving the problem instead of making it worse.

Feature request based on performance considerations above is: Make hot questions with multiple answers age away faster on smaller / subjective-ish sites.

psr · Accepted Answer · 2013-09-06 22:12:38Z

4

+200

I think this is a great idea though any algorithm in the ballpark would probably be fine. For example (Qscore*(Sum of AnswerScores)) would probably be fine and at least a little easier to implement. It would also allow great answers (which we are theoretically optimizing for, pearls not sand and whatnot) to bring a question into the collider, and presumably we want to show off the great answers at least as much as the great questions.

answered Sep 6, 2013 at 22:12

psr

1,88913 silver badges15 bronze badges

Add a comment |

Jimmy Hoffa · Accepted Answer · 2013-09-05 18:01:49Z

Personally I think this is a great idea. The collider in my experience clearly habits to attach to a fixed set of questions which all tend to have the same thing in common: Being the sorts of things everybody has an opinion about. If answers that lack distinct quality are not counted in the chance to enter the collider, this will have a clear and distinct affect on questions of that nature, which are frequently not particularly good to begin with.

The synergy between questions that attract many answerers and the advertising in the collider can clearly only have one affect, which is going to be good when the many answerers attracted are adding quality up-voted content. When it's everyone adding their two-cents-opinions in and doing no benefit to the information archive that is SE however, it clearly benefits everyone to stem the tide into the question that the collider brings.

Stack Exchange Network

In “network hot” questions formula, discard answers when voting evidence indicates that these are not good data points

5 Answers 5

`(TopAnswerScore/10-1)`

Performance considerations

Performance considerations - age decay

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
feature-request
status-declined
voting
stack-exchange-button
hot-questions
.

Linked

Hot Network Questions

In “network hot” questions formula, discard answers when voting evidence indicates that these are not good data points

5 Answers 5

(TopAnswerScore/10-1)

Performance considerations

Performance considerations - age decay

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged feature-requeststatus-declinedvotingstack-exchange-buttonhot-questions.

Linked

Related

Hot Network Questions

`(TopAnswerScore/10-1)`

Not the answer you're looking for? Browse other questions tagged
feature-request
status-declined
voting
stack-exchange-button
hot-questions
.