2
$\begingroup$

In my website, users can either "like" or "dislike" a posted comment. I want to put a link to sort comments by liking such that the most liked ones becomes on top.

Of course I cannot just sort by the number of likes only. I have to subtract the dislikes. But what if the difference (likes - dislikes) are the same. e.g. 10 - 8 = 2 and 5 - 3 = 2. I think in this case, the comment of 10 likes and 8 dislikes has to come before the 5 likes and 3 dislikes comment.

So is there an equation that you feed it the number of likes and number of dislikes and then it gives you a meaningful rating number that I can sort with?

$\endgroup$
6
  • $\begingroup$ Why can't you code this directly? Given a list pick largest "like-dislike" and if two are equal sort by largest "like"? $\endgroup$ Commented Nov 27, 2010 at 17:07
  • $\begingroup$ Yes, your criterion defines a perfectly good total order relation. Doesn't your programming language allow you to sort using arbitrary user-defined comparison functions? $\endgroup$
    – user856
    Commented Nov 27, 2010 at 18:23
  • 1
    $\begingroup$ ...Although you should really look at How Not To Sort By Average Rating by Evan Miller, which solves your underlying problem and not just the symptom. $\endgroup$
    – user856
    Commented Nov 27, 2010 at 18:58
  • $\begingroup$ There are many algorithms used for this problem because there are many ways to interpret and to use the data. One very different strategy would be to put comments on top if they've been rated many times, with a mix of likes and dislikes. That means people are interested in the comment and are likely to respond. (This could lead to intelligent debate or to name-calling, but both can result in more people visiting your site.) $\endgroup$ Commented Nov 28, 2010 at 5:48
  • $\begingroup$ Jonas Kibelbek, what would that algorithm look like? $\endgroup$
    – Anas
    Commented Nov 29, 2010 at 19:34

3 Answers 3

2
$\begingroup$

You have to define what criterion will say that a pair $(L_1,D_1)$ for comment 1 is better than $(L_2,D_2)$ for comment 2. One way is just to subtract, so $(L_1,D_1) \geq (L_2,D_2)$ if $L_1-D_1 \geq L_2-D_2$. But on stackexchange there are many more upvotes than downvotes, so maybe you want $(L_1,D_1) \geq (L_2,D_2)$ if $L_1-10*D_1 \geq L_2-10*D_2$ or some such. Maybe you want to compare on $\frac{L-D}{L+D}$. There are many choices, and you need to consider your audience and their behavior to select one. Any such function will map (L,D) to some number, which you can then sort.

$\endgroup$
1
$\begingroup$

You can try $$f(L,D) = L - D + \frac{1}{D+2}.$$ This would sort first according to $L - D$ (ascending) and then according to $D$ (descending).

$\endgroup$
2
  • $\begingroup$ Why we add 2 to D at the denominator? $\endgroup$
    – Anas
    Commented Nov 29, 2010 at 19:37
  • $\begingroup$ We want $1/(D+2)$ to be less than $1$. $\endgroup$ Commented Nov 29, 2010 at 22:33
0
$\begingroup$

Here's what governs (or at least used to govern) the Reddit "best" comment sorting:

http://www.evanmiller.org/how-not-to-sort-by-average-rating.html

The lower bound of the Wilson score confidence interval represents an estimate of "at least" how good a comment should be.

However, if you want to entertain people into conversations and also show "new", "untested" comments on top, you might want to consider the upper bound (replace the "+/-" with a "+").

That way, the comments are sorted by the optimistic potential they have ("at most"), given current votes.

So, use the lower bound to see proven-good comments on top, and the upper bound if you want new comments (of unknown quality) to be above.

$\endgroup$

You must log in to answer this question.