Bayesian Ratings (to make sort by rating actually useful)

defucilis · Sep 27, 2023

Hello! Just a quick thought: at the moment, sorting by ranking is basically useless as a story with a single 5-star rating will be placed lower than a story with 10,000 5-star ratings and one 4-star rating.

This is easily fixed by implementing a simple bayesian ranking system, which is very, very easy to implement.

You require two constant values that you can easily calculate from the database (the average ranking for all stories, and the lower quartile for number of ratings). You can generally just do this once and leave it, but if you wanted you could always manually recalculate it a few times a year if you wanted. Or you could just use some common values of 3.5 star average rating and 100 for average rating counts :P

Then you can just calculate a bayesian rating value for all stories, which is basically a combination of the rating, and how statistically likely that the story's 'true' rating reflects that value (i.e. a 5-star story with one rating is less likely to actually be a 5-star story than a 5-star story with a thousand ratings)

Here's a page explaining what it is, how it works and how to calculate it: https://www.algolia.com/doc/guides/managing-results/must-do/custom-ranking/how-to/bayesian-average/
The formula is literally just:

globalDbFactor = globalAverageRating * globalNumberOfRatingsLowerQuartile //calculated once from the entire database

bayesianAverage = (storyAverageRating * storyNumberOfRatings + globalDbFactor) / globalDbFactor

And in case 'number of ratings' isn't something that the database tracks, you can use a roughly equivalent metric like view count, favourite count, etc.

Anyway please think about it! I feel like the benefit to amount of development effort is pretty good on this one :)
When I implemented this on my site I was surprised and delighted at how good the results are!

AliceShiki · Sep 27, 2023

Tony is aware of how Bayesian averages work. He doesn't see the need to implement something like it because the current rating system works just fine.

Don't expect a change to happen.

TotallyHuman · Sep 27, 2023

AliceShiki said:
the current rating system works just fine.

How?

AliceShiki · Sep 27, 2023

TenderHuman said:
How?

Because standard averages still give you a good notion of the overall feeling of the readers about the novel. If it has 100 5* votes and 10 1* votes, it doesn't matter what average method you use, you're still gonna get a high average regardless.

As for searches like the OP is talking about, the end user can easily disconsider the small stories with a small number of ratings from their considerations, as they'll be able to see that the story has a small number of ratings at a glance at the series' page, so this won't cause any issues. Once they look at the bigger stories, the standard method to calculate averages will serve its function just fine.

Essentially, the "problem" with standard average when taking ratings into account is just something that is applied to small stories that have yet to get a sizeable number of readers... And well, if you only wanna read stories with a high rating, it's trivial to exclude those stories anyways, so this ends up not being a problem.

So, the system works fine as is. Tony doesn't need to change anything.

And honestly, if he wanted to use Bayesian averages, he'd have already done that from the start, since he is aware of their existence. The fact that Scribblehub uses standard averages by default kinda points to Tony preferring this system.

If I were to make a guess, he made this choice probably to make it more intuitive for readers/authors and to avoid having people post "bug reports" about the average ratings not being calculated right, when they would just be using Bayesian averages instead of the standard averages, which would not be intuitive to the end user at all.

So yeah, current system is fine. Don't expect it to change.

defucilis · Oct 1, 2023

I think you might have misunderstood what I'm suggesting the problem is...

Just to illustrate: Here's what I get if I search for Adventure stories, sorted by Ratings:

Here is page 100:

And here is page 250:

You have to jump to page 294 to start viewing stories with average ratings of 4.9, at which point the number of views/favorites jumps way up since the average rating is now meaningful - although it's still extremely variable. So I wouldn't consider it 'trivial' to just ignore the stories with low numbers of views. And I also wouldn't consider a ranking system that requires you to skip past the first few hundred pages to be working as intended.

I definitely don't want to argue the point if you're made up your mind already, I just wanted to clarify that the sort-by-rating system is currently not useful. I'm personally happy with the site as is and will continue to use it happily, but wanted to draw your attention to an easy way to improve things. I've been having the issue where I want to view popular stories that are *also* rated highly. Sorting by popular/favourites tends to include stories that are super popular even if they're not all that good (often they're read a bunch because they appear at the top of the most-read list, a self-fulfilling prophesy), and sorting by rating is, as I've shown, not useful.

In regards to user intuition bayesian rating systems (or similar weighted-ratings) are used across the vast majority of websites that display ratings like imdb, amazon, rotten tomatoes, etc. so I don't think you'd need to worry about users being confused by it. Everyone intuitively understands that 99/100 people rating something five stars is worth more than 1/1 person rating something five stars.

AliceShiki · Oct 1, 2023

defucilis said:
You have to jump to page 294 to start viewing stories with average ratings of 4.9,

Only if you don't have any other filters.

Just put a few genres/tags filter in your search.

There is also a filter you can use to only show stories that have a set minimum of chapters. Use that too to get rid of the small stories that barely got any chapters.

And there. Problem solved.

greyblob · Oct 1, 2023

AliceShiki said:
Only if you don't have any other filters.

Just put a few genres/tags filter in your search.

There is also a filter you can use to only show stories that have a set minimum of chapters. Use that too to get rid of the small stories that barely got any chapters.

And there. Problem solved.

4.5 stars with 50 ratings
is different from
5.0 starts with 5 ratings

i think thats what OP means. seems like a good filter to me. nothing of what you say achieves this.

AliceShiki · Oct 1, 2023

greyblob said:
4.5 stars with 50 ratings
is different from
5.0 starts with 5 ratings

i think thats what OP means. seems like a good filter to me. nothing of what you say achieves this.

I understand what OP said, but I'm saying that it's trivial to exclude the series with 5 ratings from your search.

OP gave an example of a search with 0 filters to try making their point... But said search wouldn't be useful in the first place, because anyone using the Series Finder would already be using some Genres/Tags filter to get stories they actually care about.

And if you're someone that cares about rating, then you probably want the bigger stories that lots of people are reading and love, so you can easily put a restriction in your search to only show stories with 50+ chapters or whatever.

Once you do that you've basically automatically removed all the small stories that barely received any ratings from the equation, as it's basically guaranteed that those stories won't have reached a high chapter count... So your search by rating is now perfectly functional.

Which is why I'm saying that OP is complaining about a non-issue... In a realistic situation, the search by rating will always be able to get rid of the small stories that barely got rated by anybody, so it doesn't matter if a search with no filters shows a ton of small stories... Because nobody uses searches with no filters in the first place.

defucilis · Oct 2, 2023

I guess the point that I'm trying to make is that you shouldn't need to create a complex search with a ton of filters to see 'the top rated stories'. The actual sort-by-rating function isn't really doing much if you need to create a hugely complex filter to cull things that the sorting function is incorrectly putting at the top.

To replicate what I was basically hoping for in my sort-by-rating search (since I'm not really that fussy about genre, I just want to find high-quality stories that people really like), here's:

Action+Comedy stories
Excluding LitRPG stories
With ratings restricted to between 4.5 and 4.9 stars
With at least 10 ratings
With at least 50 chapters
With at least 10,000 favorites

The point I'm trying to make is that in this (by this point pretty complex search that took me a few tries to iterate on and refine) is that the actual 'sort by rating' feature isn't really doing much of anything. I'm using all these other metrics to effectively stand-in for what the rating of stories is meant to be.

I can tell I'm never going to convince you...I'm not even sure why I added this last post but oh well it's done now. :P

Bayesian Ratings (to make sort by rating actually useful)

defucilis

New member

AliceShiki

Magical Girl of Love and Justice

TotallyHuman

Well-known member

AliceShiki

Magical Girl of Love and Justice

defucilis

New member

AliceShiki

Magical Girl of Love and Justice

greyblob

b

AliceShiki

Magical Girl of Love and Justice

defucilis

New member