Why Steam's thumbs up/thumbs down system works better than star ratings

Steam asks players a simple yes or now question. Not “rate this game from 1 to 5 stars” but “Would you recommend this game?”

Nearly all major platforms such as Amazon, Google Play, Yelp,… use star ratings, while Steam review seems less precise and less comprehensive.

But here’s the thing: it actually gives you better information for the decision you’re actually trying to make.

Answering the real question players have

When someone browses Steam, they’re not asking “is this a 7.4 or a 7.8?” They’re asking “should I buy this game or not?”

That’s already a yes/no question. You either spend your money and time on it, or you don’t. Instead of making reviewers translate their experience into a number, then making you translate that number back into a purchase decision, Steam just asks directly.

And the phrasing “Would you recommend this game to other players?” nudges the reviewer into become more objective.

Reviews that answer the right question help people decide faster. The clarity helps with impulse buys, especially during sales - you see “Very Positive”, “85% of players recommend” and just grab it.

Star ratings are not that precise

Five-star systems look more precise but they’re actually just noisy.
Everyone uses the scale differently. For some people, 3 stars means “this was bad.” For others, 3 stars is genuinely “it was fine, nothing special.” Some people give 4 stars to anything good and save 5 for masterpieces. Others throw 5 stars at anything they didn’t hate.

Same numbers, completely different meanings.

Binary sidesteps this mess. The game either cleared the bar for you or it didn’t. There’s no ambiguous middle to misinterpret. And if you’re uncertain? The framing nudges you to just not leave a review.
When Steam says “78% of players recommend this,” you know what that means. When something says “4.1 stars average,” you have to do math in your head about what different star ratings mean to different people.

“Okay” games are basically failures

Here’s where games are different from most products: an “okay” game is way worse than an “okay” blender.

A 3-star blender still blends things. You can buy something that’s 60% as good as the best option and be reasonably happy, especially if it’s 60% the price.

But a game that’s “just okay”? That doesn’t mean you got 60% of the fun. It usually means you spent hours figuring out you’re not really having fun, wondering if it gets better, then eventually giving up and feeling vaguely annoyed about the whole thing.

Time is the resource you can’t get back. People don’t want to “find out” something is mid. They want clear signal upfront: is this worth my time or not?

Binary ratings push people away from leaving lukewarm middle ratings. You either recommend it or you don’t. This is actually more useful.

This changes what gets made

Binary ratings punish mediocrity and reward distinctiveness.

With stars, a game that’s “pretty good for everyone” might get 4.0 stars. Safe, acceptable, invisible. A weird niche game that some people absolutely love but others bounce off? Also 4.0 stars - the 5s from fans average out with the 2s from people who didn’t get it.
Same score, different purchase decisions that are implied.

With binary ratings, the broad-but-mediocre game is screwed. You can’t hide in the comfortable 3.8-4.2 zone. But the niche game can score really well with its actual audience, because those players enthusiastically recommend something that felt like it was made for them.

The system pushes developers toward “be great for someone” instead of “be okay for everyone.” Hollow Knight isn’t for everyone. Neither is Dwarf Fortress. These games found their audiences by NOT trying to please everybody. Binary ratings make that strategy actually work.

Why not a “maybe” option?

Wouldn’t three options be better than two?

No, because now buyers have to figure out “what ‘maybe’ would actually be like for me?” You’re back to endless deliberation.

Steam has a better solution: the two-hour refund window. Instead of capturing ambivalence in ratings, they just let you try the game. Buy it, play it, refund it if it sucks. You get your own answer instead of trying to decode other people’s uncertainty.

How Steam adds nuance

Binary alone would be too simple. Steam layers stuff on top:

Labels for different percentages: 95%+ is “Overwhelmingly Positive,” 80-94% is “Very Positive,” etc. The stronger labels require minimum review thresholds. This prevents a group of friends from making something look huge.

Recent vs. All-Time reviews: Shows you if a game got worse (or better) after patches. If you see “Recent: Mixed” but “All Time: Very Positive,” you know something changed. What you buy today might be substantially different from what existed at launch.

Helpful votes on reviews: The binary captures direction. Nuance lives in the actual written reviews. Steam boosts well-written reviews through community voting.

Language filters: A game might work great in one culture but not another. Filtering reviews by language creates different feedback tuned to specific audiences.

Why Amazon can’t just switch

If binary is so good, why doesn’t Amazon use it?

Because shopping is different. When you search “USB-C cable” you get a page of basically identical products. Your first question isn’t “should I buy a USB-C cable?” It’s “which of these twelve identical-looking options will disappoint me least?” Stars help you scan a list of clones.

But games aren’t like that. You discover them through recommendations, not search. And each good game is genuinely different from the others - you’re not picking between twelve identical products. Binary works because you’re deciding on one specific game, not scanning clones.

Also, physical products fail differently. A “just okay” cable still charges your phone. The functional baseline is more forgiving. The gap between “good” and “great” actually matters when you’re trying to sort.

Why services need stars

Services (plumbers, restaurants, contractors) seem like they could use binary ratings, but they can’t.

Imagine a plumber who fixes your leak but shows up 40 minutes late and tracks mud through your house. What do you do with thumbs up/down?

You probably say “recommend” because saying “don’t recommend” signals “this person is incompetent or a scammer.” The job got done. You’re not going to tank someone’s livelihood over muddy carpets. Additionally, the meticulous plumber who’s on time and cleans up also gets thumbs up. The system can’t tell them apart.

Stars fix this. Muddy plumber gets 3-4 stars. Meticulous plumber gets 5. Now the platform can rank them.

Also, same numbers could be interpreted in different ways:

84% thumbs up = 16% total failure rate. Terrifying.
4.2 stars = Competent but has some rough edges. Workable.

For services, the question isn’t just “did they do the job?” but “how well they did the job” and “how was the experience?” Stars let you answer these.

What about mobile games?

Mobile stores use stars. Why doesn’t binary work there?

PC gaming is deliberate. You sit down, commit focus for extended periods. It’s like watching a movie. This means games need to be really good to be worth it.

Mobile gaming is different - passive, ambient. Playing while waiting in line, half-watching TV, killing time. The question shifts from “is this worth my focused attention?” to “is this roughly functional and not annoying?”

Plus mobile free-to-play games make money through retention and microtransactions, not upfront purchases. The “should I buy this?” framing that makes Steam’s system work doesn’t apply when there’s nothing to buy upfront.

Binary ratings work when the decision feels consequential. PC games meet that bar. Many mobile games don’t.

Bottom line

Rating systems work best when they match the actual decision people need to make.

Steam’s funnel is: browse → buy → don’t refund. That’s binary, so the ratings are binary. The question reviewers answer matches the question buyers are asking.

For stuff like games where mediocrity feels like failure, where “just okay” means wasted hours you’ll never get back, the binary question is the right one.