What You Like — First Results

This graph represents the games of this IF Comp, cumulatively rated by quality, according to the data I’ve collected so far.  This ranking predicts how well the games will do in the Comp — sort of.  There are two problems.

ifcomp game qualities

The first problem is, I need more people to precisen up this thing’s validity.  Right now, it’s fuzzy.  I’m looking for fifteen more people to give me their opinion on the games.

The other issue is that currently all factors are being rated equally.  After the comp, the idea will be to find how strongly the factors correlate with the games’ final score, and find a coefficient for each rating.

Then (assuming this works) we’ll have a weighting for each quality, allowing us to determine how much value it adds to a game.  The interesting part, to my mind, will be if there’s no such uniform scale available — if judges evaluate games variably, based on how they categorize them.

All of this means:  If you want to help make this happen, then take 20 minutes and help make it happen.

And, feel free to nag other Comp judges into taking the survey, too:


Published in: on November 15, 2009 at 6:13 pm  Comments (7)  
Tags: ,

The URI to TrackBack this entry is: https://onewetsneaker.wordpress.com/2009/11/15/what-you-like-first-results/trackback/

RSS feed for comments on this post.

7 CommentsLeave a comment

  1. I’m not certain I really believe in this project, but I guess that it’s worth a shot. I filled in the survey. :)

  2. I guess those bars are the averages of the respective question? Any chance to have those averages as numbers, too? It’s a little hard to make out the exact size.

  3. Victor – Many thanks! Even if the project fails, it may fail interestingly.

    Hannes – I’ll publish all the data, in easily-downloadable form, after the Comp — if I get enough people that it has some validity. That’s fair, isn’t it?

    -edit- I’ve had 23 unique people view this post in the 90 minutes since it went up, so getting 15 more survey completions over the next 16 hours shouldn’t be that difficult.

  4. I definitely evaluated the games variably depending on how I characterized them (well, almost definitely; I suppose that an analysis of my scores could surprise me by showing that I actually did give a uniform weighting to the factors). For instance, I’m sure I ranked the NPCs higher for Snowquest than for Duel that Spanned the Ages, but Snowquest gets an overall ranking of good from me while Duel gets an overall ranking of very good because the NPCs aren’t as important in Duel. It’d be interesting to try to correlate the poll results with the individual judges’ overall scores for each game, but that might involve so many variables that you wouldn’t get anything meaningful out of it.

  5. Is it just me, but I don’t even remember the names of the games, let alone how I would rate them on individual factors, let alone on a 1-10 scale? :) SORRY. I’d be happy to complete a survey, but at this point I’d practically have to play all the games again… I hope those with better memories can contribute.

  6. I don’t know how well this will work in general, but my own ratings for IF in general tend to be a product of two factors: (how well does this work accomplish its own goals?) x (how worthy/interesting were those goals?)

    So for instance, I thought Blue Lacuna had some important flaws both in its content and its pacing, and therefore less than full points on the “accomplishment” part — but I had very high regard for the many ambitious things it was trying to do. I wound up giving it a 4 on IFDB, though a 4.5 would probably be closer to right if that rating were available.

    On the other hand, a game can be very polished but with a more manageable set of goals. Suveh Nux is extremely polished but small; still, even in its small scope it’s doing something very cool. Five stars. Does that mean I think it’s more important than Blue Lacuna? Not really. Have I spent as much time thinking about it as I have about BL? No. But the scores I give games aren’t really a reflection of how significant a contribution they are to the development of IF, which is a different category again and not something you can put a numeric value on.

    The upshot of that is that, for me, the quality of the puzzles will matter in the score of a game that is about puzzles… and not in one that isn’t. I’ve rated some games high because of their excellent NPCs, but that doesn’t mean a game with no NPCs or less-interactive NPCs will get an automatic score drop as a result.

  7. Guys,

    I’m just stopping in briefly; I have a lot to do today, as I’m moving to another city.

    Matt and Emily,

    Thanks for the insights. I’m closely watching the score of _Byzantine Perspective_. If it does better than its cumulative scores indicate it ought to, and especially in light of your reports, I’ll look at sorting the games into two populations, to see if that has greater predictive power. Then we’ll see if that applies to pure puzzle games, like _Gleaming the Cube_.


    Yeah, sorry about that. Maybe next year I’ll set up a survey per game at the start of the Comp, so people can fill them out as they go.


    We’re at 20 survey completions. For this to have +/- 10% validity, in comparison to the number of people taking the Comp, I need 10 more survey completions in the next two hours.

    Looks like we’re not going to make it.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s