Dominion Strategy Forum

Please login or register.

Login with username, password and session length
Pages: [1] 2  All

Author Topic: working on a new stats database - what questions do you want answered  (Read 9358 times)

0 Members and 1 Guest are viewing this topic.

ThaddeusB

  • Young Witch
  • ****
  • Offline Offline
  • Posts: 133
  • Respect: +140
    • View Profile
+15

I am working on a log scraper/database tool for goko and am interested it what kind of stats questions people may have. Ultimately there will hopefully be a Councilroom style web interface, but don't get too excited as that is a long ways off. Progress is slow so far due to both the complexities of Dominion and goko's inconsistent logs... When complete I should be able to answer all kinds of things as I am capturing each turn, not just kingdom cards and players. To help get the data structure right, I am asking for the type of questions that you are curious about - they can be specific or vague. For example, how often does the player with more actions played win? What is the damage done by card x missing the 2nd shuffle? What is the most VP scored in a loss? And so on. Some things are a lot harder to code for than others (CR stuff is mostly easy), and some things may be impossible to answer if not  pre-planned for, so I'm asking for feedback in advance. Thanks!
Logged

Awaclus

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 11808
  • Shuffle iT Username: Awaclus
  • (´。• ω •。`)
  • Respect: +12846
    • View Profile
    • Birds of Necama
Re: working on a new stats database - what questions do you want answered
« Reply #1 on: September 02, 2014, 06:09:53 pm »
0

First player advantage for each card would be pretty cool. Assuming that you're going to implement the CR features already.
Logged
Bomb, Cannon, and many of the Gunpowder cards can strongly effect gameplay, particularly in a destructive way

The YouTube channel where I make musicDownload my band's Creative Commons albums for free

rrenaud

  • Administrator
  • *****
  • Offline Offline
  • Posts: 991
  • Uncivilized Barbarian of Statistics
  • Respect: +1197
    • View Profile
    • CouncilRoom
Re: working on a new stats database - what questions do you want answered
« Reply #2 on: September 02, 2014, 06:10:59 pm »
+10

One suggestion on implementation.  Don't worry about working around all the bugs/craziness you find the logs.  Detect a problem and just throw the whole game away rather than adding hacks to the parsing code.  Aim to parse say, 95% of the logs that are reasonably well formed.  Keep track of how many game logs you are discarding, but be willing to lose a few.

Writing log parsers and working around bugs sucks.
Logged

silverspawn

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 5301
  • Shuffle iT Username: sty.silver
  • Respect: +3188
    • View Profile
Re: working on a new stats database - what questions do you want answered
« Reply #3 on: September 02, 2014, 06:20:03 pm »
0

First player advantage for each card would be pretty cool. Assuming that you're going to implement the CR features already.
agreed

I don't think the "when does more action played win" is particularly interesting, since low level play will dominate here, and it doesn't really tell much at all.

rrenaud

  • Administrator
  • *****
  • Offline Offline
  • Posts: 991
  • Uncivilized Barbarian of Statistics
  • Respect: +1197
    • View Profile
    • CouncilRoom
Re: working on a new stats database - what questions do you want answered
« Reply #4 on: September 02, 2014, 06:28:26 pm »
+11

One thing to learn from CR.

Everyone second guesses the global stats because there are lots of games from crappy players.

Pick some decent heuristic metric for "good player", (say, in top 200 at time of game), and in addition to the global stats, provide stats just from that subpool of players.
Logged

Beyond Awesome

  • Global Moderator
  • *****
  • Offline Offline
  • Posts: 2941
  • Shuffle iT Username: Beyond Awesome
  • Respect: +2466
    • View Profile
Re: working on a new stats database - what questions do you want answered
« Reply #5 on: September 02, 2014, 06:53:48 pm »
0

One thing to learn from CR.

Everyone second guesses the global stats because there are lots of games from crappy players.

Pick some decent heuristic metric for "good player", (say, in top 200 at time of game), and in addition to the global stats, provide stats just from that subpool of players.

Second this. Also, use the iostropish leaderboard to determine top 200, not the Goko board for obvious reasons.

First turn advantage of cards as has been said.

Plus win rate with a card vs. not getting said card

Also, data on opening splits
Logged

theblankman

  • Witch
  • *****
  • Offline Offline
  • Posts: 461
  • Respect: +383
    • View Profile
Re: working on a new stats database - what questions do you want answered
« Reply #6 on: September 02, 2014, 07:15:08 pm »
+1

Plus win rate with a card vs. not getting said card
Along the same lines, is there significant correlation between winning and:
- gaining your first copy of a card before opponent
- gaining more copies of a card than opponent
- playing a card for the first time before opponent
- playing a card more times than opponent

(Hypothesis: These will all turn out heavily in favor of my least favorite card: Cultist :) )
Logged
it's a shame that full-random is the de facto standard

Awaclus

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 11808
  • Shuffle iT Username: Awaclus
  • (´。• ω •。`)
  • Respect: +12846
    • View Profile
    • Birds of Necama
Re: working on a new stats database - what questions do you want answered
« Reply #7 on: September 02, 2014, 07:36:32 pm »
+1

Plus win rate with a card vs. not getting said card
Along the same lines, is there significant correlation between winning and:
- gaining your first copy of a card before opponent
- gaining more copies of a card than opponent
- playing a card for the first time before opponent
- playing a card more times than opponent

(Hypothesis: These will all turn out heavily in favor of my least favorite card: Cultist :) )
Actually the "first time before opponent" stats could be graphs where X is the turn the card is first bought/gained and Y is the advantage it gives for the player who did it.
Logged
Bomb, Cannon, and many of the Gunpowder cards can strongly effect gameplay, particularly in a destructive way

The YouTube channel where I make musicDownload my band's Creative Commons albums for free

ThaddeusB

  • Young Witch
  • ****
  • Offline Offline
  • Posts: 133
  • Respect: +140
    • View Profile
Re: working on a new stats database - what questions do you want answered
« Reply #8 on: September 02, 2014, 07:53:16 pm »
0

One thing to learn from CR.

Everyone second guesses the global stats because there are lots of games from crappy players.

Pick some decent heuristic metric for "good player", (say, in top 200 at time of game), and in addition to the global stats, provide stats just from that subpool of players.

Yes, that had occurred to me.  I will do one of two things - either I'll retroactively calculate the TrueSkill ratings myself or I'll use the current Iso leaderboard.  The first is a bit more work, but would allow filtering based on a player's rating at game time, which should be more accurate than current rating.  Either way, it'll be something that will be a parameter that can be adjusted to preference.
Logged

ThaddeusB

  • Young Witch
  • ****
  • Offline Offline
  • Posts: 133
  • Respect: +140
    • View Profile
Re: working on a new stats database - what questions do you want answered
« Reply #9 on: September 02, 2014, 07:57:20 pm »
0


Also, data on opening splits

Do you mean win % with different opening pairs, win rate with 5/2 vs. 4/3, or something else?
Logged

Beyond Awesome

  • Global Moderator
  • *****
  • Offline Offline
  • Posts: 2941
  • Shuffle iT Username: Beyond Awesome
  • Respect: +2466
    • View Profile
Re: working on a new stats database - what questions do you want answered
« Reply #10 on: September 02, 2014, 08:00:49 pm »
+1


Also, data on opening splits

Do you mean win % with different opening pairs, win rate with 5/2 vs. 4/3, or something else?

I mean both of what you just said.
Logged

theblankman

  • Witch
  • *****
  • Offline Offline
  • Posts: 461
  • Respect: +383
    • View Profile
Re: working on a new stats database - what questions do you want answered
« Reply #11 on: September 02, 2014, 08:38:45 pm »
+1

Plus win rate with a card vs. not getting said card
Along the same lines, is there significant correlation between winning and:
- gaining your first copy of a card before opponent
- gaining more copies of a card than opponent
- playing a card for the first time before opponent
- playing a card more times than opponent

(Hypothesis: These will all turn out heavily in favor of my least favorite card: Cultist :) )
Actually the "first time before opponent" stats could be graphs where X is the turn the card is first bought/gained and Y is the advantage it gives for the player who did it.
Another possibility for X is number of turns between player A's first gain or play and B's.  This might be instructive in a case like Mercenary, i.e. is it a bigger deal if I play Merc three turns before you vs one turn. 

Meanwhile I thought of another: Correlation between the presence of a given card in a kingdom, and the variety of cards gained during games, i.e. which cards statistically lead to high-variety or low-variety decks (I have some suspects, like Cultist and Rebuild for low variety, Fairgrounds for high variety, but in this kind of analysis the surprises are often the interesting part). 
Logged
it's a shame that full-random is the de facto standard

Awaclus

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 11808
  • Shuffle iT Username: Awaclus
  • (´。• ω •。`)
  • Respect: +12846
    • View Profile
    • Birds of Necama
Re: working on a new stats database - what questions do you want answered
« Reply #12 on: September 02, 2014, 08:54:30 pm »
0

Plus win rate with a card vs. not getting said card
Along the same lines, is there significant correlation between winning and:
- gaining your first copy of a card before opponent
- gaining more copies of a card than opponent
- playing a card for the first time before opponent
- playing a card more times than opponent

(Hypothesis: These will all turn out heavily in favor of my least favorite card: Cultist :) )
Actually the "first time before opponent" stats could be graphs where X is the turn the card is first bought/gained and Y is the advantage it gives for the player who did it.
Another possibility for X is number of turns between player A's first gain or play and B's.  This might be instructive in a case like Mercenary, i.e. is it a bigger deal if I play Merc three turns before you vs one turn. 

Meanwhile I thought of another: Correlation between the presence of a given card in a kingdom, and the variety of cards gained during games, i.e. which cards statistically lead to high-variety or low-variety decks (I have some suspects, like Cultist and Rebuild for low variety, Fairgrounds for high variety, but in this kind of analysis the surprises are often the interesting part).
I'm pretty sure Cultist results in high variety, as it adds up to 5 unique cards to the game and usually puts them all in the players' decks, and works well in engines. Rebuild is definitely one that leads to low variety, and Scout and others that basically remove one card slot from the kingdom by taking it and being completely useless.

One thing to learn from CR.

Everyone second guesses the global stats because there are lots of games from crappy players.

Pick some decent heuristic metric for "good player", (say, in top 200 at time of game), and in addition to the global stats, provide stats just from that subpool of players.

Yes, that had occurred to me.  I will do one of two things - either I'll retroactively calculate the TrueSkill ratings myself or I'll use the current Iso leaderboard.  The first is a bit more work, but would allow filtering based on a player's rating at game time, which should be more accurate than current rating.  Either way, it'll be something that will be a parameter that can be adjusted to preference.
And I guess for some (all?) of these stats, it shouldn't count if only one player is "good".
« Last Edit: September 02, 2014, 09:00:50 pm by Awaclus »
Logged
Bomb, Cannon, and many of the Gunpowder cards can strongly effect gameplay, particularly in a destructive way

The YouTube channel where I make musicDownload my band's Creative Commons albums for free

SCSN

  • Mountebank
  • *****
  • Offline Offline
  • Posts: 2227
  • Respect: +7140
    • View Profile
Re: working on a new stats database - what questions do you want answered
« Reply #13 on: September 02, 2014, 09:04:26 pm »
+4

It would be really nice if you could associate with each log the rating of both players before the game so that you can calculate the predicted win % (the isotropish code is on AI's github I think).

Once you have that you can calculate the "skill factor" of each card: for each game containing card A you assign the value "predicted win% of player who won" to that game. Then just take the average of that number over all kingdoms containing A. For a high-skill card like King's Court or Golem this number will be a lot higher than for a card that levels the playing field like Familiar or Cultist. I think this method is better than simply counting the number of upsets as it takes into account the magnitude of an upset.

Ideally I'd like a way to dynamically choose what data to include for those numbers, e.g. only games played between players who both have at least a specifiable rating.
Logged

theblankman

  • Witch
  • *****
  • Offline Offline
  • Posts: 461
  • Respect: +383
    • View Profile
Re: working on a new stats database - what questions do you want answered
« Reply #14 on: September 02, 2014, 11:14:58 pm »
0

I'm pretty sure Cultist results in high variety, as it adds up to 5 unique cards to the game and usually puts them all in the players' decks, and works well in engines. Rebuild is definitely one that leads to low variety, and Scout and others that basically remove one card slot from the kingdom by taking it and being completely useless.
I suppose it technically does, but anecdotally I think it causes "low variety strategies" where both players mainly want to buy and play lots of Cultists.  In that particular case I'd amend the criterion from just "gained" to "gained by choice, not due to an attack played by an opponent."  Or "gained during the player's turn," which is probably close enough and easier to implement. 
Logged
it's a shame that full-random is the de facto standard

Beyond Awesome

  • Global Moderator
  • *****
  • Offline Offline
  • Posts: 2941
  • Shuffle iT Username: Beyond Awesome
  • Respect: +2466
    • View Profile
Re: working on a new stats database - what questions do you want answered
« Reply #15 on: September 03, 2014, 12:08:36 am »
+1

Maybe the impact of the opening cards missing the first shuffle can also be calculated.
Logged

Hydrad

  • Young Witch
  • ****
  • Offline Offline
  • Posts: 142
  • Shuffle iT Username: Hidrad
  • Respect: +109
    • View Profile
Re: working on a new stats database - what questions do you want answered
« Reply #16 on: September 03, 2014, 12:13:28 am »
0

This feels like it will take a really long time to code all of these statistics in.
Logged
For anyone else, such a statement would be a scum tell.  For Hydrad, it's simply a tell that you're reading something from Hydrad.

Beyond Awesome

  • Global Moderator
  • *****
  • Offline Offline
  • Posts: 2941
  • Shuffle iT Username: Beyond Awesome
  • Respect: +2466
    • View Profile
Re: working on a new stats database - what questions do you want answered
« Reply #17 on: September 03, 2014, 12:21:54 pm »
0

This feels like it will take a really long time to code all of these statistics in.

True. Well, whatever is easiest to code first, I guess. I am curious to see just about any stats of DA and Guilds cards.
Logged

ThaddeusB

  • Young Witch
  • ****
  • Offline Offline
  • Posts: 133
  • Respect: +140
    • View Profile
Re: working on a new stats database - what questions do you want answered
« Reply #18 on: September 03, 2014, 01:12:37 pm »
+4

Thanks for all the feedback so far. It will be very helpful as development continues. I especially like the idea of using expected win %, as I think that will generate better ideas of card/opening/whatever strength.
 
This feels like it will take a really long time to code all of these statistics in.

The key is to intelligently set up the database from the beginning. Then it's a just a matter of figuring out the query necessary to get the info you want. Something like adding player ratings is more work, but just once and then it can be tied to every stat.

don't get too excited as that is a long ways off. Progress is slow so far

As of this morning 96/204 action cards are fully implemented and tested after about 1 month of work. That is not to say it is only half done - new cards get easier and easier to add as new effects get rarer. I'd guess the scraping is 2/3rds done. Then answers can start being generated as the web interface is built and back data filled in.
Logged

ThaddeusB

  • Young Witch
  • ****
  • Offline Offline
  • Posts: 133
  • Respect: +140
    • View Profile
Re: working on a new stats database - what questions do you want answered
« Reply #19 on: September 09, 2014, 07:12:36 pm »
+4

Update: 120 action cards and most Kingdom treasures now implemented.  Counterfeit proved to be especially difficult, requiring several hours of effort to get right... While doing Counterfeit I cam across the following game, with the craziest turn yet tested: http://gokosalvager.com/static/logprettifier.html?20140721/log.50a4d41ee4b03214bb781b08.1405960866300.txt#1-16

Among other things, it includes a 40 coin play of Diadem.

Logged

soulnet

  • Mountebank
  • *****
  • Offline Offline
  • Posts: 2142
  • Respect: +1751
    • View Profile
Re: working on a new stats database - what questions do you want answered
« Reply #20 on: September 10, 2014, 12:53:31 pm »
+1

There is probably a lot of people here that can code SQL (like me) and a significant percentage that either do not want or cannot code a parser, host a web service, code a UI, etc (like me).

So a good way to have a lot of statistics up and going in a short period of time would be:

Build the original database of game summarys with some of the data points and well indexed.
Have some sort of simple social network: allow people to write SQL select queries and save those queries, similar to what Dominiate allows for strategies. Then, each person can just play with the SQL and the favorites get saved and displayed nicely.
Maybe add some prewritten nice queries or predicates to be used in queries, like "consider only games among top 100 players" or something like that.

I would spend some type coding SQL for this, but I do not have time to learn and use the skills necessary for all the rest.

EDIT: If you worry about security, you can consider password-protecting adding custom SQL queries and sharing the password only with specific people you trust from the forums (or from somewhere else).
« Last Edit: September 10, 2014, 12:54:34 pm by soulnet »
Logged

blueblimp

  • Margrave
  • *****
  • Offline Offline
  • Posts: 2849
  • Respect: +1559
    • View Profile
Re: working on a new stats database - what questions do you want answered
« Reply #21 on: September 10, 2014, 02:04:57 pm »
+2

Or just make available for download some kind of aggregated parsed game logs in a structured format. The two most imposing things for doing analysis on the logs, IMO, are collecting the game logs and parsing them.

On the original topic of the thread, I thought the most interesting and unexplored statistics in councilroom were the conditioned-gain and conditioned-buy statistics, such as "in what proportion of games with both Fool's Gold and Mine in the kingdom does a particular player gain Mine at some point".
« Last Edit: September 10, 2014, 02:09:47 pm by blueblimp »
Logged

ThaddeusB

  • Young Witch
  • ****
  • Offline Offline
  • Posts: 133
  • Respect: +140
    • View Profile
Re: working on a new stats database - what questions do you want answered
« Reply #22 on: September 19, 2014, 03:45:12 pm »
+3

Update: About 160 action cards are now implemented, minus a small list of special interactions/scenarios I have to teak down logs for to see how goko handles them. I estimate the parser part of the project should be done the first week of October.
Logged

Polk5440

  • Torturer
  • *****
  • Offline Offline
  • Posts: 1708
  • Respect: +1788
    • View Profile
Re: working on a new stats database - what questions do you want answered
« Reply #23 on: September 20, 2014, 09:11:34 am »
0

So, what is the benefit from doing this from scratch? It's easier? Did you try to talk to the Council Room guys to see where they are at, or whether you could actually get Council Room up and running without having to start over?
Logged

ThaddeusB

  • Young Witch
  • ****
  • Offline Offline
  • Posts: 133
  • Respect: +140
    • View Profile
Re: working on a new stats database - what questions do you want answered
« Reply #24 on: September 20, 2014, 04:49:14 pm »
0

So, what is the benefit from doing this from scratch? It's easier? Did you try to talk to the Council Room guys to see where they are at, or whether you could actually get Council Room up and running without having to start over?

Isotopic and Goko logs are completely different, so there isn't really any option other than to start from scratch as far as a parser goes. I am also enabling different stats by tracking each hand card by card.
Logged
Pages: [1] 2  All
 

Page created in 0.112 seconds with 23 queries.