Dominion Strategy Forum

Please login or register.

Login with username, password and session length

Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Messages - Holger

Filter to certain boards:

Pages: 1 ... 22 23 [24] 25 26 ... 29
576
Dominion General Discussion / Re: Interview with Donald X.
« on: April 17, 2014, 08:14:05 am »
Is there a specific reason why "+3 Actions" and "+2 Buys" were only used on a single card each (Crossroads and Squire), or was it a coincidence?

A "super Village" giving +3 Actions each time you play it might work at $4 (or $5?), I think. (Though it wouldn't be too interesting without some extra twist.)
+2 Buys isn't that much better than +1 Buy, but it might have been an interesting boost e.g. for Woodcutter (and/or Nomad Camp) to have a second Buy. (This would have made Woodcutter/Gardens quite strong, but still far weaker than Beggar/Gardens, and probably also weaker than Squire/Gardens, so it should be an okay combo.)

577
I'm unsure whether I prefer daily or gamely uncertainty decreases myself (or both, like Goko does).

Correct me if I'm wrong, but if I remember the TrueSkill documentation correctly, the per game uncertainty models the idea that the skill you play with for that game is itself drawn from a distribution. One never plays with a fixed underlying skill. For example, I may be watching tv, be under the weather, distracted by something outside, etc. These are factors separate from luck within the game itself. [The modeling assumption is that the parameter (beta, I think?) that describes the distribution from which you "draw" your skill every game is the same for every player.]
I think beta does account for the luck of the game, not a "skill distribution". Either way, there is a separate parameter gamma, which does systematically increase the uncertainty once for each game. It's this parameter which allows for rating decreases after a win.

Quote
The daily uncertainty increase is an artificial way of 1) encouraging play and decreasing leaderboard camping and/or 2) crudely modeling skill depreciation over time.

Agreed; but skill depreciation can also be "crudely" modeled by an uncertainty once per game instead, like the original TrueSkill algorithm (and also (in addition to the daily increase) Goko) does.


Quote
Quote
I do agree with michaeljb that the rating shouldn't drop after win, but that's a "bug" of TrueSkill, which Goko (and Isotropish) just copied. Ideally, you'd limit the automatic uncertainty increase by the mu increase (to give at worst a rating change of zero) in the case of an expected result. "Lying" about the rating decrease doesn't help people who get stuck with an ever-decreasing rating for continually beating Serf Bot (there was such a case last year in Casual mode: http://forum.dominionstrategy.com/index.php?topic=6819.msg189088#msg189088 ).

Edit: added link.

Again, ranking systems are not achievement systems.

mu - 2*sigma is just one way to represent a two-parameter system as one number to create a leaderboard. If you prefer not having a rating decline purely because of uncertainty increase, then consider preferring a leaderboard based only on mu rather than changing the system itself.
I wouldn't mind a leaderboard based only on mu (if it doesn't lead to new players getting to #1 with 10 lucky wins); but on Goko, the leaderboard effectively IS the system because they don't publish mu and sigma separately (let alone the estimated win probabilities). And to me it makes no sense at all to decrease  the rating for a win, no matter whether you consider the leaderboard as an "achievement system" or not. FWIW, the Goko leaderboard is used as a ranking system with the Salvager extension (requiring e.g. "4000+" opponents), although I'd consider it an achievement system due to the substracted 2*sigma.


Quote
Ideally, the leaderboard/rating decline from a win against a weak player wouldn't necessarily impact the quality of matchmaking, either. I think Microsoft tries to match players on the highest expected probability of a draw (not the same thing as closest rank on the leaderboard). With good matchmaking, rating declines should almost never happen, anyway.

Certainly the probability of a rating decline also depends on the TrueSkill parameters, not only the good matchmaking. With the high number of complaints about it, it seems to have occurred quite frequently on Goko. (Goko does seem to have good "bot matchmaking", always choosing the bot closest to the rating as an opponent when starting a "Play bots" game. This didn't prevent the quoted Serf Bot matchmaking being a rating trap.)

578
Just two nit-pickings: It was trisha_brooke, not Mr.Griggs, who told us the μ and σ details in the linked post (unless Griggs is the "engineer" she referred to).

Ah.  Yes, I think you're right.  I has assumed it was all part of the Q&A.

What is your second nit-pick?  ;)

There's no other, just that it should twice read Trisha instead of Griggs.  ;)

579
Great to have the elusive formula at last. 

  • μ can't go below 0 or above 10k: This one's real, at least for the Casual system.  But μ=0 in Pro mode is so horrifically bad that not one player has bumped up against the limit.  Same story for the alleged upper bound at 10k.

Not even Serf Bot, which has Isotropish mu=-18.7? Since Serf Bot has over 4,000 Pro games, this might make a difference for many weak players...

Rating never drops after a win: It's legitimately possible for your displayed TrueSkill rating to go down after a win.  If you beat a far lower-rated player, TS gives you a modest increase in σ but only a tiny increase in μ.  So μ-2σ can go down even though μ itself (its best guess for your skill) has gone up.

I don't understand this. I understand that "tiny increase in μ" combined with "modest increase in σ" results in a decrease of μ-2σ, but I don't understand why TS gives you a modest increase in σ.

Isn't σ supposed to be the uncertainty in the rating you have? If so, it seems like it should not be increasing when your actual game result matches the predicted game result. When you play a lower-rated player, the predicted result will be a win for you, and intuitively to me that means that when you do win, uncertainty should remain the same or decrease.

As I understand it, the gamma (aka tau) parameter gives you an increase in uncertainty before every game that's mean to model the possibility that your skill has changed.  And I think you're right that it's really doing the wrong thing when you beat a much lower rated player... it's a compromise.  Without gamma, your sigma plummets and you can end up with a rating that lags your evolving skill level.

There's an argument to be made for skipping gamma and just applying a daily increase in uncertainty...  Holger suggested that Isotropic might have been doing this.  I suppose it's a question of whether you think skill is more likely to change with time away from the game or with experience playing.

Also, what Kirian said. :)

I'm unsure whether I prefer daily or gamely uncertainty increases decreases myself (or both, like Goko does). I do agree with michaeljb that the rating shouldn't drop after win, but that's a "bug" of TrueSkill, which Goko (and Isotropish) just copied. Ideally, you'd limit the automatic uncertainty increase by the mu increase (to give at worst a rating change of zero) in the case of an expected result. "Lying" about the rating decrease doesn't help people who get stuck with an ever-decreasing rating for continually beating Serf Bot (there was such a case last year in Casual mode: http://forum.dominionstrategy.com/index.php?topic=6819.msg189088#msg189088 ).

Edit: added link.
2nd edit: fixed "sign error"

580
Great job on cracking Goko's system! Congratulations :D


Back in January 2013, CEO Ted Griggs told us that Goko's rating system tracks μ and σ for each player and displays your rating as μ-2σ.  Qvist immediately guessed that they were running TrueSkill, but other forums members were more skeptical.

[...]
Code: [Select]
Error-minimizing TrueSkill Parameters:
mu0:       5500.00
sigma0:    2250.00

Residual Error: 0.0000

Those are exactly the values that Mr. Griggs told us in Jan 2013, which makes it plausible that Goko has been using TS with these same parameters since day one.
Just two nit-pickings: It was trisha_brooke, not Mr.Griggs, who told us the μ and σ details in the linked post (unless Griggs is the "engineer" she referred to).

581
Simulation / Re: Dominiate: a Dominion simulator that runs on the Web
« on: April 11, 2014, 08:07:07 am »
Probably this uses lazy evaluation, so it only looks at that last condition if it can't figure out the result from all the others. That's why you only saw the bug some turns. If it gets to that part it makes sense that it will perform that action though.
That's a possible explanation, but in the turns I posted, the simulator shouldn't even have looked at the Estate condition (if it's really lazy :P), since discarding an Estate wasn't even possible with a Copper revealed; also the command inside an if-clause should have triggered an error warning IMO. I was using a single "=" because that's used as the mathematical equality sign in many languages, but of course it's a command in JavaScript. :-[

I'm glad order has been restored to the universe. Those were some pretty surprising results!
I couldn't agree more. Though I would have liked Rebuild to be substantially weaker than some card...

582
#61 ▲4 Saboteur (Intrigue)

Saboteur is now better than all the treasure cards, [...]

#60 ▼5 Royal Seal (Prosperity)

And there's the fourth treasure card: Stash dropped 5 ranks and 5pp. [...]

I can spot two mistakes here...  :P ;D

583
Simulation / Re: Dominiate: a Dominion simulator that runs on the Web
« on: April 10, 2014, 05:52:07 pm »
I've noticed that JoaT is implemented suboptimally: Jack discards an Estate or Curse when it has none in hand, instead it should usually draw and trash it (at least in the early game):

I've now tried a more reasonable discard strategy by adding

Quote
 
discardPriority: (state, my) -> [
    "Estate" if my.countInHand("Estate")+ my.countInHand("Curse")  > 0 \
    or state.countInSupply("Province") < 6 or my.coins=7
    "Curse" if my.countInHand("Estate")+ my.countInHand("Curse")  > 0 \
    or state.countInSupply("Province") < 3 or my.coins=7
    "Copper"
    "Province"
    "Duchy"
    "Jack of All Trades"
    null
 ]

to the DoubleJack bot. The improvement is astonishing: The win rate against BankWharf increases from 36% to 55%, it now wins 53-47 against DoubleWitch, 67-33 against DoubleMountebank.
And it wins 58-42 against Rebuild!!! Am I dreaming? :o I've let the simulator play 5,000 games to be sure. DoubleJack even beats RebuildRogue, RebuildHorse and RebuildDuke by 54-46 or better.

Does the discardPriority function I added mess something up with the opponent's strategy, or is Rebuild really worse than Jack when both are played reasonably (and without Shelters/Colonies)?
I was also surprised to see that the improved DoubleJack seemed to beat most strategies even better in Colony games (adding the "vanilla" Colony/Plat. gain rules), but there seems to be a serious bug in the simulator for these games; DoubleJack "cheats" by sometimes getting 7 extra Coins from nowhere every few turns (no Coin token cards available):
[...]

It seems I WAS dreaming, and the bug was in my own code (and not just in Colony games, apparently): "my.coins=7" sometimes sets the player Coins count from 0 to 7 in spite of being in an if clause; it should read "my.coins==7" instead. When fixing this, DoubleJack does lose against Rebuild and DoubleWitch.
(My improvement only increases the win rate by 0-4%, far from enough.)

584
Game Reports / Re: Baker Board - Cultist of Rebuild?
« on: April 09, 2014, 08:50:48 am »
According to SCSN's simulation results, pure Rebuild-BM does beat Cultist-BM. On this board, Altar helps both a Rebuild and a Cultist player, but it also counters Cultist by trashing Ruins. The starting coin token should slightly help Cultist for whom early gains are more relevant. But Navigator and (probably) Moneylender are good support for Rebuild, and there's no Shelters or Colonies, so I think I'd go Rebuild-BM (gaining an Altar with the first $6, saving my coin token for this buy) on this board.
(Apart from not saving his coin token and not buying Moneylender before Rebuild on T1, shark_bait's opponent was unlucky in that his first Rebuild misses the reshuffle, and the first time he gets to $5 "on his own" was only turn 6.)
Slightly is a major understatement. Being able to open Cultist is huge for a card that snowballs like that. If you can't open Cultist, then sure Rebuild looks better, but that's no the question here.

It's not really "huger" than having a 5/2 opening without coin tokens. But I agree that the token does help Cultist very much; what I meant is that it helps Rebuild almost as much - you can get an (almost) guaranteed Altar on T3/4 or get an extra Rebuild early.

Quote
Quote
With a Cultist chain running, you'd draw Rebuild dead half the time (and with the other half, you have to Rebuild your less profitable starting Estates first). Witch into Rebuild is better than either card alone, but I'm sure Cultist into Rebuild is worse than either single-card strategy.
I'm not sure what you're comparing to here. If your opponent ignores Cultist, you just go Cultist BM and win, like shark_bait did here. If you both go Cultist and get a fairly even Ruins split, I'm not sure buying Provinces will be that fast, and you won't be dead drawing as much as you seem to be saying.

I was considering the non-mirror case (because I think Rebuild+Support is preferable to pure Cultist), but my argument applies to any game where you've bought ~5 Cultists before going Rebuild. You'll usually buy at least that many in the non-mirror, but I think it's the best Cultist strategy also in the mirror (in the absence of other Action cards, Cultists are still Labs after the Ruins are gone; and after two piles are empty, you should probably start buying Duchies anyway). And with 5 Cultists out of ~20-25 cards in your deck, the chance of drawing Rebuild (or any other Action) dead is close to 50%. Even with only 3 Cultists in your deck, Rebuild would still be substantially nerfed.

I'm not quite sure which strategy is better here, but the actual game doesn't prove Cultist BM to be better because, as I said, shark_bait's opponent was very unlucky and also played suboptimally - buying Moneylender on turn 3/4 (instead of Navigator on T1) seems very bad against a Cultist player to me.

585
Goko Dominion Online / Re: Goko Dominion Salvager Discussion
« on: April 08, 2014, 03:58:54 pm »
"These people are going to cheat anyway, so let's enable them" doesn't sound very compelling to me. The fact that it's so tedious to do these things manually is a significant deterrent to doing them. For example, if a VP counter had never been implemented, I'm guessing (guessing!) that SheCantSayNo would still be playing Dominion Online and would probably not be making a habit of reading through the log every game in order to tally the score. He'd just be better at point counting.

Playing with slightly different rules on mutual agreement is not cheating. If a VP counter had been implemented as part of the rules for the physical game, I'm guessing (guessing!) that you would still be playing Dominion and would probably not be making a habit of trying to tally the score in your head instead. You'd just be worse at point counting.  ;D :P

So true! I think people should be able to play by whatever variants they want. But it seems weird to me that these variant games are ranked on the same Pro leaderboard as people who are actually playing Dominion for realsies.

As discussed before (and pointed out by JW), "playing Dominion for realsies" in your narrow definition is not possible on Goko, due to the existence of a complete game log, the missing starting-player rule,  the risk of fatal misclicks, the explicit pause for playing reactions etc. So like all of us, you're also just playing a Dominion variant on Dominion Online.

586
Goko Dominion Online / Re: Goko Dominion Salvager Discussion
« on: April 08, 2014, 02:41:00 pm »
"These people are going to cheat anyway, so let's enable them" doesn't sound very compelling to me. The fact that it's so tedious to do these things manually is a significant deterrent to doing them. For example, if a VP counter had never been implemented, I'm guessing (guessing!) that SheCantSayNo would still be playing Dominion Online and would probably not be making a habit of reading through the log every game in order to tally the score. He'd just be better at point counting.

Playing with slightly different rules on mutual agreement is not cheating. If a VP counter had been implemented as part of the rules for the physical game, I'm guessing (guessing!) that you would still be playing Dominion and would probably not be making a habit of trying to tally the score in your head instead. You'd just be worse at point counting.  ;D :P

587
Game Reports / Re: Baker Board - Cultist of Rebuild?
« on: April 08, 2014, 01:52:53 pm »
This game is a good example of not blindly playing into "power" cards.  They are not always the best strategy and even though in most cases Rebuild counters junkers pretty hard, there are always exceptions to the rule.  Both the lack of Rebuild support and ability to open Cultist provide a unique board where Rebuild is actually not able to outpace junking attacks.

I'm not so sure this game is an example of that concept. Isn't Cultist considered more of a power card than Rebuild? "Blindly" playing into this would actually not turn out wrong. I would guess the best strategy is going Cultists into Rebuilds (assuming your opponent also goes Cultist -- if not, just go money and Provinces, like you did), which seems like the most obvious thing to do.

According to SCSN's simulation results, pure Rebuild-BM does beat Cultist-BM. On this board, Altar helps both a Rebuild and a Cultist player, but it also counters Cultist by trashing Ruins. The starting coin token should slightly help Cultist for whom early gains are more relevant. But Navigator and (probably) Moneylender are good support for Rebuild, and there's no Shelters or Colonies, so I think I'd go Rebuild-BM (gaining an Altar with the first $6, saving my coin token for this buy) on this board.
(Apart from not saving his coin token and not buying Moneylender before Rebuild on T1, shark_bait's opponent was unlucky in that his first Rebuild misses the reshuffle, and the first time he gets to $5 "on his own" was only turn 6.)

B) If you go for Rebuild, I think the opening to do is Moneylender/Navigator over Silver/Navigator. Yes they are two terminals but you're keeping your deck thin and you can always pick out the good hands for Rebuild (and pick up an Altar when getting $6).
Sacrificing a coin token for a second $4 card which may easily collide on T3/4 seems suicidal to me. Unlike Cultist, Rebuild doesn't care much for keeping your deck thin anyway, a thin deck increases the risk of collision with the only rebuildable VP card. Save the coin token for a $5 or $6 buy instead.

Quote
My gut feeling says B) is way too slow. So I think A) is the call here. And of course when the ruins are dealt out, you can always switch to Rebuild yourself.
With a Cultist chain running, you'd draw Rebuild dead half the time (and with the other half, you have to Rebuild your less profitable starting Estates first). Witch into Rebuild is better than either card alone, but I'm sure Cultist into Rebuild is worse than either single-card strategy.

588
Simulation / Re: Dominiate: a Dominion simulator that runs on the Web
« on: April 07, 2014, 02:49:23 pm »
Maybe I should also clarify that my surprise was not just that Jack beats Rebuild decisively, but that (unlike Rebuild) it beats practically every other BM strategy (the only defeat is against bane-less Young Witch, and very narrowly - about 48-52). The (oldish) common wisdom expressed in the Jack articles by theory and HME:

Quote from: theory, http://dominionstrategy.com/2011/12/05/hinterlands-jack-of-all-trades/
...[it] goes toe-to-toe with DoubleMountebank and DoubleWitch.

If you go in the simulator and pit single-card strategies against each other, [Double]Jack is going to win a lot. As far as I can tell, it only loses to Young Witch, Witch, Mountebank, Wharf, Masquerade, and Courtyard.

is apparently almost completely wrong; DoubleJack does beat all these other cards decisively, except Young Witch (unless their Dominiate strategies should all turn out to be as badly suboptimal as Jack's). So these articles (AdamH says of theory's that it "reflected the mentality of that time, which was that Jack [...] was overpowered[...]") still substantially underestimate Jack's power!

And the simulation results are for Estate games only - with Shelters DoubleJack should be even stronger. Only with (correctly implemented) Colonies might the other cards stand a chance...

589
Simulation / Re: Dominiate: a Dominion simulator that runs on the Web
« on: April 07, 2014, 02:15:07 pm »
It should still lose against RebuildJack (add the updated strategy there as well), which is the only sensible comparison.

It does lose (even without the updating), but still I'm baffled by this result.
(I wouldn't call this the only reasonable comparison; on a board with Rebuild, Jack and a "better" Rebuild support like Horse Traders, I would have ignored Jack prior to this simulation.)
My impression was that almost everyone considered Rebuild to be the better 1-card BM strategy. Probably (?!) Rebuild remains the stronger card against engines due to the Duchy depletion, but still...

I suspect that what SCSN means is that on any board in which Double-Jack can face Rebuild-X, one of the options is Rebuild-Jack.

That's true, of course; but my point was that the relative strength of Rebuild and Jack can still matter if there's good enough support for one of the two that you can ignore the other. E.g. I could now imagine Jack-Mystic to be stronger than RebuildJack; so Rebuild may be ignorable on a sizable number of BM boards containing Jack.

I am actually not surprised at all of this result. Adding those rules to the Jack play has to be a big improvement. And Jack is already a very strong BM enabler that was reasonably close in power to rebuild strategies before. SCSN is of course very right about Jack also being a Prime Rebuild enabler (better than e.g. Horse Traders) As rebuild synergizes pretty well with Jack[...]
I don't think Rebuild and Jack synergize at all - the advantage of drawing an additional card when the two collide should be more than offset by the facts that Rebuild doesn't want Jack to trash all Estates, and the risk of drawing Rebuild dead with Jack; also the Silvers dilute the Rebuilds. But apparently Jack is strong enough that it still helps another card with which it has slight anti-synergy. That's why I had thought Horse Traders would be better support, because it does synergize with Rebuild (giving an almost guaranteed $5).

What i don't really understand is why double Witch has a higher win-rate against double Jack than double Mountemank as you will block Mountebanks less often after the play-rules improvement. I guess there is a good reason for this (given such a huge difference in the win rates) but i can't see it right now.
I expected this from the Rebuild comparisons: DoubleWitch is one of the very few single-card strategies that beats Rebuild, while Rebuild beats DoubleMountebank. I think Witch is just the better BM card (as Awaclus and SCSN said).

I was also surprised to see that the improved DoubleJack seemed to beat most strategies even better in Colony games (adding the "vanilla" Colony/Plat. gain rules), but there seems to be a serious bug in the simulator for these games; DoubleJack "cheats" by sometimes getting 7 extra Coins from nowhere every few turns (no Coin token cards available):

Quote
(DoubleJack shuffles.)
[...]
Tableau: [Jack of All Trades, Smithy, Spice Merchant, Remodel, Jester, Vault, Treasury, Ambassador, Familiar, Bishop, Colony, Platinum, Potion]

[...]
== DoubleJack's turn 5 ==
DoubleJack plays Jack of All Trades.
DoubleJack gains Silver.
DoubleJack reveals and discards Copper.
DoubleJack draws 1 cards: [Jack of All Trades].
DoubleJack plays Copper.
DoubleJack plays Copper.
DoubleJack plays Silver.
DoubleJack plays Silver.
Coins: 13, Potions: 0, Buys: 1
Coin Tokens left: 0
DoubleJack buys Platinum.
DoubleJack draws 5 cards: [Estate, Copper, Copper, Copper, Copper].
[...]
== DoubleJack's turn 8 ==
DoubleJack plays Jack of All Trades.
DoubleJack gains Silver.
DoubleJack reveals and discards Copper.
DoubleJack draws 1 cards: [Copper].
DoubleJack plays Silver.
DoubleJack plays Platinum.
DoubleJack plays Copper.
DoubleJack plays Copper.
DoubleJack plays Copper.
Coins: 17, Potions: 0, Buys: 1
Coin Tokens left: 0
DoubleJack buys Colony.
DoubleJack draws 5 cards: [Silver, Copper, Copper, Silver, Jack of All Trades].

[...]

== DoubleJack's turn 9 ==
DoubleJack plays Jack of All Trades.
DoubleJack gains Silver.
(DoubleJack shuffles.)
DoubleJack reveals and discards Copper.
DoubleJack draws 1 cards: [Platinum].
DoubleJack plays Silver.
DoubleJack plays Silver.
DoubleJack plays Copper.
DoubleJack plays Copper.
DoubleJack plays Platinum.
Coins: 18, Potions: 0, Buys: 1
Coin Tokens left: 0
DoubleJack buys Colony.
DoubleJack draws 5 cards: [Colony, Copper, Copper, Silver, Copper].
[...]

590
Simulation / Re: Dominiate: a Dominion simulator that runs on the Web
« on: April 06, 2014, 05:10:22 pm »
It should still lose against RebuildJack (add the updated strategy there as well), which is the only sensible comparison.

It does lose (even without the updating), but still I'm baffled by this result.
(I wouldn't call this the only reasonable comparison; on a board with Rebuild, Jack and a "better" Rebuild support like Horse Traders, I would have ignored Jack prior to this simulation.)
My impression was that almost everyone considered Rebuild to be the better 1-card BM strategy. Probably (?!) Rebuild remains the stronger card against engines due to the Duchy depletion, but still...

591
Simulation / Re: Dominiate: a Dominion simulator that runs on the Web
« on: April 06, 2014, 03:56:05 pm »
I've noticed that JoaT is implemented suboptimally: Jack discards an Estate or Curse when it has none in hand, instead it should usually draw and trash it (at least in the early game):

I've now tried a more reasonable discard strategy by adding

Quote
 
discardPriority: (state, my) -> [
    "Estate" if my.countInHand("Estate")+ my.countInHand("Curse")  > 0 \
    or state.countInSupply("Province") < 6 or my.coins=7
    "Curse" if my.countInHand("Estate")+ my.countInHand("Curse")  > 0 \
    or state.countInSupply("Province") < 3 or my.coins=7
    "Copper"
    "Province"
    "Duchy"
    "Jack of All Trades"
    null
 ]

to the DoubleJack bot. The improvement is astonishing: The win rate against BankWharf increases from 36% to 55%, it now wins 53-47 against DoubleWitch, 67-33 against DoubleMountebank.
And it wins 58-42 against Rebuild!!! Am I dreaming? :o I've let the simulator play 5,000 games to be sure. DoubleJack even beats RebuildRogue, RebuildHorse and RebuildDuke by 54-46 or better.

Does the discardPriority function I added mess something up with the opponent's strategy, or is Rebuild really worse than Jack when both are played reasonably (and without Shelters/Colonies)?

592
Simulation / Re: Dominiate: a Dominion simulator that runs on the Web
« on: April 06, 2014, 12:41:47 pm »
I've noticed that JoaT is implemented suboptimally: Jack discards an Estate or Curse when it has none in hand, instead it should usually draw and trash it (at least in the early game):

Quote
== DoubleJack's turn 6 ==
DoubleJack plays Jack of All Trades.
DoubleJack gains Silver.
DoubleJack reveals and discards Estate.
DoubleJack draws 1 cards: [Copper].
DoubleJack plays Copper.
DoubleJack plays Copper.
DoubleJack plays Copper.
DoubleJack plays Silver.
Coins: 5, Potions: 0, Buys: 1
Coin Tokens left: 0
DoubleJack buys Silver.
(DoubleJack shuffles.)
DoubleJack draws 5 cards: [Copper, Copper, Copper, Gold, Silver].

This is especially bad behaviour in games against Cursers; no wonder DoubleJack currently loses to Witch and Mountebank...

593
Dominion Articles / Re: Article: Jack of All Trades, Advanced (draft)
« on: April 06, 2014, 12:29:38 pm »
Jack of All Trades, Advanced

The original article can be found on the wiki and was written by theory in late 2011, right after Hinterlands was just released. The original article reflected the mentality of that time, which was that Jack was really strong for money, didn't really synergize with all that much, and was grossly overpowered and could only be beaten by the strongest engines (sound familiar? I really hope this article can be written about Rebuild some day).

I agree with most of this, but not with the "grossly overpowered" claim. theory only ever states that it's very strong for BM (as opposed to the initial impression mentioned of it being quite weak), and flat-out says that "DoubleJack isn’t unbeatable."  And the claim that Jack is one of the strongest BM card is still correct, I think, if you ignore stronger cards that were released later (Rebuild, Cultist).* In Qvist's 2014 card list, it's still #4 among $4 cards, and none of the higher-ranked cards is good for BM. 
I really wish your Rebuild hope comes true, but even Donald X. has admitted that it's too strong for a one-card strategy, something he never had to do for Jack...

* In the Dominiate simulator, it only loses to Wharf, Witch and (narrowly) Mountebank, but I think the latter two are due to suboptimal card implementation - Jack discards a Curse when it has none in hand, instead it should draw and trash it.
Edit: I was wrong, Jack seems to lose either way (although very narrowly against Wharf). In Province-Estate games, that is - Shelters may help Jack to win those match-ups, while Colonies are hopeless.

594
TL;DR: Isotropish and Isotropic use an identical updating algorithm, but with different parameters values.  I think it's these different values that give Isotropic lower uncertainties.

I haven't looked at your code, but you said previously that you just used the vanilla TrueSkill algorithm, which does increase the variance (by GAMMA^2) once for each game:

Quote from: http://research.microsoft.com/en-us/projects/trueskill/faq.aspx
So, what is going on here? Between any two games of a gamer, the TrueSkill ranking system assumes that the true skill of a gamer, that is, μ, can have changed slightly either up or down; this property is what allows the ranking system to adapt to a change in the skill of a gamer. Technically, this is achieved by a small increase in the σ of each participating gamer before the game outcome is incorporated.

Quote
sigma=sqrt(pl.skill[1] ** 2 + GAMMA ** 2)

...

(Bold by me.) So you apply GAMMA once per game, while Doug (according to qmech's quote above) ended up applying it only once per day, not per game.

I understand now.  I think our confusion is Microsoft's fault. ;)

I agree, but in another way, I think.   :P

Quote


This is Vanilla TrueSkill updating as described on Microsoft's TrueSkill detail page.  So I think what Microsoft means by the rather misleading "... small increase in the σ of each participating gamer before the game outcome is incorporated" is that the scaling factor c is not just the sum of the players' variances, but also includes 2β².  That's the sense in which the variance is increased.  The variable itself isn't increased by β², but its effect on the updating is.

I don't think this interpretation of β is correct; according to DougZ's code documentation, β is a measure of how random the game is, and is clearly distinct from gamma:

Quote
  beta is a measure of how random the game is.  You can think of it as
  the difference in skill (mean) needed for the better player to have
  an ~80% chance of winning.  A high value means the game is more
  random (I need to be *much* better than you to consistently overcome
  the randomness of the game and beat you 80% of the time); a low
  value is less random (a slight edge in skill is enough to win
  consistently).  The default value of beta is half of INITIAL_SIGMA
  (the value suggested by the Herbrich et al. paper).

  [...]
 
  gamma is a small amount by which a player's uncertainty (sigma) is
  increased prior to the start of each game.  This allows us to
  account for skills that vary over time; the effect of old games
  on the estimate will slowly disappear unless reinforced by evidence
  from new games.

Now the Microsoft formulas quoted don't even mention gamma. On the Details page, they seem to imply that the increase in uncertainty happens before these formulas are used: "Before starting to determine the new skill beliefs of all participating players for a new game outcome, the TrueSkill ranking system assumes that the skill of each player may have changed slightly between the current and the last game played by each player."
So I think they just neglected to give the corresponding equations for the uncertainty increase, and only give the more "interesting" change based on the game's outcome.

I also notice that DougZ does talk about an uncertainty increase per game here; so if qmech was right, this would be DougZ' old TrueSkill implementation, before he switched to an uncertainty increase per day. If this is the case, it's not so surprising that it leads to the same output as your script, because both implementations do it once per game...

595
I don't think I understand.  Isotropish doesn't increase variance after each game either.  I would expect that the more games you play per day under dougz's system, the closer you get to the floor your variance hits on Isotropish.  I don't get how your variance could actually get lower.

I haven't looked at your code, but you said previously that you just used the vanilla TrueSkill algorithm, which does increase the variance (by GAMMA^2) once for each game:

Quote from: http://research.microsoft.com/en-us/projects/trueskill/faq.aspx
So, what is going on here? Between any two games of a gamer, the TrueSkill ranking system assumes that the true skill of a gamer, that is, μ, can have changed slightly either up or down; this property is what allows the ranking system to adapt to a change in the skill of a gamer. Technically, this is achieved by a small increase in the σ of each participating gamer before the game outcome is incorporated.

Quote
gamma is a small amount by which a player's uncertainty (sigma) is
  increased prior to the start of each game.  This allows us to
  account for skills that vary over time; the effect of old games
  on the estimate will slowly disappear unless reinforced by evidence
  from new games.

This is a plausible algorithm, but I don't see it in dougz's code
So this line
Quote
sigma=sqrt(pl.skill[1] ** 2 + GAMMA ** 2)
https://github.com/dougz/trueskill/blob/master/trueskill.py#L345
does something else?

Yeah, that updating is a normal part of the Vanilla TS algorithm. It happens once per game, not once per day. But going back to the original quote:

Quote
(For those interested in the details, I've set β = 25, γ = σ0 / 100 (applied daily), and the draw probability at 5%.)

... I understand your original statement now. I agree that dougz may have meant that he's doing the once-a-day updating in addition to or instead of the once-per-game updating. I think I'll wait until he chimes in to make any such changes though.

(Bold by me.) So you apply GAMMA once per game, while Doug (according to qmech's quote above) ended up applying it only once per day, not per game.

596
A crazy question about the calculations:  is there something intrinsic to the system that creates a floor for sigma?  There's literally no one--no matter how many games played--with a sigma under 9.92.  What's up with that?
This could be a natural property of the game.  I mean sometimes you do everything possible and still lose.

Possibly, but the isotropic leaderboard didn't look quite like that.
Could it possibly have something to do with the games played that count toward your rank?  The isotropic leaderboard includes all eligible games, so lespeutere, for example, has over 10,000 games counting toward his ranking.  Meanwhile since the isotropish includes only games played in the last month, lespeutere has just under 3000 games counting toward the isotropish leaderboard.  I haven't followed all of the implementation talk and such, so this is just a bystander's casual look at the rankings, but that seems to be the biggest difference in the two leaderboards.

It's pretty much a natural property of TrueSkill. There is a point at which you can't decrease your uncertainty any more, because the uncertainty increase per game balances the uncertainty decrease you get by playing.

This happened on Isotropic too. It had two different sources of uncertainty, though, one that was per-game and one that was per-day. The players with games >> days could get their uncertainty down to 6-ish.

But the per-day source only ever increased the uncertainty on Isotropic, according to Doug; so how could people get uncertainties substantially below 9.9 on Isotropic, but not on Isotropish?
(I don't think this can only be due to the number of games played either - e.g. jog now has more games on Goko, but a much lower uncertainty (8.6) on Isotropic.)

I think I've found the explanation (right in this thread  ;)):
so I still don't understand what γ = σ0 / 100 (applied daily) means.

In the early days Iso would increase the variance after each game.  Since that put a hard floor on how low the variance could go, it was later changed to happen once a day.  So "applied daily" means that you only fudge the variance once a day, rather than recalculating the (constant) gamma daily.

So apparently Isotropic only increased the variance per day, not also per game. Therefore its variances varied much more, depending on the number of games played per day. On Isotropish, the increase per game leads to an absolute floor, as we currently observe. Players with >1 game/day usually had lower variances on Isotropic, players with <1 game/day usually have lower variances now.

597
Goko Dominion Online / Re: Isotropish ratings
« on: April 04, 2014, 12:26:59 pm »
Not sure if its the good topic to put it...

But i dont understand something on rankings... why HvBoefeld is 3rd and SCSN 4th? Why AI is 8th and perry green 9th?

Level 53    64.02   ±   10.15    1   1643   Mic Qsenoch
Level 52    62.32   ±   10.03    2   2245   Stef
Level 49    59.14   ±   10.02    3   3308   HvBoedefeld
59.61   ±   9.96    4   4469   SheCantSayNo
Level 48    58.40   ±   10.08    5   2220   Wandering Winder
60.09   ±   11.13    6   600   Tao Chen
Level 47    57.86   ±   10.00    7   1380   hiroki
57.37   ±   10.09    8   2581   Andrew Iannaccone
57.64   ±   10.03    9   1813   Perry Green

That's really strange, the higher ranked player has higher μ and lower 3σ ...
There's also further mistakes further down the list, e.g. isaka (#49) should be behind Troninho and me (we both have μ-3σ about 40.8, isaka has 40.3)

598
A crazy question about the calculations:  is there something intrinsic to the system that creates a floor for sigma?  There's literally no one--no matter how many games played--with a sigma under 9.92.  What's up with that?
This could be a natural property of the game.  I mean sometimes you do everything possible and still lose.

Possibly, but the isotropic leaderboard didn't look quite like that.
Could it possibly have something to do with the games played that count toward your rank?  The isotropic leaderboard includes all eligible games, so lespeutere, for example, has over 10,000 games counting toward his ranking.  Meanwhile since the isotropish includes only games played in the last month, lespeutere has just under 3000 games counting toward the isotropish leaderboard.  I haven't followed all of the implementation talk and such, so this is just a bystander's casual look at the rankings, but that seems to be the biggest difference in the two leaderboards.

It's pretty much a natural property of TrueSkill. There is a point at which you can't decrease your uncertainty any more, because the uncertainty increase per game balances the uncertainty decrease you get by playing.

This happened on Isotropic too. It had two different sources of uncertainty, though, one that was per-game and one that was per-day. The players with games >> days could get their uncertainty down to 6-ish.

But the per-day source only ever increased the uncertainty on Isotropic, according to Doug; so how could people get uncertainties substantially below 9.9 on Isotropic, but not on Isotropish?
(I don't think this can only be due to the number of games played either - e.g. jog now has more games on oko, but a much lower uncertainty (8.6) on Isotropic.)

599
Goko Dominion Online / Re: Goko Dominion Salvager Discussion
« on: April 01, 2014, 03:59:37 pm »
Playing the game by the rules means not using a point counter. If two people both want to play a casual game with a point counter, fine! But to be unable to be random matched in a Pro game with someone because you prefer to play the game by the rules? That's nuts! Really, Pro games shouldn't allow the VP counter at all.

(Expanding on what silverspawn wrote:)
Regardless of the VP counter, Goko doesn't let anyone play "by the rules" anyway, because it provides a log of the entire game all the time, which is a much worse rules deviation than a VP counter alone.

Barring Masquerade (and time-outs), a really "professional" player could (re-)calculate the VPs from the log everytime he makes a crucial decision; I much prefer an available VP counter to having to wait minutes for my opponent's calculation every few turns.

600
Game Reports / Re: Threefold repetition of Possesion
« on: March 24, 2014, 06:34:15 pm »
No, neither deck is notably better for the opponent if both players trash their apprentices with Rats (as they should, for just the reason that it only helps the opponent). You don't get an 11-card hand without Apprentice. (Also, you can never buy card draw if the opponent doesn't get more than 2-3 Coppers at one time; Vault and Nobles cost $6.)

Your argument might would apply if neither player had had a Rats at the time of the stalling, but hvb did have one.

Edit: hvb could have tried to force a stalemate by trashing his Rats before Rene could gain a Rats from it. But the game wasn't a stalemate at the time they stopped playing.

Actually, using Rats makes it possible for Rene to (nearly) guarantee a win by running out the Rats, Estates, and Curses. By playing Rats on his Possession turns, he'd be able to run out that pile. Then all he needs to do is buy two Copper, wait for either player to pick up an Estate, then buy Curses and trash them with Rats (being sure to never keep more than two in his deck at a time).

(There are 7 Estates in the trash and 6 in their decks, which leaves one in the supply.)

This only works if Rene keeps an Apprentice and trashes almost all the Rats as he gains them (with 20 Rats in his deck, he couldn't play Possession reliably any more). And then it's essentially Axxle's solution as "fixed" by amalloy, I think:

Whoever's ahead:

1) Buy one copper or curse.
2) Trash it with Rats.
3) Repeat steps 1 and 2 until you run out the copper, rats, curse piles.

edit: Oh, Rene Kuroi didn't have a rats.  Nevermind then.

edit2: Kirian's right, she could have gained one on a possession turn.  So it is winnable!

Better plan: tell your opponent that this is your strategy. Ask that they please resign so that neither of you have to sit through this.

You can't just do this, because if your deck becomes too full of Rats to play Possession frequently, your opponent gets the green light to start building back up to Provinces. Here, Apprentice provides a way to get rid of the excess Rats, so this game should be endable. But if you try just those steps 1-3 in a stalemate with no real trashing, I think you probably lose.
(It doesn't matter if you empty Estates or Coppers as third pile, except Estate is much faster.)

This is indeed a safer way to win than "my" method; to clarify, I only gave mine as a maybe more intuitive way to see that the game is not at all stalled yet. And with it, Rene would even have had a very good chance to win if he'd had slightly fewer VPs than his opponent at the time they broke off.

Pages: 1 ... 22 23 [24] 25 26 ... 29

Page created in 0.097 seconds with 18 queries.