The issue is what it algorithmically means to "know" how many cards you're going to draw, and how that interacts with the token.
If it's A (the one people seem to intuitively feel is correct), it's because the -1 Card Token now works by changing a "+N Cards" instruction into a "+{N - 1} Cards" instruction, and, once everything is settled, the final modified instruction decides the number of cards you "know" you'll draw.
Then other questions are raised about "knowing." If you trash four Rats with Chapel, are you supposed to "know" you're going to draw four cards? If you do, now you have a new layer of the game rules where you start having to look at simultaneously-triggered effects and combining them into single packets of "knowledge."
And what if it was three Rats and a Catacombs? Can you split your "knowledge" up so you can "know" you're drawing two cards (and draw them) before deciding what to gain, then "know" you're drawing the other one? Or do you have to "know" about all three draws at the same time and maybe trigger a reshuffle you didn't want yet?