The thing about 3p games is not quite right. That's how my crappy prototype version worked, but DougZ implemented the more probabilistically correct version that takes the entire ordering into account.
TrueSkill assumes that there is an un-observable "actual rating" that comes from playing a game, and that players finish in order of their actual rating. The actual rating is in a fuzzy range around your mean skill rating (the 99% margin of error for that range is shown on the leaderboard), but nobody can actually tell where it is in the range because you might be lucky or unlucky, or having an on or off day.
But if A beats B, that says that A's actual rating (whatever it is) was higher than B's (whatever it is) for that game. TrueSkill takes that into account and updates their skill ranges to make that more probable.
In a three-player game where A > B > C, it actually gets a lot of information about the actual rating of B, because it's in a narrow window between A and C. I'm not entirely clear what the practical difference is between this and doing the pairwise comparisons, except that your variance decreases more when you rank in the middle.