Top Chef Mathematics

If you like food, Washington DC, hubris, or reality television, then chances are you are a fan of Bravo’s cooking competition Top Chef.  Every year the show takes a group of aspiring chefs, places them in a house in a new city, and throws weekly challenges their way.  Following the Survivor template, every week one chef is voted off, and at the end someone is crowned Top Chef (and given a large check).  This season, the action takes place in our nation’s capitol.

Now, a show such as this might seem to have very little to do with mathematics.  But look, and ye shall find.  In the second episode of this past season, the chefs were paired up for one of the challenges.  There were 16 chefs at the time, combining to make 8 pairs.  The pairing was determined by drawing knives: 16 knives were presented in a knife block, and each had a number on it from 1 to 8.  The number was printed on the blade, so each chef would walk to the block, draw a knife, and read the number.  The knives were not replaced afterwards.  Pairs were formed by people who drew the same number.

This dude loves the number 3.

In this particular episode, the first six numbers drawn were 2, 1, 3, 6, 7, and 7.  In particular, the first pair was formed on the 6th draw.  This leads to a natural question: how long would you expect it to take before the first pair is formed?  Six draws seemed a bit long to me (I would have expected the first pair to have been formed sooner), so I immediately set about trying to understand the answer to this question.

To ease ourselves into it, let’s simplify things.  Instead of 8 pairs, suppose there were only 3.  And instead of knives, which are dangerous and pointy, let’s suppose people were choosing balls from a bag.  Rather than differentiating the balls by writing numbers on them, let’s differentiate them by color.  So suppose you have a bag with 3 pairs of balls: one pair red, one pair green, and one pair blue.

The game is this: you draw a ball from a bag and put it aside.  You keep doing this until you have drawn a pair.  The question is how long it will take before the first pair is drawn.

Right away we see that you will get a pair some time between your second and your fourth draw.  Obviously you can’t get a pair after only drawing one ball, so you need a minimum of two draws.  On the other hand, if you don’t have a pair after drawing three, you must have one ball of each color, which means your fourth draw MUST give you a pair of some color.

Given this observation, we can now start to calculate probabilities.  What is the probability that you will have a pair after two draws?  Well, this happens precisely when your first and second draw are the same color.  The probability of this happening is equal to 1/5, since there is no restriction on the first ball you draw, but then there is a 1 in 5 chance that the second one you draw will be the other ball with the same color.

You have a 1 in 5 chance of picking the second green ball after picking the first one, for example.

What about the probability that you’ll have a pair after exactly three draws?  In order for this to happen, your second draw must be a different color than the first, and your third draw must be the same color as either your first or second draw.  Of the 5 balls remaining after your first draw, 4 will have a different color from the first, meaning that the probability of drawing a second ball which is a different color than the first is 4/5.  Similarly, the probability of drawing a third ball which is the same color as either the first or the second ball is 1/2 (see the picture below).  Thus, by the laws of conditional probability, the odds that you will have a pair after your third draw is 4/5 x 1/2 = 2/5.

You have a 2 in 4 (i.e. 1 in 2) chance of pulling a blue or green ball given that the results of your first two draws were blue and green, for example.

The same argument works when calculating the odds that the pair will come on the fourth draw.  There is no restriction on the first draw, there is a 4/5 chance that your second draw will be a different color from the first, there is a 1/2 chance that the third draw will be a different color from the second, and there is then a 100% chance that your fourth draw will be the same color as one of your earlier draws.  This again gives a probability of 2/5.  We see that the probabilities add up to one, as they should.

Given these probabilities we can also calculate the expected value: on average, how many draws will you need before you get a pair?  Since the probability of two draws is 1/5, the probability of three draws is 2/5, and the probability of four draws is 4/5, we see that the average is

2 x 1/5 + 3 x 2/5 + 4 x 2/5 = 16/5 = 3.2.

In other words, on average you will need 3.2 draws before you come up with a pair.

It’s more interesting, of course, to deal with c different colors, rather than just 3.  We can still perform this analysis, and try to find probabilities and expectations. Suppose we have c pairs of balls, each pair of a different color. We draw the balls without replacement from a bag until we find a pair of the same color, then we stop. We can define a random variable Y_c to be the draw on which we complete our first pair. For example, in the case c = 3 above, we saw that Y_3 = 2 with probability 1/5, Y_3 = 3 with probability 2/5, and Y_3 = 4 with probability 2/5.

As before, notice that Y_c must take a value between 2 and c + 1.  This is because we can’t draw a pair before our 2nd draw, and after c draws, the worst case scenario is for us to have each ball of a different color.  Since we have exhausted all color possibilities, the c + 1st draw must give us a pair (in essence we are applying the Pigeonhole Principle).  So, to describe the behavior of Y_c we need to calculate the probability that $Y_c = k$ for k between 2 and c + 1.

The same sort of argument as in the simple case c = 3 works here.  Suppose you want to calculate P(Y_c = k).  In order to find your first pair on the kth draw, you need to NOT draw a pair on your 2nd, 3rd, 4th, …, or k – 1st draw, and then have the color on the kth draw match one of the colors you have already drawn.  Since there are a total of 2c balls in the bag to begin with, we see that the odds of not drawing a pair on the 2nd draw is \frac{2c-2}{2c-1}, the odds of not drawing a pair on the 3rd draw is \frac{2c-4}{2c-2} (since there are 2c – 2 balls remaining, and you want to avoid 2 that are colors you’ve already drawn, leaving you with 2c – 2 – 2 = 2c – 4 options), and so on, so that the odds of not getting a pair on the k – 1st draw is \frac{2c-2(k-2)}{2c-(k-2)}.  Meanwhile, in order to draw a pair on your kth draw, you must pull one of the k – 1 colors that have already been pulled.  Since there are 2c – k + 1 balls remaining, the probability that this happens is \frac{k-1}{2c-k+1}.

Combining these, we see that

P(Y_c = k) = \frac{k-1}{2c-k+1}\prod_{j=1}^{k-2}\frac{2c-2j}{2c-j}

= \frac{k-1}{2c-k+1}2^{k-2}\prod_{j=1}^{k-2}\frac{c-j}{2c-j}

= 2^{k-2}\frac{k-1}{2c-k+1}\frac{(c-1)!}{(2c-1)!}\frac{(2c-k+1)!}{(c-k+1)!}

= \frac{2^{k-2}\left(\begin{array}{c}c-1\\k-2\end{array}\right)}{\left(\begin{array}{c}2c-1\\k-1\end{array}\right)}.

(Recall that the binomial coefficient \left(\begin{array}{c}n\\k\end{array}\right) is given by \left(\begin{array}{c}n\\k\end{array}\right) = \frac{n!}{k!(n-k)!}, and as usual n! = n x (n-1) x … 3 x 2 x 1 is the product of all integers from 1 to n.) With this formula, we can now see how likely it was for the first pairing on Top Chef to have occurred on or after the 6th draw.  In this case there are 8 pairs, so c = 8, and we see that

P(Y_8 \geq 6) = \sum_{k=6}^{9}P(Y_8 = k) = \sum_{k=6}^{9} \frac{2^{k-2}\left(\begin{array}{c}7\\k-2\end{array}\right)}{\left(\begin{array}{c}15\\k-1\end{array}\right)},

which comes out to 16/39, or roughly 41.03%.

Of course, there’s still the question of expectation: approximately how large do we expect Y_c to be (remember we saw that E(Y_3) = 16/5)?  I’ll spare you the details, but one can show that for general c, the expected value is given by

E(Y_c) = \frac{2^{2c}(c!)^2}{(2c)!}.

In particular, in the case c = 8 from Top Chef, we find that E(Y_8) = \frac{32768}{6435}, which is approximately 5.09.  So on average, for c = 8 we expect to find a pair after a little more than 5 draws.

One final question: what happens to the expected value as c grows large?  As it turns out, we can write the expected value in a very nice form in terms of the Gamma function (which one can think of as a generalization of the factorial to the entire real line).  Using the doubling formula \Gamma(z)\Gamma(z+1/2) = 2^{1-2z}\sqrt{\pi}\Gamma(2z), the interested reader can show that

E(Y_c) = \sqrt{\pi}\frac{\Gamma(c+1)}{\Gamma(c+1/2)}.

If one then uses Stirling’s formula to approximate the Gamma function, it follows that

E(Y_c) \approx \sqrt{c\pi},

in other words \frac{E(Y_c)}{\sqrt{c}} \rightarrow \sqrt{\pi} as c \rightarrow \infty.  What a wonderful asymptotic!  This tells us that the number of draws we will need from a bag of c pairs before obtaining our first pair grows like the square root of c times a factor of \sqrt{\pi}.  We can compare the estimate given here for E(Y_8) with the exact value computed above – in doing so, we find that E(Y_8) \approx \sqrt{8\pi} \approx 5.01.  So indeed, the approximation is fairly close to the true value (and the approximation will only get better as c grows).

There are many related questions one could ask.  For example, what if instead of pairs, we look at collections of triplets, or quadruplets?  What if we consider formation of the 2nd pair or 3rd pair instead of only considering the 1st pair?  What if we allow for different numbers of balls of each color (e.g. 2 red balls and 3 green balls)?  But I’ve already gone on too long, so I will leave these questions for another time.  I don’t know if these questions go by a certain name or not – I couldn’t find this particular problem anywhere.  If anyone knows of a paper or book where these problems are discussed, I would be much obliged.

In the mean time, I will close with a picture of Tom Colicchio looking like a badass.  Clearly the worlds of chefs and rock stars have collided – will mathematicians be next?

(Kudos to Dr. Moore for some helpful commentary.)

The Housekeeper and the Professor

Some time ago, I heard about a book from Japan called The Housekeeper and the Professor, written by Yoko Ogawa in 2003 and translated by Stephen Snyder last year.  As the title suggests, the book centers on the relationship between a housekeeper, her son, and a math professor.  The main conceit of the book is that the Professor suffered an accident some years before that impaired his memory, so that his short term memory only lasts around 80 minutes.  In other words, every day the housekeeper and her son come to visit the professor, it is as if they are meeting him for the first time.  He copes by clipping small notes to his clothing, and in spite of his disability he still dabbles in mathematics. 

One part Memento, one part A Beautiful Mind, the book was named a New York Times Book Review Editors’ Choice, and was popular enough in Japan to warrant a film adaptation (the Japanese language trailer for which can be found below).  Clearly, then, the book has resonated with people regardless of language.  But how does the book stack up from a mathematical perspective?

At first glance, it may sound like there’s a lot of fodder for me to complain about here.  After all, how many times have we seen mathematicians in popular culture with some sort of mental handicap?  Granted, memory loss is a new twist – usually insanity is the preferred condition.  Still, though, I don’t think it’s too much to ask for a mathematician who’s just a normal dude (or even, gasp, a normal lady).

Unfortunately, he is also but the latest entry in a long line of mathematicians in popular culture who are socially maladjusted.  He’s also incredibly reclusive – he has a strong aversion to crowds, and when he accompanies the housekeeper and her son to a baseball game midway through the novel, it’s fairly clear that he hasn’t been on an outing in some time.  One could explain these traits as a byproduct of his mental condition, of course: it’s natural for him to be shy around people when he is always meeting them for the first time, and there’s a danger in taking him out for too long lest he should forget what he’s doing out in the first place.  But part of me feels like these are convenient excuses for rehashing familiar tropes about people who study mathematics.

It’s not all bad, though.  In fact, I found myself able to forgive much of what I didn’t like about this portrayal of mathematicians, because there are many positive features about the professor as well.  For starters, the professor is able to form a close relationship to the housekeeper’s son (who he nicknames Root, because the child’s flat head of hear reminds him of the square root sign).  Even though the professor can’t remember who Root is from day to day, every time the boy comes to the professor’s house, the professor dotes on him like a father.  He obsesses over the safety of Root more than the housekeeper, and keeps Root in his mind as much as he can, given his circumstances.

Of course, his concern about Root wouldn’t be complete if it didn’t include concern over his mathematics education.  Here’s another thing Ogawa does quite well – she is able to not only show the professor’s love of mathematics, but she is also able to illustrate how that passion can inspire others.  The professor is a number theorist, and he is always looking for meaning behind numbers (something which, in the hands of a poorer story-teller, would no doubt incite my rage).  What makes the professor’s interest significant is that he never discusses numbers for the sake of random numerological connections – instead, he is able to take small observations and use them to hint at larger mathematical ideas.  Here’s one such example, from the day the three of them went to a baseball game (this may be my favorite passage from the book):

And when he noticed that his seat number was 714 and Root’s was 715, he began to lecture again and completely forgot to sit down.

“The home run record Babe Ruth set in 1935 is 714.  On April 8, 1974, Hank Aaron broke that record, hitting his 715th off of Al Downing of the Dodgers.  The product of 714 and 715 is equal to the product of the first seven prime numbers: 714 x 715 = 2 x 3 x 5 x 7 x 11 x 13 x 17 = 510510.  And, the sum of the prime factors of 714 equals the sum of the prime factors of 715: 714 = 2 x 3 x 7 x 17, 715 = 5 x 11 x 13; 2 + 3 + 7 + 17 = 5 + 11 + 13 = 29.  A pair of consecutive whole numbers with these properties is quite rare.  There are only 26 such pairs up to 20,000.  This one is the Ruth-Aaron pair.  Just like prime numbers, they are more rare as the numbers get larger.  And 5 and 6 are the smallest pair.  But the proof to show that those pairs are infinite in number is quite difficult*. . . . The important thing is that I’m sitting in 714 and you’re in 715, instead of the opposite.  It’s the young who have to break the old records.  That’s the way the world works, don’t you think?” (90)

Happy to have broken the home run record, or happy to be part of an interesting number phenomenon?

There are vignettes like this peppered throughout the book, where the professor will link an everyday observation to some kind of mathematics.  In the story, which is told from the housekeeper’s perspective, we see how this inspires the housekeeper to think about simple mathematical problems.  In spite of her lack of formal training, the professor is able to inspire in her a sense of mathematical curiosity (something which all math teachers should aspire to do).  Those of the mathematical persuasion are rarely shown as being able to interest people who are less mathematically inclined.  I’m glad to see that this book bucks that trend.

Moreover, this book does a better than average job of discussing what makes mathematics so appealing to those of us who study it.  Consider the following exchange between the housekeeper and the professor, after she apologizes for sending a proof of his to a journal via regular mail instead of express:

“No, there was no need to send it express.  Of course, it’s important to arrive at the correct answer before anyone else, but it’s just as important that the proof is elegant.”

“I had no idea a proof could be beautiful . . . or ugly.”

“Of course it can,” he said.  Getting up from the table, he came over to the sink where I was washing the dishes and peered at me as he continued.  “The truly correct proof is one that strikes a harmonious balance between strength and flexibility.  There are plenty of proofs that are technically correct but are messy and inelegant or counterintuitive.  But it’s not something you can put into words—explaining why a formula is beautiful is like trying to explain why the stars are beautiful.” (16)

While I don’t necessarily agree with the last part, I think it’s refreshing to find a discussion of what accounts for mathematical beauty in a book like this.  This type of discussion between a mathematical expert and a mathematical amateur is not often present in works that center on mathematics – more frequently, the conversation is between math experts, or is not about mathematics at all.  Providing this type of simple insight to a reader who may not have a mathematical background is certainly a plus.

Despite falling into some tired stereotypes, the professor emerges as a fully realized character.  His memory problems are much more than a gimmick, and while they enable certain stereotypes to persist, Ogawa also uses his disability to showcase a degree of empathy for other people that is not often found in portrayals of those who study mathematics.  Overall I found that I quite enjoyed the book – if you’ve got a lazy Sunday coming up (the book is short, so you could easily finish it in a weekend), I’d certainly recommend giving this story a shot.

*Actually, the question as to whether or not there are infinitely many such pairs is actually still open.  See here for more information.

Let’s Make a Deal with Paul the Octopus

As summer reaches its midpoint, we come to the end of another rousing year of World Cup soccer.  As with any international sporting event, fans all over the world have undoubtedly had their share of ups and downs.  Of all the countries in this year’s tournament, however, I think Germany may be receiving the most attention, for even though they didn’t make it into the finals, the Germans have one thing no other country has: a precognitive octopus.

At least, that is what the media would have us believe.  For the past several weeks, Paul the Octopus has captured the hearts, minds, and stomachs of people around the world.  He’s a charming octopus, to be sure, but it isn’t his good looks that have gotten him this far.  Instead, it’s his seeming ability to correctly predict the outcome of soccer matches.  As time has gone on and Paul’s predictions have continued to prove themselves accurate, the amount of press he has received has only increased.  Articles about him are everywhere on the internet: here‘s one discussing public outrage after he correctly predicted Spain to defeat Germany in the semifinals, and here‘s an article discussing his preference for Spain over the Netherlands in the finals.  Search for “Paul the Octopus” in Google News and you will find thousands of results.

This is one popular mollusk.

Of course, I suppose it’s possible that Paul really can see into the future, at least as far as soccer is concerned.  After all, he did correctly predict the winner of every game asked of him; an impressive feat, seeing as how his advice was requested a total of 8 times.  However, Occam’s Razor suggests that we should look for a simpler explanation.

A natural first choice is to guess that Paul was simply guessing randomly, and is very lucky.  The odds of this happening are small – assuming he has a 50% chance of picking correctly, the odds of him being right each one of the 8 times he predicted a winner in this World cup would be 1/28, which is only approximately .39%. Very low odds indeed.

This analysis ignores biases that may be present – in particular, Paul’s octopus vision may bias him towards certain flag designs (which, given the fact that he frequently chooses Germany, seems plausible).  The Wikipedia article I linked to discusses other sorts of possible bias.  However, these biases would only influence which box he selected – they wouldn’t necessarily affect the odds that his selection would correspond to the winning team (although it is possible that he is being persuaded to throw his chips in for the favored team, which would increase the likelihood of his success).  Either way, questions concerning how Paul makes his selection are more interesting to me, so let me focus for the moment on that.

First of all, one could easily argue that unlike flipping a coin, the trials here (i.e. Paul’s selections) are NOT independent. Indeed, what’s going on here may be very similar to an article in the New York Times a couple of years ago that discussed the lurking presence of the Monty Hall Problem in a classic experiment from psychology (which I discussed here).

The idea is quite simple.  Paul chooses between two opponents in the World Cup by selecting a piece of food from one of two boxes.  Each box is labeled with a country’s flag, and this is the most obvious distinction between the boxes.  Suppose, for the sake of argument, that Paul has in his mind a ranking of his preferences for the flags, starting with the one he likes the most, and ending with the one he likes the least.  Assuming between any pair of flags Paul prefers one to another, one could theoretically determine his preferences by giving him sufficiently many pairings of different flags.  Moreover, assuming he does have preferences, the game selections are no longer independent, because each game gives us some information about his preferences.

Let’s dig deeper and look at his selections throughout the World Cup.  Paul gave predictions for 8 games, and those 8 games involved 9 separate teams (only one game did not involve Germany).  In the first game, Germany versus Australia, Paul selected Germany.  Let us note that as: Now, already this gives us information about Paul’s preferences.  There are 9 x 8 x 7 x 6 x 5 x 4 x 3 x 2 x 1 = 9! = 362,880 possible ways to order these 9 teams, since you can choose any one of the 9 teams to be your favorite, any one of the 8 remaining to be your second favorite, and so on.  But given the above information, we already know that any choice with Australia ranked higher than Germany can not match Paul’s preferences.  This eliminates a surprising number of possible outcomes – half, in fact, since for every list in which Germany is ranked higher than Australia, we can flip these two countries to obtain a ranking in which Australia is higher than Germany.

This then affects the probability of all subsequent pairings!  Let’s take a look at the next game.  In the second game, Paul correctly predicted Serbia to defeat Germany.  We can write this as: This gives us even more information!  Now, not only do we know that Paul prefers Germany to Australia, we also know he prefers Serbia to Germany.

Moreover, suppose (as seems reasonable) we assume the probability that he prefers Germany to Australia is 50%.  Now, after the first match, we know he prefers Germany to Australia – the right followup question to ask is, what is the probability he prefers Serbia to Germany GIVEN that he prefers Germany to Australia?  In particular, we have a conditional probability question here – these two facts are not independent!  For example, the fact that Paul prefers Germany to Australia means that Germany cannot be last in his rankings – this should decrease the probability that he prefers Serbia to Germany.  And indeed it does: supposing that each of the 362,880 rankings is equally likely, the probability that someone will prefer Serbia to Germany given that they prefer Germany to Australia is not 50%, but is only 33%!

To see why, consider an arbitrary ranking of the 9 teams.  If we only consider the relative placement of Serbia (S), Germany (Ge), and Australia (A), there are 6 possible rankings among these three:

  1. S > Ge > A
  2. S > A > Ge
  3. Ge > S > A
  4. Ge > A > S
  5. A > Ge > S
  6. A > S > Ge.

However, if we also know that Ge > A, then rankings 2, 5, and 6 are eliminated, leaving us only with

  1. S > Ge > A
  2. Ge > S > A
  3. Ge > A > S.

In particular, of these three remaining, Serbia is ranked higher than Germany only once.  Therefore, the probability that S > G GIVEN that G > A is only 1/3, or 33%.  (If you prefer, you can use a counting argument to show this as well.)

Of course, just as the first match gave us information, so did the second.  Therefore, when it comes to the third match, we have even more information at our disposal.  The third match was between Germany and Ghana, and Paul correctly identified Germany.  In other words: Now the appropriate question to ask, of course, is: what is the probability that Paul prefers Germany to Ghana, given that he prefers Serbia to Germany and Germany to Australia?  Well, we know that Germany can’t be his first or his last choice, because it must be preceded by Serbia and followed by Australia.  Therefore, among these four countries, Germany must rank second or third.

If Germany ranks second, Serbia must be first, but we are free to make Australia third or fourth.  Similarly, if Germany is third, then Australia must be last, and we are free to make Serbia first or second.  In other words, we have four outcomes:

  1. S > Ge > Gh > A
  2. S > Ge > A > Gh
  3. S > Gh > Ge > A
  4. Gh > S > Ge > A.

Among these four choices, only 2 have Germany preferred over Ghana.  Thus the conditional probability that one would prefer Germany to Ghana is again 50%.

The interested reader can easily continue on in this fashion.  If you’re impatient, however, you can calculate the probability that Paul would have made the selections he did more directly.  All we need to know is who Paul selected in each match.  I’ll tell you that in the subsequent matches, Paul picked Germany over England, Germany over Argentina, Spain over Germany, Germany over Uruguay, and Spain over the Netherlands.

Given this information, suppose you want to know the probability that Paul selects Germany over Australia, Argentina, Uruguay, Ghana, and England, while he selected Spain over Germany and the Netherlands and Serbia over Germany.  As stated before, there are 9! possible lists of preferences.  In this case, it’s not hard to determine how many would lead to the behavior seen in this year’s World Cup.  Since Paul picked Germany over 5 teams, but behind 2 teams, we know that Germany can only be ranked 3rd or 4th (any higher and there wouldn’t be room for Spain and Serbia above, any lower and there wouldn’t be room for the 5 teams below).

If Germany is ranked 3rd, then we can choose to put either Spain or Serbia in 1st place.  Whichever one we don’t put in first place will then need to go in 2nd place, so that both countries are ranked higher than Germany.  After that, we are free to order the remaining countries however we like.  In other words, we see that there are 2 x 6! = 1,440 possible lists of preferences if Germany is ranked 3rd:

  1. 2 choices
  2. 1 choice
  3. Germany
  4. 6 choices
  5. 5 choices
  6. 4 choices
  7. 3 choices
  8. 2 choices
  9. 1 choice.

Meanwhile, if Germany is in 4th place, we need to figure out how many ways there are to choose the three teams above it.  Notice that since Australia, Argentina, Uruguay, Ghana and England must all be ranked lower than Germany, the three countries ranked above it must be Spain, Serbia, and the Netherlands.  However, since we also need the Netherlands to be ranked below Spain, this only gives us three possibilities for the ranking of the first three teams: Spain, Serbia, Netherlands; Spain, Netherlands, Serbia; and Serbia, Spain, Netherlands.  Once we have made that selection, however, we are free to choose the 5 teams below Germany however we please.  In other words, if Germany is 4th, there are 3 x 5! = 360 possible lists of preferences.

Combining these, we see that there are 1,800 possible lists of preferences that would lead to the behavior shown by Paul.  Since the total number of outcomes is 9!, this gives a probability of only 1,800/9!, or roughly .49%.  This is the same value you will get if you calculate the remaining conditional probabilities and multiply them together.

Of course, if one wants a more impressive number, one can always try to correct for bias in Paul’s selection.  For example, suppose we assume that Paul will never choose England or Australia if given the option of a different flag – this seems reasonable, given experiments on how his species sees (the other flags have more contrast and are more focused on horizontal shapes, which apparently his species is drawn to).  If we make this assumption, the number of potential preference lists drops to 7! x 2 = 10,080, in which case the probability that Paul would choose as he did jumps up to 17.9%!

There are valid concerns about this model, though.  For instance, given the choice between two flags, why should we assume that Paul will always choose the same one over the other?  Equivalently, why should we believe that Paul’s decisions follow a prescribed preference list?  Indeed, when Paul was used to make predictions in 2008, he selected Germany over Spain, unlike his selection of Spain over Germany in 2010.  In 2008, it was Germany who was the victor, and so Paul guessed incorrectly – perhaps he learns from his mistakes, after all.

Whatever the case, I doubt this octopus has any special ability.  And even if he does, I don’t know that that would necessarily be a good thing.  For we all know that when 8 limbs are combined with super powers, nothing good can come of it.

Is this the future that Paul's powers portend? I believe it is.

A New Birthday Problem

Last week, Slashdot posted an interesting link to a problem posed at the most recent Gathering 4 Gardner, a mathematical (or perhaps I should say mathemagical) convention created in honor of the late Martin Gardner.  The question, posed by Gary Foshee, is as follows: you have a friend with two children, one of whom is a boy born on a Tuesday.  What is the probability that the other child is a boy?

Forget about the Tuesday fact for a moment – if you have a friend with two children, one of whom is a boy, what is the probability that the other child is a boy?  You might expect that the answer should be 50%, since the sex of one child shouldn’t affect the sex of the other.  But this is not quite right, because you’re not told whether the boy is the older or younger child.

There are only four possibilities when one has two children, so the situation is easy to analyze.  With two kids, the four possibilities are boy boy, boy girl, girl boy, and girl girl.  If you know that one of the kids is a boy, this eliminates girl girl from the list of potential combinations, leaving us with the three outcomes boy boy, boy girl, and girl boy.  Of these three outcomes, we see that only the first has two boys, and so we conclude that the probability of the second child being a boy is 1/3, NOT 1/2!

There’s another way to answer this question, one that generalizes nicely to the more complicated question asked by Foshee.  There are two cases to consider: either the boy you know about is the younger child, or the older child.  If the boy you know about is the younger child, there are two possibilities for the older child (girl or boy).  Similarly, if the boy you know about is the older child, there are two possibilities for the younger child (girl or boy).  This gives four outcomes, but you have counted the boy boy outcome twice.  In other words, we see there are only three distinct outcomes, and only one of them has two boys, so again we see that the probability is 1/3.

Now let’s return to the original question.  Again, it seems like the fact that the boy was born on a Tuesday shouldn’t matter, but if we do the same analysis as above, we’ll see that this is not the case.  The information about the day does make the number of outcomes larger, however, so it’s easier to get mixed up.

As before, let’s split into two cases, depending on whether the Tuesday boy is younger or older.  If the younger child is the Tuesday boy, then there are 14 possible outcomes for the older child, since there are 2 choices for the sex of the child and 7 choices for the day of the week on which the child was born.  Similarly, if the older child is the Tuesday boy, there are once again 14 possible outcomes for the younger child.  However, notice that we have counted the outcome of two boys both born on Tuesday twice, just as we counted the outcome of two boys twice in the simpler problem.  Correcting for this double counting, we see that there are 14 + 14 – 1 = 27 possible outcomes.  Of these outcomes, 13 of them correspond to having a two boys – if the younger child is the Tuesday boy, there are 7 possible outcomes that will give us two boys, and if the older child is the Tuesday boy, there are again 7 possible outcomes.  As before, though, we’ve counted both children being Tuesday boys twice, so we subtract 1 to correct for this double-counting, which leaves us with a total of 7 + 7 – 1 = 13 desired outcomes.  This means that the probability of the second child being a boy is 13/27.  While still not equal to 1/2, this is much closer to 1/2 than 1/3.

Much like the Monty Hall problem, however, one can understand the mathematical reasoning behind this problem and still have trouble with the intuition.  After all, why should the day on which a child was born have any bearing on the sex of the other child?  On the face of it, the solution to this problem doesn’t make any sense.  One way to try and marry this solution to our intuition is to assign some numbers and explore the data, seeing where our intuition diverges from this picture.

Suppose we take a survey of 19,600 families with two children (you will understand why I’ve chosen this seemingly random number in a moment).  Of those families, suppose they are evenly divided among boy boy, boy girl, girl boy, and girl girl households.  In particular, we see that there are 4,900 families with two boys, 4,900 families with two girls, and 9,800 families with a boy and a girl (in half of these families the boy is older, and in half the boy is younger).

Now suppose we further subdivide the data according to the day of the week in which the child was born.  Suppose that a child is equally likely to have been born on any day of the week.  This means that, for example, among the 4,900 families with two boys, 4,900 ÷ 49 = 100 will have boys who were both born on Monday, 100 will have boys for which the oldest was born on Monday and the youngest Tuesday, 100 will have boys for which the oldest was born on Monday and the youngest Wednesday, and so on.  In other words, for every possible combination of sexes and days of birth, there will be 100 families that match that combination.

Suppose, now, that you take a random family in which one of the children is a boy born on Tuesday.  There will be 1,400 families for which the younger child is a born boy on Tuesday, and 1,400 families for which the older child is a boy born on Tuesday, which means there are 2,700 families total families under consideration (note that we have counted the 100 families with two boys born on Tuesday twice).  Of those 2,700 families, 1,300 will have two boys, for the same reason as above (you can also look at the diagram above – imagine each square as representing 100 families in the survey).  So, if we pick a FAMILY at random with a boy born on Tuesday, the probability that the other child is a boy is 1,300 out of 2,700, or 13/27.

Now, instead of picking a family at random, suppose we pick a BOY at random who was born on Tuesday, and ask for the probability that the boy’s sibling is also a boy.  In this case, note that there are 2,800 boys in our sample who were born on Tuesday – 1,400 who are the younger sibling and 1,400 who are the older sibling.  In particular, note that we don’t subtract out the families with two boys both born on Tuesday for double counting in this case; this is because we are choosing the boy, not the family, and choosing the younger boy born on Tuesday is different from choosing the older boy born on Tuesday.  Because there’s nothing to subtract out, we see that of these 2,800 boys, 1,400 have sisters and 1,400 have brothers.  Therefore the probability that the BOY has a brother is 1/2, the answer our intuition gave us from the beginning!

In other words, you can try to understand this seeming paradox as a difference in perspective.  If we look from the perspective of the family unit, the probability that a two-child FAMILY with one son born on Tuesday will have two sons is 13/27, slightly less than one half.  However, from the perspective of the children, the probability that a BOY born on Tuesday in a two-child family will have a brother as opposed to a sister is 1/2.  In this latter case, the day of the week really is irrelevant.

For some, however, this explanation still may not seem like enough.  I’d encourage anyone with a different approach to sound off below!

The Twilight Saga: A Mathematical Perspective

Living in Los Angeles, it’s hard not to be aware of the fact that the new Twilight movie, Eclipse, arrives in theaters today.  The series has developed an insatiable fan base of people willing to spend thousands of dollars to fly here in the hopes of scoring tickets to the premiere, which certainly indicates the film will be a success.  But of course, the film’s success was never in question: with the first two movies having grossed over $1 billion worldwide, the success of this latest entry in the franchise is a foregone conclusion.

Of course, the success of this franchise should not be viewed in isolation, but as just a part of the larger vampire pop culture renaissance.  HBO’s True Blood, also based on a book series involving a girl who knocks boots with the undead, is going strong into its third season this summer, and the CW’s Vampire Diaries will return for a second season this fall.  And just when I thought the market for vampire-themed programming had become saturated, ABC premiered its own summer show featuring blood suckers called The Gates.  Clearly there is a trend here, with the ever-growing popularity of the vampire at its center.  No doubt Eddie Murphy is rolling in his undead grave for not releasing Vampire in Brooklyn 10 years later.

While there are many words that could be used to describe these shows and movies that place supernatural love triangles at their center, “realistic” is not one of them.  Nevertheless, there are a handful of people who have taken a critical eye to the vampire phenomenon and have used mathematical models to gain insight into how the populations of such creatures might behave in real life.  Just like the fights between Team Edward and Team Jacob, however, the debate over whether vampires could actually exist rages on.

Not long ago, an article went around the web purporting that a team of physicists had proven that vampires could not exist.  The physicists, Costas Efthimiou and Sohang Gandhi, posted a paper to the arXiv in which they purport to use physics to dispel pop culture portrayals of ghosts and zombies, in addition to vampires.  Their argument for debunking vampires rests on the following assumptions:

  1. When a vampire bites a human, that human becomes a vampire (we will return to this assumption later).
  2. Vampires need to feed on a human once every month (a conservative estimate when compared to what popular culture would have us believe).
  3. Assume the first vampire came into existence in 1600, when the human population was roughly 500 million.
  4. Ignore human mortality rates due to other factors, and ignore human birth rates as well.

With these hypotheses, they show that vampires would wipe out humanity in just 2 1/2 years.  In fact, no matter the size of the initial human population, their model will lead to humanity’s extinction in a short amount of time.  This is true even if we assume a more conservative estimate on the length of time vampires can go between feedings.

The reason is simple.  Using their model, the first vampire will feed after one month, creating a new vampire (by assumption 1).  After 2 months, those 2 vampires will each feed, giving us a total of 4 vampires.  After 3 months, those 4 vampires will feed, giving us a total of 8 vampires.  The pattern continues – after n months, the vampire population will be 2n. In other words, the population of vampires will grow exponentially.  Moreover, because of the assumption on the birth and mortality rate of mankind, we see that as the population of vampires grows exponentially, so too must the population of humans shrink exponentially.  This means that at some point (sooner than you might think), humans would be wiped out.

The careful reader, however, will note a number of problems with this analysis.  For one, ignoring the birth rate of humans means that the model’s date of extinction is premature.  However, Efthimiou and Gandhi point out that even if we include the birth rate, that rate would not be high enough to counteract the explosion in the vampire population.  A more serious flaw, however, is in not considering the mortality rate of the vampires themselves.  After all, once people realize there are vampires in their midst, wouldn’t they fight back, or at least defend themselves so that not all of the vampires could feed?  Assuming that every vampire would be able to feed whenever necessary seems unrealistic.

What’s more, assuming that vampires can only satisfy themselves with human blood, it seems unreasonable to assume that vampires would feast so carelessly, without regard to the diminishing supply of their food.  If vampires killed all humans, they in turn would die (again), and so it seems reasonable to expect that vampires would apply a better strategy, one in which they kept the human species afloat so that they could themselves continue to exist.  Just ask Ethan Hawke.

In a brief article for Math Horizons, mathematician Dino Sejdinovic addresses these issues and highlights an article from 1982 that modeled the vampire outbreak more realistically, by including the human birth rate and vampire mortality rate.  In doing so, the mathematics becomes less fit for a general audience, but it also gives us a more interesting picture – regardless of the collective desire for human blood, vampires can act in a way that the ratio of vampires to humans reaches an eventual equilibrium.  In other words, it doesn’t seem right to throw out the idea of vampires based on purely mathematical arguments.

Interestingly, though, all of these analyses rest upon assumption (1), which states that humans always become vampires once bitten.  In the modern incarnation of these creatures, however, this assumption no longer appears to be valid.  For example, in both True Blood and The Vampire Diaries, the process of turning into a vampire requires consent (I guess it’s more romantic that way); not only must the vampire drink the human’s blood, but the human must also drink the vampire’s blood.  In this case, it is possible for vampires to satiate themselves without killing humans (provided the vampires can show enough restraint) or increasing their own population.

There also appear to be rules governing population control in vampire communities.  For example, in an episode of True Blood, one vampire is tasked with creating a new vampire as penance for murdering one of his own kind.  Are such rules keeping the population stable widespread?  How might such rules, in conjunction with a weakening of assumption (1), alter the vampires’ optimal strategy?  I will leave it to the curious reader to discover the answer.

Happy Tau Day?

In the past, I’ve used this blog as a platform to make clear my mixed feelings about Pi Day, a math themed holiday celebrated every year on March 14th (3/14, har har) in honor of the beloved mathematical constant \pi.  My thoughts on the subject can be found here.

It would seem that I am not alone in my frustration.  Michael Hartl, an educator and entrepreneur (as well as a Ph.D. graduate from Caltech), has just today launched a website in favor of Tau Day as a replacement for Pi Day.  However, his argument (based on a 2001 paper by Bob Palais) goes a step farther – he argues that \pi day shouldn’t be celebrated because \pi isn’t the fundamental constant we should be considering!  Rather, he argues that the true fundamental constant is 2\pi, which is approximately 6.283185… .  Hartl argues that this should be the fundamental constant of interest, and renames it \tau (for reasons given on the website).

Why should this be viewed as a more fundamental constant?  Recall how \pi is defined – it is the ratio of a circle’s circumference to its diameter.  But a circle itself is more naturally defined in terms of the radius, i.e. as the set of points whose distance from the center is equal to the radius.  Because of this, doesn’t it seem more natural to consider the ratio of a circle’s circumference to its radius, rather than the ratio of circumference to diameter?  Put another way, isn’t a more natural constant given by the circumference of a circle with radius 1 rather than the circumference of a circle with radius 1/2?  He offers plenty of other aesthetic examples for why 2\pi should be viewed as more fundamental, including references to the Bernoulli numbers and simple quadratic forms.

On the one hand, this may seem like a trivial issue – after all, the difference between \pi and \tau is only a factor of 2, and different normalizations of quantities are quite common in mathematics.  On the other hand, Hartl does make a convincing argument from a pedagogical point of view.  His strongest argument comes from trigonometry.  When students learn to convert between radians and degrees, they learn that 2\pi corresponds to full revolution.  From this, one sees that half of a revolution corresponds to an angle of \pi, 1/4 of a revolution corresponds to an angle of \pi/2, and so on.  But if we define the fundamental quantity to be \tau, then in radians, half a revolution is \tau/2, a quarter of a revolution is \tau/4, and the measure of c revolutions is given by c\tau for any number c.

Hartl concludes the following: “The unnecessary factors of 2 arising from the use of \pi are annoying enough by themselves, but far more serious is their tendency to cancel when divided by any even number. The absurd results, such as \pi/2 for a quarter circle, obscure the underlying relationship between angle measure and the circle constant. To those who maintain that it “doesn’t matter” whether we use \pi or \tau in teaching trigonometry … from the perspective of a beginner, using \pi instead of \tau is a pedagogical disaster.”

It’s an interesting argument, and one I think students would benefit from seeing.  \pi is fairly entrenched, so I’m not sure how much of a following Hartl will gain, but even if \pi remains the standard, offering students this viewpoint can only help them as they learn trigonometry.  For that reason, I for one will be endorsing Tau Day (6/28, get it?).  It certainly doesn’t sound as delicious as Pi Day, and the fact that students are out of school is a bit of a problem, but today is apparently the inaugural Tau Day, and these are wrinkles that I’m sure can be ironed out.

So happy Tau Day to you, no matter your preference!

(Big ups to James Hawkins for sending me the Tau Day link.)

Deep Sea Math Hunting

Every now and then an article pops up which highlights a link between mathematics and the animal kingdom, and I’ve been able to discuss several such links on this blog.  The latest entry into this category concerns the movement of sharks (and other ocean creatures) as they hunt for food.  A recent article in Nature has spawned a great deal of interest, and the topic has been discussed on the websites of Wired, Discovery, and Physics World.

What does the motion of sharks have to do with mathematics?  Well, suppose you are a shark.  Unfortunately, there are not yet any In-N-Out’s under water, so when it comes to food you are on your own.  What would be the best way to forage for your food?  With your heightened senses, you would undoubtedly be a formidable opponent in an area rich with prey, but what if you are in a more sparsely populated area?  What’s the best way for you to search for nutrition?  As it turns out, the best thing for you to do may be to follow a type of random motion known as a Lévy flight.

The following is from the aforementioned Wired article: “Computer models suggest Lévy flight is the optimal search pattern for predators in low-prey areas, and maximizes the chance of a random encounter.”  Visually speaking, a Lévy flight is characterized by short movements in random directions, interspersed with occasional longer trips in a particular direction.  Here is a sample from the Wikipedia article on the subject:

I bet little kids are awesome at drawing Lévy flights.

Suggestions of Lévy flight patterns in animals goes back to at least 1996, but a recent study led by David Sims of the Marine Biological Association Laboratory is the first to reach these conclusions on the basis of such a vast amount of data.  Indeed, by tagging marine animals with GPS locators, they were able to track the movement of 14 different predator species over a period of 5,700 days!  With some rare exceptions (such as the great white shark), they found that animals in areas with less food were more likely to move in a Lévy flight pattern.

Several articles (jokingly, I assume) point out that this should make us even warier of sharks, since they can use math to optimize their behavior.  As is often the case with these articles, the headline rarely reflects the reality.  Whether this behavior is learned or evolved, the fact that sharks may follow this pattern says as much about their mathematical ability as the resemblance of trees to fractal patterns says about their mathematical ability.  Mathematics is a language that we can use to describe nature, so the fact that we can use mathematics to describe this natural phenomenon shouldn’t necessarily be that surprising.  If a shark gets a 5 on the calculus AP, that’s when I’ll be surprised.

Here are some sharks in a group study session (courtesy of the Discovery article mentioned above).

Love and Marriage

I’ve previously discussed some mathematical approaches to dating.  Specifically, we have seen how choosing a partner can be modeled as a type of secretary problem, and, if you like, you can estimate the number of candidates you should consider by using a modified Drake’s equation.  However, as you know, building a lasting relationship is about more than choosing the right partner; maintaining a happy relationship takes work.  And even though most people go into a relationship believing they will not end up as a statistic, the unfortunate reality is that nearly half of all marriages in this country will end in divorce.

How can it be that despite the best intentions of many couples, such a significant proportion will not endure?  As one always should, we can turn to mathematics for possible answers.  In fact, José-Manuel Rey of the Department of Economic Analysis at the Universidad Complutense in Madrid has done just that, by proposing a mathematical model to explain the dynamics of long term relationships.  His model is based on the following assumptions: first, that the individuals in the couple have similar traits (this is spelled out more precisely in Rey’s paper, but basically this is so that he can consider one utility function for the couple rather than separate utility functions for each individual), and second, he assumes a so-called law of thermodynamics for sentimental relationships.  This law can be stated as follows: “There is tendency for the initial feeling for one another to fade away. This kind of inertia must be counteracted by conscious practices.”  In other words, the natural deterioration in a relationship must be counteracted by continual effort, such as a romantic evening, a heartfelt conversation, or listening to some slow jams by R. Kelly.

Given these assumptions, Rey constructs a model to try and explain the paradox (which he terms the failure paradox) in the fact that people who believe they will be together forever will, with a high degree of probability, end up separating.  And sure enough, his model offers us an explanation.  His work is perhaps best summarized by the following picture:


The graph above is of feeling (x axis) versus effort (y axis) – in other words, how satisfied is the couple in their relationship compared to how much effort they are willing to put into it?  Rey’s model asserts that there is a minimum happiness (denoted xmin above) below which a relationship cannot survive.  In other words, if your happiness crosses the red vertical line, your relationship is doomed.  The couple’s initial happiness is given by the value x0 – note that this is greater than the happiness of the equilibrium point, reflecting the common observation that happiness within a couple frequently decreases from its initial state (you can think of this as the honeymoon period, if you like).  This isn’t to say that couples in long term relationships are unhappy, for indeed the happiness at the equilibrium point is still above xmin; all this is saying is that happiness is lower at the equilibrium stage than it was initially.

What about the effort involved?  Heuristically speaking, the amount of effort you put into a relationship should increase your own happiness, up to a point, but then should begin to negatively affect your happiness.  Driving your partner to the airport every once in a while will probably make you feel good about yourself, but driving your partner to the airport every weekend will probably start to wear on you.  On the graph above, c* represents the turning point from when the amount of effort you put in positively affects your happiness to when it negatively affects your happiness.

Here is where an important feature of the model comes into play: the amount of effort at the equilibrium point is greater than c*!  In other words, in order to maintain equilibrium in your relationship, you need to be putting in more effort than you would optimally choose to.  It is this s0-called “effort gap” that Rey identifies as being the cause of so many problems.  Note that if the effort gap is large enough, then maintaining the appropriate level of effort may drive happiness below the minimum required value, and even if this is not the case, maintaining the appropriate level of effort will still require some loss of the couple’s happiness, since the effort required will always be greater than c*.  To put it more bluntly, relationships require sacrifice.

I’d encourage you to keep this in mind as you navigate the unpredictable seas of love.  When it comes to the effort involved in keeping your relationship afloat, mathematics warns you against complacency.  When in doubt, it’s always best to go the extra mile.  And if you need help wooing your special someone, you can always call on R. Kelly to help set the mood.

Aziz Ansari does a mighty fine impression of Kells.

RIP Martin Gardner

Not long ago, I wrote an article in commemoration of Martin Gardner’s 95th birthday.  Sadly, it seems this will be my last article in celebration of his birth, as he passed away late last month.

Through his passing, though, his influence has become even more apparent.  Perhaps because he published mathematical games in Scientific American for 25 years, the magazine has been the most visible in its veneration of him.  There are no less than six articles on Gardner at the SciAm website; while some are reprints of earlier articles, there is also new material from writers and mathematicians who were influenced in some way by Gardner’s unique career.  Since I can’t do justice to Gardner the way others already have, let me summarize what you can find if you’re interested in learning more about this stand-up fellow.

If you’d like to learn more about Gardner’s life, SciAm has reprinted to earlier essays on the man: one is a profile written by Philip Yam, originally published in 1995, and one is a tribute to Gardner’s influence written by Douglas Hofstadter, author of Gödel, Escher, Bach: an Eternal Golden Braid. For a look into Gardner’s interest in debunking pseudoscience, there is this article written by Michael Shermer, which was originally published in 2002.  Or, if you’d just like to try your hand at some recreational math puzzles, there are a couple of articles that give a sample of the material Gardner published in his column.  Here’s one for you to stew over, if you like (taken from the first link above):

Arrange four paper matches on a table as shown at right. They represent a martini glass. A match head goes inside to indicate the onion of a Gibson cocktail. The puzzle is to move just two matches so that the glass is re-formed, but the onion—which must stay where it is—winds up outside the glass. At the finish, the glass may be turned to the left or the right, or even be upside down, but it must be exactly the same shape as before.

Finally, there’s also this collection of musings from a few scientists and mathematicians who were influenced in their careers by Gardner’s work.  Hofstadter is quoted here, but for my money, a better quote can be found in his earlier article which I cited above:

There should be, it seems to me, be a prestigious national or international prize for writing about scientific ideas. As everybody knows, human civilization relies on science and technology more than at any time in the past, and that reliance can only increase. Yet the worldwide ignorance of and disdain for science, mathematics and precise thinking in general is appalling. Because of this tragic situation, people like Martin are precious purveyors of precious knowledge … if I dare say so, what Martin Gardner has done is of far greater originality than work that has won many people Nobel Prizes. Simultaneously achieving both depth and breadth is almost unheard of in today’s scientific world, but Martin Gardner is an exception, and it is a delight and a privilege to celebrate here his many achievements. Just as Martin’s writings have inspired me for decades, so they will undoubtedly continue to inspire other people for many decades to come.

RIP, Mr. Gardner.

Patient Problem Solving

Last year, I remarked on a TED talk from mathemagician Arthur Benjamin, who argued for the displacement of Calculus by Statistics in the hierarchy of high school mathematics.  This year, TED has sponsored a talk by high school math teacher Dan Meyer, who discusses what, in his view, are the major problems with the way mathematics is currently taught to kids, and what can be done to fix things.

His opening is spot on: “I teach high school math.  I sell a product to a market that doesn’t want it, but is forced by law to buy it.”  He goes on to argue that the problem with math education, a problem exacerbated by most textbooks, is that it discourages what he terms patient problem solving.  Problems in textbooks rarely reflect the types of problems one encounters in real life: textbook problems usually supply you with just the right amount of information, and the question is frequently just a matter of plugging values into an appropriate formula.  More complicated questions are frequently presented in multiple parts, so that students rarely have to think long and hard about any one thing.  Instead, they are offered bite-sized pieces of a larger problem, any one of which is relatively simple.  Taken together, the net result may be the solution to an interesting problem, but by holding students’ hands like this along the way, they never develop a taste for solving difficult problems.

This hand holding, Meyer argues, trains students to become impatient when they encounter an actual problem.  If a question isn’t broken down into easily digestible pieces, any one of which can be solved by plugging and chugging into one of a short list of formulas, students are trained to throw up their hands.  In a world fraught with problems, however, this does a disservice not just to students who study mathematics, but all students.

The video is a little over 10 minutes, but if you have the time I’d encourage you to watch it.  Meyer talks about potential solutions to this failure of our educational system, and compares the current state of affairs to watching episodes of Two and a Half Men.  And despite the presence of a fraction in the title, the comparison is not favorable.

Enjoy!

(Hat tip to Patrick for sending me the link.)