<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Math Goes Pop! &#187; probability</title>
	<atom:link href="http://www.mathgoespop.com/tag/probability/feed" rel="self" type="application/rss+xml" />
	<link>http://www.mathgoespop.com</link>
	<description>Ruminations on the Intersection Between Mathematics and Popular Culture</description>
	<lastBuildDate>Tue, 07 Feb 2012 04:49:20 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Playoff Probabilities</title>
		<link>http://www.mathgoespop.com/2011/10/playoff-probabilities.html</link>
		<comments>http://www.mathgoespop.com/2011/10/playoff-probabilities.html#comments</comments>
		<pubDate>Wed, 05 Oct 2011 19:33:02 +0000</pubDate>
		<dc:creator>Matt</dc:creator>
				<category><![CDATA[Sports]]></category>
		<category><![CDATA[baseball]]></category>
		<category><![CDATA[probability]]></category>

		<guid isPermaLink="false">http://www.mathgoespop.com/?p=1454</guid>
		<description><![CDATA[<p>Continuing with last week&#8217;s theme, and since we are in the midst of playoffs, I&#8217;d like to take a moment now to discuss another link between baseball and mathematics.  This link is particularly timely since the scuttlebutt on the internet suggests that next year the playoff rules for baseball will be changed: the number of teams <span style="color:#777"> . . . &#8594; Read More: <a href="http://www.mathgoespop.com/2011/10/playoff-probabilities.html">Playoff Probabilities</a></span>]]></description>
			<content:encoded><![CDATA[<p>Continuing with <a href="http://www.mathgoespop.com/2011/09/moneyball.html">last</a> week&#8217;s theme, and since we are in the midst of playoffs, I&#8217;d like to take a moment now to discuss another link between baseball and mathematics.  This link is particularly timely since the <a href="http://www.baseball-reference.com/blog/archives/10800">scuttlebutt</a> on the internet suggests that next year the playoff rules for baseball will be changed: the number of teams competing for the World Series will increase from 8 to 10, and because of that, another round of playoff games will be introduced.</p>
<p>Currently, the playoffs consist of three rounds.  The first round is the Division Series, in which eight teams compete in a best-of-five match-up (equivalently, a first-to-three match-up, i.e. the first team to win three games wins the series).  The second and third rounds, better known as the Championship Series and World Series, are composed of four and two teams, respectively, but are both best-of-seven (equivalently, first-to-four).  Because of these three rounds of several games each, the playoff season is already quite long; therefore, the new proposed playoff round, it has been suggested, would be composed of either a single game between competing teams, or a best-of-three (first-to-two) series between the two teams.</p>
<p>Many people take issue with such a short series on the grounds of fairness.  In a season where each team plays 162 games, they say, it&#8217;s not fair for a team&#8217;s World Series hopes to ride on a single game, or even a short series composed of at most three games.  There are even those who suggest that the Division Series is too short, and that all three of the current rounds should be a best-of-seven.  These are noble sentiments, but are they reasonable?  We can use mathematics to try and answer this question.</p>
<p>Suppose two teams are meeting for a playoff series, and the probability that one team (call it, I don&#8217;t know, the <a href="http://sanfrancisco.giants.mlb.com/index.jsp?c_id=sf">Giants</a>) will win a single game is <em>p</em> (this model is fairly simple, and does not take into account advantages associated with the starting pitcher, for example, but let&#8217;s keep things basic for now).  Then the probability that this team will win a one game series is again <em>p</em>, since the series consists of a single game.</p>
<p>What if the series is three games long?  In this case, the Giants will win if they win the first two games, or split the first two games and win the third game.  So there are three outcomes: WW, WLW, or LWW.  The probability of the first event is <img src='http://s.wordpress.com/latex.php?latex=p%5E2&#038;bg=T&#038;fg=000000&#038;s=0' alt='p^2' title='p^2' class='latex' />, while the probability of the second and third events are both <img src='http://s.wordpress.com/latex.php?latex=p%5E2%281-p%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='p^2(1-p)' title='p^2(1-p)' class='latex' /> (probability <em>p</em> of success is the same as probability 1-<em>p</em> of failure).  Adding these three probabilities gives the total probability that the team will win a best-of-three series:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=p%5E2%20%2B%202p%5E2%281-p%29%20%3D%20p%5E2%283-2p%29.&#038;bg=T&#038;fg=000000&#038;s=0' alt='p^2 + 2p^2(1-p) = p^2(3-2p).' title='p^2 + 2p^2(1-p) = p^2(3-2p).' class='latex' /></p>
<p style="text-align: left;">In a best-of-five series, the Giants will win if they win three in a row, two of the first three and the fourth, or two of the first four and the fifth.  Using <a href="http://en.wikipedia.org/wiki/Combination">combinations</a> to count the possibilities, we see that in this case, the probability of the Giants winning the series is equal to</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=p%5E3%20%2B%20%5Cbinom%7B3%7D%7B2%7Dp%5E3%281-p%29%2B%5Cbinom%7B4%7D%7B2%7Dp%5E3%281-p%29%5E2&#038;bg=T&#038;fg=000000&#038;s=0' alt='p^3 + \binom{3}{2}p^3(1-p)+\binom{4}{2}p^3(1-p)^2' title='p^3 + \binom{3}{2}p^3(1-p)+\binom{4}{2}p^3(1-p)^2' class='latex' /></p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%3D%20p%5E3%2810-15p%2B6p%5E2%29.&#038;bg=T&#038;fg=000000&#038;s=0' alt='= p^3(10-15p+6p^2).' title='= p^3(10-15p+6p^2).' class='latex' /></p>
<p style="text-align: left;">With the same type of argument you can calculate the probability that the Giants win a best-of-seven series.  I&#8217;ll spare you the details: the result is <img src='http://s.wordpress.com/latex.php?latex=p%5E4%2835-84p%2B70p%5E2-20p%5E3%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='p^4(35-84p+70p^2-20p^3)' title='p^4(35-84p+70p^2-20p^3)' class='latex' />.</p>
<p style="text-align: left;">In each case, the probability of winning the series is a <a href="http://en.wikipedia.org/wiki/Polynomial">polynomial</a> in <em>p</em>, the probability of winning a single game.  But how to these polynomials compare?  Let&#8217;s turn to technology to lead the way!</p>
<div id="attachment_1456" class="wp-caption aligncenter" style="width: 610px"><a href="http://www.mathgoespop.com/wp-content/uploads/2011/10/Picture-5.png"><img class="size-full wp-image-1456" title="Picture 5" src="http://www.mathgoespop.com/wp-content/uploads/2011/10/Picture-5.png" alt="" width="600" height="532" /></a><p class="wp-caption-text">Probabilities for a one, three, five, and seven game series.</p></div>
<p>Above is a graph of these four functions &#8211; the <em>x</em>-axis represents the probability <em>p</em>, while the <em>y</em>-axis represents the probability of winning the series.  The dark blue graph is for a single-game series (the function is <em>p</em>), the light blue graph is for a three-game series (the function is <img src='http://s.wordpress.com/latex.php?latex=p%5E2%283-2p%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='p^2(3-2p)' title='p^2(3-2p)' class='latex' />, the light green graph is for a five-game series (the function is <img src='http://s.wordpress.com/latex.php?latex=%3D%20p%5E3%281-15p%2B6p%5E2%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='= p^3(1-15p+6p^2)' title='= p^3(1-15p+6p^2)' class='latex' />), and the red graph is for a seven-game series (the function is <img src='http://s.wordpress.com/latex.php?latex=p%5E4%2835-84p%2B70p%5E2-20p%5E3%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='p^4(35-84p+70p^2-20p^3)' title='p^4(35-84p+70p^2-20p^3)' class='latex' />).  What can we deduce from the picture above?</p>
<p>First, note that a longer series benefits the stronger team more than the weaker team &#8211; this makes intuitive sense, if you think about it.  Also, for teams that are perfectly evenly matched (i.e. <em>p = </em>0.5), the length of the series doesn&#8217;t affect the probability of winning the series, which is also 50% in each case.</p>
<p>But what about teams with a slight, moderate, or strong advantage over their competition?  How does the length of the series affect the probability of winning the series?  Let&#8217;s look at a small table of values, in the cases <em>p</em> = .55, <em>p</em> = .6, and <em>p</em> = .7.</p>

<table id="wp-table-reloaded-id-4-no-1" class="wp-table-reloaded wp-table-reloaded-id-4">
<thead>
	<tr class="row-1 odd">
		<th class="column-1">p</th><th class="column-2">Best of One Odds of Success</th><th class="column-3">Best of Three Odds of Success</th><th class="column-4">Best of Five Odds of Success</th><th class="column-5">Best of Seven Odds of Success</th>
	</tr>
</thead>
<tbody>
	<tr class="row-2 even">
		<td class="column-1">.550</td><td class="column-2">.550</td><td class="column-3">.575</td><td class="column-4">.593</td><td class="column-5">.608</td>
	</tr>
	<tr class="row-3 odd">
		<td class="column-1">.600</td><td class="column-2">.600</td><td class="column-3">.648</td><td class="column-4">.683</td><td class="column-5">.710</td>
	</tr>
	<tr class="row-4 even">
		<td class="column-1">.700</td><td class="column-2">.700</td><td class="column-3">.784</td><td class="column-4">.837</td><td class="column-5">.874</td>
	</tr>
</tbody>
</table>

<p>As you can see from the table (or the graph), the more evenly matched the teams, the less of a difference the length of the series makes.  If your team has a 55% chance of winning a given game, the advantage in a seven game series is increased by a little less than 6 percentage points.  With a 60% chance of winning a given game, the advantage in a seven game series is increased by 11 percentage points, and with a 70% chance of winning a given game, the advantage in a seven game series is increased by over 17 percentage points.</p>
<p>Not also that the change from a best-of-five series to a best-of-seven series isn&#8217;t really very large.  Even if your team is heavily favored (70% probability of winning each game), the change from a best-of-five series to a best-of-seven series is less than four points.  With more evenly matched teams, the difference is even smaller, suggesting that expansion of the Division Series from a maximum of five to a maximum of seven games isn&#8217;t necessarily a great idea.</p>
<p>On the other hand, the largest change in probabilities is between the jump from a best-of-one series to a best-of-three series.  While the change isn&#8217;t so significant for evenly matched teams (and more evenly matched teams would be most likely to play each other in this round under the suggested rule changes), for match-ups in which one team is heavily favored, the difference can be more significant.</p>
<p>Whether or not one wants longer series or shorter series depends, I suppose, on one&#8217;s baseball philosophy.  It certainly seems like having everything ride on a single game after a season of more than 150 games is a little unbalanced, but from a mathematical standpoint, the stronger team will most likely gain only a small advantage by moving to a three game series.  Of course, this simplified model can only tell us so much, and it&#8217;s possible that the advantages of a longer series are being underrepresented here.  To err on the side of caution, I&#8217;d be more inclined to support a best-of-three series, though whether or not this is possible without stretching the season too long is something that the folks who are paid better than me to think about these matters will have to decide.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mathgoespop.com/2011/10/playoff-probabilities.html/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Scoreboard Stats</title>
		<link>http://www.mathgoespop.com/2011/05/scoreboard-stats.html</link>
		<comments>http://www.mathgoespop.com/2011/05/scoreboard-stats.html#comments</comments>
		<pubDate>Thu, 26 May 2011 21:28:19 +0000</pubDate>
		<dc:creator>Matt</dc:creator>
				<category><![CDATA[Math in the News]]></category>
		<category><![CDATA[Sports]]></category>
		<category><![CDATA[baseball]]></category>
		<category><![CDATA[e]]></category>
		<category><![CDATA[poisson distribution]]></category>
		<category><![CDATA[probability]]></category>
		<category><![CDATA[statistics]]></category>

		<guid isPermaLink="false">http://www.mathgoespop.com/?p=1247</guid>
		<description><![CDATA[<p>A couple of weeks ago I noticed this article on the Yahoo Sports page, which highlighted a statistically rare event that occurred in the American League on Sunday, May 8th.  On that day, 7 baseball games were played on the AL schedule, and in all of those games one team scored exactly 5 runs.  The post <span style="color:#777"> . . . &#8594; Read More: <a href="http://www.mathgoespop.com/2011/05/scoreboard-stats.html">Scoreboard Stats</a></span>]]></description>
			<content:encoded><![CDATA[<p>A couple of weeks ago I noticed <a href="http://sports.yahoo.com/mlb/blog/big_league_stew/post/Gimme-Five-American-League-scoreboard-features-?urn=mlb-wp5759">this</a> article on the Yahoo Sports page, which highlighted a statistically rare event that occurred in the American League on Sunday, May 8th.  On that day, 7 baseball games were played on the AL schedule, and in all of those games one team scored exactly 5 runs.  The post then links to <a href="http://news.yahoo.com/s/ap/20110509/ap_on_sp_ba_ne/bba5_alive">this</a> article from the AP, which gives this rare event the following context:</p>
<blockquote><p>It was the first time in 18 years that such a quirky thing happened with a full schedule. On Aug. 10, 1993, all seven NL games featured one team scoring precisely two runs, STATS LLC said.</p>
<p>The last time it occurred with five or more runs was July 20, 1955, when all four AL games had at least one team score exactly six, STATS LLC said.</p></blockquote>
<p>When I read this article, some questions immediately came to mind: exactly how rare is it for one team in a collection of 7 baseball games to have a common score of 5?  Also, if 7 teams in 7 games have the same score, which score are they most likely to share?  Are the 7 games with a common score 0f 2 more or less likely to occur than the 7 games with a common score of 5?</p>
<p>We can answer these questions with some (relatively) simple probability models, given some caveats.  I&#8217;d like to estimate these probabilities using only one parameter: the average number of runs a team scores during a game.  Of course, that average will vary from team to team, and also from year to year (in particular, runs per game have declined from the heyday of steroid-mania that gripped baseball at the turn of the millennium).  Due to different rules, there may also be variation between the American and National Leagues.  Let me ignore this, though, and consider only an average number of runs per game overall &#8211; what we lose in precision we will more than make up for in clarity.</p>
<div id="attachment_1249" class="wp-caption aligncenter" style="width: 320px"><a href="http://www.mathgoespop.com/wp-content/uploads/2011/05/dingers.jpg"><img class="size-full wp-image-1249" title="dingers" src="http://www.mathgoespop.com/wp-content/uploads/2011/05/dingers.jpg" alt="" width="310" height="230" /></a><p class="wp-caption-text">Ahh, the late 90&#39;s, when it was easier to sock a few dingers.</p></div>
<p>The question remains: how many runs are scored on average in a baseball game?  I found some data online which is somewhat outdated, but I&#8217;ll stick to it for convenience (and, more importantly, out of laziness) &#8211; any alteration in this number is easy to propagate throughout the following discussion.  In <a href="http://www.hardballtimes.com/main/article/runs-per-game/">this</a> article from 2005, the author tabulated the average number of runs per game in MLB over a 5 year span from 2000-2004 (that&#8217;s over 12,000 games!).  He has a nice looking graph of the distribution of scores as well:</p>
<p><a href="http://www.hardballtimes.com/main/article/runs-per-game/"><img class="aligncenter size-full wp-image-1250" title="runspergame" src="http://www.mathgoespop.com/wp-content/uploads/2011/05/runspergame.gif" alt="" width="439" height="369" /></a>A savvy probability student might see the long tail of this probability distribution and liken it to the <a href="http://en.wikipedia.org/wiki/Poisson_distribution">Poisson distribution</a>, a distribution encountered in many probability courses, and which is frequently motivated by a desire to model &#8220;rare events.&#8221;  I put the term in quotations since what constitutes &#8220;rare&#8221; is frequently left undefined, and in any event, is not really pertinent to this discussion.</p>
<p>Let us suppose, then, that the number of runs scored per game by each team follows a Poisson distribution.  French aside, this means that the probability a team will score <em>n</em> runs is equal to</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=e%5E%7B-A%7D%5Cfrac%7BA%5En%7D%7Bn%21%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='e^{-A}\frac{A^n}{n!}' title='e^{-A}\frac{A^n}{n!}' class='latex' />,</p>
<p style="text-align: left;">where A is the average number of runs scored per game &#8211; in this case, 4.82, and <em>e</em> is the unsung hero sometimes known as <a href="http://www.mathgoespop.com/2010/01/e-day.html">Euler&#8217;s number</a>.  Don&#8217;t worry too much about this formula; if you prefer, the graph of the function <img src='http://s.wordpress.com/latex.php?latex=e%5E%7B-4.82%7D%5Cfrac%7B4.82%5En%7D%7Bn%21%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='e^{-4.82}\frac{4.82^n}{n!}' title='e^{-4.82}\frac{4.82^n}{n!}' class='latex' /> looks like this (courtesy of <a href="http://www.wolframalpha.com/">Wolfram Alpha</a>):</p>
<p style="text-align: left;"><a href="http://www.mathgoespop.com/wp-content/uploads/2011/05/Picture-2.png"><img class="aligncenter size-full wp-image-1252" title="Poisson482" src="http://www.mathgoespop.com/wp-content/uploads/2011/05/Picture-2.png" alt="" width="320" height="193" /></a>Note that the fit isn&#8217;t perfect &#8211; this graph starts much lower at 0 than the graph of the actual data pictured above, for example &#8211; but there is precedence for using the Poisson distrubtion to model runs in a baseball game (<a href="http://www.jstor.org/pss/2684837">this</a> article provides one such example, but a subscription is required to view it in its entirety).  More careful analysis is possible, and can be found in resources like <a href="http://books.google.com/books?id=1mNZfyil2ecC&amp;lpg=PA168&amp;ots=oXZDh_q7X5&amp;dq=probability%20distribution%20of%20runs%20scored%20in%20a%20baseball%20game&amp;pg=PP1#v=onepage&amp;q=probability%20distribution%20of%20runs%20scored%20in%20a%20baseball%20game&amp;f=false">this</a> one, but again, I want to keep things relatively simple.</p>
<p style="text-align: left;">So, let us suppose that the probability that a team scores <em>n</em> runs is <img src='http://s.wordpress.com/latex.php?latex=e%5E%7B-4.82%7D%5Cfrac%7B4.82%5En%7D%7Bn%21%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='e^{-4.82}\frac{4.82^n}{n!}' title='e^{-4.82}\frac{4.82^n}{n!}' class='latex' />.  What then, is the probability than in a baseball game, one of the teams will score <em>n</em> runs?  Either team A can score <em>n</em> runs or team <em>B</em> can score <em>n</em> runs, but they can&#8217;t both score <em>n</em> runs since baseball games can&#8217;t end in a tie.  This means that the probability of A or B scoring <em>n</em> runs is simply the probability that A scores <em>n</em> runs plus the probability that <em>B</em> scores <em>n</em> runs, or <img src='http://s.wordpress.com/latex.php?latex=2e%5E%7B-4.82%7D%5Cfrac%7B4.82%5En%7D%7Bn%21%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='2e^{-4.82}\frac{4.82^n}{n!}' title='2e^{-4.82}\frac{4.82^n}{n!}' class='latex' /></p>
<p style="text-align: left;">For the odds that this happens 7 times, we then multiply this number by itself 7 times (lurking under this is the assumption that runs scored in different games are <a href="http://en.wikipedia.org/wiki/Independence_%28probability_theory%29">independent</a>, which seems like an entirely reasonable assumption to make).  To summarize, we estimate the probability that one team in each of 7 games scores n runs is</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%282e%5E%7B-4.82%7D%5Cfrac%7B4.82%5En%7D%7Bn%21%7D%29%5E7.&#038;bg=T&#038;fg=000000&#038;s=0' alt='(2e^{-4.82}\frac{4.82^n}{n!})^7.' title='(2e^{-4.82}\frac{4.82^n}{n!})^7.' class='latex' /></p>
<p style="text-align: left;">If <em>n</em> = 5 (as it did earlier this month), the probability is roughly .064%.  In other words, if 7 AL games were played every day, you would expect this outcome once every 1,560 days or so.  Having said that, with more careful analysis it&#8217;s possible to show that in fact, if 7 games will have teams scoring the same number of runs, 5 is the most likely number.  For comparison, when <em>n</em> = 2 the probability is only a paltry 0.00812%, making what happened on May 8th over 75 times more likely than what happened on August 10, 1993.  Of course, it&#8217;s not fair to compare these records to the 6 run record in 1955, since in that case only 4 games were played, rather than 7.  Nevertheless, it&#8217;s not difficult to adjust this model from 7 games to 4 games (or an arbitrary number of games).</p>
<p style="text-align: left;">So, rather than some murky intuition telling us this event should be unlikely, with a little more effort we can attempt to quantify exactly how unlikely this event should be.  More sophisticated models for runs could be used, but perhaps that is a topic I will save for another day.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mathgoespop.com/2011/05/scoreboard-stats.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Parks and Recreation(al Mathematics)</title>
		<link>http://www.mathgoespop.com/2011/05/parks-and-recreational-mathematics.html</link>
		<comments>http://www.mathgoespop.com/2011/05/parks-and-recreational-mathematics.html#comments</comments>
		<pubDate>Tue, 10 May 2011 18:13:07 +0000</pubDate>
		<dc:creator>Matt</dc:creator>
				<category><![CDATA[Math on TV]]></category>
		<category><![CDATA[charlie sheen]]></category>
		<category><![CDATA[combinatorics]]></category>
		<category><![CDATA[maximum likelihood]]></category>
		<category><![CDATA[parks and recreation]]></category>
		<category><![CDATA[Pigeonhole Principle]]></category>
		<category><![CDATA[probability]]></category>
		<category><![CDATA[the office]]></category>

		<guid isPermaLink="false">http://www.mathgoespop.com/?p=1227</guid>
		<description><![CDATA[<p>Continuing last week&#8217;s trend of discussing mathematics in the context of NBC comedy, today I&#8217;d like to move from The Office to Parks and Recreation.  More specifically, I&#8217;d like to discuss local government wunderkind/aspiring club owner Tom Haverford, whose unique charm I cherish almost as much as Ron Swanson&#8216;s mustache.</p>
<p class="wp-caption-text">What a stud.</p>
<p>In a recent episode, <span style="color:#777"> . . . &#8594; Read More: <a href="http://www.mathgoespop.com/2011/05/parks-and-recreational-mathematics.html">Parks and Recreation(al Mathematics)</a></span>]]></description>
			<content:encoded><![CDATA[<p>Continuing <a href="http://www.mathgoespop.com/2011/05/dunder-math-lin.html">last week&#8217;s</a> trend of discussing mathematics in the context of NBC comedy, today I&#8217;d like to move from <a href="http://en.wikipedia.org/wiki/The_Office_%28U.S._TV_series%29">The Office</a> to <a href="http://en.wikipedia.org/wiki/Parks_and_Recreation">Parks and Recreation</a>.  More specifically, I&#8217;d like to discuss local government wunderkind/aspiring club owner <a href="http://en.wikipedia.org/wiki/Tom_Haverford">Tom Haverford</a>, whose unique charm I cherish almost as much as <a href="http://en.wikipedia.org/wiki/Ron_Swanson">Ron Swanson</a>&#8216;s mustache.</p>
<div id="attachment_1230" class="wp-caption aligncenter" style="width: 434px"><a href="http://www.mathgoespop.com/wp-content/uploads/2011/05/ron-swanson-pic.jpg"><img class="size-full wp-image-1230" title="ron-swanson-pic" src="http://www.mathgoespop.com/wp-content/uploads/2011/05/ron-swanson-pic.jpg" alt="" width="424" height="210" /></a><p class="wp-caption-text">What a stud.</p></div>
<p>In a recent episode, Tom Haverford waxed poetic on the slang he has invented to describe different types of food.  A clip is currently on YouTube (though I don&#8217;t know how long it will stay).</p>
<p style="text-align: center;"><object width="640" height="390"><param name="movie" value="http://www.youtube.com/v/FbR7mpX07Uw?fs=1&amp;hl=en_US" /><param name="allowFullScreen" value="true" /><param name="allowscriptaccess" value="always" /><embed type="application/x-shockwave-flash" width="640" height="390" src="http://www.youtube.com/v/FbR7mpX07Uw?fs=1&amp;hl=en_US" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<p style="text-align: left;">Here&#8217;s a list of the slang Tom uses:</p>
<p style="text-align: center;">desserts = &#8216;serts,<br />
entrees = tre-tre&#8217;s,<br />
sandwiches = sammies, sandoozles, or adamsandlers,<br />
cakes = big ole&#8217; cookies,<br />
noodles = long-ass rice,<br />
fried chicken = fry-fry chicky-chick,<br />
chicken parm = chicky-chicky parm-parm,<br />
chicken cacciatore = chicky catch,<br />
eggs = pre-birds or future birds,<br />
root beer = super water,<br />
tortillas = bean blankies.</p>
<p style="text-align: left;">Some folks had the brilliant idea to build on this new parlance by creating a website devoted to related slang that Tom Haverford might use.  The website, <a href="http://tomhaverfoods.com/">tomhaverfoods.com</a>, consists of one of several delightful pictures of Mr. Haverford, followed by a food item and an appropriate slang term.  Click on Tom&#8217;s face and you&#8217;ll get a new term.</p>
<p style="text-align: left;">
<div class="wp-caption aligncenter" style="width: 560px"><a href="http://tomhaverfoods.com/images/tom6.jpg"><img src="http://tomhaverfoods.com/images/tom6.jpg" alt="" width="550" height="368" /></a><p class="wp-caption-text">Watch out, ladies!</p></div>
<p>Here is where the math comes in: the slang items aren&#8217;t numbered, so one can&#8217;t be certain when one has seen all of these inspired terms.  Since the slang terms are (presumably) generated randomly, even if you click on Tom 1,000 times, there is a chance that there will be one slang term that simply hasn&#8217;t appeared.  The issue is further complicated by the fact that the volume of slang terms is surely growing (there is a link on the website for people to submit slang suggestions), but let me ignore this issue for now and simply assume that the website contains a fixed number (say <em>N</em>) slang terms, and when you click on Tom&#8217;s face, one of those slang terms is selected at random to display.</p>
<p style="text-align: left;">Consider the following experiment: go to the website, and record the slang term that awaits you.  Then click on Tom&#8217;s face and record the next slang term.  Repeat this process until you encounter a slang term you&#8217;ve already recorded, and then stop.  This will give you a list of slang terms (say <em>k</em> of them).  The question, then, is the following: can we use <em>k</em> to estimate <em>N</em>?  In other words, can we use the number of unique slang terms we see to estimate how many total slang terms are on the website?</p>
<p style="text-align: left;">We can model this situation with some probability.  Let&#8217;s start with some preliminary analysis.  First, note that if there are <em>N</em> phrases on the website, the worst case is that you will get only one term, and the best case is that you obtain all <em>N</em> (this happens if you are lucky enough to have no repeats until you&#8217;ve seen every possible phrase, so that page view <em>N</em> + 1 must be a repeat, since there are only <em>N</em> total phrases &#8211; this is the <a href="http://en.wikipedia.org/wiki/Pigeonhole_principle">pigeonhole principle</a> at work).</p>
<p style="text-align: left;">Now, you will obtain only one slang term if the second term is the same as the first; since there are <em>N</em> terms total, the probability of the second term being the same as the first term is 1/<em>N</em>.  What about the probability that you obtain two terms?  This happens only if the second term is different from the first, but the third term is one of the previous two.  Since this gives <em>N</em> &#8211; 1 choices for the second page view and 2 for the third, the probability is</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Cfrac%7BN-1%7D%7BN%7D%20%5Ccdot%20%5Cfrac%7B2%7D%7BN%7D.&#038;bg=T&#038;fg=000000&#038;s=0' alt='\frac{N-1}{N} \cdot \frac{2}{N}.' title='\frac{N-1}{N} \cdot \frac{2}{N}.' class='latex' /></p>
<p style="text-align: left;">Let&#8217;s generalize this to find the probability that you obtain <em>k</em> terms for some <em>k</em> between 1 and <em>N</em>.  This happens precisely when the first <em>k</em> page views give new terms, but the (<em>k </em>+ 1)st view gives one of the previous <em>k</em> terms.  In order for the first <em>k</em> terms to all be distinct from one another, there are <em>N &#8211; </em>1 choices for the second term, <em>N</em> &#8211; 2 for the third, and so on, so that there are <em>N</em> &#8211; (k &#8211; 1) choices for the <em>k</em>th page view.  Then, since by this point you have seen <em>k</em> terms, this means there are <em>k</em> possible ways for the (<em>k</em> + 1)st view to be a repeat.  Or, mathematically speaking, the probability that the first repeat occurs at the (<em>k</em> + 1)st page view is</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Cfrac%7BN-1%7D%7BN%7D%20%5Ccdot%20%5Cfrac%7BN-2%7D%7BN%7D%20%5Ccdot%20%5Cldots%20%5Ccdot%20%5Cfrac%7BN-%28k-1%29%7D%7BN%7D%20%5Ccdot%20%5Cfrac%7Bk%7D%7BN%7D%20%3D%20%5Cfrac%7B%28N-1%29%21%7D%7B%28N-k%29%21%7D%20%5Ccdot%20%5Cfrac%7Bk%7D%7BN%5Ek%7D.&#038;bg=T&#038;fg=000000&#038;s=0' alt='\frac{N-1}{N} \cdot \frac{N-2}{N} \cdot \ldots \cdot \frac{N-(k-1)}{N} \cdot \frac{k}{N} = \frac{(N-1)!}{(N-k)!} \cdot \frac{k}{N^k}.' title='\frac{N-1}{N} \cdot \frac{N-2}{N} \cdot \ldots \cdot \frac{N-(k-1)}{N} \cdot \frac{k}{N} = \frac{(N-1)!}{(N-k)!} \cdot \frac{k}{N^k}.' class='latex' /></p>
<p style="text-align: left;">In other words, if we have a value for <em>k</em>, we can determine what value of <em>N</em> makes this expression as large as possible &#8211; in other words, we can determine the most likely value of <em>N</em>, given <em>k</em>.</p>
<p style="text-align: left;">Let&#8217;s get back to the matter at hand: discovering amazing terms for your favorite foods.  I conducted this experiment, and recorded six unique slang terms; my seventh term was a repeat, so I stopped.  Here is what I recorded:</p>
<p style="text-align: center;">raisins = old ass grapes<br />
hot wings = lil&#8217; flapperz<br />
shrimp = tiny ass lobster<br />
ketchup = kanye blood<br />
mountain dew = halo powerup<br />
gum = chew chew trains</p>
<p>(Sadly, I did not come across one of my personal favorites, funyuns = stank rings.)  Since I found 6 terms, I should take <em>k</em> = 6 in the above expression.  This means I want to choose <em>N</em> so that <img src='http://s.wordpress.com/latex.php?latex=%5Cfrac%7B%28N-1%29%21%7D%7B%28N-6%29%21%7D%20%5Ccdot%20%5Cfrac%7B6%7D%7BN%5E6%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\frac{(N-1)!}{(N-6)!} \cdot \frac{6}{N^6}' title='\frac{(N-1)!}{(N-6)!} \cdot \frac{6}{N^6}' class='latex' /> is as large as possible. Let me simply call this expression <em>p</em>(<em>N</em>). By graphing <em>p</em>(<em>N</em>) for varying <em>N</em> (below is the graph for <em>N</em> between 6 and 200), we see that the maximum occurs at 19.  So, based on this data, the best guess as to the number of slang terms on the website is 19.</p>
<p><a href="http://www.mathgoespop.com/wp-content/uploads/2011/05/Picture-24.png"></a><a href="http://www.mathgoespop.com/wp-content/uploads/2011/05/Picture-25.png"><img class="aligncenter size-full wp-image-1239" title="pN6" src="http://www.mathgoespop.com/wp-content/uploads/2011/05/Picture-25.png" alt="" width="600" height="379" /></a><br />
Note that this problem is not unique to this website.  There are plenty of other internet destinations which use this same basic template.  For example, <a href="http://livethesheendream.com/">here</a> is one that will give you Charlie Sheen quotes (if you&#8217;re into that sort of thing).  In this case, I obtained 29 quotes before encountering my first repeat &#8211; this suggests that there are more Charlie Sheen quotes on the internet than Tom Haverford slang (the best estimate for <em>N</em> when <em>k </em>= 29 is, using the same argument as above, is an impressive 425).  This is an imbalance that I hope will be corrected over time.</p>
<p>Of course, in practice one could gather more data before trying to estimate the number of Tom Haverford quotes &#8211; rather than stopping after the first repeat, one could stop after the second, or third, etc.  This, in turn, would change the probability model, so I won&#8217;t get into it here.  I will say, though, that with a little bit more work one can show that the optimal choice of <em>N</em> given <em>k</em> distinct slang terms is the largest whole number such that</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Cleft%28%5Cfrac%7BN%7D%7BN-k%7D%20%5Cright%29%20%5Cleft%281-%5Cfrac%7B1%7D%7BN%7D%5Cright%29%5E%7Bk%2B1%7D%20%5Cgeq%201.&#038;bg=T&#038;fg=000000&#038;s=0' alt='\left(\frac{N}{N-k} \right) \left(1-\frac{1}{N}\right)^{k+1} \geq 1.' title='\left(\frac{N}{N-k} \right) \left(1-\frac{1}{N}\right)^{k+1} \geq 1.' class='latex' /></p>
<p style="text-align: left;">This is not exactly the simplest relationship between <em>k</em> and <em>N</em>, which makes a simple formula between the two values hard to come by.  This is unfortunate, but is not such an issue in this case, since you are free to click to your heart&#8217;s content, until you have claimed all the bounty that Tom&#8217;s manner of speech has to offer.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mathgoespop.com/2011/05/parks-and-recreational-mathematics.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Look, but don&#8217;t Scratch</title>
		<link>http://www.mathgoespop.com/2011/03/lookbutdontscratch.html</link>
		<comments>http://www.mathgoespop.com/2011/03/lookbutdontscratch.html#comments</comments>
		<pubDate>Thu, 03 Mar 2011 06:43:29 +0000</pubDate>
		<dc:creator>Matt</dc:creator>
				<category><![CDATA[Math in the News]]></category>
		<category><![CDATA[lottery]]></category>
		<category><![CDATA[Pigeonhole Principle]]></category>
		<category><![CDATA[probability]]></category>
		<category><![CDATA[wired]]></category>

		<guid isPermaLink="false">http://www.mathgoespop.com/?p=1090</guid>
		<description><![CDATA[<p>Ladies and gentlemen, please excuse my prolonged absence.  Life occasionally has a habit of getting in the way of the schedule that I&#8217;d like to keep; in this case, it means I haven&#8217;t been able to update over the past month.  Fear not though, for now I have returned, and I am ready to dish on <span style="color:#777"> . . . &#8594; Read More: <a href="http://www.mathgoespop.com/2011/03/lookbutdontscratch.html">Look, but don&#8217;t Scratch</a></span>]]></description>
			<content:encoded><![CDATA[<p>Ladies and gentlemen, please excuse my prolonged absence.  Life occasionally has a habit of getting in the way of the schedule that I&#8217;d like to keep; in this case, it means I haven&#8217;t been able to update over the past month.  Fear not though, for now I have returned, and I am ready to dish on math and pop culture.</p>
<p>In that spirit, I would be remiss if I did not take a moment to mention <a href="http://www.wired.com/magazine/2011/01/ff_lottery/3/">this</a> article from Wired last month on the man who cracked the code for several scratch lottery ticket games.  Mohan Srivastiva, geological statistician by day and mathematical rogue by night, discovered a pattern in certain scratch lottery tickets back in 2003, but I&#8217;m sure (as <a href="http://www.lotterypost.com/news/227079/1940460">this</a> article suggests) he&#8217;s received a bit more publicity since the Wired article hit.</p>
<p>I highly recommend reading the whole article, but I&#8217;ll outline the gist of his discovery here.  In order to do so, I&#8217;ll need to specify a type of scratch game he cracked.  The article focuses primarily on a tic-tac-toe themed scratcher shown below.</p>
<p><a href="http://www.wired.com/magazine/wp-content/images/19-02/ff_lottery4_f.jpg"><img class="alignleft" src="http://www.wired.com/magazine/wp-content/images/19-02/ff_lottery4_f.jpg" alt="" width="320" height="596" /></a>The left side of the ticket is what gets scratched &#8211; below each X and each O lies a number.  Once all of ticket has been scratched, you can compare the uncovered numbers to the numbers on the eight 3&#215;3 grids.  If, in any of those grids, you can find three numbers in a row, column, or diagonal that match the hidden list, you are a winner.  Note the craftiness here &#8211; much like the McDonald&#8217;s monopoly game, it&#8217;s much more likely to get two numbers in a row rather than three, so that a ticket can seem tantalizingly close to being a winner.</p>
<p>In each square of each grid sits a number between 1 and 39.  Also, within each grid, no number repeats more than once; however, since there are 72 squares total on the scratcher, some numbers must repeat themselves between grids (by the <a href="http://en.wikipedia.org/wiki/Pigeonhole_principle">pigeonhole principle</a>, if you like).  Some numbers may repeat several times (for example, 17 appears three times in the ticket on the left), while others will appear only once (such as 08).  The key to cracking the ticket, Srivastiva realized, is to take note of the numbers that appear only once on the ticket.  Such numbers will be called &#8220;singletons.&#8221;</p>
<p>There are several singletons on the ticket presented here, and a more thorough analysis is given in the Wired article.  Most importantly, though, one of the grids has a row of singletons: 24, 12, and 29 are all singletons, and this sequence makes an appearance in the second grid in the in the third row.</p>
<p>What Srivastiva observed was that if a ticket has a sequence of singletons in a winning row, column, or diagonal, then that ticket is likely to be a winner.  In particular, since you can determine all the singletons without scratching off the ticket, he realized that this game reveals information about the likelihood of winning!  In theory, one could (at least in 2003, before the game was pulled) make a career out of buying these tickets in bulk, scratching off the identified winners, and returning the remainder &#8211; Srivastiva even went so far as to ask if lottery tickets could be returned, and found that indeed they could be (in fact, it seems as though this is not such an uncommon occurrence). Ultimately, the only reason why he exposed this fault was that he decided the effort involved in sticking it to the man wasn&#8217;t worth it &#8211; he thought he could earn roughly $600 a day by going through lottery tickets, but he earned more money and had more fun at his day job.</p>
<div id="attachment_1098" class="wp-caption aligncenter" style="width: 370px"><a href="http://www.imdb.com/title/tt0088850/"><img class="size-full wp-image-1098" title="Brewsters Millions " src="http://www.mathgoespop.com/wp-content/uploads/2011/03/brewstersmillions.jpg" alt="" width="360" height="163" /></a><p class="wp-caption-text">With the right lotto strategy, this could be you!</p></div>
<p>Kudos to Mr. Srivastiva for his foray into mathematical badassery.  From a mathematical standpoint, there are a number of questions one can ask about this particular type of ticket.  Here&#8217;s one: what&#8217;s the probability that a number is a singleton?  Of course, these tickets can&#8217;t be completely random, as Srivastiva observed, since &#8220;the lottery corporation needs to control the number of winning tickets. The game can’t be truly random. Instead, it has to generate the illusion of randomness while actually being carefully determined.”  Nevertheless, for argument&#8217;s sake let&#8217;s suppose the numbers on the ticket are random.</p>
<p>In this case, a number is a singleton if it appears on one grid and doesn&#8217;t appear on the remaining 7 grids.  What is the probability that a given number appears on a 3 x 3 grid?  Since there are 9 numbers in the grid, this probability equals 9/39 = 3/13.  If we fix a number between 1 and 39, and let <em>X</em> denote the number of times that number appears in the grids, then <em>X</em> satisfies a <a href="http://en.wikipedia.org/wiki/Binomial_distribution">binomial distribution</a> with <em>n</em> = 8 and <em>p</em> = 3/13.  In particular, <em>X</em> = 1 means that the number is a singleton, and <em>P</em>(<em>X</em> = 1) = 8*(3/13)*(10/13)<sup>7</sup>, which is approximately 29.4%.  We also see that the expected value of <em>X</em> (i.e. the expected number of times any given number will occur) is <em>np</em> = 24/13, which is around 1.85.  One can also find the probabilities that <em>X</em> takes on some other value.</p>
<p>Of course, one can also ask what happens if we vary the number of grids, the sizes of the grids, or the size of the number pool from which we draw.  Other questions abound as well: what are the odds of getting two singletons in a row?  Three singletons in a row?  How do the odds of winning change if the number of hidden values change (either in absolute terms, or as a proportion of the total pool of values)?  These questions, gentle reader, I will leave for you.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mathgoespop.com/2011/03/lookbutdontscratch.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Test Taking, Part 3</title>
		<link>http://www.mathgoespop.com/2011/01/test-taking-part-3.html</link>
		<comments>http://www.mathgoespop.com/2011/01/test-taking-part-3.html#comments</comments>
		<pubDate>Sat, 22 Jan 2011 08:09:24 +0000</pubDate>
		<dc:creator>Matt</dc:creator>
				<category><![CDATA[Math Gets Around]]></category>
		<category><![CDATA[exams]]></category>
		<category><![CDATA[immigration]]></category>
		<category><![CDATA[probability]]></category>

		<guid isPermaLink="false">http://www.mathgoespop.com/?p=1036</guid>
		<description><![CDATA[<p>If you&#8217;ll permit me this small indulgence, gentle reader, this week I&#8217;d like to return to a topic from last month.  More precisely, I&#8217;d like to continue the series of posts that discussed how one best ought to prepare for an exam in which all N questions are given beforehand, and one knows that M questions <span style="color:#777"> . . . &#8594; Read More: <a href="http://www.mathgoespop.com/2011/01/test-taking-part-3.html">Test Taking, Part 3</a></span>]]></description>
			<content:encoded><![CDATA[<p>If you&#8217;ll permit me this small indulgence, gentle reader, this week I&#8217;d like to return to a topic from last month.  More precisely, I&#8217;d like to continue the series of posts that discussed how one best ought to prepare for an exam in which all <em>N</em> questions are given beforehand, and one knows that<em> M</em> questions will appear on the exam, of which the student must answer <em>K</em>.  In my <a href="http://www.mathgoespop.com/2010/12/humanities.html">first</a> post I discussed this problem in the context of preparing essays, while in my <a href="http://www.mathgoespop.com/2010/12/humanities2.html">second</a> I discussed it in the context of preparing for the US citizenship exam.</p>
<p>Apparently I&#8217;m not the only one who thought this a worthwhile problem.  This problem has also made an <a href="http://mindyourdecisions.com/blog/2011/01/18/math-problem-passing-the-citizenship-test/">appearance</a> at the fun-filled blog Mind Your Decisions (it&#8217;s an excellent discussion, so if this kind of thing suits you, check it out).  In the comments <a href="http://mindyourdecisions.com/blog/2011/01/18/math-problem-passing-the-citizenship-test/#comments">section</a>, discussion on this problem continues; in particular, one person proposed that the model should be modified to include the possibility of guessing.  This is an entirely reasonable thing to want, and thankfully it can be incorporated into the model without too much added effort.</p>
<p>Let me recall the notations I used when I discussed the problem earlier.  I&#8217;ve already mentioned <em>N</em>, <em>M</em>, and <em>K</em>.  Let&#8217;s let <em>n</em> represent the number of questions you can answer, and of those <em>n</em>, let <em>X</em> represent the number that actually appear on the exam.  As we&#8217;ve seen, <em>X</em> satisfies a hypergeometric distribution, and the probability that <em>X</em> is some value <em>k</em> is given by</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=P%28X%20%3D%20k%29%20%3D%20%5Cfrac%7B%5Cleft%20%28%5Cbegin%7Bmatrix%7Dn%5C%5Ck%5Cend%7Bmatrix%7D%20%5Cright%20%29%20%5Cleft%20%20%28%5Cbegin%7Bmatrix%7DN-n%5C%5CM-k%5Cend%7Bmatrix%7D%20%5Cright%20%29%7D%7B%5Cleft%20%20%28%5Cbegin%7Bmatrix%7DN%5C%5CM%5Cend%7Bmatrix%7D%20%5Cright%20%29%7D.&#038;bg=T&#038;fg=000000&#038;s=0' alt='P(X = k) = \frac{\left (\begin{matrix}n\\k\end{matrix} \right ) \left  (\begin{matrix}N-n\\M-k\end{matrix} \right )}{\left  (\begin{matrix}N\\M\end{matrix} \right )}.' title='P(X = k) = \frac{\left (\begin{matrix}n\\k\end{matrix} \right ) \left  (\begin{matrix}N-n\\M-k\end{matrix} \right )}{\left  (\begin{matrix}N\\M\end{matrix} \right )}.' class='latex' /></p>
<p>Moreover, since you will pass the exam only if <em>X</em> is at least <em>K</em> (in other words, only if the number of questions on the exam that you can answer is at least the minimum number of correct answers need to pass), the probability of passing is</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=P%28X%5Cgeq%20K%29%20%3D%20%5Cleft%20%28%5Cbegin%7Bmatrix%7DN%5C%5CM%5Cend%7Bmatrix%7D%20%5Cright%20%29%5E%7B-1%7D%20%20%5Csum_%7Bk%3DK%7D%5E%7B%20%5Cmin%5Cleft%20%5C%7B%20M%2Cn%20%5Cright%20%5C%7D%7D%20%5Cleft%20%20%28%5Cbegin%7Bmatrix%7Dn%5C%5Ck%5Cend%7Bmatrix%7D%20%5Cright%20%29%20%5Cleft%20%20%28%5Cbegin%7Bmatrix%7DN-n%5C%5CM-k%5Cend%7Bmatrix%7D%20%5Cright%20%29.&#038;bg=T&#038;fg=000000&#038;s=0' alt='P(X\geq K) = \left (\begin{matrix}N\\M\end{matrix} \right )^{-1}  \sum_{k=K}^{ \min\left \{ M,n \right \}} \left  (\begin{matrix}n\\k\end{matrix} \right ) \left  (\begin{matrix}N-n\\M-k\end{matrix} \right ).' title='P(X\geq K) = \left (\begin{matrix}N\\M\end{matrix} \right )^{-1}  \sum_{k=K}^{ \min\left \{ M,n \right \}} \left  (\begin{matrix}n\\k\end{matrix} \right ) \left  (\begin{matrix}N-n\\M-k\end{matrix} \right ).' class='latex' /></p>
<p style="text-align: left;">This is all review from the earlier posts, and does not take into account the effect of guessing.  Let us now imagine how we can include this refinement into the model.</p>
<p style="text-align: left;">Intuitively, if we are allowed to guess, then the probability of our being able to pass should increase.  To make things as simple as possible, let&#8217;s assume that if you don&#8217;t know the answer to a question, you have a probability <em>p</em> of guessing correctly &#8211; in other words, the probability of a correct answer is the same for each question.  Let&#8217;s also assume that the probability of guessing correctly is independent of the number of questions on the exam that you can answer without guessing (we&#8217;ll use this assumption in a moment).</p>
<p style="text-align: left;">In this modified situation, you will win if you can answer enough questions, or if you can guess enough correct answers in the event that you can&#8217;t answer the minimum number of questions with absolute certainty.  So, if <em>X </em>is at least <em>K</em>, nothing changes &#8211; you&#8217;ll simply answer the minimum number of questions and call it a day.  What&#8217;s new is the situation when <em>X &lt; K</em>, in which case you&#8217;ll need to guess in order to try and pass.</p>
<p style="text-align: left;">Roughly speaking, then, the probability of passing is equal to <em>P(X <span style="text-decoration: underline;">&gt;</span> K) </em>+ <em>P(X &lt; K</em> and you guess correctly enough times).  How many times is &#8220;enough&#8221;?  Well, <em>X</em> plus the number of correct guesses must be at least <em>K</em>.  Considering the cases <em>X = </em>0, 1, &#8230;, <em>K</em> &#8211; 1 separately, we can rewrite this as</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=P%28X%5Cgeq%20K%29%20%2B%20%5Csum_%7Bi%3D0%7D%5E%7BK-1%7DP%5Cbegin%7Bpmatrix%7D%20X%20%3D%20i%5C%20%5Ctextup%7Band%20you%20guess%20at%20least%20%5Ctextit%7BK%20-%20i%7D%7D%5C%5C%5Ctextup%7Bof%20the%20other%20%5Ctextit%7BM%20-%20i%7D%20questions%20correctly%7D%20%5Cend%7Bpmatrix%7D.&#038;bg=T&#038;fg=000000&#038;s=0' alt='P(X\geq K) + \sum_{i=0}^{K-1}P\begin{pmatrix} X = i\ \textup{and you guess at least \textit{K - i}}\\\textup{of the other \textit{M - i} questions correctly} \end{pmatrix}.' title='P(X\geq K) + \sum_{i=0}^{K-1}P\begin{pmatrix} X = i\ \textup{and you guess at least \textit{K - i}}\\\textup{of the other \textit{M - i} questions correctly} \end{pmatrix}.' class='latex' /></p>
<p style="text-align: left;">Since the value of <em>X</em> is independent of the number of correct guesses you&#8217;ll make, this can be written as</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=P%28X%5Cgeq%20K%29%20%2B%20%5Csum_%7Bi%3D0%7D%5E%7BK-1%7DP%28X%3Di%29P%5Cbegin%7Bpmatrix%7D%20%5Ctextup%7Byou%20guess%20at%20least%20%5Ctextit%7BK%20-%20i%7D%20of%20the%7D%5C%5C%5Ctextup%7Bother%20%5Ctextit%7BM%20-%20i%7D%20questions%20correctly%7D%20%5Cend%7Bpmatrix%7D.&#038;bg=T&#038;fg=000000&#038;s=0' alt='P(X\geq K) + \sum_{i=0}^{K-1}P(X=i)P\begin{pmatrix} \textup{you guess at least \textit{K - i} of the}\\\textup{other \textit{M - i} questions correctly} \end{pmatrix}.' title='P(X\geq K) + \sum_{i=0}^{K-1}P(X=i)P\begin{pmatrix} \textup{you guess at least \textit{K - i} of the}\\\textup{other \textit{M - i} questions correctly} \end{pmatrix}.' class='latex' /></p>
<p style="text-align: left;">Moreover, since we want the number of correct guesses to be between <em>K &#8211; i</em> and <em>M &#8211; i</em>, we can write the above as</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=P%28X%5Cgeq%20K%29%20%2B%20%5Csum_%7Bi%3D0%7D%5E%7BK-1%7DP%28X%3Di%29%5Csum_%7Bj%3DK-i%7D%5E%7BM-i%7DP%5Cleft%20%28%20j%5C%20%5Ctextup%7Bcorrect%20guesses%20out%20of%7D%5C%20M-i%5Cright%20%29.&#038;bg=T&#038;fg=000000&#038;s=0' alt='P(X\geq K) + \sum_{i=0}^{K-1}P(X=i)\sum_{j=K-i}^{M-i}P\left ( j\ \textup{correct guesses out of}\ M-i\right ).' title='P(X\geq K) + \sum_{i=0}^{K-1}P(X=i)\sum_{j=K-i}^{M-i}P\left ( j\ \textup{correct guesses out of}\ M-i\right ).' class='latex' /></p>
<p style="text-align: left;">Now, the probability occurring in the sum over <em>j</em> should hopefully look familiar to anyone with a basic background in probability.  The number of successes out of a fixed number of trials given that the probability of success is some number <em>p</em> follows the <a href="http://en.wikipedia.org/wiki/Binomial_distribution">Binomial distribution</a>, one of the first probability distributions encountered in any course in probability or statistics.  In particular, since we&#8217;ve said that the probability of a correct guess is <em>p</em>, knowledge of the binomial distribution tells us that</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=P%5Cleft%20%28%20j%5C%20%5Ctextup%7Bcorrect%20guesses%20out%20of%7D%5C%20M-i%5Cright%20%29%20%3D%20%5Cbinom%7BM-i%7D%7Bj%7Dp%5Ej%281-p%29%5E%7BM-i-j%7D.&#038;bg=T&#038;fg=000000&#038;s=0' alt='P\left ( j\ \textup{correct guesses out of}\ M-i\right ) = \binom{M-i}{j}p^j(1-p)^{M-i-j}.' title='P\left ( j\ \textup{correct guesses out of}\ M-i\right ) = \binom{M-i}{j}p^j(1-p)^{M-i-j}.' class='latex' /></p>
<p style="text-align: left;">In summary, the probability of passing is</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=P%28X%5Cgeq%20K%29%20%2B%20%5Csum_%7Bi%3D0%7D%5E%7BK-1%7DP%28X%3Di%29%5Csum_%7Bj%3DK-i%7D%5E%7BM-i%7D%5Cbinom%7BM-i%7D%7Bj%7Dp%5Ej%281-p%29%5E%7BM-i-j%7D.&#038;bg=T&#038;fg=000000&#038;s=0' alt='P(X\geq K) + \sum_{i=0}^{K-1}P(X=i)\sum_{j=K-i}^{M-i}\binom{M-i}{j}p^j(1-p)^{M-i-j}.' title='P(X\geq K) + \sum_{i=0}^{K-1}P(X=i)\sum_{j=K-i}^{M-i}\binom{M-i}{j}p^j(1-p)^{M-i-j}.' class='latex' /></p>
<p style="text-align: left;">If we want a more explicit formula, we can also use our knowledge of the probability distribution for <em>X</em>.  Also, notice that <em>P</em>(<em>X = i</em>) is 0 if <em>n</em> &lt; <em>i </em>(the number of questions on the exam for which you know the answer can&#8217;t exceed the number of questions in total for which you know the answer), so we can write the probability of success in two ways, depending on whether <em>n &lt; K </em>or <em>n </em><span style="text-decoration: underline;">&gt;</span> <em>K</em>.</p>
<p style="text-align: left;">If <em>n</em> &lt; <em>K</em>, then the probability of passing becomes</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Cbinom%7BN%7D%7BM%7D%5E%7B-1%7D%5Csum_%7Bi%3D0%7D%5E%7Bn%7D%5Cbinom%7Bn%7D%7Bi%7D%5Cbinom%7BN-n%7D%7BM-i%7D%5Csum_%7Bj%3DK-i%7D%5E%7BM-i%7D%5Cbinom%7BM-i%7D%7Bj%7Dp%5Ej%281-p%29%5E%7BM-i-j%7D.&#038;bg=T&#038;fg=000000&#038;s=0' alt='\binom{N}{M}^{-1}\sum_{i=0}^{n}\binom{n}{i}\binom{N-n}{M-i}\sum_{j=K-i}^{M-i}\binom{M-i}{j}p^j(1-p)^{M-i-j}.' title='\binom{N}{M}^{-1}\sum_{i=0}^{n}\binom{n}{i}\binom{N-n}{M-i}\sum_{j=K-i}^{M-i}\binom{M-i}{j}p^j(1-p)^{M-i-j}.' class='latex' /></p>
<p style="text-align: left;">In the case that <em>n <span style="text-decoration: underline;">&gt;</span> K</em>, we have a contribution from the <em>P</em>(<em>X <span style="text-decoration: underline;">&gt;</span> K</em>) term, and so the total probability is</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Cbegin%7Bmatrix%7D%5Cbinom%7BN%7D%7BM%7D%5E%7B-1%7D%5Csum_%7Bk%3DK%7D%5E%7B%5Cmin%7B%28M%2Cn%29%7D%7D%5Cbinom%7Bn%7D%7Bk%7D%5Cbinom%7BN-n%7D%7BM-k%7D%2B%5C%5C%5Cbinom%7BN%7D%7BM%7D%5E%7B-1%7D%5Csum_%7Bi%3D0%7D%5E%7BK-1%7D%5Cbinom%7Bn%7D%7Bi%7D%5Cbinom%7BN-n%7D%7BM-i%7D%5Csum_%7Bj%3DK-i%7D%5E%7BM-i%7D%5Cbinom%7BM-i%7D%7Bj%7Dp%5Ej%281-p%29%5E%7BM-i-j%7D.%20%5Cend%7Bmatrix%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\begin{matrix}\binom{N}{M}^{-1}\sum_{k=K}^{\min{(M,n)}}\binom{n}{k}\binom{N-n}{M-k}+\\\binom{N}{M}^{-1}\sum_{i=0}^{K-1}\binom{n}{i}\binom{N-n}{M-i}\sum_{j=K-i}^{M-i}\binom{M-i}{j}p^j(1-p)^{M-i-j}. \end{matrix}' title='\begin{matrix}\binom{N}{M}^{-1}\sum_{k=K}^{\min{(M,n)}}\binom{n}{k}\binom{N-n}{M-k}+\\\binom{N}{M}^{-1}\sum_{i=0}^{K-1}\binom{n}{i}\binom{N-n}{M-i}\sum_{j=K-i}^{M-i}\binom{M-i}{j}p^j(1-p)^{M-i-j}. \end{matrix}' class='latex' /></p>
<p style="text-align: left;">This is all well and good (and agrees with what commenter Scott derived in the comments of the Mind Your Decisions post), but what does it say about our example from last time (where <em>N </em>= 100, <em>M</em> = 10 and <em>K </em>= 6)?  As before, here are some graphs of the probability of success as a function of how many questions you can answer.  Note that any such graph depends on the probability <em>p</em>.  So, let&#8217;s illustrate two examples:</p>
<p style="text-align: left;">Case 1: <em>p</em> is a fixed value.  Here are the graphs corresponding to <em>p </em>= 0, <em>p </em>= .25, <em>p</em> = .375, <em>p</em> = .5, and <em>p</em> = .75 (i.e. the chances of you guessing correctly are either 25%, 37.5% 50%, or 75% &#8211; note that the case of <em>p</em> = 0 corresponds to the previous case where guessing isn&#8217;t a factor).</p>
<p style="text-align: left;"><a href="http://www.mathgoespop.com/wp-content/uploads/2011/01/examgraph1.png"><img class="aligncenter size-full wp-image-1054" title="examgraph1" src="http://www.mathgoespop.com/wp-content/uploads/2011/01/examgraph1.png" alt="" width="646" height="430" /></a>Some highlights &#8211; without guessing, you need to know the answers to 55 questions in order to have at least 50% chance of passing.  With a 25% chance of guessing correctly, you only need to know the answers to 40 questions.  At 37.5%, the number of questions decreases to 28, at 50% it drops to 10 questions, and if you have a 75% chance of answering correctly, you have over a 90% chance of passing without knowing any answers at all!  If you want at least an 80% chance of passing, the number of answers becomes 67 (<em>p</em> = 0), 56 (<em>p</em> = .25), 48 (<em>p</em> = .375), and 35 (<em>p</em> = .5).</p>
<p style="text-align: left;">Case 2: <em>p</em> increases with <em>n</em>.  It seems reasonable to assume that the more answers you know, the better your chances of correctly guessing the answer to a question you don&#8217;t know, since you will be more knowledgeable in general.  In this particular example, I&#8217;ve taken <em>p</em> to equal <em>n</em>/<em>N</em> (in this case <em>n</em>/100).  Note that with this choice, initially the probability of success will be 0, but as <em>n</em> grows the probability of success should grow relatively rapidly.</p>
<p style="text-align: left;"><a href="http://www.mathgoespop.com/wp-content/uploads/2011/01/examgraph2.png"><img class="aligncenter size-full wp-image-1055" title="examgraph2" src="http://www.mathgoespop.com/wp-content/uploads/2011/01/examgraph2.png" alt="" width="645" height="430" /></a>The above graph quantifies the above heuristics.  Note that the red line grows very rapidly, so that the probability of success is greater than 50% after memorizing 33 questions, more than 80% after 43 questions, and over 95% after slightly over half of the questions (53).</p>
<p style="text-align: left;">So there you have it.  If you are feeling lazy the next time you have to prepare for an exam, hopefully this will provide you some guidance as to the minimum amount of work you can do while still being reasonably confident that you won&#8217;t fail.</p>
<p style="text-align: left;">Hope you all have a great weekend!</p>
<p style="text-align: left;">
]]></content:encoded>
			<wfw:commentRss>http://www.mathgoespop.com/2011/01/test-taking-part-3.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Lost Winnings</title>
		<link>http://www.mathgoespop.com/2011/01/lostwinnings.html</link>
		<comments>http://www.mathgoespop.com/2011/01/lostwinnings.html#comments</comments>
		<pubDate>Thu, 13 Jan 2011 21:45:40 +0000</pubDate>
		<dc:creator>Matt</dc:creator>
				<category><![CDATA[Math in the News]]></category>
		<category><![CDATA[Math on TV]]></category>
		<category><![CDATA[combinations]]></category>
		<category><![CDATA[lost]]></category>
		<category><![CDATA[lottery]]></category>
		<category><![CDATA[probability]]></category>

		<guid isPermaLink="false">http://www.mathgoespop.com/?p=1018</guid>
		<description><![CDATA[<p>Last week, two very lucky people won the Mega Millions lottery jackpot (here&#8216;s a profile on one of the winners).  This particular lottery is played in 41 out of the 50 states, and these two individuals will share a combined, pre-tax total of $380 million.</p>
<p>But are they so lucky after all?  Setting aside the common notion <span style="color:#777"> . . . &#8594; Read More: <a href="http://www.mathgoespop.com/2011/01/lostwinnings.html">Lost Winnings</a></span>]]></description>
			<content:encoded><![CDATA[<p>Last week, two very lucky people won the Mega Millions lottery jackpot (<a href="http://www.megamillions.com/mcenter/pressrelease.asp?newsID=5A051296-4770-4426-A143-535A423640ED">here</a>&#8216;s a profile on one of the winners).  This particular lottery is played in 41 out of the 50 states, and these two individuals will share a combined, pre-tax total of $380 million.</p>
<p>But are they so lucky after all?  Setting aside the common notion that winning the lottery can actually do you more harm than good, some people are concerned because of the numbers themselves that made the winning ticket.</p>
<p>The numbers drawn for this particular lottery were 4, 8, 15, 25, 47, and 42.  Note that the last number is lower than the number that precedes it because it is the so-called &#8220;Mega Number,&#8221; which is drawn from a different pool than the first five.  For those of you with a penchant for televised dramas set in tropical locations, you may note that these numbers bear a striking similarity to Hurley&#8217;s <a href="http://en.wikipedia.org/wiki/Numbers_%28Lost%29">numbers</a> from <a href="http://en.wikipedia.org/wiki/Lost_%28TV_series%29">Lost</a>.</p>
<p><a href="http://www.tv.com/hurleys-numbers-arent-so-unlucky-anymore/webnews/249677.html"><img class="aligncenter size-full wp-image-1023" title="hurnums" src="http://www.mathgoespop.com/wp-content/uploads/2011/01/hurnums.jpg" alt="" width="480" height="401" /></a></p>
<p>As evidenced by the above image, Hurley&#8217;s number&#8217;s were 4, 8, 15, 16, 23, and 42.  In other words, 4 out of the 6 Mega Millions numbers matched Hurley&#8217;s!</p>
<p>Unfortunately, Lost fans will note that this is not necessarily a good thing; on the show, the numbers caused Hurley nothing but trouble (including, but not limited to, a meteor strike on his place of work).  Hurley (real name <a href="http://www.imdb.com/name/nm0306201/">Jorge Garcia</a>) himself wrote on his <a href="http://furtherdispatches.wordpress.com/2011/01/05/will-you-people-ever-learn/">blog</a>: &#8220;When will you people learn? The numbers are bad!&#8221;</p>
<div id="attachment_1027" class="wp-caption aligncenter" style="width: 404px"><a href="http://www.mathgoespop.com/wp-content/uploads/2011/01/hurley.jpg"><img class="size-full wp-image-1027" title="hurley" src="http://www.mathgoespop.com/wp-content/uploads/2011/01/hurley.jpg" alt="" width="394" height="222" /></a><p class="wp-caption-text">This is how Hurley feels about the numbers.</p></div>
<p>From a mathematical standpoint, though, I&#8217;m less interested in whether or not the numbers are cursed (if the show is any indication, this question has already been decisively settled), and more interested in how likely it is for the lottery jackpot to so closely match the numbers from the show.</p>
<p>Lottery odds are quite well understood.  What&#8217;s more, someone by the name of Durango Bill has a <a href="http://www.durangobill.com/MegaMillionsOdds.html">website</a> devoted to odds for the Mega Millions lottery (he also calculates that the odds of dying in a car accident on the way to buy a lottery ticket are almost 6 times as high as the odds of winning the lottery itself).  We don&#8217;t need all the information on this site, though, just some of it.</p>
<p>To calculate the odds, one needs to know how many numbers are in play for the lottery.  The first five numbers are drawn from a pool (without replacement) of 56, while the Mega Number is drawn from a pool of 46.  Since we are choosing 5 numbers from the original 56, the total number of outcomes is 56 <a href="http://en.wikipedia.org/wiki/Combination">choose</a> 5, or 3,819,816.  Clearly there are 46 different choices for the Mega Number.  Therefore, the total number of outcomes is the product 3,819,816 x 46 = 175,711,536.</p>
<p>(As an aside, note that this is much higher than the number of outcomes available if the Mega Number didn&#8217;t exist, and one simply chose 6 numbers from the pool of 56.  In this case, the number of outcomes would be 56 choose 6, or 32,468,436.  In other words, use of the Mega Number effectively makes the number of outcomes over 5 times larger, thereby significantly decreasing the likelihood of a jackpot!)</p>
<p>Now, what are the odds that three of the five numbers, in addition to the Mega Number, will match the Lost numbers?  Well, there&#8217;s only one way to match the Mega Number, but there are 5 choose 3 = 10 ways to match 3 of the 5 Lost numbers, and 51 choose 2 = 1,275  ways to match 2 of the 51 non-lost numbers.  Therefore, the total number of favorable outcomes is 1,275 x 10 = 12,750, which means the probability of this event occurring must be 12,750/175,711,536 (the proportion of total outcomes which are favorable), which amounts to around 1 in 13,781. In particular, this is 12,750 times as likely as winning the jackpot, for which the odds are 1/175,711,536.</p>
<p>&#8220;But wait!&#8221; you might say.  &#8220;Not only did the numbers match, but their positions matched too!&#8221;  In other words, the 4, 8, and 15 were the first three numbers in both lotteries.  If we take position into account, we could ask &#8220;What are the odds that 3 of the 5 numbers and the Mega Number match the Lost Numbers, and have the same position?&#8221;  You should expect that these odds are lower, since we are now further restricting the types of tickets that we count (for example, the ticket 1 2 4 8 15 42 would count only if position doesn&#8217;t matter).</p>
<p>If we fix the positions, then we want to count the number of possible lottery tickets of the form 4 8 15 <em>a b</em> 42, where <em>a</em> must be between 17 and <em>b</em> (since the numbers are listed in increasing order and 16 is not allowed), not including 23, and <em>b</em> must be between <em>a</em> and 56, not including 23.</p>
<p>To count these outcomes, we split into two cases.  First, if <em>a </em>is between 17 and is less than 23, then there are 6 choices for <em>a</em> (17, 18, 19, 20, 21, or 22) and for each choice of <em>a</em> there are 56 &#8211; <em>a</em> &#8211; 1 = 55 &#8211; <em>a</em> choices for <em>b</em> (since <em>b</em> must lie between <em>a</em> + 1 and 56, and can&#8217;t be 23).  Therefore, the total number of outcomes if <em>a</em> is less than 23 is</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Csum_%7Ba%3D17%7D%5E%7B22%7D%2855-a%29%20%3D%20213.&#038;bg=T&#038;fg=000000&#038;s=0' alt='\sum_{a=17}^{22}(55-a) = 213.' title='\sum_{a=17}^{22}(55-a) = 213.' class='latex' /></p>
<p style="text-align: left;">Secondly, if <em>a</em> is greater than 23, then there are 32 choices for <em>a </em>(since <em>a</em> must lie between 24 and 55), and for each choice of <em>a</em> there are now 56 &#8211; <em>a</em> choices for <em>b</em>.  In particular, note that <em>a</em> can never be 56, since <em>a</em> must be less than <em>b</em>, and 56 is the highest possible number. Therefore, in this case, the number of outcomes is</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Csum_%7Ba%3D24%7D%5E%7B55%7D%2856-a%29%20%3D%20528.&#038;bg=T&#038;fg=000000&#038;s=0' alt='\sum_{a=24}^{55}(56-a) = 528.' title='\sum_{a=24}^{55}(56-a) = 528.' class='latex' /></p>
<p style="text-align: left;">From this, we see the number of favorable outcomes is now only 213 + 528 = 741, which makes the probability of a jackpot with 4 of 6 numbers (including the Mega Number) in the same position as 4 of the 6 Lost Numbers only 741/175,711,536, or roughly 1 in 237,128.  In particular, the odds are decreased by a factor of over 17.</p>
<p style="text-align: left;">To put it more succinctly, the odds are small.  But when the lottery is involved, one frequently encounters unlikely events such as this.  While it&#8217;s a cool coincidence, I think we can all agree it&#8217;s unwise to play the Lost Numbers when you buy your lottery tickets.  It just doesn&#8217;t make sense to choose the most popular possible combination of numbers &#8211; after all, you don&#8217;t want to share that Jackpot with anyone.</p>
<p style="text-align: left;">Just as importantly, of course, nobody wants to bet with cursed numbers.</p>
<p style="text-align: left;">(Other articles can be found <a href="http://latimesblogs.latimes.com/showtracker/2011/01/thank-you-hurley-lost-numbers-pay-off-as-winning-mega-millions-numbers.html">here</a> and <a href="http://latimesblogs.latimes.com/showtracker/2011/01/jorge-garcia-and-carlton-cuse-respond-to-mega-millions-winning-lost-numbers.html">here</a>.)</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mathgoespop.com/2011/01/lostwinnings.html/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Addendum to Math Gets Around: The Humanities</title>
		<link>http://www.mathgoespop.com/2010/12/humanities2.html</link>
		<comments>http://www.mathgoespop.com/2010/12/humanities2.html#comments</comments>
		<pubDate>Fri, 17 Dec 2010 16:58:16 +0000</pubDate>
		<dc:creator>Matt</dc:creator>
				<category><![CDATA[Math Gets Around]]></category>
		<category><![CDATA[combinatorics]]></category>
		<category><![CDATA[exams]]></category>
		<category><![CDATA[immigration]]></category>
		<category><![CDATA[probability]]></category>
		<category><![CDATA[the simpsons]]></category>

		<guid isPermaLink="false">http://www.mathgoespop.com/?p=981</guid>
		<description><![CDATA[<p>Last week we discussed an example of when a mathematical background might prove useful even in the least quantitative of liberal arts courses.  More specifically, we asked the question: if a teacher gives you a list of N questions, tells you that M will be on an exam, and you must answer K of the questions given <span style="color:#777"> . . . &#8594; Read More: <a href="http://www.mathgoespop.com/2010/12/humanities2.html">Addendum to Math Gets Around: The Humanities</a></span>]]></description>
			<content:encoded><![CDATA[<p>Last week we <a href="http://www.mathgoespop.com/2010/12/humanities.html">discussed</a> an example of when a mathematical background might prove useful even in the least quantitative of liberal arts courses.  More specifically, we asked the question: if a teacher gives you a list of <em>N</em> questions, tells you that <em>M</em> will be on an exam, and you must answer <em>K</em> of the questions given on the exam, what&#8217;s the minimum number of questions you should prepare to guarantee that you will be able to answer <em>K</em> of the questions on the exam?  (Answer: <em>N + K &#8211; M.</em>)<em> </em>We also looked at the question probabilistically &#8211; namely, we saw that of the questions appearing on the exam, the number that you&#8217;ve prepared for follows a <a href="http://en.wikipedia.org/wiki/Hypergeometric_distribution">hypergeometric distribution</a>.</p>
<p>As a concrete example I considered the case <em>N</em> = 6, <em>M</em> = 5, <em>K = </em>3 &#8211; in this case, the minimum number of questions you should prepare to guarantee that you can answer 3 of 5 problems on the exam is 4, and we saw that if you only prepare 3 questions, you have a 50% chance of those 3 questions appearing on the list of 5.</p>
<p>Late last week, however, I was made aware of another example, one for which the probabilities might prove more interesting (since there are more cases to consider).  Specifically, let us consider the case of a person studying to become a U.S. citizen.  As part of this process, one must submit to an interview in which one is asked 10 questions, and must answer 6 of those 10 questions correctly.  However, the potential list of questions is made available to people beforehand; there are 100 questions from which the 10 questions can be drawn.  In other words, we have <em>N</em> = 100, <em>M</em> = 10, and <em>K</em> = 6.</p>
<p>In this case, to guarantee that you will be able to answer 6 of the 10 questions presented, our analysis from last time tells you that you should prepare 100 + 6 &#8211; 10 = 96 of the questions.  Indeed, this makes sense, since the worst that can happen is that the 4 questions you don&#8217;t prepare happen to be precisely 4 of the 10 questions you are asked in the interview.  This also reflects the fact that the closer <em>M</em> is to <em>K</em>, the more questions the test taker will have to prepare (note that if <em>M</em> were closer to <em>N</em>, say <em>M</em> = 90, the test taker would only have to prepare 16 questions).</p>
<p>Still, preparing 96 of the questions may seem like a little much, especially since only 10 questions will come up in the interview.  So, let&#8217;s see what happens if someone prepares for fewer than 96 questions.  Obviously one should know how to answer at least 6 of the questions, but what about values between 6 and 96?﻿</p>
<p>Here is a graph showing the probability that one will pass the interview given that one has learned the answer to <em>n</em> questions, for some <em>n</em> between 6 and 96.<a href="http://www.mathgoespop.com/wp-content/uploads/2010/12/Picture-11.png"><img class="aligncenter size-full wp-image-985" title="CitizenGraph" src="http://www.mathgoespop.com/wp-content/uploads/2010/12/Picture-11.png" alt="" width="600" height="383" /></a>This graph tells you that, for example, even if one only had time to learn the answers to 73 out of the 100 questions, one&#8217;s chances of passing the exam would still be over 90%.  Those are pretty good odds, for only learning the answers to roughly three quarters of the questions.  On the other hand, one needs to learn the answers to 37 questions before one&#8217;s odds of passing rise above 10%, so it&#8217;s certainly not likely that someone will pass by learning the answers to only a handful of questions (which is probably what the government intends).</p>
<div id="attachment_988" class="wp-caption aligncenter" style="width: 411px"><a href="http://en.wikipedia.org/wiki/Much_Apu_About_Nothing"><img class="size-full wp-image-988" title="Picture 2" src="http://www.mathgoespop.com/wp-content/uploads/2010/12/Picture-2.png" alt="" width="401" height="315" /></a><p class="wp-caption-text">If only Apu had known of these findings, perhaps he could have saved himself some trouble.</p></div>
<p>Anyway, I just wanted to highlight another example where these ideas apply.  If you can think of any others, let me know!  Also, if you are interested in the content of the 100 questions that can be asked of our future citizens, you can find the full list (along with acceptable answers) <a href="http://www.immihelp.com/citizenship/naturalization-civics-test-questions.html">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mathgoespop.com/2010/12/humanities2.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Math Gets Around: The Humanities</title>
		<link>http://www.mathgoespop.com/2010/12/humanities.html</link>
		<comments>http://www.mathgoespop.com/2010/12/humanities.html#comments</comments>
		<pubDate>Mon, 06 Dec 2010 16:00:42 +0000</pubDate>
		<dc:creator>Matt</dc:creator>
				<category><![CDATA[Math Gets Around]]></category>
		<category><![CDATA[exams]]></category>
		<category><![CDATA[humanities]]></category>
		<category><![CDATA[probability]]></category>

		<guid isPermaLink="false">http://www.mathgoespop.com/?p=874</guid>
		<description><![CDATA[<p>Unless you&#8217;re one of those suckers who goes to a school that administers final exams after the holidays (like I was), the few weeks after Thanksgiving can be quite a stressful time for students.  Between exams, final papers, and working out holiday travel plans, it can be easy to get overwhelmed.  For students with a quantitative <span style="color:#777"> . . . &#8594; Read More: <a href="http://www.mathgoespop.com/2010/12/humanities.html">Math Gets Around: The Humanities</a></span>]]></description>
			<content:encoded><![CDATA[<p>Unless you&#8217;re one of those suckers who goes to a school that administers final exams after the holidays (like I was), the few weeks after Thanksgiving can be quite a stressful time for students.  Between exams, final papers, and working out holiday travel plans, it can be easy to get overwhelmed.  For students with a quantitative bent, the days are undoubtedly spent in large part trying to memorize formulas or theorems, or on refining their understanding of certain problem-solving techniques that have been covered in their courses.</p>
<p>If your interests are more in line with the humanities, you may think that you are safe from the pull of mathematics.  There are occasions, though, when a working knowledge of mathematics can help even in a liberal arts course.</p>
<div id="attachment_960" class="wp-caption aligncenter" style="width: 410px"><a href="http://www.imdb.com/media/rm2371983360/tt0083929f"><img class="size-full wp-image-960" title="fasttimes" src="http://www.mathgoespop.com/wp-content/uploads/2010/12/Picture-1.png" alt="" width="400" height="317" /></a><p class="wp-caption-text">Spicoli certainly could&#39;ve benefitted from a stronger math background.</p></div>
<p>Consider the following example.  Suppose you&#8217;re enrolled in a course for which the final exam will have a large essay component.  To help you study, your teacher gives you a list of <em>N</em> potential essay questions.  Moreover, she tells you that on the exam, some smaller number <em>M</em> of those exact questions will appear, and of those that do appear, you must select <em>K</em> to answer on the exam.  The question then becomes: what&#8217;s the minimum number of essay questions that you should prepare?</p>
<p>If you have a humanities background, and these capital letters scare you off, let&#8217;s consider a particular example &#8211; say, <em>N</em> = 6, <em>M</em> = 5, and <em>K</em> = 3.  In other words, your teacher gives you 6 questions, you know that 5 of the 6 will end up on the exam, and out of those 5 you&#8217;ll have to answer 3.  How many essay questions must you prepare to guarantee that you&#8217;ll be able to answer 3 of the 5 questions on the exam?  Obviously, preparing for all 6 essays is overkill, since you know that only 5 questions will be on the exam.  But preparing for 5 questions is overkill too, since the worst thing that can happen is that one of the questions you prepared is omitted from the list on the exam, in which case you will still have prepared for 5 &#8211; 1 = 4 of the available essays, one essay too many.</p>
<p>Following this reasoning, we see that the smallest number of essays you can get away with preparing while still guaranteeing that you can complete the exam is 4.  If you prepare 4 essays, then in particular you DON&#8217;T prepare 2, and in a worst-case scenario, the 2 that you don&#8217;t prepare will be on the list of 5 that show up on the exam.  Since you can&#8217;t respond to those two, you must respond to all three of the remaining questions, which you will be able to do since you prepared for the remaining four.  Note that this argument falls apart if you only prepare 3 questions, since in that case the worst scenario is that the three you didn&#8217;t pick end up on the exam, leaving you with only 5 &#8211; 3 = 2 questions you can answer.</p>
<div id="attachment_962" class="wp-caption aligncenter" style="width: 221px"><a href="http://www.imdb.com/title/tt0105958/"><img class="size-full wp-image-962" title="feeney" src="http://www.mathgoespop.com/wp-content/uploads/2010/12/Picture-10.png" alt="" width="211" height="301" /></a><p class="wp-caption-text">Not even Mr. Feeney is immune to the long arm of mathematical applicability.</p></div>
<p>Let&#8217;s now return to the general case.  The same argument applies.  Suppose you are given <em>N</em> questions, <em>M</em> of which you know will be on an exam, out of which you can choose <em>K</em> to answer.  This means that you are free to ignore <em>M</em> &#8211; <em>K</em> of the questions on the exam.  In particular, the safest thing to do is to prepare for all but <em>M</em> &#8211; <em>K</em> of the questions, since then, in the worst case scenario, the <em>M</em> &#8211; <em>K</em> questions you did not prepare will be precisely the same as the <em>M</em> &#8211; <em>K</em> questions on the exam.  In other words, the safest strategy would be to prepare <em>N</em> &#8211; (<em>M &#8211; K</em>) = <em>N + K &#8211; M</em> of the questions.  Note that this agrees with our previous example; if <em>N</em> = 6, <em>M</em> = 5, and <em>K</em> = 3, then <em>N + K &#8211; M </em>= 6 + 3 &#8211; 5 = 4.</p>
<p>This also gives you an idea of what to hope for if you are a student.  Note that the larger <em>M</em> is relative to <em>K</em>, the better off a student will be.  Meanwhile, the smaller <em>M</em> is relative to <em>K</em>, the more work a student will have to do in order to be fully prepared for the exam.  If the teacher gives out 10 questions and says that 2 will be on the exam, this is a much worse situation for the student than if the teacher gives out 10 questions, says that all 10 will be on the exam, and the student will have to choose two to discuss (in the first case, this strategy says the student must prepare all 10 questions, while in the second case the student must prepare only 2).</p>
<p>Of course, not all students will follow this strategy.  Notice that if a student is lucky, he may get away with preparing only <em>K</em> essays, since in the best case scenario the <em>K</em> essays that the student prepares will be the same as the <em>K</em> essays that are on the exam.  So no matter what, a student should prepare somewhere between <em>K</em> and <em>N + K &#8211; M</em> essays (in our first example, this equates to either 3 or 4 essays).  In the event that <em>N + K &#8211; M </em>is much larger than <em>K</em>, though, it may be tempting to prepare fewer essays and simply hope that luck is on one&#8217;s side.</p>
<p>Fortunately, mathematics can help us in this case as well (watch out, though; the math required from here on out is more substantial).  For we can ask the question &#8220;If we prepare only <em>n</em> essays for some <em>n</em> between <em>K</em> and <em>N + K &#8211; M</em>, what is the probability that we will be able to answer all of the essay questions posed on the exam?&#8221;  In fact, these probabilities are well understood.  Indeed, if we let <em>X</em> denote the number of questions selected for the exam that you&#8217;ve prepared for, then <em>X</em> satisfies a <a href="http://en.wikipedia.org/wiki/Hypergeometric_distribution">hypergeometric distribution</a>.  This tells us that the probability that the number of questions on the exam that you&#8217;ve prepared for is equal to <em>k</em> is, using the notation introduced here,</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Cfrac%7B%5Cleft%20%28%5Cbegin%7Bmatrix%7Dn%5C%5Ck%5Cend%7Bmatrix%7D%20%5Cright%20%29%20%5Cleft%20%28%5Cbegin%7Bmatrix%7DN-n%5C%5CM-k%5Cend%7Bmatrix%7D%20%5Cright%20%29%7D%7B%5Cleft%20%28%5Cbegin%7Bmatrix%7DN%5C%5CM%5Cend%7Bmatrix%7D%20%5Cright%20%29%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\frac{\left (\begin{matrix}n\\k\end{matrix} \right ) \left (\begin{matrix}N-n\\M-k\end{matrix} \right )}{\left (\begin{matrix}N\\M\end{matrix} \right )}' title='\frac{\left (\begin{matrix}n\\k\end{matrix} \right ) \left (\begin{matrix}N-n\\M-k\end{matrix} \right )}{\left (\begin{matrix}N\\M\end{matrix} \right )}' class='latex' /></p>
<p style="text-align: left;">(for help with the notation, see the link above).  In particular, if you want to compute the probability that <em>X</em> is at least <em>K</em>, (in other words, that the number of questions you prepared that are on the exam is at least the minimum necessary for you to complete the exam successfully), this can be found by adding up these probabilities from <em>k</em> = <em>K</em> to <em>k = </em><img src='http://s.wordpress.com/latex.php?latex=%5Cmin%5Cleft%20%5C%7B%20M%2Cn%20%5Cright%20%5C%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\min\left \{ M,n \right \}' title='\min\left \{ M,n \right \}' class='latex' />.  Thus, we have</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=P%28X%5Cgeq%20K%29%20%3D%20%5Cleft%20%28%5Cbegin%7Bmatrix%7DN%5C%5CM%5Cend%7Bmatrix%7D%20%5Cright%20%29%5E%7B-1%7D%20%5Csum_%7Bk%3DK%7D%5E%7B%20%5Cmin%5Cleft%20%5C%7B%20M%2Cn%20%5Cright%20%5C%7D%7D%20%5Cleft%20%28%5Cbegin%7Bmatrix%7Dn%5C%5Ck%5Cend%7Bmatrix%7D%20%5Cright%20%29%20%5Cleft%20%28%5Cbegin%7Bmatrix%7DN-n%5C%5CM-k%5Cend%7Bmatrix%7D%20%5Cright%20%29.&#038;bg=T&#038;fg=000000&#038;s=0' alt='P(X\geq K) = \left (\begin{matrix}N\\M\end{matrix} \right )^{-1} \sum_{k=K}^{ \min\left \{ M,n \right \}} \left (\begin{matrix}n\\k\end{matrix} \right ) \left (\begin{matrix}N-n\\M-k\end{matrix} \right ).' title='P(X\geq K) = \left (\begin{matrix}N\\M\end{matrix} \right )^{-1} \sum_{k=K}^{ \min\left \{ M,n \right \}} \left (\begin{matrix}n\\k\end{matrix} \right ) \left (\begin{matrix}N-n\\M-k\end{matrix} \right ).' class='latex' /></p>
<p style="text-align: left;">We can check this formula against our example above.  When <em>N</em> = 6, <em>M</em> = 5, and <em>K</em> = 3, one must prepare either 3 or 4 exams (i.e. <em>n</em> = 3 or <em>n</em> = 4).  When <em>n</em> = 4, the equation yields <img src='http://s.wordpress.com/latex.php?latex=P%28X%20%5Cgeq%203%29%20%3D%201&#038;bg=T&#038;fg=000000&#038;s=0' alt='P(X \geq 3) = 1' title='P(X \geq 3) = 1' class='latex' />, i.e. there is a 100% chance that you&#8217;ll be able to complete the exam successfully.  Of course, this makes sense given our discussion above.  If you only prepare 3 essays, however, then <img src='http://s.wordpress.com/latex.php?latex=P%28X%20%5Cgeq%203%29%20%3D%201%2F2&#038;bg=T&#038;fg=000000&#038;s=0' alt='P(X \geq 3) = 1/2' title='P(X \geq 3) = 1/2' class='latex' />, so there there&#8217;s only a 50% chance that you&#8217;ll be able to answer three of the five essay questions on the exam.</p>
<p style="text-align: left;">One can discover other things from this formula as well.  For example, if you are a lazy student, and only prepare the minimum number of essays (that is, <em>K</em> of them), then <em>n = K</em>, and the probability that <em>X</em> is at least <em>K </em>becomes the probability that <em>X </em>is <em>K</em>, which simplifies to <img src='http://s.wordpress.com/latex.php?latex=%5Cfrac%7B%28N-K%29%21M%21%7D%7BN%21%28M-K%29%21%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\frac{(N-K)!M!}{N!(M-K)!}' title='\frac{(N-K)!M!}{N!(M-K)!}' class='latex' />.  Note that this is 1 if <em>N</em> = <em>M</em>, which is the case most advantageous to the student; in other words, in this case, you are guaranteed to be able to answer all the questions by preparing the minimum number of questions.  However, in the case least advantageous to the student, where <em>M = K</em>, the probability becomes <img src='http://s.wordpress.com/latex.php?latex=%5Cleft%20%28%5Cbegin%7Bmatrix%7DN%5C%5CK%5Cend%7Bmatrix%7D%20%5Cright%20%29%5E%7B-1%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\left (\begin{matrix}N\\K\end{matrix} \right )^{-1}' title='\left (\begin{matrix}N\\K\end{matrix} \right )^{-1}' class='latex' />, which can be 1 (for <em>K = </em>1 or <em>K = N</em>), but can also be much smaller.  For example, if your teacher gives you 10 questions and tells you 5 will be on the exam, of which you must answer all 5, the odds that you will be able to complete the exam by only preparing 5 questions are only 1 in 252!</p>
<p style="text-align: left;">If this is too much for you, then stick to the safest strategy: if you are given <em>N</em> potential essay questions, and are told that you must choose <em>K</em> of them from a list of <em>M </em>during the exam, just prepare <em>N + K &#8211; M</em> of the essays.  If you feel like living dangerously, though, the point is that mathematics can help you to see how much risk you are taking on in the process.</p>
<p style="text-align: left;">(Hat tip to my betrothed for posing this question.)</p>
<p style="text-align: left;">
]]></content:encoded>
			<wfw:commentRss>http://www.mathgoespop.com/2010/12/humanities.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Stand Up to Questionable Odds</title>
		<link>http://www.mathgoespop.com/2010/09/standup.html</link>
		<comments>http://www.mathgoespop.com/2010/09/standup.html#comments</comments>
		<pubDate>Wed, 15 Sep 2010 15:00:31 +0000</pubDate>
		<dc:creator>Matt</dc:creator>
				<category><![CDATA[Math in the News]]></category>
		<category><![CDATA[Math on TV]]></category>
		<category><![CDATA[Sports]]></category>
		<category><![CDATA[bowling]]></category>
		<category><![CDATA[cancer]]></category>
		<category><![CDATA[jon stewart]]></category>
		<category><![CDATA[probability]]></category>

		<guid isPermaLink="false">http://www.mathgoespop.com/?p=717</guid>
		<description><![CDATA[<p style="text-align: left;">If you went to the movies in Los Angeles this summer, you may have seen the following ad from Stand Up to Cancer, a charitable program whose telethon aired last Friday night.  A clear homage to MasterCard&#8216;s long-running Priceless campaign, this ad swaps out prices for odds, ending with the sobering fact that 1 <span style="color:#777"> . . . &#8594; Read More: <a href="http://www.mathgoespop.com/2010/09/standup.html">Stand Up to Questionable Odds</a></span>]]></description>
			<content:encoded><![CDATA[<p style="text-align: left;">If you went to the movies in Los Angeles this summer, you may have seen the following ad from <a href="https://www.standup2cancer.org/Default.aspx">Stand Up to Cancer</a>, a charitable program whose telethon aired last Friday night.  A clear homage to <a href="http://en.wikipedia.org/wiki/MasterCard">MasterCard</a>&#8216;s long-running Priceless campaign, this ad swaps out prices for odds, ending with the sobering fact that 1 in 2 men and 1 in 3 women will be diagnosed with some type of cancer in their lifetime.</p>
<p style="text-align: center;"><object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="640" height="385" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="src" value="http://www.youtube.com/v/rwC87ZKF1dQ?fs=1&amp;hl=en_US" /><param name="allowfullscreen" value="true" /><embed type="application/x-shockwave-flash" width="640" height="385" src="http://www.youtube.com/v/rwC87ZKF1dQ?fs=1&amp;hl=en_US" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<p style="text-align: left;">Presumably, those cancer odds are taken from The American Cancer society, which has the relevant stats posted <a href="http://www.cancer.org/Cancer/CancerBasics/lifetime-probability-of-developing-or-dying-from-cancer">here</a>.  When it comes to some of the other claims in the ad, though, I couldn&#8217;t help but be skeptical.</p>
<p style="text-align: left;">Take the bowling claim, for instance.  This ad would have you believe that your odds of bowling a perfect game are 1 in 11,500.  This seems quite high, even when I consider the fact that I am not a bowling master.</p>
<p style="text-align: left;">Let&#8217;s try to reverse-engineer this statistic.  To score a perfect game in bowling, one must bowl 12 strikes in a row.  Let us suppose that your probability of bowling a strike on any given frame is some number <em>p</em>.  Furthermore, let&#8217;s suppose that your performance in any frame is independent of your performance in any other frame, so that you have a probability <em>p</em> of bowling a strike each time it&#8217;s your turn.  Of course, whether or not these probabilities are independent is up for debate.  On the one hand, bowling many strikes in a row may make you more anxious about keeping your streak going, which may in turn decrease your probability of another strike; but on the other hand, if you are an adrenaline junkie who thrives in the limelight that only a bowling alley can provide, perhaps such a chain would make it more likely for your streak to continue.  In any event, these are questions better suited to a psychologist rather than a mathematician, so for simplicity let us ignore them here.</p>
<p style="text-align: left;">If you have a probability <em>p</em> of bowling a strike, and a perfect game requires 12 strikes, then the probability you will score a perfect game is the product of 12 copies of <em>p</em> (one for each strike), or <em>p</em><sup>12</sup>.  If the above ad is to be believed, this probability must equal 1/11500.  In other words,</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=p%5E%7B12%7D%20%3D%20%5Cfrac%7B1%7D%7B11500%7D.&#038;bg=T&#038;fg=000000&#038;s=0' alt='p^{12} = \frac{1}{11500}.' title='p^{12} = \frac{1}{11500}.' class='latex' /></p>
<p style="text-align: left;">Taking the twelfth root of each side, we can then conclude that</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=p%20%3D%20%5Csqrt%5B12%5D%7B%5Cfrac%7B1%7D%7B11500%7D%7D%20%5Capprox%20.4588.&#038;bg=T&#038;fg=000000&#038;s=0' alt='p = \sqrt[12]{\frac{1}{11500}} \approx .4588.' title='p = \sqrt[12]{\frac{1}{11500}} \approx .4588.' class='latex' /></p>
<p style="text-align: left;">In other words, your odds of bowling a perfect game are 1 in 11,500 if and only if the probability that you&#8217;ll bowl a strike is around 45.88%.  This seems like an extremely generous probability to give to the population at large.  After all, who among you or your circle of friends bowls a strike, on average, every other frame?  Perhaps I bowl exclusively with people who are not very good (myself included), but I would think a fairer probability for the entire population would be closer to 20% or 30% (maybe even this is too generous).</p>
<p style="text-align: left;">What to these two alternatives yield for the odds of bowling a perfect game?  Well, if <em>p</em> = .3, then <em>p</em><sup>12</sup> is approximately 1 in 1,881,676; for p = .2, the odds plummet to 1 in 244,140,625.  Both of these are significantly lower than the odds cited in the ad (roughly 164 and 21,230 times lower, respectively).</p>
<p style="text-align: left;">It may be that the odds of witnessing a perfect game are around 11,500.  For example, when you go to a bowling alley, there may be experienced players practicing.  Moreover, there are many games occurring simultaneously at a bowling alley, thus increasing the odds that at least one of them will be a perfect game.  But saying &#8220;you have a 1 in 11,500 chance of seeing someone else bowl a perfect game&#8221; doesn&#8217;t sound as sexy as &#8220;you have a 1 in 11,500 chance of bowling a perfect game,&#8221; I suppose.</p>
<p style="text-align: left;">Some of the other odds are questionable as well.  For example, the National Weather Service has some data <a href="http://www.lightningsafety.noaa.gov/medical.htm">here</a> that suggests the odds of being struck by lightning in a given year are about 1 in 500,000, not too far off from the ad&#8217;s claim of 1 in 576,000.  This isn&#8217;t an apples to apples comparison, though, because the ad does not specify that these are the odds you will be struck by lightning <em>in a given year</em>.  If we take into account the average lifespan in the United States (<a href="http://www.google.com/publicdata?ds=wb-wdi&amp;met=sp_dyn_le00_in&amp;idim=country:USA&amp;dl=en&amp;hl=en&amp;q=average+lifespan+in+us">approximately</a> 78 years), then the probability of being struck by lightning (in one&#8217;s lifetime, not in one particular year) is closer to 1 &#8211; (499,999/500,000)<sup>78</sup>, which is around 1 in 6,411.  Much higher, you&#8217;ll note, than the odds of bowling a perfect game.  (Once again, of course, we are assuming that the odds of being struck by lightning don&#8217;t vary from year to year, and that the odds of being struck in one year are independent of the odds in any other year.)</p>
<p style="text-align: left;">If the point of the ad is to give us an intuitive understanding of how likely it is for us to develop cancer, then it seems important to give benchmarks that are accurate and relatable.  Most people have bowled, but few people will have a good intuitive understanding of what it means to face odds that are 1 in 1.8 million (the odds of bowling a perfect game if you get a strike 30% of the time).  I think the point of the ad is understood regardless, but it&#8217;s a shame that the claims leading up to this point weren&#8217;t checked more thoroughly.  Indeed, many of the odds quoted in the ad can be found <a href="http://www.funny2.com/odds.htm">here</a>, a humor website that offers no sources for any of its statistics.</p>
<p style="text-align: left;">Perhaps Jon Stewart was in charge of fact-checking.  Given his lack of understanding about the nature of this program, this is perhaps the most reasonable explanation.</p>
<p style="text-align: center;"><object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="640" height="385" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="src" value="http://www.youtube.com/v/txmezfuV0P4?fs=1&amp;hl=en_US" /><param name="allowfullscreen" value="true" /><embed type="application/x-shockwave-flash" width="640" height="385" src="http://www.youtube.com/v/txmezfuV0P4?fs=1&amp;hl=en_US" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<p style="text-align: left;">
]]></content:encoded>
			<wfw:commentRss>http://www.mathgoespop.com/2010/09/standup.html/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Top Chef Mathematics</title>
		<link>http://www.mathgoespop.com/2010/07/top-chef-mathematics.html</link>
		<comments>http://www.mathgoespop.com/2010/07/top-chef-mathematics.html#comments</comments>
		<pubDate>Wed, 28 Jul 2010 23:22:55 +0000</pubDate>
		<dc:creator>Matt</dc:creator>
				<category><![CDATA[Math and Food]]></category>
		<category><![CDATA[Math Gets Around]]></category>
		<category><![CDATA[Math on TV]]></category>
		<category><![CDATA[combinatorics]]></category>
		<category><![CDATA[Pigeonhole Principle]]></category>
		<category><![CDATA[probability]]></category>
		<category><![CDATA[stirling's formula]]></category>
		<category><![CDATA[top chef]]></category>

		<guid isPermaLink="false">http://www.mathgoespop.com/?p=520</guid>
		<description><![CDATA[<p>If you like food, Washington DC, hubris, or reality television, then chances are you are a fan of Bravo&#8217;s cooking competition Top Chef.  Every year the show takes a group of aspiring chefs, places them in a house in a new city, and throws weekly challenges their way.  Following the Survivor template, every week one chef <span style="color:#777"> . . . &#8594; Read More: <a href="http://www.mathgoespop.com/2010/07/top-chef-mathematics.html">Top Chef Mathematics</a></span>]]></description>
			<content:encoded><![CDATA[<p>If you like food, Washington DC, hubris, or reality television, then chances are you are a fan of Bravo&#8217;s cooking competition <a href="http://en.wikipedia.org/wiki/Top_Chef">Top Chef</a>.  Every year the show takes a group of aspiring chefs, places them in a house in a new city, and throws weekly challenges their way.  Following the Survivor template, every week one chef is voted off, and at the end someone is crowned Top Chef (and given a large check).  This season, the action takes place in our nation&#8217;s capitol.</p>
<p style="text-align: center;"><object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="512" height="288" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="src" value="http://www.hulu.com/embed/NhByDp2e69Ld2PuJ5T1qQw" /><param name="allowfullscreen" value="true" /><embed type="application/x-shockwave-flash" width="512" height="288" src="http://www.hulu.com/embed/NhByDp2e69Ld2PuJ5T1qQw" allowfullscreen="true"></embed></object></p>
<p style="text-align: left;">Now, a show such as this might seem to have very little to do with mathematics.  But look, and ye shall find.  In the second episode of this past season, the chefs were paired up for one of the challenges.  There were 16 chefs at the time, combining to make 8 pairs.  The pairing was determined by drawing knives: 16 knives were presented in a knife block, and each had a number on it from 1 to 8.  The number was printed on the blade, so each chef would walk to the block, draw a knife, and read the number.  The knives were not replaced afterwards.  Pairs were formed by people who drew the same number.</p>
<div id="attachment_558" class="wp-caption aligncenter" style="width: 542px"><a href="http://www.mathgoespop.com/wp-content/uploads/2010/07/Picture-14.png"><img class="size-full wp-image-558" title="topchef" src="http://www.mathgoespop.com/wp-content/uploads/2010/07/Picture-14.png" alt="" width="532" height="375" /></a><p class="wp-caption-text">This dude loves the number 3.</p></div>
<p style="text-align: left;">In this particular episode, the first six numbers drawn were 2, 1, 3, 6, 7, and 7.  In particular, the first pair was formed on the 6th draw.  This leads to a natural question: how long would you expect it to take before the first pair is formed?  Six draws seemed a bit long to me (I would have expected the first pair to have been formed sooner), so I immediately set about trying to understand the answer to this question.</p>
<p style="text-align: left;">To ease ourselves into it, let&#8217;s simplify things.  Instead of 8 pairs, suppose there were only 3.  And instead of knives, which are dangerous and pointy, let&#8217;s suppose people were choosing balls from a bag.  Rather than differentiating the balls by writing numbers on them, let&#8217;s differentiate them by color.  So suppose you have a bag with 3 pairs of balls: one pair red, one pair green, and one pair blue.</p>
<p style="text-align: left;">The game is this: you draw a ball from a bag and put it aside.  You keep doing this until you have drawn a pair.  The question is how long it will take before the first pair is drawn.<a href="http://www.mathgoespop.com/wp-content/uploads/2010/07/bag31.jpg"><img class="aligncenter size-full wp-image-561" title="bag3" src="http://www.mathgoespop.com/wp-content/uploads/2010/07/bag31.jpg" alt="" width="300" height="200" /></a></p>
<p style="text-align: left;">Right away we see that you will get a pair some time between your second and your fourth draw.  Obviously you can&#8217;t get a pair after only drawing one ball, so you need a minimum of two draws.  On the other hand, if you don&#8217;t have a pair after drawing three, you must have one ball of each color, which means your fourth draw MUST give you a pair of some color.</p>
<p style="text-align: left;">Given this observation, we can now start to calculate probabilities.  What is the probability that you will have a pair after two draws?  Well, this happens precisely when your first and second draw are the same color.  The probability of this happening is equal to 1/5, since there is no restriction on the first ball you draw, but then there is a 1 in 5 chance that the second one you draw will be the other ball with the same color.</p>
<div id="attachment_562" class="wp-caption aligncenter" style="width: 310px"><a href="http://www.mathgoespop.com/wp-content/uploads/2010/07/bag3one.jpg"><img class="size-full wp-image-562" title="bag3one" src="http://www.mathgoespop.com/wp-content/uploads/2010/07/bag3one.jpg" alt="" width="300" height="200" /></a><p class="wp-caption-text">You have a 1 in 5 chance of picking the second green ball after picking the first one, for example.</p></div>
<p style="text-align: left;">What about the probability that you&#8217;ll have a pair after exactly three draws?  In order for this to happen, your second draw must be a different color than the first, and your third draw must be the same color as either your first or second draw.  Of the 5 balls remaining after your first draw, 4 will have a different color from the first, meaning that the probability of drawing a second ball which is a different color than the first is 4/5.  Similarly, the probability of drawing a third ball which is the same color as either the first or the second ball is 1/2 (see the picture below).  Thus, by the laws of conditional probability, the odds that you will have a pair after your third draw is 4/5 x 1/2 = 2/5.</p>
<div id="attachment_563" class="wp-caption aligncenter" style="width: 310px"><a href="http://www.mathgoespop.com/wp-content/uploads/2010/07/bag3two.jpg"><img class="size-full wp-image-563" title="bag3two" src="http://www.mathgoespop.com/wp-content/uploads/2010/07/bag3two.jpg" alt="" width="300" height="200" /></a><p class="wp-caption-text">You have a 2 in 4 (i.e. 1 in 2) chance of pulling a blue or green ball given that the results of your first two draws were blue and green, for example.</p></div>
<p style="text-align: left;">The same argument works when calculating the odds that the pair will come on the fourth draw.  There is no restriction on the first draw, there is a 4/5 chance that your second draw will be a different color from the first, there is a 1/2 chance that the third draw will be a different color from the second, and there is then a 100% chance that your fourth draw will be the same color as one of your earlier draws.  This again gives a probability of 2/5.  We see that the probabilities add up to one, as they should.</p>
<p style="text-align: left;">Given these probabilities we can also calculate the <a href="http://en.wikipedia.org/wiki/Expected_value">expected value</a>: on average, how many draws will you need before you get a pair?  Since the probability of two draws is 1/5, the probability of three draws is 2/5, and the probability of four draws is 4/5, we see that the average is</p>
<p style="text-align: center;">2 x 1/5 + 3 x 2/5 + 4 x 2/5 = 16/5 = 3.2.</p>
<p style="text-align: left;">In other words, on average you will need 3.2 draws before you come up with a pair.</p>
<p style="text-align: left;">It&#8217;s more interesting, of course, to deal with <em>c</em> different colors, rather than just 3.  We can still perform this analysis, and try to find probabilities and expectations.  Suppose we have <em>c</em> pairs of balls, each pair of a different color.  We draw the balls without replacement from a bag until we find a pair of the same color, then we stop.  We can define a <a href="http://en.wikipedia.org/wiki/Random_variable">random variable</a> <img src='http://s.wordpress.com/latex.php?latex=Y_c&#038;bg=T&#038;fg=000000&#038;s=0' alt='Y_c' title='Y_c' class='latex' /> to be the draw on which we complete our first pair. For example, in the case <em>c</em> = 3 above, we saw that <img src='http://s.wordpress.com/latex.php?latex=Y_3%20%3D%202&#038;bg=T&#038;fg=000000&#038;s=0' alt='Y_3 = 2' title='Y_3 = 2' class='latex' /> with probability 1/5, <img src='http://s.wordpress.com/latex.php?latex=Y_3%20%3D%203&#038;bg=T&#038;fg=000000&#038;s=0' alt='Y_3 = 3' title='Y_3 = 3' class='latex' /> with probability 2/5, and <img src='http://s.wordpress.com/latex.php?latex=Y_3%20%3D%204&#038;bg=T&#038;fg=000000&#038;s=0' alt='Y_3 = 4' title='Y_3 = 4' class='latex' /> with probability 2/5.</p>
<p style="text-align: left;">As before, notice that <img src='http://s.wordpress.com/latex.php?latex=Y_c&#038;bg=T&#038;fg=000000&#038;s=0' alt='Y_c' title='Y_c' class='latex' /> must take a value between 2 and <em>c</em> + 1.  This is because we can&#8217;t draw a pair before our 2nd draw, and after <em>c</em> draws, the worst case scenario is for us to have each ball of a different color.  Since we have exhausted all color possibilities, the <em>c</em> + 1st draw must give us a pair (in essence we are applying the <a href="http://en.wikipedia.org/wiki/Pigeonhole_principle">Pigeonhole Principle</a>).  So, to describe the behavior of <img src='http://s.wordpress.com/latex.php?latex=Y_c&#038;bg=T&#038;fg=000000&#038;s=0' alt='Y_c' title='Y_c' class='latex' /> we need to calculate the probability that <img src='http://s.wordpress.com/latex.php?latex=Y_c%20%3D%20k&#038;bg=T&#038;fg=000000&#038;s=0' alt='Y_c = k' title='Y_c = k' class='latex' /> for <em>k</em> between 2 and <em>c</em> + 1.</p>
<p style="text-align: left;">The same sort of argument as in the simple case <em>c</em> = 3 works here.  Suppose you want to calculate <img src='http://s.wordpress.com/latex.php?latex=P%28Y_c%20%3D%20k%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='P(Y_c = k)' title='P(Y_c = k)' class='latex' />.  In order to find your first pair on the <em>k</em>th draw, you need to NOT draw a pair on your 2nd, 3rd, 4th, &#8230;, or <em>k</em> &#8211; 1st draw, and then have the color on the <em>k</em>th draw match one of the colors you have already drawn.  Since there are a total of 2<em>c</em> balls in the bag to begin with, we see that the odds of not drawing a pair on the 2nd draw is <img src='http://s.wordpress.com/latex.php?latex=%5Cfrac%7B2c-2%7D%7B2c-1%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\frac{2c-2}{2c-1}' title='\frac{2c-2}{2c-1}' class='latex' />, the odds of not drawing a pair on the 3rd draw is <img src='http://s.wordpress.com/latex.php?latex=%5Cfrac%7B2c-4%7D%7B2c-2%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\frac{2c-4}{2c-2}' title='\frac{2c-4}{2c-2}' class='latex' /> (since there are 2<em>c</em> &#8211; 2 balls remaining, and you want to avoid 2 that are colors you&#8217;ve already drawn, leaving you with 2<em>c</em> &#8211; 2 &#8211; 2 = 2<em>c</em> &#8211; 4 options), and so on, so that the odds of not getting a pair on the <em>k &#8211; </em>1st draw is <img src='http://s.wordpress.com/latex.php?latex=%5Cfrac%7B2c-2%28k-2%29%7D%7B2c-%28k-2%29%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\frac{2c-2(k-2)}{2c-(k-2)}' title='\frac{2c-2(k-2)}{2c-(k-2)}' class='latex' />.  Meanwhile, in order to draw a pair on your <em>k</em>th draw, you must pull one of the <em>k</em> &#8211; 1 colors that have already been pulled.  Since there are 2<em>c</em> &#8211; k + 1 balls remaining, the probability that this happens is <img src='http://s.wordpress.com/latex.php?latex=%5Cfrac%7Bk-1%7D%7B2c-k%2B1%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\frac{k-1}{2c-k+1}' title='\frac{k-1}{2c-k+1}' class='latex' />.</p>
<p style="text-align: left;">Combining these, we see that</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=P%28Y_c%20%3D%20k%29%20%3D%20%5Cfrac%7Bk-1%7D%7B2c-k%2B1%7D%5Cprod_%7Bj%3D1%7D%5E%7Bk-2%7D%5Cfrac%7B2c-2j%7D%7B2c-j%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='P(Y_c = k) = \frac{k-1}{2c-k+1}\prod_{j=1}^{k-2}\frac{2c-2j}{2c-j}' title='P(Y_c = k) = \frac{k-1}{2c-k+1}\prod_{j=1}^{k-2}\frac{2c-2j}{2c-j}' class='latex' /></p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%3D%20%5Cfrac%7Bk-1%7D%7B2c-k%2B1%7D2%5E%7Bk-2%7D%5Cprod_%7Bj%3D1%7D%5E%7Bk-2%7D%5Cfrac%7Bc-j%7D%7B2c-j%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='= \frac{k-1}{2c-k+1}2^{k-2}\prod_{j=1}^{k-2}\frac{c-j}{2c-j}' title='= \frac{k-1}{2c-k+1}2^{k-2}\prod_{j=1}^{k-2}\frac{c-j}{2c-j}' class='latex' /></p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%3D%202%5E%7Bk-2%7D%5Cfrac%7Bk-1%7D%7B2c-k%2B1%7D%5Cfrac%7B%28c-1%29%21%7D%7B%282c-1%29%21%7D%5Cfrac%7B%282c-k%2B1%29%21%7D%7B%28c-k%2B1%29%21%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='= 2^{k-2}\frac{k-1}{2c-k+1}\frac{(c-1)!}{(2c-1)!}\frac{(2c-k+1)!}{(c-k+1)!}' title='= 2^{k-2}\frac{k-1}{2c-k+1}\frac{(c-1)!}{(2c-1)!}\frac{(2c-k+1)!}{(c-k+1)!}' class='latex' /></p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%3D%20%5Cfrac%7B2%5E%7Bk-2%7D%5Cleft%28%5Cbegin%7Barray%7D%7Bc%7Dc-1%5C%5Ck-2%5Cend%7Barray%7D%5Cright%29%7D%7B%5Cleft%28%5Cbegin%7Barray%7D%7Bc%7D2c-1%5C%5Ck-1%5Cend%7Barray%7D%5Cright%29%7D.&#038;bg=T&#038;fg=000000&#038;s=0' alt='= \frac{2^{k-2}\left(\begin{array}{c}c-1\\k-2\end{array}\right)}{\left(\begin{array}{c}2c-1\\k-1\end{array}\right)}.' title='= \frac{2^{k-2}\left(\begin{array}{c}c-1\\k-2\end{array}\right)}{\left(\begin{array}{c}2c-1\\k-1\end{array}\right)}.' class='latex' /></p>
<p>(Recall that the binomial coefficient <img src='http://s.wordpress.com/latex.php?latex=%5Cleft%28%5Cbegin%7Barray%7D%7Bc%7Dn%5C%5Ck%5Cend%7Barray%7D%5Cright%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='\left(\begin{array}{c}n\\k\end{array}\right)' title='\left(\begin{array}{c}n\\k\end{array}\right)' class='latex' /> is given by <img src='http://s.wordpress.com/latex.php?latex=%5Cleft%28%5Cbegin%7Barray%7D%7Bc%7Dn%5C%5Ck%5Cend%7Barray%7D%5Cright%29%20%3D%20%5Cfrac%7Bn%21%7D%7Bk%21%28n-k%29%21%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\left(\begin{array}{c}n\\k\end{array}\right) = \frac{n!}{k!(n-k)!}' title='\left(\begin{array}{c}n\\k\end{array}\right) = \frac{n!}{k!(n-k)!}' class='latex' />, and as usual n! = n x (n-1) x &#8230; 3 x 2 x 1 is the product of all integers from 1 to n.) With this formula, we can now see how likely it was for the first pairing on Top Chef to have occurred on or after the 6th draw.  In this case there are 8 pairs, so <em>c</em> = 8, and we see that</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=P%28Y_8%20%5Cgeq%206%29%20%3D%20%5Csum_%7Bk%3D6%7D%5E%7B9%7DP%28Y_8%20%3D%20k%29%20%3D%20%5Csum_%7Bk%3D6%7D%5E%7B9%7D%20%5Cfrac%7B2%5E%7Bk-2%7D%5Cleft%28%5Cbegin%7Barray%7D%7Bc%7D7%5C%5Ck-2%5Cend%7Barray%7D%5Cright%29%7D%7B%5Cleft%28%5Cbegin%7Barray%7D%7Bc%7D15%5C%5Ck-1%5Cend%7Barray%7D%5Cright%29%7D%2C&#038;bg=T&#038;fg=000000&#038;s=0' alt='P(Y_8 \geq 6) = \sum_{k=6}^{9}P(Y_8 = k) = \sum_{k=6}^{9} \frac{2^{k-2}\left(\begin{array}{c}7\\k-2\end{array}\right)}{\left(\begin{array}{c}15\\k-1\end{array}\right)},' title='P(Y_8 \geq 6) = \sum_{k=6}^{9}P(Y_8 = k) = \sum_{k=6}^{9} \frac{2^{k-2}\left(\begin{array}{c}7\\k-2\end{array}\right)}{\left(\begin{array}{c}15\\k-1\end{array}\right)},' class='latex' /></p>
<p style="text-align: left;">which comes out to 16/39, or roughly 41.03%.</p>
<p style="text-align: left;">Of course, there&#8217;s still the question of expectation: approximately how large do we expect <img src='http://s.wordpress.com/latex.php?latex=Y_c&#038;bg=T&#038;fg=000000&#038;s=0' alt='Y_c' title='Y_c' class='latex' /> to be (remember we saw that <img src='http://s.wordpress.com/latex.php?latex=E%28Y_3%29%20%3D%2016%2F5&#038;bg=T&#038;fg=000000&#038;s=0' alt='E(Y_3) = 16/5' title='E(Y_3) = 16/5' class='latex' />)?  I&#8217;ll spare you the details, but one can show that for general <em>c</em>, the expected value is given by</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=E%28Y_c%29%20%3D%20%5Cfrac%7B2%5E%7B2c%7D%28c%21%29%5E2%7D%7B%282c%29%21%7D.&#038;bg=T&#038;fg=000000&#038;s=0' alt='E(Y_c) = \frac{2^{2c}(c!)^2}{(2c)!}.' title='E(Y_c) = \frac{2^{2c}(c!)^2}{(2c)!}.' class='latex' /></p>
<p style="text-align: left;">In particular, in the case <em>c</em> = 8 from Top Chef, we find that <img src='http://s.wordpress.com/latex.php?latex=E%28Y_8%29%20%3D%20%5Cfrac%7B32768%7D%7B6435%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='E(Y_8) = \frac{32768}{6435}' title='E(Y_8) = \frac{32768}{6435}' class='latex' />, which is approximately 5.09.  So on average, for <em>c</em> = 8 we expect to find a pair after a little more than 5 draws.</p>
<p style="text-align: left;">For the more advanced reader, one final question: what happens to the expected value as <em>c</em> grows large?  As it turns out, we can write the expected value in a very nice form in terms of the <a href="http://en.wikipedia.org/wiki/Gamma_function">Gamma function</a> (which one can think of as a generalization of the factorial to the entire real line).  Using the doubling formula <img src='http://s.wordpress.com/latex.php?latex=%5CGamma%28z%29%5CGamma%28z%2B1%2F2%29%20%3D%202%5E%7B1-2z%7D%5Csqrt%7B%5Cpi%7D%5CGamma%282z%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='\Gamma(z)\Gamma(z+1/2) = 2^{1-2z}\sqrt{\pi}\Gamma(2z)' title='\Gamma(z)\Gamma(z+1/2) = 2^{1-2z}\sqrt{\pi}\Gamma(2z)' class='latex' />, the interested reader can show that</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=E%28Y_c%29%20%3D%20%5Csqrt%7B%5Cpi%7D%5Cfrac%7B%5CGamma%28c%2B1%29%7D%7B%5CGamma%28c%2B1%2F2%29%7D.&#038;bg=T&#038;fg=000000&#038;s=0' alt='E(Y_c) = \sqrt{\pi}\frac{\Gamma(c+1)}{\Gamma(c+1/2)}.' title='E(Y_c) = \sqrt{\pi}\frac{\Gamma(c+1)}{\Gamma(c+1/2)}.' class='latex' /></p>
<p style="text-align: left;">If one then uses <a href="http://en.wikipedia.org/wiki/Stirling%27s_approximation">Stirling&#8217;s formula</a> to approximate the Gamma function, it follows that</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=E%28Y_c%29%20%5Capprox%20%5Csqrt%7Bc%5Cpi%7D%2C&#038;bg=T&#038;fg=000000&#038;s=0' alt='E(Y_c) \approx \sqrt{c\pi},' title='E(Y_c) \approx \sqrt{c\pi},' class='latex' /></p>
<p style="text-align: left;">in other words <img src='http://s.wordpress.com/latex.php?latex=%5Cfrac%7BE%28Y_c%29%7D%7B%5Csqrt%7Bc%7D%7D%20%5Crightarrow%20%5Csqrt%7B%5Cpi%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\frac{E(Y_c)}{\sqrt{c}} \rightarrow \sqrt{\pi}' title='\frac{E(Y_c)}{\sqrt{c}} \rightarrow \sqrt{\pi}' class='latex' /> as <img src='http://s.wordpress.com/latex.php?latex=c%20%5Crightarrow%20%5Cinfty.&#038;bg=T&#038;fg=000000&#038;s=0' alt='c \rightarrow \infty.' title='c \rightarrow \infty.' class='latex' />  What a wonderful asymptotic!  This tells us that the number of draws we will need from a bag of <em>c</em> pairs before obtaining our first pair grows like the square root of <em>c</em> times a factor of <img src='http://s.wordpress.com/latex.php?latex=%5Csqrt%7B%5Cpi%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\sqrt{\pi}' title='\sqrt{\pi}' class='latex' />.  We can compare the estimate given here for <img src='http://s.wordpress.com/latex.php?latex=E%28Y_8%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='E(Y_8)' title='E(Y_8)' class='latex' /> with the exact value computed above &#8211; in doing so, we find that <img src='http://s.wordpress.com/latex.php?latex=E%28Y_8%29%20%5Capprox%20%5Csqrt%7B8%5Cpi%7D%20%5Capprox%205.01&#038;bg=T&#038;fg=000000&#038;s=0' alt='E(Y_8) \approx \sqrt{8\pi} \approx 5.01' title='E(Y_8) \approx \sqrt{8\pi} \approx 5.01' class='latex' />.  So indeed, the approximation is fairly close to the true value (and the approximation will only get better as <em>c</em> grows).</p>
<p style="text-align: left;">There are many related questions one could ask.  For example, what if instead of pairs, we look at collections of triplets, or quadruplets?  What if we consider formation of the 2nd pair or 3rd pair instead of only considering the 1st pair?  What if we allow for different numbers of balls of each color (e.g. 2 red balls and 3 green balls)?  But I&#8217;ve already gone on too long, so I will leave these questions for another time.  I don&#8217;t know if these questions go by a certain name or not &#8211; I couldn&#8217;t find this particular problem anywhere.  If anyone knows of a paper or book where these problems are discussed, I would be much obliged.</p>
<p style="text-align: left;">In the mean time, I will close with a picture of <a href="http://en.wikipedia.org/wiki/Tom_Colicchio">Tom Colicchio</a> looking like a badass.  Clearly the worlds of chefs and rock stars have collided &#8211; will mathematicians be next?</p>
<p style="text-align: left;"><a href="http://www.mathgoespop.com/wp-content/uploads/2010/07/Picture-15.png"><img class="aligncenter size-full wp-image-598" title="tc" src="http://www.mathgoespop.com/wp-content/uploads/2010/07/Picture-15.png" alt="" width="375" height="310" /></a></p>
<p style="text-align: left;">(Kudos to <a href="http://rtm.wustl.edu/index.html">Dr. Moore</a> for some helpful commentary.)</p>
<p style="text-align: center;">&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mathgoespop.com/2010/07/top-chef-mathematics.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

