Unless you’re one of those suckers who goes to a school that administers final exams after the holidays (like I was), the few weeks after Thanksgiving can be quite a stressful time for students. Between exams, final papers, and working out holiday travel plans, it can be easy to get overwhelmed. For students with a quantitative bent, the days are undoubtedly spent in large part trying to memorize formulas or theorems, or on refining their understanding of certain problem-solving techniques that have been covered in their courses.

If your interests are more in line with the humanities, you may think that you are safe from the pull of mathematics. There are occasions, though, when a working knowledge of mathematics can help even in a liberal arts course.

Consider the following example. Suppose you’re enrolled in a course for which the final exam will have a large essay component. To help you study, your teacher gives you a list of *N* potential essay questions. Moreover, she tells you that on the exam, some smaller number *M* of those exact questions will appear, and of those that do appear, you must select *K* to answer on the exam. The question then becomes: what’s the minimum number of essay questions that you should prepare?

If you have a humanities background, and these capital letters scare you off, let’s consider a particular example – say, *N* = 6, *M* = 5, and *K* = 3. In other words, your teacher gives you 6 questions, you know that 5 of the 6 will end up on the exam, and out of those 5 you’ll have to answer 3. How many essay questions must you prepare to guarantee that you’ll be able to answer 3 of the 5 questions on the exam? Obviously, preparing for all 6 essays is overkill, since you know that only 5 questions will be on the exam. But preparing for 5 questions is overkill too, since the worst thing that can happen is that one of the questions you prepared is omitted from the list on the exam, in which case you will still have prepared for 5 – 1 = 4 of the available essays, one essay too many.

Following this reasoning, we see that the smallest number of essays you can get away with preparing while still guaranteeing that you can complete the exam is 4. If you prepare 4 essays, then in particular you DON’T prepare 2, and in a worst-case scenario, the 2 that you don’t prepare will be on the list of 5 that show up on the exam. Since you can’t respond to those two, you must respond to all three of the remaining questions, which you will be able to do since you prepared for the remaining four. Note that this argument falls apart if you only prepare 3 questions, since in that case the worst scenario is that the three you didn’t pick end up on the exam, leaving you with only 5 – 3 = 2 questions you can answer.

Let’s now return to the general case. The same argument applies. Suppose you are given *N* questions, *M* of which you know will be on an exam, out of which you can choose *K* to answer. This means that you are free to ignore *M* – *K* of the questions on the exam. In particular, the safest thing to do is to prepare for all but *M* – *K* of the questions, since then, in the worst case scenario, the *M* – *K* questions you did not prepare will be precisely the same as the *M* – *K* questions on the exam. In other words, the safest strategy would be to prepare *N* – (*M – K*) = *N + K – M* of the questions. Note that this agrees with our previous example; if *N* = 6, *M* = 5, and *K* = 3, then *N + K – M *= 6 + 3 – 5 = 4.

This also gives you an idea of what to hope for if you are a student. Note that the larger *M* is relative to *K*, the better off a student will be. Meanwhile, the smaller *M* is relative to *K*, the more work a student will have to do in order to be fully prepared for the exam. If the teacher gives out 10 questions and says that 2 will be on the exam, this is a much worse situation for the student than if the teacher gives out 10 questions, says that all 10 will be on the exam, and the student will have to choose two to discuss (in the first case, this strategy says the student must prepare all 10 questions, while in the second case the student must prepare only 2).

Of course, not all students will follow this strategy. Notice that if a student is lucky, he may get away with preparing only *K* essays, since in the best case scenario the *K* essays that the student prepares will be the same as the *K* essays that are on the exam. So no matter what, a student should prepare somewhere between *K* and *N + K – M* essays (in our first example, this equates to either 3 or 4 essays). In the event that *N + K – M *is much larger than *K*, though, it may be tempting to prepare fewer essays and simply hope that luck is on one’s side.

Fortunately, mathematics can help us in this case as well (watch out, though; the math required from here on out is more substantial). For we can ask the question “If we prepare only *n* essays for some *n* between *K* and *N + K – M*, what is the probability that we will be able to answer all of the essay questions posed on the exam?” In fact, these probabilities are well understood. Indeed, if we let *X* denote the number of questions selected for the exam that you’ve prepared for, then *X* satisfies a hypergeometric distribution. This tells us that the probability that the number of questions on the exam that you’ve prepared for is equal to *k* is, using the notation introduced here,

(for help with the notation, see the link above). In particular, if you want to compute the probability that *X* is at least *K*, (in other words, that the number of questions you prepared that are on the exam is at least the minimum necessary for you to complete the exam successfully), this can be found by adding up these probabilities from *k* = *K* to *k = *. Thus, we have

We can check this formula against our example above. When *N* = 6, *M* = 5, and *K* = 3, one must prepare either 3 or 4 exams (i.e. *n* = 3 or *n* = 4). When *n* = 4, the equation yields , i.e. there is a 100% chance that you’ll be able to complete the exam successfully. Of course, this makes sense given our discussion above. If you only prepare 3 essays, however, then , so there there’s only a 50% chance that you’ll be able to answer three of the five essay questions on the exam.

One can discover other things from this formula as well. For example, if you are a lazy student, and only prepare the minimum number of essays (that is, *K* of them), then *n = K*, and the probability that *X* is at least *K *becomes the probability that *X *is *K*, which simplifies to . Note that this is 1 if *N* = *M*, which is the case most advantageous to the student; in other words, in this case, you are guaranteed to be able to answer all the questions by preparing the minimum number of questions. However, in the case least advantageous to the student, where *M = K*, the probability becomes , which can be 1 (for *K = *1 or *K = N*), but can also be much smaller. For example, if your teacher gives you 10 questions and tells you 5 will be on the exam, of which you must answer all 5, the odds that you will be able to complete the exam by only preparing 5 questions are only 1 in 252!

If this is too much for you, then stick to the safest strategy: if you are given *N* potential essay questions, and are told that you must choose *K* of them from a list of *M *during the exam, just prepare *N + K – M* of the essays. If you feel like living dangerously, though, the point is that mathematics can help you to see how much risk you are taking on in the process.

(Hat tip to my betrothed for posing this question.)