Predicting Matriculation

Here (pdf) is the academic paper which came out of Peter Nurnberg’s ’09 thesis.

This paper provides an econometric analysis of the matriculation decisions made by students accepted to Williams College, one of the nation’s most highly selective colleges and universities. Using data for the Williams classes of 2008 through 2012 to estimate a yield model, we find that—conditional on the student applying to and being accepted by Williams—applicant quality as measured by standardized tests, high school GPA and the like, the net price a particular student faces (the sticker price minus institutional financial aid), the applicant’s race and geographic origin, plus the student’s artistic, athletic and academic interests, are strong predictors of whether or not the student will matriculate.

1) Kudos to Nurnberg for doing some excellent work. All thesis students should aspire to publish their work in an academic journal. Kudos also to Nurnberg’s advisors: Morton Schapiro and David Zimmerman.

2) Brickbacks to Nurnberg (or should it really be to Schapiro and Zimmerman) for not making the full text of Nurnberg’s thesis available on line. (Prior discussion here.)

3) Want your economics and statistics thesis to be equally successful? Then write about Williams. Professor Steven Miller is eager to supervise thesis students (in math/stat) who want to analyze Williams data.

4) Should I spend a week or two going through the details of this paper? Reader requests are always welcome!


Academic Rating Details

I need to do a post in which I bring together everything we know about Williams admissions. Alas, no time today! But I can share this pdf with some details about the College’s Academic Rating system. See here for previous discussion. Comments:

1) The key importance is that, if you are not an AR 1 or 2, Williams automatically rejects you unless you are in one of the special categories, and those special categories do not include “Wrote an amazing essay” or “Best editor of our high school paper in a decade.” There are plenty such applicants with AR 2, many of whom Williams will also reject. So, if you are AR 3 or below, you are toast.

2) The single biggest exception category is the 65 or so athletic tips. Note that this is not the same thing as great high school athlete. You could be a national champion in something like gymnastics or ski jumping and Williams wouldn’t (really) care because Williams does not compete in gymnastics. To be a “tip,” a Williams coach must tell Admissions that she wants you.

3) The second biggest category is racial affirmative action, mainly black/Hispanic. Actually, it could be that this category is even bigger than athletic tips, but I am feeling PC today. It is unclear if Williams, like other elite schools, discriminates against Asian American applicants.

4) The third category, much smaller (I think) than athletics/race, is wealth. Williams does some non-trivial affirmative action for poor students (and/or students whose parents did not attend college) and for extremely rich students (whose parents have given or might be expected to make million dollar donations to the College).

5) I need a good short hand description for these three categories: race/wealth/athletics. Suggestions? Beyond them, there are very few students who are admitted with AR 3 or below. (At least, that is my understanding. Contrary opinions welcome.)

6) Looking closely at the descriptions, it is obvious that some measures are more objective than others. Who can agree on the difference between an “exceptional” essay versus one that is merely “outstanding?” Given that, I would wager that the harder numbers — above 1450 math/verbal SAT, 33 or above ACT, 4’s and 5’s on AP exams — matter most.

7) Always keep in mind that high school quality is very important. Being in the 90th percentile of your class (that is, at the botton of the to 10%) at Andover or Milton or Stuyvesant is better than being the valedictorian at more than half the high schools in the US.

8) To be honest, I can’t recall the source for this pdf. Probably somehow related to Peter Nurnberg’s ’09 thesis. Sorry! Does anyone recognize it?


Follow Up to Williams Thesis Use Dispute

I was going to title this post “Why My Critics are Clueless, As Usual,” but I am aiming for higher standards in 2011. Recall our lengthy dispute about the Academic Rating system at Williams. Sam, Derek and Rory demonstrated, to varying degrees, a sad inability to understand both Williams’ own policies and the broader ethos of academia. Only go below if you want the details.

First, we disputed my use of Peter Nurnberg’s senior thesis. I claimed that my usage was appropriate. My critics claimed that it was against Williams policies. (Some of them were also confused on related topics.) As most readers with a clue would have known — concepts like fair use are not that complex — I was correct. See the update at the bottom of the post.

This post has been slightly edited after conversations with Sylvia Brown, Williams Archivist. As a result, the comment thread below will not make much sense. Sorry! In its current form, the post is consistent with Williams policies with regard to the use of senior theses.

Now, it would be one thing if my conversations will Sylvia had forced me to make changes in the original post, but not a single fact from Nurnberg’s thesis has been removed. Every detail — about the exact scores needed for the different academic ratings, about the precise (and formerly secret) procedures used by the admissions office — is still in the current version, officially approved by the relevant authority at Williams. (I did clean up some aspects of the post to remove some confusions, as evidenced in the comment thread, about some side issues.)

So, anyone who asserted that my use of Nurnberg’s thesis was against Williams policy is wrong. Just ask Sylvia Brown!

Second, was my usage consistent with the broader ethos/standards of academia? This is, of course, different from my adherence to Williams’ policies. Perhaps Williams is an outlier. The claim, by various critics, that my actions were outside the bounds of normal academic standards was even more annoying (to me) than disputes about Williams policies. After all, there was a (very small) chance that I was wrong about Williams, but I know approximately as much about current academic norms as most readers.

But you don’t need to believe just me! Let’s consult some other academics.

Professor Andrew Gelman:

I don’t know anything about Williams policy but I have little sympathy for someone trying to restrict the discussion of a thesis on a blog! A thesis is public material and it would seem best for all concerned for any research to be accessible and discussed. I mean, sure, it wouldn’t be right to scan and post entire chapters without permission, but it doesn’t sound like you’re planning on doing that. The bit about “you may not copy or distribute any content without the permission . . .”–that just sounds ridiculous.

Also, I’m not sure how relevant it is whether the blog is commercial or academic. There’s some sort of continuous range, right? On one extreme is this blog right here. It’s non-commercial (we’ve in fact turned down requests to advertise) and it’s academic–actually hosted on a Columbia University computer. But what if we were not academic (if, for example, I worked at a company and hosted it on a server at home) or commercial (as with the many blogs that run a few ads). Or what if it were commercial and non-academic? For example, what if Slate magazine or the New York Times wanted to report some content from this undergraduate thesis? They wouldn’t need permission, right? At least, I don’t see why they shouldn’t be allowed to go to the library, read the thesis, and report what they find. (I’m not speaking of legalities here, just what seems reasonable to me.)

Professor Radford Neal:

Perhaps undergraduate theses ought not to be quoted, since the author may not want their immature thoughts to be widely known. But in this case, such quotes are explicitly allowed in academic publications. Is it really less embarrassing for your silly undergrad thoughts to be quoted in Science or Nature rather than in a blog post?

Either a thesis is out there, forming a part of the intellectual atmosphere, or it isn’t. There is no half-way status.

Read the whole discussion for details. The point here is not that all academics agree with me. They don’t! The point is that some academics agree with me and some disagree with me.

The most subtle argument against my use was raised, perhaps unsurprisingly, by Will Slack ’11. He argues that I lack the “moral standing to re-publish” information from Peter’s thesis. Before I grapple with this position, perhaps Will (and others) can flesh it out a bit. For example, does everyone lack this moral standing, or just me? Consider a Williams student writing a Record article about admissions policy at Williams. Does she have the “moral standing” to read Peter’s thesis and use some of the facts in her article? Does it matter if her article is a news story or an op-ed? What if, instead of writing this article for the Record, she wrote it on WSO, or even on EphBlog?

Unless Will (or others) can come up with some plausible grounds for distinguishing among these cases, I would recommend a different rule: Unless library policies specifically prohibit it, anyone writing anywhere may (accurately!) report the contents of anything in the Williams library.

Do readers disagree?


Academic Rating at Williams

This post summarizes what we know about the Academic Rating used by the Williams Admissions Office as the single most important input to their decision making process. (Previous posts that have touched on this topic here, here and here.)

“Students Choosing Colleges: Understanding the Matriculation Decision at a Highly Selective Private Institution” (pdf) by Peter Nurnberg ’09, Morton Schapiro and David Zimmerman provides the inside scoop on the definition of academic rating.

While the academic reader ratings are somewhat subjective, they are strongly influenced by the following guidelines.

  • Academic 1: at top or close to top of HS class / A record / exceptional academic program / 1520 – 1600 composite SAT I score;
  • Academic 2: top 5% of HS class / mostly A record / extremely demanding academic program / 1450 – 1520 composite SAT I score;
  • Academic 3: top 10% of HS class / many A grades / very demanding academic program / 1390 – 1450 composite SAT I score;
  • Academic 4: top 15% of HS class / A – B record / very demanding academic program / 1310 – 1400 composite SAT I score;
  • Academic 5: top 20% of HS class / B record / demanding academic program / 1260 – 1320 composite SAT I score;
  • Academic 6: top 20% of HS class / B record / average academic program / 1210 – 1280 composite SAT I score;
  • Academic 7: top 25% of HS class / mostly B record / less than demanding program / 1140 – 1220 composite SAT I score;
  • Academic 8: top 33% of HS class / mostly B record or below / concern about academic program / 1000 – 1180 composite SAT I score;
  • Academic 9: everyone else.

(See here and here for previous discussion of this article, which is largely based on Nurnberg’s senior thesis.)

Never before has Williams revealed the exact details of the Academic Rating, abbreviated as AR by the cognoscenti. Nurnberg’s thesis, which College policy prevents me from directly quoting, includes a copy of “Class of 2009 Folder Reading Guide, Academic Ratings,” a Williams College document. Here (pdf) are some details.


      verbal   math   composite SAT II   ACT    AP
AR 1: 770-800 750-800 1520-1600 750-800 35-36 mostly 5s
AR 2: 730-770 720-750 1450-1520 720-770 33-34 4s and 5s
AR 3: 700-730 690-720 1390-1450 690-730 32-33 4s


1) AR takes account of high school quality. It is common for a student in just the top 10% of an elite high school like Bronx Science or Andover to be an AR 1, assuming she has the test scores to match.

2) According to Nurnberg’s thesis, each file is rated by two admissions officers. If their ratings differ by no more than 1 point, the ratings are averaged. If they differ by 2 or more, there is an adjudication process. Any applicant with a score of 1.5 or 2.5, get a third reader whose score is then incorporated in the average.

3) Incorporating essay strength and teacher recommendations into the overall AR is hard. Reasonable admissions officers will often disagree about whether a given essay is “exceptional” or merely “outstanding.”

4) The vast majority of applicants come from high schools that send multiple applications to Williams, either this year or in past years. So, many judgments are made by comparing students within the same high school against each other. This allows the Admissions Office to compare students high school grades even at schools which do not officially rank their students.

How much does Academic Rating matter? Consider the Alumni Review’s own overview (pdf).

The full-time admission staffers, plus a handful of helpers like Phil Smith ’55 (Nesbitt’s predecessor as director), pore over the folders. Two readers examine each folder independently, without seeing each other’s comments, and assess them in three major ways. Each applicant gets an academic rating from 1 to 9 that focuses heavily on his or her high school grades, standardized test scores, the rigor of his or her academic program within the context of the school setting and the strength of teacher recommendations.

If the first and second readers’ academic ratings differ by more than a point, they put their heads together to try to reach a consensus rating. In general, all applicants with a combined academic rating of 3 or higher are rejected at this point, unless the first and second readers have identified one or more “attributes” that warrant additional consideration.

The reason that the admissions office assigns a third reader to applicants rated 2.5 is that there is a huge difference between a 2 and a 3. If you are a typical 3 (1400 SAT, 700 achievement tests, 4s on your AP exams), you will be rejected unless you have a specific attribute or hook.

And this is where Williams, and other elite schools, can become a bit dishonest. They sometimes like to pretend that it is common for such hooks to include things like “being a great kid” or “significant concern for the environment” or “wonderful musician.” This is not a lie in so much as, every once in a while, such a hook might make a difference. But 90% of the time or more, the students who are accepted to Williams without being an AR 1 or 2 fall into just a few special buckets: recruited athlete (Williams coach knows your name and wants you), under-represented minority (checked the black and/or Hispanic box on the Common Application), socio-economic (neither parent has a BA and you check the financial aid box) or legacy (parent or grand-parent went to Williams). Other attributes (development, employee child, local resident) can matter too, but athlete/URM/socio-ec/legacy are by far the four largest drivers of admissions for applicants with an academic rating below 2.

Contrary opinions and questions are welcome!

UPDATE: This post has been slightly edited after conversations with Sylvia Brown, Williams Archivist. As a result, the comment thread below will not make much sense. Sorry! In its current form, the post is consistent with Williams policies with regard to the use of senior theses. Later discussion here and here.


Nurnberg Thesis Commentary

Hello EphBlog community… I am a recently graduated Math/Economics major and wanted to offer my summary and perspective on Peter Nurnberg’s excellent thesis of which a good portion was just submitted as a working paper at the NBER, as I was relatively close to the process at the time of his writing.  I offer a summary of the key points of the NBER paper, and some (not-so) brief commentary.

In short, the main questions addressed in the authors’ research are:
1) What drives a student to apply at a certain college?
2) What drives a school to accept a given student?
3) Given a student has a choice between multiple colleges, what drives a student to choose between these colleges?

Note that these are all very different questions, and while the authors addressed all of them in his thesis paper, the NBER working paper focuses only on the third question. The third question is important to colleges because managing yield is extremely important– while downside error can be rectified by letting in a few more students from the waitlist, upside error is not easily corrected. If colleges are able to use this research to better plan their acceptee pools, the college application process becomes more efficient for all.

The data set studied are regular decision student acceptees from 2008-2012, with the dependent variable being the discrete choice of matriculation at Williams vs. matriculation at another institution. From analysis of this data, the authors are able to develop a yield model that can be applied to an acceptee pool with reasonable accuracy.

So what makes college selection interesting from an individual perspective?

For a student, attending college is both an investment (leading to stronger future job prospects, better networking) and a consumption product (having enjoyable classes, a wide range of extracurricular activities, strong sports teams). Each prospective matriculant seeks to optimize both of these, subject to a set of constraints (costs, desired distance from home, etc) as well as their own individual preferences, and applies this to their “choice set,” ie: the list of colleges that student has been accepted to, and determines which college best optimizes the individual’s preferences (see Figure 1 in the paper for a good graphical representation of this decision).

A difficultly with this research is the limited data available for each applicant. Although it may be plausible that a student with a passion for Applachian trail hiking may be more prone to attend Williams based on its proximity to the trail, these sorts of idiosyncratic preferences are not obvious from a student’s application. Instead, we are limited by the available variables– demographic data, financial aid status, location, whether the student visited campus, and academic and non-academic admissions ratings (a quick and dirty proxy used by the admissions office for how strong an applicant in academics and extracurriculars/other intangibles, respectively) and a few other available variables.

When the authors plugged all these variables into the model, some of the variables had strong statistical significance. I highlight a few interesting results below:

Unsurprisingly, candidates with worse academic and nonacademic ratings were more likely to matriculate than stronger candidates, plausibly explained by stronger students having access to better choice sets. Additionally, despite our numerous museums, students with a strong studio art background were less likely to choose Williams (perhaps they had thought Williams was in Williamsburg, Brooklyn during their application process). Strong athletes, (NB: non controlled for whether they have been ‘tipped’) on the other hand, were more likely to attend Williams, perhaps due to its athlete-friendly reputation and competitive sports teams.

However, all non-international minorities studied– Black, Hispanic, Asian, Native American, were much less likely to choose Williams, and according to Peter, were “among the strongest predictors of [non-]matriculation.” While I don’t have a plausible explanation for why this choice, it seems that minority outreach efforts still have a long way to go in making Williams a more welcoming place for underrepresented minorities.

Finally, while students who marked an interest in research science were somewhat less likely to come to Williams (perhaps spurning us for a large state school or an MIT/CalTech), students who marked their academic interest as “no idea” were much more likely to come to Williams– perhaps the most encouraging result of them all!

The upshot of the research is that the authors (and potentially the admissions office as well) are now able run each candidate through the model and come up with a number that represents that percentage probability that the candidate will matriculate at Williams, with the potential for resulting in more efficient yield management and thus a more efficient application acceptance process.

While the model certainly isn’t perfect and much data are unavailable, this was nevertheless a fantastic work by Peter with credit to President Shapiro and Professor Zimmerman, as well as the admissions office for sharing the data.  I highly recommend reading the paper, and would be happy to answer any more detailed questions about the statistical modeling in the comments.


Peter Nurnberg ’09 turns his thesis into a working paper!

Originally posted by Will Slack ’11 on WSO.

Coauthors: Morton Schapiro, David Zimmerman


The college choice process can be reduced to three questions:

1) Where does a student apply?

2) Which schools accept the students?

3) Which offer of admission does the student accept?

This paper addresses question three. Specifically, we offer an econometric analysis of the matriculation decisions made by students accepted to Williams College, one of the nation’s most highly selective colleges and universities. We use data for the Williams classes of 2008 through 2012 to estimate a yield model. We find that—conditional on the student applying to and being accepted by Williams—applicant quality as measured by standardized tests, high school GPA and the like, the net price a particular student faces (the sticker price minus institutional financial aid), the applicant’s race and geographic origin, plus the student’s artistic, athletic and academic interests, are strong predictors of whether or not the student will matriculate.

Full paper here or here (for those without NBER access).


