Hello EphBlog community… I am a recently graduated Math/Economics major and wanted to offer my summary and perspective on Peter Nurnberg’s excellent thesis of which a good portion was just submitted as a working paper at the NBER, as I was relatively close to the process at the time of his writing.  I offer a summary of the key points of the NBER paper, and some (not-so) brief commentary.

In short, the main questions addressed in the authors’ research are:
1) What drives a student to apply at a certain college?
2) What drives a school to accept a given student?
3) Given a student has a choice between multiple colleges, what drives a student to choose between these colleges?

Note that these are all very different questions, and while the authors addressed all of them in his thesis paper, the NBER working paper focuses only on the third question. The third question is important to colleges because managing yield is extremely important– while downside error can be rectified by letting in a few more students from the waitlist, upside error is not easily corrected. If colleges are able to use this research to better plan their acceptee pools, the college application process becomes more efficient for all.

The data set studied are regular decision student acceptees from 2008-2012, with the dependent variable being the discrete choice of matriculation at Williams vs. matriculation at another institution. From analysis of this data, the authors are able to develop a yield model that can be applied to an acceptee pool with reasonable accuracy.

So what makes college selection interesting from an individual perspective?

For a student, attending college is both an investment (leading to stronger future job prospects, better networking) and a consumption product (having enjoyable classes, a wide range of extracurricular activities, strong sports teams). Each prospective matriculant seeks to optimize both of these, subject to a set of constraints (costs, desired distance from home, etc) as well as their own individual preferences, and applies this to their “choice set,” ie: the list of colleges that student has been accepted to, and determines which college best optimizes the individual’s preferences (see Figure 1 in the paper for a good graphical representation of this decision).

A difficultly with this research is the limited data available for each applicant. Although it may be plausible that a student with a passion for Applachian trail hiking may be more prone to attend Williams based on its proximity to the trail, these sorts of idiosyncratic preferences are not obvious from a student’s application. Instead, we are limited by the available variables– demographic data, financial aid status, location, whether the student visited campus, and academic and non-academic admissions ratings (a quick and dirty proxy used by the admissions office for how strong an applicant in academics and extracurriculars/other intangibles, respectively) and a few other available variables.

When the authors plugged all these variables into the model, some of the variables had strong statistical significance. I highlight a few interesting results below:

Unsurprisingly, candidates with worse academic and nonacademic ratings were more likely to matriculate than stronger candidates, plausibly explained by stronger students having access to better choice sets. Additionally, despite our numerous museums, students with a strong studio art background were less likely to choose Williams (perhaps they had thought Williams was in Williamsburg, Brooklyn during their application process). Strong athletes, (NB: non controlled for whether they have been ‘tipped’) on the other hand, were more likely to attend Williams, perhaps due to its athlete-friendly reputation and competitive sports teams.

However, all non-international minorities studied– Black, Hispanic, Asian, Native American, were much less likely to choose Williams, and according to Peter, were “among the strongest predictors of [non-]matriculation.” While I don’t have a plausible explanation for why this choice, it seems that minority outreach efforts still have a long way to go in making Williams a more welcoming place for underrepresented minorities.

Finally, while students who marked an interest in research science were somewhat less likely to come to Williams (perhaps spurning us for a large state school or an MIT/CalTech), students who marked their academic interest as “no idea” were much more likely to come to Williams– perhaps the most encouraging result of them all!

The upshot of the research is that the authors (and potentially the admissions office as well) are now able run each candidate through the model and come up with a number that represents that percentage probability that the candidate will matriculate at Williams, with the potential for resulting in more efficient yield management and thus a more efficient application acceptance process.

While the model certainly isn’t perfect and much data are unavailable, this was nevertheless a fantastic work by Peter with credit to President Shapiro and Professor Zimmerman, as well as the admissions office for sharing the data.  I highly recommend reading the paper, and would be happy to answer any more detailed questions about the statistical modeling in the comments.

Print  •  Email