I know that the truly computer literate among the Eph would be able to do this in a much cooler way than I, but here is my psuedo-geek attempt to figure out all the female African-American faculty members on leave. Using R, it’s pretty easy . . .


Note that they really should use R for all the statistics and econometrics courses at Williams, but that is a rant for another day.

First I need to grab the data from the College.

> x = scan(file = “http://www.williams.edu/admin/registrar/geninfo/faculty.html”, sep = “\n”, what = “character”)
Read 1168 items

The “>” is the R prompt. I’ll intersperse my fascinating comments amongst the code snippets. (And yes, I know, if I were cool I would do this in Perl.)

I realize that this is a hack in several levels, but all that I wanted to do was ensure that I get all the data from that web page and store all the lines as separate character strings in one vector called x.

Alas, if you are trying this at home, you can see all sorts of endlessly annoying HTML mark-up. To make things simpler, I am just going to delete this.

> x = gsub(‘

|||

|
‘, “”, x)

I realize that this renders as total gibberish on your browser. You can see the code by viewing the source for this page. I don’t know the HMTL equivalent of the verbatim environment for LaTeX.

Anyway, this seems to have worked. Note that there are 1,168 items, but a lot of this data is junk. Let’s look at the top and bottom of the vector:

> head(x)
[1] “” “”
[3] “” “”
[5] “FACULTY 2004-2005 ” “

*On leave 2004-2005

> tail(x)
[1] “Steven J. Zottoli Howard B. Schow ’50 and Nan W. Schow ”
[2] ” Professor of Biology ”
[3] “B.A. (1969) Bowdoin; Ph.D. (1976) University of Massachusetts ”
[4] “”
[5] “”
[6] “”
>

Looking at the file on the web, this seems correct. Note that I think that there is some subtle formatting weirdness around Peter Wells in the original file that caused my initial attempts, using read.table, to fail.

Let’s now determine which elements in the vector have a * in them.

> length(x)
[1] 1168
> on.leave <- grep("\\*", x)
> length(on.leave)
[1] 80
> head(on.leave)
[1] 6 7 8 9 10 16
>

Note that R requires that you use a double up all your backslashes. (I am assuming that only the geeks are reading now, so that I don’t need to explain the usage of escape characters in regular expressions.) So, there are 80 rows with a * in them. Not all of these rows are for faculty members on leave, but most of them are.

> head(x[on.leave])
[1] “

*On leave 2004-2005

[2] ” * *On leave first semester

[3] ” * * *On leave second semester

[4] ” * * * *On leave calendar year (January-December 2005) ”
[5] “* Daniel P. Aalberts Associate Professor of Physics ”
[6] “* Colin C. Adams Francis Christopher Oakley Third Century ”
> tail(x[on.leave])
[1] “* Carmen Whalen Assistant Professor of History ”
[2] “* Dwight L. Whitaker Assistant Professor of Physics ”
[3] “*** Alan E. White Mark Hopkins Professor of Philosophy ”
[4] “* Heather Williams Professor of Biology ”
[5] “** Reinhard A. Wobus Edna McConnell Clark Professor of ”
[6] “*** Betty Zimmerberg Professor of Psychology ”
>

Let’s simplify things by getting rid of the first 4 junk rows.

> candidates <- x[on.leave[5:80]]
> length(candidates)
[1] 76
> head(candidates)
[1] “* Daniel P. Aalberts Associate Professor of Physics ”
[2] “* Colin C. Adams Francis Christopher Oakley Third Century ”
[3] “* Laylah Ali Assistant Professor of Art ”
[4] “** Donald deB Beaver Professor of History of Science ”
[5] “** Olga R. Beaver Professor of Mathematics ”
[6] “* Ilona D. Bell Professor of English and

>

So, we now have 76 faculty members on leave, including Ali (who we are almost certain is the faculty member in question). Note, however, that there are professors here from Political Science and History, two departments that have already been cleared.

> grep(“History|Political Science”, candidates)
[1] 4 25 28 48 51 52 57 69 71

We can delete Math and Physics (two (of the many) departments without any female African American faculty). Also, I think that we know from Scott Wong that the professor is on leave now — without being clear if the leave is first semester or all year — so we could get rid of the faculty who are on-leave for just the second semester. At Williams, three *’s mean second semester leave only. Putting this together leaves us with just 50 names:

> candidates[- grep(“History|Political Science|Mathematics|Physics|\\*\\*\\*”, candidates)]
[1] “* Colin C. Adams Francis Christopher Oakley Third Century ”
[2] “* Laylah Ali Assistant Professor of Art ”
[3] “* Ilona D. Bell Professor of English and”
[4] “* Elizabeth Brainerd Associate Professor of Economics ”
[5] “* Kim B. Bruce Frederick Latimer Wells Professor of Computer ”
[6] “* Denise Kimber Buell Associate Professor of Religion ”
[7] “* Julie A. Cassiday Associate Professor of Russian ”
[8] “** Ronadh Cox Associate Professor of Geosciences ”
[9] “* Georges B. Dreyfus Professor of Religion and”
[10] “** Helga Druxes Professor of German and”
[11] “** David B. Edwards Carl W. Vogt ’58 Professor of Anthropology ”
[12] “* David Eppel Professor of Theatre ”
[13] “* Edward A. Epping Alexander D. Falck Class of 1899 Professor ”
[14] “* Soledad Fox Assistant Professor of Romance Languages ”
[15] “** Chris R.A. Geiregat Assistant Professor of Economics ”
[16] “* Louise E. Gl�ck Margaret Bundy Scott Senior Lecturer ”
[17] “** George R. Goethals II Dennis Meenan ’54 Third Century ”
[18] “* Douglas Gollin Associate Professor of Economics ”
[19] “** Charles W. Haxthausen Faison-Pierson-Stoddard Professor ”
[20] “* Marjorie W. Hirsch Assistant Professor of Music and”
[21] “** Meredith C. Hoppin Frank M. Gagliardi Professor of ”
[22] “* G. Robert Jackall Class of 1956 Professor of Sociology ”
[23] “** Sarah (Liza) Johnson Assistant Professor of Art ”
[24] “** Saul M. Kassin Massachusetts Professor of ”
[25] “* Robert D. Kavanaugh Hales Professor of Psychology and ”
[26] “* Faruk A. Khan Assistant Professor of Economics ”
[27] “** Thomas A. Kohut Sue and Edgar Wachenheim III Professor ”
[28] “* Matthew A. Kraus Associate Professor of Classics ”
[29] “* Karen B. Kwitter Ebenezer Fitch Professor of Astronomy ”
[30] “* Michael J. Lewis Professor of Art ”
[31] “** Charles M. Lovett, Jr. Philip and Dorothy Schein Professor ”
[32] “** Daniel V. Lynch Professor of Biology ”
[33] “* Anandi Mani Assistant Professor of Economics ”
[34] “* Manuel A. Morales Assistant Professor in Biology ”
[35] “** Peter T. Murphy Associate Professor of English and ”
[36] “* Peter L. Pedroni Associate Professor of Economics ”
[37] “* Christopher L. Pye Class of 1924 Professor of English ”
[38] “* Leyla Rouhi Professor of Spanish ”
[39] “** Robert M. Savage Associate Professor of Biology ”
[40] “** Kenneth K. Savitsky Associate Professor of Psychology ”
[41] “* Stephen C. Sheppard James Phinney Baxter, 3rd, Professor ”
[42] “* Lara Shore-Sheppard Associate Professor of Economics ”
[43] “** Ari Solomon Assistant Professor of Psychology ”
[44] “** Paul R. Solomon Professor of Psychology and Fellow ”
[45] “* Stefanie Solum Assistant Professor of Art ”
[46] “** Arnard V. Swamy Associate Professor of Economics ”
[47] “* Janneke van de Stadt Assistant Professor of Russian ”
[48] “* Christopher M. Waters Hans W. Gatzke ’38 Professor of ”
[49] “* Heather Williams Professor of Biology ”
[50] “** Reinhard A. Wobus Edna McConnell Clark Professor of ”
>

Of course, this listing is flawed in many ways. For starters, it fails to include Professor Kaye Fealing, who is on-leave (in reality and) according to the Economics Department but not according to the registrar. My scanning procedure also cut off the department names for some professors (like Adams in Mathematics), so there are some names here that do not belong. . . .

Blah, blah, blah. I see no female African American faculty members on this list, other than Ali. I think that we can be virtually certain that the target of the racial slur was Laylah Ali and the department we are looking for is Art.

But we told you that 2 days ago . . .

Here’s a challenge for anyone who got this far. Repeat this exercise in Perl, but use the College’s on-line pdf course catalogue for your input. This data seems to be better (at least it has Fealing correct).

Facebooktwitter
Print  •  Email