Sunday, November 7, 2010

Viewing from Afar : Analyzing Writing

The written word is a two-way street. As the author is telling us something, we learn a bit more about the person. When students write, a good teacher can find places for the student to improve their language skills. But what can a computer see?

In 2003, released a simple script which counts the number of times common words appear in writing. With 80% accuracy, it then guesses whether the author is male or female. You can try it yourself. Alex Chancellor of The Guardian has an excellent article wondering what this means for equality in the written word. Paired with complexity analysis, a grammar check, slang maps, and other information, computers probably can estimate a writer's native language, familiarity with parts of the language, and dialects used by their English teachers.

This gets to the point which I wanted to make about analysis. A computer can tell if students in Class 3A are using an unusually low amount of prepositional phrases, with a high error rate. Or if students used a variety of new adjectives after reading "The Phantom Tollbooth". Were those adjectives in the book, in the vocabulary quiz given by teachers, or did the story influence students' writing style? When I phrase it this way, it sounds like an awesome idea. If I said, "a consultant in Chicago monitors, directs, and rates each teacher in Montevideo" then it becomes a bad idea. This isn't a privacy issue as much as it's deciding the role of the teacher. We want to help teachers.

We also could use SocialHistory.js, an ingenious script which takes "Share on Facebook" and "Tweet this" buttons and hides them from people who don't use those websites. It didn't take long before a male-or-female test appeared based on 10,000 possible websites, and you can try it here (works in Firefox, IE, and Browse). With some editing, this could be used by educators, too. Suppose I assign a paper, then see how many students read the topic's page on English Wikipedia, versus Spanish or Simple English Wikipedia. We could find out whether students Google the books they read in class, how often teachers visit the Plan Ceibal website, and how many pages students viewed in the Chemistry book in Browse. There are privacy issues, but responsible researchers should use it to get better diagnostics of what schools are doing with their technology.

Suppose we had an activity for students to write and share short story mysteries. We see which resources help students make fewer errors and use more adjectives. It sounds like a research goldmine to me.