Wednesday 22 November 2006

Computer says Male

This is odd. Researchers reckon they can tell an author's sex from their prose (link to PDF). Any sort of prose—fiction, non-fiction, greetings cards. (No, I made up the last one.) They do it by examining the number and type of pronouns and noun modifiers. Apparently, women use more of the former and men more of the latter.
Pronouns send the message that the identity of the "thing" involved is known to the reader, while specifiers provide information about "things" that the writer assumes the reader does not know

So it's the difference between discussion and instruction.

They examined the relative frequencies of more than 1000 selected characters in fiction and non-fiction pieces by male and female authors. Characters were specific words, pairs of words and three-word phrases (for example, a preposition – article – noun). From these, some fancy algorithm narrowed them down to fewer than 50 useful key features.

Male writing tends to be top-heavy with articles and quantities; female writing with pronouns.

Well, that's not quite the whole story. The pronoun thing is complicated. Obviously everyone uses them. Male authors employ a lot of plural (we, us, them) and male pronouns. Women employ second person (you) and female pronouns. Well, what a surprise. Men use 'he' and women use 'she'. Maybe—and I'm going out on a bit of a limb here—it's something to do with subject matter rather than the writing style.

You can test their claims at this site, which uses the same algorithm to determine sex. Give it a go. Eighty per cent accuracy? When I plugged in one of my blog posts the site confidently misidentified me as male. It came up with the same conclusion for a passage of non-fiction. And even with a big slab of my novel, which features a female protagonist, it still insisted that I'm male. It's my own fault. I shouldn't have used the definite article so many times. Not to mention my profligacy with 'to' and 'as'. I'm going to have to include more gerberas.