Saturday, January 5, 2013

SHRDLU is wrong!

It turns out it's not

ETAOIN SHRDLU
after all!

Peter Norvig reveals the New Truth:

there is a standard order of frequency used by typesetters, ETAOIN SHRDLU, that is slightly violated here: L, R, and C have all moved up one rank, giving us the less mnemonic ETAOIN SRHLDCU.

Given all the powerful tools that have come along in the last fifty years, it's actually amazing what a good job Mayzner did with the tools of his time.

What's a bit hard to tell is how the Google language corpus differs from Mayzner's " variety of sources, e.g., newspapers, magazines, books, etc."

Linguists of the world, enjoy!

1 comment:

  1. This article appeared in my rss feed at roughly the same time as your link, and seems like a funny complement: http://www.datagenetics.com/blog/april12012/index.html

    ReplyDelete