Wednesday, December 2nd, 2009

This relates to something I will blog more at length about in the next few days, but I thought I’d ask this question first to see if anyone who reads this might know the answer.

I’m going to try and approximate, using the simplest way possible, an English language sentence. The method I’m going to use is to pick a number, N, and make my selection of words from random strings of at most N letters.

  • If N = 2 a sentence would look like this: d fo mh j e l tx df d
  • If N = 5 a sentence would look like this: gh e kj jegns tyu dfa o wdu tah ttauo kk

So here’s my question:

If I want to approximate the distribution of word-lengths in the English language, which value of N should I choose?

I know it won’t be a very close approximation, but it’s very quick and easy to generate the words using this set-up.


Monday, April 27th, 2009

A freelancing website I’ve just signed up to offers you a choice of adding yourself as a Native English speaker, English as a second language, or both.
Which makes me think of Geordies.