Wednesday, 22 August 2007


Ah, I misunderstood what was meant by middle here, thank you for the clarification. However it would also be interesting to know if there is any significant difference in the distribution of words, as a function of intial letter, between different era's. Do you know of any work in this area? DP

But, Dark Puss, that's not the point - the middle could be in h or in r, but we know that in the best modern dictionaries, it actually is about late l or ealy m. Therefore, the fact that early dictionaries mid out much earlier is of interest.

All without worrying of course whether the middle is measured in terms of pages, headwords, subwords etc!!

Hmm, why would I expect words to be evenly balanced about the "middle"? I'd like to do a bootstrap analysis (a non-parametric test that repeatedly samples a distribution) on the underlying data to see what the median and the standard-error on the median is without any underlying assumptions of symmetry or normality. Anyone know where I could get the appropriate data sets?

Dark Puss (as if I didn't have enough REAL work to do!)

Sounds fascinating but I admire you for reading it all -- sounds "good but tough" as Huck Finn (I think) said.

Have you read "Caught in the Web of Words" by K. M. Elisabeth Murray? If not, do!

