Lists Home |
Date Index |
Rick Jelliffe wrote:
From: "Jonathan Borden" <email@example.com>
> Actually this _is_ the original point, isn't it?
No, the original point is the use-case where there is no dispute about
what characters are used. The user can toss coins to decide, we
have no more interest in their particular ranges than we care about
their particular content models.
By that interpretation, we don't allow alternate spellings of the words
"facade" and "cooperate". End of story. John was making that point that such
definitions are not a reliable way to determine what language a piece of
text is written in. I am agreeing.
All the rest of the discussion has been a red herring, and I have seen
that fish before. Basically it comes down to a denial of the use case.
I changed the topic on purpose. If the use case is to restrict a character
string to a specific set of characters, terrific. It's not particularly
interesting and deserves not much more comment.
The issue of detection of human language, on the other hand, is one that
interests me. So my 'original point' was not pointed at the same location in
the thread as your 'original point' ... that's the way these email threads
go, eh? No surprise.