[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] RE: Encoding charset of HTTP Basic Authentication
- From: Peter Flynn <peter@silmaril.ie>
- To: xml-dev@lists.xml.org
- Date: Fri, 03 Feb 2012 22:20:30 +0000
On 03/02/12 12:53, Tei wrote:
> He!... mediocrity is not always bad.
>
> Say... databases. Almost all web applications are built using
> relational databases. Not all applications need a database. But web
> applications are soehorn to use relational databases anyway.
Tell me about it. I worked with some Oracle guys. They're really good,
and seriously know their stuff, but ask them to add 2+2 and they'll
define a table with two variables, write a screen to capture two values
and stuff them into the table, then write a stored procedure to retrieve
the values, add them, and format a report to print the result.
They are perfectly aware of other ways of doing this, and it's not as if
they don't have access to tools other than hammers, but I have seen one
of them populate a HTML select element of five options by pulling the
data from a specially-written table, for data which won't change in my
lifetime (Mr/Ms/Mrs/Miss/Dr).
So how do we extend this to the philosophical choices faced with XML,
HTTP, or anything else? First off, it's scale. Hard-coding a web-form
visitor's choice of title isn't a big deal, and if you need to add Rev,
or if you get assimilated by the BBC and need to add Lord and Lady,
that's not a big deal either; nor is it a big deal doing it in the
database, modulo some small penalty in retrieval and dependency.
Convincing programmers to soft-code something large and mutable isn't
hard either, like the set of ISO 3166 country codes, or the 639 language
codes (or whatever the current suite is, I haven't looked at them for
years). They can see the need for it, so the second clue is perception.
We got 2-letter country TLDs when we could have used the 3-letter set
and stayed the same length as .gov, .mil, .edu, .org, and .com. Doesn't
matter a whole lot unless we end up with 26×26 nation states on the
planet. But 2-letter language codes don't scale: there are way more than
676 languages on this earth.
So eventually you get down to the third factor: judgement, which is
based on knowledge and a shedload of other things. Programmer X speaks
only Klingon, and doesn't really know much about other languages except
for the existence of a few of them, high and far off; and certainly
doesn't really care that anyone speaking Sindarin would ever have need
to access the Internet, let alone in their own language. That's a harsh
view, and mercifully rarer nowadays. Soften it a little and consider IBM
(I believe: Len? Michael?) who were building a precursor to what would
eventually become the foundations of Latin-1. Right down somewhere near
the bottom right-hand corner came the ÿ (yuml) character, which is used
in French, and mostly in the names of some towns, but so rarely that
even some French people are unaware of it, as I discovered when I asked
some French LaTeX typesetters. The ŵ character (wcirc), which is used
daily by 3 million Welsh speakers didn't appear to get a look in until
Latin-2.
So whats close to and up front tends to have more effect on what we
decide -- when nothing else intervenes -- than what is high and far off.
That's why it's good to have people who are able to take the longer
view, with deeper knowledge, who can warn when things are likely to
break if they don't take certain factors into account. They won't always
be heeded, as Michael says, and we'll always be able to look back with
regret on things we should have done differently, but I think it *is*
improving a little: I see projects now which *do* take things into
account that 20 years ago would (were) laughed out of the water.
So how much attention should we pay to charsets? It turns out, *lots*.
But like my Oracle programmers, we have a bunch of knowledge on one
side, and a bunch of tools on the other, and they all work, many even in
the whole of Unicode, but we can't envisage the scale of demand for
building that capability into every aspect of what we write, so we stick
with what we know works. It's wrong, it will break, perhaps
disastrously, and it will upset and affect lots of people, but right
there and then it would have taken an order of magnitude longer to do it
right.
Perhaps we can learn from history for once :-)
///Peter
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]