[
Lists Home |
Date Index |
Thread Index
]
Pascarella, Randy wrote:
> I confess I'm new to XML, so please forgive my newbie question
> in advance. Take the following XML snippets for reference. The first
> snippet describes some streets in a city, and let's say this file was
> generated by some guy, Tom.
>
> Tom's city description:
> <city>
> <name>Chicago</name>
> ...
> </city>
>
> Now Bob describes the same city with the same schema, but the content
> differs (the street names are followed by "Street"):
> <city>
> <name>Chicago</name>
> ...
</city>
> The point is that the two files describe the same thing from the same
> schema in different ways via the values of the elements. So if I, being
> an intercessor, want to link the two together in a common way, how would
> I do this?
In general, this is a really difficult problem. In fact, there is no
general solution. You can only hope that in your chosen domain there is
a "reasonable" solution that works often enough.
Take your illustration, for example. I have actually had to do
something quite similar, where one data set used FIPS codes and another
(ours) used city-county-state state names. The usual way to deal with
it is to create a lookup table, whereby the one set can be converted to
the other or to a canonical form.
In your example, of course you can use the zip code - as long as you
know that both Tom and Bob are going to be using it. Now, what readable
labels do you want to use? Tom's, Bob's, or another one? That's one
of the tricky parts.
Of course, what I just wrote assumes that within any one zip code there
is not more than one street with the same abbreviated name. To stick
with Chicago, on the South Side there are, for example, both a 54th St.
and a 54th Pl. (I used to live on it), about half a block apart. If Bob
carelessly omitted the St. or Pl. suffix, you could still confuse the two.
So you have to learn whether Bob and Tom are consistent, and what rules
they use, or you have to get them to agree to use some way you all agree
on (or that you force down their throats), or you cannot quite be sure
that you will always get it right.
Of course, this is not specific to XML.
Cheers,
Tom P
|