OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] Re: Are the data users happy? Why not?

[ Lists Home | Date Index | Thread Index ]

John, the question is not "can I validate some string against *some
criterion* because I have a tool that will?". The question is "can I
validate some string wrt the use I intend to make of it".

John Cowan <jcowan@reutershealth.com> writes:

> I don't care where my customers are, I just care that they can enter their
> postal addresses without being blocked by validation.  Thus I require a
> postal code if the country code is US or CA, because I know how to validate
> those postal codes; otherwise, the postal code is optional.  Ditto for the
> state/province field.

But how do you know it's *their* post code? Something tells me you are
interested in the singleton set of *the user's* post code, not the
larger set of *valid* post codes. Is a valid, but irrelevant, post
code better for you than no post code at all or a bogus post code?
20502 is a valid US ZIP code. Do you want to send mail there that is
addressed to me?

<snip topic="relative merits of unconstrained strings vs. enumerated types"/>

> > wonder how many sites have users in Afghanistan?
> Probably not.  But why should I refuse to take their money?

Whose money you take is up to you. But after you take their money, you
want to know where they are, don't you (although I'm not sure why)?

Presumably, you give your users a drop-down list and not a text field
to reduce data-entry errors. As the list of enumerations grows longer,
it becomes a greater potential source of errors than an unconstrained
string because users can't be bothered to read through its entire
length. The difference is that you assumed that since it is a member
of enumeration (of countries or zips or streets) it is valid *for your
purposes*. The real validation of an address is sending mail to
it. When your letter returns with no-such-person stamped in Dari, are
you better off than a no-such-country stamp in US-English?

Validation, wither regexp- or set-based, only catches so many errors,
while giving you the illusion that your data are somehow
correct. There are what -- about 32,000 US zips[1]? There are 100,000
5-digit integers. There is 1 zip where your customer lives.


[1] http://www.census.gov/geo/www/gazetteer/places2k.html


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS