[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Regular expression for URI matching

From: Nicolas LEHUEN <nicolas.lehuen@ubicco.com>
To: "'Bisaga, Gary'" <Gary.Bisaga@Equidity.com>,"'xml-dev@lists.xml.org'" <xml-dev@lists.xml.org>
Date: Fri, 24 Aug 2001 16:43:49 +0200
I am not defending this way of developing in any case.

I just notice that during the period where you didn't had any validation
tools for HTML, developers were validating their HTML pages against Netscape
and IE. If it rendered correctly, the HTML code was OK. Problem was, both
browsers were quite forgiving in order to be able to display the widest
range of pages, even bad written ones, so that they would not be dropped in
favor of the other.

So people were in fact validating their work against forgiving programs. No
one could expect a could result from such a policy, but hey, there was no
correct validation tool available ! That's all the more sad that a great
deal of validation could have been performed using the HTML DTD that were
available since the very beginning (IIRC).

As mentionned on this list, some parts of XML compliance are so arcane (just
have a look at the URI regexp !) that you can't expect people to follow the
spec without a 100% compliant tool. The problem is that content had to be
produced before 100% compliant validators were (are ?) available... That
means a lot of refactoring that is sometimes a bit difficult to accept,
especially when the validation errors are about URI that are arbitrarous
anyway.

When you're a product manager, you then begin to worry about the success of
your product if it is too picky about certain specification features...
Hence the possibility to switch them off, which means more and more invalid
documents will be produced.

Regards,
Nicolas

>-----Message d'origine-----
>De : Bisaga, Gary [mailto:Gary.Bisaga@Equidity.com]
>Envoyé : vendredi 24 août 2001 15:45
>À : 'xml-dev@lists.xml.org'
>Objet : RE: Regular expression for URI matching
>
>
>I find it difficult to believe that all developers, or even 
>most developers,
>feel this way. Or, should I say, I find it difficult to 
>believe that most
>developers would feel this way if their technical leaders take 
>some efforts
>to help them understand.
>
>I'm reminded of the Netscape vs. IE question (and why I'm glad 
>they both
>exist). When code goes out to the users, I want them to have the most
>forgiving environment; when code is being developed, I want the least
>forgiving. In our HTML development group, we're successfully convinced
>people to use Netscape for their primary development - if you 
>mess up your
>tables enough, it doesn't display. But the Java developers all 
>use IE; there
>you want to be able to see the HTML even if it's slightly 
>messed up for now.
>
>I can see the same thing with this kind of tool - the 
>developers of that
>particular type of content use it in strict mode while other 
>developers use
>it in lenient mode.
>
><>< gary
>
>-----Original Message-----
>From: Nicolas LEHUEN [mailto:nicolas.lehuen@ubicco.com]
>Sent: Friday, August 24, 2001 3:36 AM
>To: 'Michael Brennan'; 'xml-dev@lists.xml.org'
>Subject: RE: Regular expression for URI matching
>
>
>The problem is when you have a developer that produces 
>malformed content,
>then compares development tool A and development tool B to 
>decide which one
>he'll buy. Development tool A, being more strict, rejects the 
>developer's
>content. Development tool B does not. So the developer buys 
>the B product.
>This is a phenomenon that keeps marketers and product managers awake at
>night.
>
>Maybe the solution is to educate the developer himself ?
>
>Regards,
>Nicolas
>
>>-----Message d'origine-----
>>De : Michael Brennan [mailto:Michael_Brennan@allegis.com]
>>Envoyé : jeudi 23 août 2001 21:15
>>À : xml-dev@lists.xml.org
>>Objet : RE: Regular expression for URI matching
>>
>>
>>Thanks for passing this along (although that regular 
>>expression makes my
>>brain hurt ;)). 
>>
>>It's too bad, though, that Altova is completely removing it. I 
>>understand
>>the reasoning. We've all heard the admonition: be strict in 
>>what you create,
>>be forgiving in what you accept. Unfortunately, the 
>>overwhelming majority of
>>developers follow the path of least resistance. Forgiving web 
>>browsers is
>>one reason there is so much buggy, malformed content on the 
>>web. The typical
>>web developer writes a web page, brings it up in the browser, 
>and if it
>>displays, they are done. If web browsers were more strict, 
>>developers would
>>produce more conformant content.
>>
>>Maybe Altova should just add an optional feature that lets a 
>>user explicitly
>>disable the URI checking. That way, at least, they could 
>>accomodate their
>>customers without inadvertently leading naive developers down 
>>the path of
>>bad practice.
>>
>>> -----Original Message-----
>>> From: John Cowan [mailto:cowan@mercury.ccil.org]
>>> Sent: Wednesday, August 22, 2001 7:15 PM
>>> To: xml-dev@lists.xml.org
>>> Subject: Regular expression for URI matching
>>> 
>>> 
>>> Alexander Falk of Altova, the XML Spy people, posted the 
>following to
>>> an internal W3C mailing list.  With his permission, I am 
>reposting it
>>> here so that it will be archived.  Anyone may use it, but this
>>> information is provided "as-is" with no warranties whatsoever 
>>> regarding
>>> the correctness of the information.
>>> 
>>> ----- Forwarded message from Alexander Falk -----
>>> 
>>> This is the Regular Expression (RE) we originally used for 
>the anyURI
>>> dataype within our XML Spy product up until 4.0b2:
>>> 
>>> 	
>>> (([a-zA-Z][0-9a-zA-Z+\\-\\.]*:)?/{0,2}[0-9a-zA-Z;/?:@&=+$\\.\\
>>> -_!~*'()%]+)?(
>>> #[0-9a-zA-Z;/?:@&=+$\\.\\-_!~*'()%]+)?
>>> 
>>> It was constructed according to the BNF grammar given in RFC 2396
>>> (http://www.ietf.org/rfc/rfc2396.txt) and we used this RE 
>to validate
>>> elements and attributes whose datatype was anyURI.
>>> 
>>> However, we've found that (a) many customers actually use 
>>> illegal URIs in
>>> their documents happily, (b) XML Schema Part 2
>>> (http://www.w3.org/TR/xmlschema-2/#anyURI) doesn't require 
>>> any validation of
>>> the contents of the anyURI dataype, and (c) most customers 
>>> don't want us to
>>> validate stronger than what other processors are doing.
>>> 
>>> Therefore, we are currently eliminating the anyURI checking [...]
>>
>>-----------------------------------------------------------------
>>The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
>>initiative of OASIS <http://www.oasis-open.org>
>>
>>The list archives are at http://lists.xml.org/archives/xml-dev/
>>
>>To subscribe or unsubscribe from this elist use the subscription
>>manager: <http://lists.xml.org/ob/adm.pl>
>>
>
>-----------------------------------------------------------------
>The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
>initiative of OASIS <http://www.oasis-open.org>
>
>The list archives are at http://lists.xml.org/archives/xml-dev/
>
>To subscribe or unsubscribe from this elist use the subscription
>manager: <http://lists.xml.org/ob/adm.pl>
>
>-----------------------------------------------------------------
>The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
>initiative of OASIS <http://www.oasis-open.org>
>
>The list archives are at http://lists.xml.org/archives/xml-dev/
>
>To subscribe or unsubscribe from this elist use the subscription
>manager: <http://lists.xml.org/ob/adm.pl>
>
Prev by Date: RE: Including multiple schemas - duplicate name errors
Next by Date: [Zvon] XML Cover Pages + W3C standard titles + Zvon materialssearchable in one run
Previous by thread: RE: Regular expression for URI matching
Next by thread: RE: Transactional Web Services ? (was: a very long subject with weird spaces inside)
Index(es):
- Date
- Thread