[
Lists Home |
Date Index |
Thread Index
]
Tan,
TAN Kuan Hui wrote:
>>There was someone with that need writing to the list a few days ago. It
>>seems entirely legitimate to me to apply different schemas to the same
>>document at different stages of a workflow, or for senders of documents to
>>apply stronger validation criteria than recipients of the same documents.
>>
>>Michael Kay
>>http://www.saxonica.com/
>>
>>
>>
>
>Shouldn't it be the case that the validation process necessitates
>a 2-stage parsing ? What I mean is that XSD can only do a lexical
>validation, a second follow-up stage that validates against the
>application semantics is required.
>
>
>
I am not sure what you mean, but I disagree with the term "lexical
validation".
When writing a compiler, one often separates lexical syntax and
context-free syntax (lexical syntax being the answer to "what are the
tokens of the language"). Lexical syntax is like defining what is a
word, in order to talk about what is a valid sentence.
Maybe you mean "syntax validation": that validating a document against
an XML Schema definition checks merely some structural conditions.
Again I disagree to some extent: An XML Schema definition can reflect a
model of application data that goes well beyond syntax. If this were not
so, then one could map a schema to some objects. Java's type system (for
representing pure data, no methods) is much weaker than XML Schema, even
weaker than DTDs.
OTOH, every type system (or "validation system") has its limits, and I
think Mike and others have made some points earlier about multiple
validation, including multiple validation with different validation systems.
<snip/>
>type info needs to be extracted from the validation, the XSD
>validation stage can be turned off to improve throughput
>once the system is stable. I know the latter suggestion
>may be controversial to some, but it is an option
>if the logical validation is not mangled into the lexical
>space.
>
>
You seem to think validation makes things slow a priori.
This is not true, in fact in a statically typed language, you can use
type information to optimize the representation in memory.
If you cannot be sure that incoming data is valid, then you cannot use
the optimized representation, and have to deal with bloated, generic ones.
cheers,
Burak Emir
http://lamp.epfl.ch/~buraq
|