[
Lists Home |
Date Index |
Thread Index
]
/ "Jeff Rafter" <lists@jeffrafter.com> was heard to say:
| Define problem. If the feature is not supported or not recognized the
| appropriate SAXException should be raised during the call to getFeature.
> Yep.
| Unfortunately, in this proposal there is nothing per se about what to do
| when the document is or is not full normalized. Perhaps Locator2 needs
> My thinking was that if the document wasn't normalize, an error would
> be raised. If the parse finishes without raising an error, it's
> normalized.
| Additionally, I would question whether or not a feature should be provided
| for normalization transcoding or if that was beyond the scope of SAX. I
> I think that's application-level functionality that belongs in a layer
> above SAX.
I am not sure I comfortable elevating this to error status. At best, I would
think this would be a warning() because it is a purely process oriented
problem-- i.e., you simply can't do a byte for byte unicode comparisson--
this isn't an error, you should simply be warned about the inability to
perform the test.
Secondly, this is something which in a non-normalized document will possibly
occur a lot. This would create a lot of noise in an error listing-- and
likewise for a listing of errors+warnings in the document as many editors
do. Additionally, the repetition of the non-normalized warning/error means
creating a new SAXParseException *every* time. This goes back to a
conversation Karl Waclawek and I have been having off list about the
performance penalties involved in the current design-- but that is probably
a discussion for another time...
I think we need to have a defined scenario/use case for how the information
will actually be used before it is finalized. e.g., when such a
warning/error would occur, what an ideal line number and column position
would be, what the order of events would be when the processor encountered
such a warning. Should the warning be fired for every instance of
non-full-normalization? If not, what is the benefit to using the
ErrorHandler at all instead of a single check for isFullyNormalized? When
multiple errors/warnings occur (e.g. one for each occurence of
non-full-normalization, there is great ambiguity:
e.g. This is somewhat clear:
[startElement]
[startElement]
[warning non-nomalized text]
[characters]
while this is not:
[warning non-nomalized text]
[startElement]
When the element name is non-nomalized? When the attribute name is non
normalized? When the attribute content is non-normalized?
What about:
[warning non-nomalized text]
[startPrefixMapping]
[startElement]
Again, it is difficult to determine where exactly the warning should be
applied. Does it matter?
Best Regards,
Jeff Rafter
|