[
Lists Home |
Date Index |
Thread Index
]
Tim Bray <tbray@textuality.com> wrote:
| I usually encounter breakage in a completely different area, which
| low-level character handling. Unless you're in a very tightly
| controlled environment you can't generate XML with printf() at all
| because one of the strings behind a %s might contain a < or & or
| something that is not kosher per the character encoding you think you're
| generating.
True. Did I say I was a fan of printf? ;-)
| And it's not as simple as just having an escape-xml-string either,
Well, yes and no...
| because lots of times you get input (i.e. from users who are too smart,
| or from some database field that got serialized out of other XML) that
| already has & and ɼ and so on in it, so you need to think
| through carefully where & how you do the escaping.
Only if you've taken it upon yourself to dwim with user data, I would
think. If the issue is somebody being too smart and pseudo- or semi- or
quasi-escaping data "in advance", he gets hosed for his pains. I'd
suspect database systems that try to be too "smart" or "helpful" are the
biggest offenders here.
Storing "serialized XML text" in a database field is brain damage.
| These things are about 8 times as much work as synthesizing the tree in
| the applications I write.
My general approach is to have a separate print module, implemented by a
Visitor pattern. This is the only module that is even aware of XML
syntax. It takes no prisoners: either give it "raw" text or you lose.
|