[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Traffic Analysis and Namespace Dereferencing

From: Miles Sabin <MSabin@interx.com>
To: xml-dev@lists.xml.org
Date: Tue, 02 Jan 2001 18:33:43 +0000

David Megginson wrote,
> Miles Sabin writes:
> > It's worth bearing in mind that this also applies to the
> > dereferencing of DTD external subsets.
>
> Absolutely correct -- that's why XML documents for production-
> side systems should not include DOCTYPE statements.  DTDs and 
> XML Schemas belong mainly on the authoring side (both as 
> templates for input tools and for debugging).

Hmm ...

There are already many production-side systems which validate,
and I'm sure there'll be many more in the future. Where the input 
docs can't be assumed up front to be valid and where DTDs/
schemas are cached locally, this doesn't seem like such a crime.

Nevertheless, I've run into a surprising (to me) number of
people who ought to know better, but who seem to be only very 
dimly unaware of the comms implications of not caching locally, 
and I suspect some of them are going to get their fingers burnt 
... when their customers complain because they can't run their 
apps on network disconnected machines, or because they think the 
app has embedded spyware, etc. etc. ...

> > I can't help worrying that unintentional DoS might turn out 
> > to be a major problem in the not too distant future ... the 
> > W3C's servers host an awful lot of critical DTDs, and a awful 
> > lot of generic XML processors don't cache external subsets or
> > use caching HTTP proxies by default. So what would happen if 
> > w3.org collapsed under the strain of a couple of hundred 
> > thousand XML editors all starting up at once?
>
> People will find ways to route around the damage.  The only 
> question is whether people will blame bad design practices or 
> XML itself.

Customers blame vendors, and vendors try to pass the buck. I
fully expect to see attempts to blame outages on the W3C for 
having 'irresponsibly' inadequate servers, or on XML itself.

But popping the stack a bit ... does this problem (even if it's
actually due to poor practice) suggest another angle on
related resource discovery? Suppose we took a vaguely DNS-like
distributed, replicated, database approach? And suppose we took
a few leaves out of DNSSECs book to combat spoofing and
information leakage?

Cheers,

Miles

-- 
Miles Sabin                               InterX
Internet Systems Architect                5/6 Glenthorne Mews
+44 (0)20 8817 4030                       London, W6 0LJ, England
msabin@interx.com                         http://www.interx.com/

Follow-Ups:
- Re: Traffic Analysis and Namespace Dereferencing
  - From: Norman Walsh <ndw@nwalsh.com>

Prev by Date: Re: Traffic Analysis and Namespace Dereferencing
Next by Date: Re: Resource discovery directory [was: XML Catalog proposal]
Previous by thread: Re: Traffic Analysis and Namespace Dereferencing
Next by thread: Re: Traffic Analysis and Namespace Dereferencing
Index(es):
- Date
- Thread