Information traversing the net is generated and consumed both by humans (carbon-based agents) and software (silicon-based agents) - I guess one could also say software and wetware. Therefore design decisions about things like XML-Schema need to take both into account, as well as the fact that information is passing over the net in real-time among organizations. I know people who'd really like to allow ambiguous content models, because that's easier for humans, but it has problems for our silcon-based friends.
I mentioned Murata Makoto's brilliant work with forest-automata as an expanded model for markup is that to gain the full expressive power of his work one needs to accept (in the worst case) bottom-up parsing (eliminates stream-based applications) and either exponential-time preprocessing (merely processing an unknown schema may be prohibitively expensive) or exponential-time validation (there are some schemas for which parsing is effectively impossible). Given that I can send you something which may take exponential time to process, schemas or instances can be used for denial-of-service attacks. I think he has the right model, but a schema language for XML on the Web can't use its full power, although it may have many applications within a single enterprise, or used in an offline manner. Note that Relax NG doesn't use their full power either. For more information on Murata-san's work, check out the Cover Pages - there're several articles, and it's a big toopic.
On the last point, there are various formal measures of complexity in computer science, such as big-O notation and Kolmogorov complexity. Psychologists have obviously also been spending time trying to determine complexity measures for people. It would be really great if there were some way to tie them together. That way, people like me, involved in creating standards for both (XML Schema and Sox in my case), would have something a little more concrete than intuition to guide us (well, other people would - mine, of course, is infallible).
> -----Original Message-----
> From: Tom Bradford [mailto:firstname.lastname@example.org]
> Sent: Tuesday, October 09, 2001 11:39 AM
> To: Fuchs, Matthew
> Cc: 'Bullard, Claude L (Len)'; email@example.com
> Subject: Re: [xml-dev] Adam Bosworth on XML and W3C
> "Fuchs, Matthew" wrote:
> > I think both are relevant - schema is consumed by both
> carbon and silicon
> > agents. For two markup related examples - ambiguous
> content models and
> > Murata Makoto's work with hedge automata. There seems to be a human
> > (carbon-based agent) predilection for ambiguous content
> models because they
> > can be easier to write, but they can cause problems for the
> behavior of
> > programs (silicon-based agents). Likewise, Makoto's work
> is very elegant,
> > and implementation may not be so hard (low Kolmogorov
> complexity), but there
> > are cases requiring exponential processing time, which is
> why I was against
> > using them directly in Schema (when I heard "exponential" I thought
> > "denial-of-service attack").
> > It would be awesome if there were some way to relate the
> formal complexity
> > measures with psychological complexity. Do you know of any sources?
> Uhhh... Again in English, please? Thanks.
> Tom Bradford
> The dbXML Project
> Open Source Native XML Database