Lists Home |
Date Index |
- From: "Neil Bradley" <firstname.lastname@example.org>
- To: email@example.com
- Date: Sat, 16 Aug 1997 20:43:42 +0000
Peter Murray-Rust wrote:
> If a set of rules *does* emerge, then how can we generally inform an application
> that it should take them as DEFAULT? I assume this is through a PI:
I was hoping that relevant applications (mainly browsers and
typesetting systems) will ALWAYS assume the rules that are finally
determined, except where preserved content (or some other set of
rules) is explicitly actioned.
> I agree with Liam - I didn't understand 'blockness'. I also think that whatever
> is done here has to be independent of stylesheets and DTDs. The average hacker
> like me simply won't undertsand the subtleties.
I am merely trying to distinguish in-line elements from other
elements. An in-line element implies no line-breaks above or below
it. A 'Block' element therefore DOES imply such a break. I do not use
the terms element and mixed content here, because it is not quite the
same thing. As I have said before, a Para element is a 'block'
element, and has mixed content, but an Emph element is an 'in-line'
element, yet also has mixed content. All style sheets, including
CSS, understand the concept of in-line and block elements. Any
whitespace surrounding a block element MUST be irrelevant.
Liam raised the issue of a half-way element type, such as a header
which implies a line-break before it, but not after, so that
following text will appear on the same line. This one is tricky.
> I would assume that this processing takes place in the application, not the
> parser. How/whether comments are passed to the application is part of the
> parser API. I assume that at this stage the comment is recognised as a single
> chunk which can be deleted with/out surrounding whitespace as required.
As I say at the top of the rules, ALL these rules are applied by the
application, not the XML processor.
> This one is tough. Please criticise my current view :-). SGML documents seem
> to use markup as structure in some places (e.g. OL/LI in HTML) or
> event streams (e.g. EM, B in HTML). Authors/readers expect different processing
> modes from these types. The example above is best treated as structuring
> markup (P) containg an event stream (#PCDATA|EM)* [sorry for abbreviations].
> So we have to indicate to the processor that P is structuring and that
> whitespace after <P> or before </P> is irrelevant, and that its content is an
> event stream where all whitespace is normalised to a single space (cf HTML.)
> Therefore can we have something like this:
> <?XML-SPACE STRUCTURE="YES"?>
> <?XML-SPACE EVENT="YES"?>
> This is<Emphasis>very</Emphasis>strange.
> <?XML-SPACE STRUCTURE="YES"?>
I think that, ultimately, some combinations of markup will always
break whatever rules we come up with. We must ensure that only
obscure, non-intuitive combinations do this, then just shout from
the rooftops that these combinations are not to be used.
> > > RULE 4. A remaining line-end code is converted into a space, except when it is
> > > preceded by a normal (hard) hyphen, or by a soft hyphen ('°'),
> > > in which case it is removed (a soft hyphen is also then removed).
> > > ---
> I have to argue against this :-(. A hyphen is indistinguishable from a minus
> to lots of people. There are also many cases where people may wish to end
> a line with a minus:
> Since we are normalising whitespace, then lines can always be arranged so that
> hyphens are unnecessary.
My concern was to address existing text files, where hyphens are
often used in this way. Maybe I am over-estimating this problem.
Neil Bradley - Author of The Concise SGML Companion.
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to firstname.lastname@example.org the following message;
List coordinator, Henry Rzepa (email@example.com)