XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] SGML DTDs for HTML 5.1

Yes thanks.

I had a conversation with Rick via PM over the weekend about this (who has
actually managed to get ISO 8879 changed in the past), and I believe i can say
as much as that he thinks there would need to be some demonstrable community
interest and business case in it before pursuing this further.

FWIW here are a couple points that I'd see in need of addressing or at least
tangentially related to parsing (a rational subset of) modern HTML using SGML:

1a. a proviso to allow unquoted attributes as long as these
don't contain spaces, even if not declared NAME/NMTOKEN/NUMBER etc. or
enumerated attributes (current SGML parsers will treat this as
recoverable error, but will chug along just fine)

1b. a proviso to allow for empty string literals as attribute values when
these require name tokens otherwise

2. a new parsing mode for elements having CDATA declared content (namely
that these can be terminated by a "matching" end-element tag, not just
a "delimiter-in-context") to cater for HTML's long standing script data parsing
issue (but this issue alread existed in 1999 and wasn't addressed, so it might
be better to leave it as it is)

3. an enhancement to allow multi-code point predefined entities as discussed

4. naming rules (once again) for elements and other name tokens;
specifically, ID tokens are allowed to be anything in HTML5
(even eg. quoting characters, hashmarks etc. but there are very good
reasons why these liberties aren't actually recommendable, eg. conflict
with CSS selector syntax etc.), whereas element names and
other generic identifiers are nominally constrained to be in IRV/ASCII,
but depending on whether you allow custom and foreign elements,
receive their definition from XML 1.0 Fifth Ed.; since HTML wants to
treat elements and other name tokens as case-insensitive,
HTML's naming rules are really hopelessly foobar'd anyway, so
another way to deal with it is just to ignore what the spec text says,
and always use case-sensitive markup, which is what's being done
in practice; since you can express exactly that with an SGML declaration,
there's no actual change needed here; only have to come up with
an actual subset of characters to allow in eg. identifiers

5. distinguished DTD notation public identifiers for XML Schema
and/or RNG, which would be relevant for future HTML revisions,
because eg. SVG2 is going to use Relax NG as schema language.
DTD notations are a relatively obscure feature of Annex K (that
I've personally never seen used in practice) allowing DTD-analogous
declarations to come from other sources such as XML Schema or
RNG. This needs addressing since it overlaps with ISO DSDL-9
which is about representing XML namespaces and XSD data types in
DTDs (though XML-DTDs only, supposedly). Also, Annex K allows
for a data notation syntax for attributes which needs realignment.

6. Annex K clarification wrt to #ALL and #IMPLICIT as related
to data attributes (attributes of notations)

7. (just maybe) a formalisation of SGML's ambigousness criteria
and tag inference model (SGML's model for
tag inference is pretty well understood eg. see the seminal works
by Brüggemann-Klein/Wood et. al. on Glushkov automata; some
of the theory of operation wrt. optimizations of allgroup-connectors
isn't accessible, however); the formalisation would be a basis to define
a set of quasi-standard additional tag inference/recovery actions
performed by SP (and sgmljs)

None of these things is actually critical; if at all, the first two issues
are the most annoying.

The  first two points could get activated by using a distinguished
added requirements public identifier (SEEALSO "... HTML5 public id ...";
point 3, though, needs a new minimum data/minimum literal as it changes
the SGML declaration syntax.

Regards,
Marcus

On Mon, Nov 21, 2016 at 11:06 AM, Flynn, Peter <pflynn@ucc.ie> wrote:
> On 18/11/16 20:25, u123724 wrote:
> [...]
>> Would you know anyone to approach for this?
>
> I think the formal approach is to contact your country's standards
> body's representative to JTC1. A link to the standards bodies in
> participating countries is here:
>
> http://www.iso.org/iso/home/standards_development/list_of_iso_technical_committees/iso_technical_committee_participation.htm?commid=45020
>
> ///Peter


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS