[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] Generic XML Tag Closer </> (GXTC)
- From: Rick Marshall <rjm@xxxxxxxxxxx>
- Date: Mon, 28 Aug 2006 07:17:48 +1000
from a brief survey of your blog on this matter i can only say that like
so many other "critics" of xml your comments are really only meaningful
if you ignore a significant part of the problem space xml is solving.
having said that - a few more comments:
it probably would have been better for css to be an xml vocabulary from
the start - but it prohibits nested "tags" so i guess it was deemed
unnecessary - and that's true for some other syntaxes.
</> terminators (or "full stop syntax" as i like to think of it) are
more readable in simple examples, but i can assure that in large machine
generated documents such as some of the edi stuff we do it would be
incredibly unreadable. my own syntax for unibase (which does use a full
stop syntax) can get very unreadable and difficult to debug on large
programs. it's a shame html and/or xml weren't known at the time i
designed the syntax.
all of this would be a lot easier if every document was checked against
a dtd or xsd - essential for our work and a whole other topic.
rick
juanrgonzaleza@canonicalscience.com wrote:
> Rick Marshall said:
>
>> juanrgonzaleza@canonicalscience.com wrote:
>>
>>
>>> Rick Marshall said:
>>>
>>>
>>>
>>>> my 5c
>>>>
>>>> </> is a syntax element and as long as something else understands the
>>>> semantics - it will do fine
>>>>
>>>> however...
>>>>
>>>> </tag> is semantic which means the parser/processor does not need
>>>> external information to make a descision about the correctness and
>>>> completeness of the information.
>>>>
>>>>
>>>>
>>> <tag1>content1<tag2>content2</tag2></tag1>
>>>
>>> Once parsed <tag1> and <tag2> the parser finds the "</" and *wait* a
>>> "tag2" because consistency of the XML. The same when finds the </tag1>.
>>>
>>> <tag1>content1<tag2>content2</></>
>>>
>>> Once parsed <tag1> and <tag2> the parser finds the "</" and
>>> knows/assumes is closing the open tag2 because consistency of XML. The
>>> same when finds the last </>.
>>>
>>>
>>>
>> assuming is not the same as knowing.
>>
>
> Sure, see below.
>
>
>> i mean lets have some real fun - like in html and allow </> to close off
>> all unclosed tags. saves a few more keystrokes.
>>
>> <tag1>...<tag2>...</>
>>
>> now that's even more compact and it's sort of what html does - but we
>> all know how often it gets it wrong.
>>
>> you see the problem is that when the parser comes across </> it has to
>> ASSUME that it is closing the last declared tag. however if you
>> accidentally left out a close tag then it's wrong and it will take
>> counting opening and closing tags right to the end of the document to
>> know that the document is valid.
>>
>> bye bye sax
>>
>> rick
>>
>
> Right! Parser assume that end empty tag is closing last open tag. If the
> doc is sintatically correct (i.e. tags, parenteses, curly braces...
> matching) then the parser knows what tag is closing each one. If some
> start or end tag is missing, then you obtain error. Is that an advantage
> of full empty tags over short forms as </>? Again no.
>
> I repeat again that i discussed a bit this in
>
> [http://canonicalscience.blogspot.com/2006/04/canonml-markup-language-beyond-tex-xml.html]
>
> The doc is a bit outdated by recent improvements but there I review Paul
> Prescod arguments in pro of full end tags and that part is still valid. He
> compared XML syntax and S-expressions, but the same apply to the special
> XML syntax discussed here because </> plays the role of the ")" of LISP.
>
> Well he omitted a </footnote> and proved that XML parser does not need
> parse the entire doc. Three comments:
>
> - He used an XML oriented example. If missing the </para> after the
> </footnote> instead then the XML parser needs to parse the entire doc
> (except the root) before finding the error. Therefore the advantage of the
> full end tag is lost.
>
> - Contrary to XML, it is trivial to run a pre-parsing step where verified
> syntaxis correctness, simply counting number of "{" and "}" (TeX, C...) or
> number of "(" and ")" (LISP, Scheme...) or "[" and "]" (CanonML) in the
> full doc.
>
> - It is also simpler typping [] and next type the content inside than
> <my-tag></my-tag>. By my own experience the number of errors typping XML
> is more than double. This can be illustrated as,
>
> Case "[]": Error posible? omision of some tag.
>
> Case <tag></tag>: Errors possible? i) omision of some tag ii) incorrect
> writting of tag, e.g. <tag><tag> , <tag></tah>, <tag><7tag>...
>
> This is also noted in
>
> [http://www-128.ibm.com/developerworks/xml/library/x-syntax.html]
>
> <blockquote>
> The extra typing required to open and close tags and escape special
> characters not only wastes time, but introduces more possibility for
> error.
> </blockquote>
>
> I would also remark that there are ways to improve readability of () {} []
> cases over the XML syntax and this also applies to the <tag></> case.
>
> Therefore i summarize.
>
> Advantages of XML standard syntax:
>
> 1) Better readability in some special cases
>
> 2) In some special cases, the parser can find errors without parsing the
> full document. This is real advantage only for large documents iff the
> error is at the very beggining. One can assume 1/4 errors in 4th (last)
> part of docs and 1/4 in 1st part. Therefore, for each case of XML is real
> advantage, you could find case where is not.
>
> Disadvantages of XML standard syntax:
>
> 1) It is easy to generate more errors.
>
> 2) For avoiding above point you need special editor.
>
> 3) It is more verbose and less efficient for parsing.
>
> 4) Cannot deal with all cases, as those addressed by ConciseXML (and others).
>
> 5) The (small) advantage of full end tags is lost when generalizing the
> system to non-hierarchies, one of non-soved problems in XML. For instance
> GODDAG approach
>
> <a/ ... <b/ ... /a> ... /b>
>
>
> Advantages of special syntaxes:
>
> I mean, SXML (), CanonML [], ConciseXML <tag></>, ..., XUL-C...
>
> 1) Very good readability, specially with large naming tags and or
> namespaces or when markup length is several order of magnitude that of
> content.
>
> In my experience, XSLT become more readable when omiting full end tags and
> code is indented as shown here. Others thing the same:
>
> <blockquote>
> XSLT is often considered to be too verbose. As stylesheet code grows, it
> tends to be unreadable.
> </blockquote>
>
> [http://www.xml.com/lpt/a/1226]
>
> 2) Lightweight parsers, specially with SXML and CanonML.
>
> 3) Less verbosity. And this _is_ a point with very large datuments,
> therein some guides in the famous element vs. attribute dilema recommend
> usage of attributes when size is an issue.
>
> 4) Do not need for special options as <tag></tag> vs. <tag/>. Therefore
> the small advantage of simple XML approaches is lost with special
> syntaxes.
>
> Disadvantages of special syntaxes:
>
> 1) In some cases it is more difficult to find sintax errors.
>
> 2) In some cases and for some pople the XML end tags increases readability.
>
> 3) If end tags are left as option, then parsers become more complex than
> when parsing only XML.
>
> 5) Cannot deal with non-hierarchies in a direct way.
>
>
> Each one can take a decision by her/himself. In some cases XML is good in
> others you need alternative sintaxes or the </> option... There is not
> absolute answer even if some XML folks desire one.
>
>
> Juan R.
>
> Center for CANONICAL |SCIENCE)
>
>
>
>
> !DSPAM:44f03f82119817354971447!
>
>
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]