OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   RE: xml spec 1.0 validity constraint for ID/IDREF

[ Lists Home | Date Index | Thread Index ]
  • From: "gopi" <gopi@aztecsoft.com>
  • To: "Peter Murray-Rust" <peter@ursus.demon.co.uk>, "Xml-Dev" <xml-dev@xml.org>
  • Date: Wed, 29 Mar 2000 21:04:23 +0530

>> Because, att1 value has
>>to start with a letter, not a digit!!!.  Why is this so?
>>	If you look at the line, "A name must not appear more than once in an XML
>>document as a value of this type; i.e., ID values must uniquely identify
>>elements which bear them".  If I am not wrong, it only wants to make sure
>>that the ID value should be unique in the xml document.  But, why should
>>there be restriction like it "has" to start from letter or underscore, why
>>can't it be one of 0-9 ?
>The "metareason" is that that is how it is in SGML. Unlike you, I have had
>the experience of living through the XML specification and there were a
>number of things that were required to be compatible with SGML.

	Hmm.... I really doubt on this line. Have a look at the URL
http://www.w3.org/TR/NOTE-sgml-xml-971215.  I don't have much knowledge
about SGML but looking this doc
I felt that XML is not just sub-set of SGML, but it puts some "restrictions"
on SGML documents.  So, if you have any SGML document,
and try to convert to XML document, it cannot be converted automatically or
it is not well formed (or valid) XML document. So, it still differs with
SGML in certain ways.  Even though SGML is the base for XML spec, it is even
now not making sure that "every" SGML doc is not XML document (if you parse
using XML processor).  (If what I have written is wrong, correct me before
somebody also misinterpret).
	So, why should we still stick on to this "buggy" definition for ID/IDREF.
Anyway, SGML doc cannot be processed using XML processor.  If we change the
definition of ID/IDREF , we are not doing any major change in

>>Otherwise the SGML community would not have used XML, and more importantly
the very
>>dedicated group of SGML experts would not have given their time (very
>>amounts) to XML and so it wouldn't exist. *Why* SGML decided on that I
>>no idea.
But, now the industry hardly bothers what happened with SGML?  They will be
concerned with XML.
They would say "forget about SGML and let us fix XML problems".
>It is illegal in SGML to define enumerated attributes which share a value.
><!ATTLIST FOO bar (yes/no) #IMPLIED>
><!ATTLIST FOO plugh (yes/no) #IMPLIED>
>is an error in SGML. Why? I believe because documents can be minimised to:
><FOO yes>
>(no =, no quotes, no attribute name) and this was to save typing. So maybe
>the NAME restrictions stem from a similar reason.
	Here, I got it.  So what I said is correct. Every SGML document is not
"well formed" XML document. So many of SGML features are invalid in XML.
Let's do the same thing with ID/IDREF also?
>>		If I change the value to
>>			<root att1="a10" att2="234"/> , it works fine!!!!   It looks stupid for
>>me.  XML is not allowing to "represent" my original data as it is, it
>>expects me to do some manipulations with my data :-(.
>Yes, but att2 is not an SGML ID.
	I gave att2 as just one more attribute in example.  It is nothing to do
with the point I am stressing.
>>	I will just take two examples where this can crib lot of data.
>>	example 1:
>>		Suppose I have data in a database table and want to convert it into XML
>>document and convert the table schema to equivalent DTD.
>>Then, the primary key column (assuming there is only column as primary
>>in table will be equal to ID in DTD.  If my primary key in database proves
>>to be integer or float , what should I do? I can't make corresponding
>>attribute as ID or I should prefix every value for that column with some
>>character 'x'.  Why should I do this?
>The answer is that SGML/XML never intended that the ID was mapped onto
>database keys. It is assumed that the "application" has some means of
>making SGML IDs unique within a document.
	Yes, it assumes about "uniqueness", but why should it make restriction on
value of that unique id. Application will take care of uniqueness whether it
is integer or float or string.
>All that XML V1.0 guarantees is
>that if you try to validate an XML document which contains attributes of
>type ID which have non-unique IDs it is entitled to throw an error of some
	No, XML processor is not just doing this, it is doing that "extra thing" of
checking the starting character of value.  Whether first character is letter
or not :-(,  since it is specified in XML spec.
>>		ID should only make sure that value is "unique", but shouldn't put any
>>restriction on the "value" of that corresponding attribute.  The one who
>>writes the xml document has to make sure that, he is writing unique values
>>to that attribute.  Whether he writes an integer, float or string
>>really be the constraint.  The XML processor "should" not try to put this
>>restriction that you got to start the value with letter or underscore or
>This is the law. You have to obey.
	I don't agree with you on this.  If it is cribbing only my application, yes
I would have obeyed.  It is a major problem, so I would say "law can be
refined" :-)
>It is often not realised that XML V1.0 is a specification of document
>syntax - NOT a set of instructions on how that document should be processed
>(other than by a parser). It is because the spec is so insistent on not
>giving any guidance on implementation that I set up XML-DEV and shouted for
>things like SAX. Unique IDs are another things that we really need some
>common tools for. I would like to have an editor (yes, an editor) which
>among other things filled in unique IDs for me. I shall keep shouting.
	Instead of having one more layer of implementation, better to change the
definition in XML spec itself.  I am sure w3c has really responded to
XML-DEV activities positively.  They won't be so rigid to make us do one
more layer of specification or implementation on XML spec.

>No, it isn't, is it! There is still a lot of generic tool development to be
	I would say w3c will understand this problem and avoid this development to
be made.

>>  Then I would prefer to throw away
>>XML as intermediate format itself and go away with my original data format
>>itself :-)
>It is a good example of the fact that XML is not a magic bullet which
>solves all your problems. But it does help to identify them and XML-DEV
>helps to encourage people to solve them.
	But XML "has" to become that magic bullet since it has all qualities to
become, but need some refinement.  It has already made so much news in the
industry that, everyone is trying to solve their problem with XML and it is
not possible to stop them now.  XML has to come up to their expectation in
order to stand.
>>	This is my opinion, would anyone give me convincing answer for this?
>>		OR
>>	Can I say this is a "BUG" in xml specification 1.0?
>You might say that if the current XML/W3C/XML-DEV community were developing
>XML *now*, they might tackle IDs and NAMEs differently. But IDs were/are
>used in SGML and presumably are not seen as bugs.
	I would say SGML was not intended for this kind of data, but XML "has"
become standard for all kind of data.  So, XML needs some "refining" to
continue solving the problems occurring in such data.

Gopinath M.R.
Software Engineer,
Aztec software and technology services (P) Ltd (www.aztec.soft.net )
Bangalore -560078
Ph : 91-080-5522892/93, 5532036,
5533725, 5533649
email : gopi@aztecsoft.com

This is xml-dev, the mailing list for XML developers.
To unsubscribe, mailto:majordomo@xml.org&BODY=unsubscribe%20xml-dev
List archives are available at http://xml.org/archives/xml-dev/


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS