Lists Home |
Date Index |
- From: "Norbert H. Mikula" <firstname.lastname@example.org>
- To: email@example.com
- Date: Mon, 10 Mar 1997 15:12:16 -0800
> Most programming languages talk explicitly about tokenisation,
> or tokenization if you prefer :-), and in doing so explain how
> the sequence of tokens that a compiler (say) sees is derived from
> an input stream. Usually, comments are stripped at this stage,
> and in languages such as C or SGML that have (in effect) macros,
> the macros are expanded at input time.
I don't think that C and SGML/XML use or rather can use the
same principle of includes/macros.
C uses a pre-processor that resolves includes. Then the actual
compiler gets started without having to worry about includes
anymore. (To my understanding of things..)
For practical reasons, at least for XML processors for online
browsers, I think, we don't want to first do the include and then do
the parsing, keeping all that stuff in memory while we do so.
Furthermore I see problems arise if we have the following scenario :
<!ENTITY %UnixSpecifics SYSTEM "http....">
<!ENTITY %DosSpecifics SYSTEM "http....">
<!ENTITY % Unix 'INCLUDE'>
<!ENTITY % Dos 'IGNORE'>
Too much to do for a pre-processor, I guess, it can, or
at least should, include the appropriate external
entity only after it has parsed and resolved the content
of %Dos and %Unix.
I am not sure whether I have addressed what you had in mind,
but I do believe that XML is too smart for a pre-processor,
thus we need other ways to look at PE resolving.
Norbert H. Mikula
= SGML, DSSSL, Intra- & Internet, AI, Java
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to firstname.lastname@example.org the following message;
List coordinator, Henry Rzepa (email@example.com)