OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   Refreshed genx.h, plus some plans

[ Lists Home | Date Index | Thread Index ]

I just refreshed http://www.tbray.org/ongoing/genx/genx.h.

Most of the changes are obvious and reflect the discussion here, but I 
probably missed something so another pair of eyes would be welcome.

Perhaps most interesting, I decided that what the genxText method 
really needed was polymorphism, but this is C, so now we have

int genxText(genxWriter w, const utf8Byte * start);
int genxCountedText(genxWriter w, const utf8Byte * start, int 
byteCount);
int genxBoundedText(genxWriter w, const utf8Byte * start, utf8Byte * 
end);

The first case is null-terminated.  This is going to add maybe ten 
lines of (easy) code to the implementation and it gives everyone what 
they want, and it will be faster too.

I'll do the I/O abstraction as suggested, but I'm not going to 
pre-design it, I'm going to write the code first so I understand what a 
reasonable balance is between the kind of I/O primitive the code needs 
and what you can ask a caller to provide.

I think I've decided that the namespace handling is wrong.  So I 
propose adding a new call

int genxDeclareNamespace(genxWriter w, utf8Byte * uri, utf8Byte * 
prefix);

if prefix == NULL then genx will generate one.  prefix == "" is not 
allowed.

Then the calls with namespaces lose the separate prefix argument, and 
genx fills in the declarations as appropriate.  This feels more like 
the way that people actually think about writing and reading XML docs.  
Does anyone see a reason not to do this?

NEW: Making it fast

It dawned on me that with all the checking and so on, genx might not be 
as fast as it could possibly be.  The way to fix this is obvious, but 
it might amount to premature optimization.  The idea is that you 
predeclare your elements and attributes and get handles to them, so 
that they only need to be namechecked and sorted once.  Something like.

genxElement genxDeclareElement(genxWriter w, utf8Byte * namespaceURI, 
utf8Byte * type);
genxAttribute genxDeclareAttribute( ... same args ... );

Then you have

int genxFastStartElement(genxWriter w, genxElement element);
int genxFastAttribute(genxWriter w, genxAttribute attribute, utf8Byte * 
value);

I think that with a little bit of care in the code, this should 
generate guaranteed-WF canonical XML at speeds close enough to the most 
deranged pedal-to-the-metal custom C code to vanish in the static of 
any conceivable application.  Would this be premature optimization? 
-Tim

smime.p7s





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS