OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   RE: XML pretty printing in HTML

[ Lists Home | Date Index | Thread Index ]
  • From: Matt Sergeant <matt.sergeant@bbc.co.uk>
  • To: "'Warren Hedley'" <w.hedley@auckland.ac.nz>, xml-dev@ic.ac.uk
  • Date: Tue, 6 Jul 1999 10:52:20 +0100

> -----Original Message-----
> From: Warren Hedley [mailto:w.hedley@auckland.ac.nz]

> I'm on the lookout for a tool that will allow me to pretty print
> XML files in such a manner that they can be pasted into an HTML
> page. For example, I would like to transform
> 
> <element1 name="foo">
>   <element2 name="bar"
>             longName="barNone"/>
> </element1>
> 
> to (hope the line wrapping doesn't make this too confusing)
> 
> <pre style="font:12pt monospace">
> &lt;<font color="#ff0000">element1</font> <font 
> color="#00ff00">name</font>=&quot;<font
> color="#0000ff">bar</font>&quot;&gt;
> ...
> 
> You get the idea. All attributes are coloured red, attribute 
> values blue,
> element names green, CDATA yellow, etc. (Yuck, that looks hideous).
> Whitespace is preserved, so that "longName" is under "name" 
> in element2.
> 
> Preferably, all of this without having to do a bunch of 
> coding in XSL or
> otherwise. Surely, someone must have already done this - it would be a
> pretty simple PERL script given a well-formed file. I had a look on
> xmlsoftware.com but couldn't find anything.

OK. Here's an attempt:

use XML::Parser;

print XML::Parser->new(Handlers => {
	# Handlers using closures, except Start 'cos it's more complex.
	# $_[0] is the expat object where we store the HTML output
	Init => sub { $_[0]->{html} = '<pre style="font:12pt monospace">' },
	Final => sub { return $_[0]->{html} },
	Start => \&start,
	End => sub { $_[0]->{html} .= '&lt;/<font color="green">' . $_[1] .
'</font>&gt;' },
	Char => sub { $_[0]->{html} .= "<b>$_[1]</b>" },
	CdataStart => sub { $_[0]->{html} .= '<font color="yellow">' },
	CdataEnd => sub { $_[0]->{html} .= '</font>' },
	}
)->parsefile($ARGV[0]);

sub start {
	my $expat = shift;
	my $element = shift;
	my %attribs = @_;

	$expat->{html} .= '&lt;<font color="green">' . $element . '</font>';
	if (%attribs) {
		foreach (keys %attribs) {
			$expat->{html} .= ' <font color="red">' . $_ . 
				'</font>=&quot;<font color="blue">' .
				$attribs{$_}.
				'</font>&quot;';
		}
	}
	$expat->{html} .= '&gt;';
}

I've written it so it bundles up the HTML instead of printing it as it goes
so you can embed it in some other application (presumably you want to print
out something as well as just the XML - like a body and html tags), but you
could easily modify it. If using it in another app change the "print
XML::Parser..." for "my $html = XML::Parser..." and do something with $html.

Have fun.

BTW: An XML parser doesn't maintain any whitespace between attribute tags -
but I guess you could dump what I've done here for a large regexp system if
you really need that feature.

Matt.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)






 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS