[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] sets of parsing rules
- From: Rick Marshall <rjm@zenucom.com>
- To: "Nathan Young -X (natyoung - Artizen at Cisco)" <natyoung@cisco.com>
- Date: Thu, 08 Feb 2007 10:21:40 +1100
linux? (bash, tcl, perl, etc)
Nathan Young -X (natyoung - Artizen at Cisco) wrote:
> Hi.
>
> I have seen parts of this question addressed but I think it's worth
> asking the whole question anyway, since I'm sure others have run into
> this problem but I haven't been able to dig up any best practices in my
> searching so far. I may just need to search with the right terminology,
> in which case this should be any easy one for someone who already
> knows...
>
> I have an application that parses a large number of HTML pages. A few
> of them are well formed XHTML but that's the exception rather than the
> rule. By grabbing pages, manipulating them a bit (regexps have been
> sufficient here so far), then tidying them I can get them to a state
> where they are parsable XML. From there I can use XSL to get them the
> rest of the way (although I have a process that allows me to run regexps
> here too, supplementing XSLT 1.0).
>
> The wrinkle is that I have several kinds of pages, each one requiring a
> distinct set of steps in order to parse it. I'm starting down the road
> of modularizing the transforms so that I can handle more page types over
> time in a way that's transparent to the rest of my application.
>
> I've been exposed XML only pipelines, are there pipeline tools that
> allow for non-XML steps?
>
> ------------>Nathan
>
>
>
>
>
> .:||:._.:||:._.:||:._.:||:._.:||:._.:||:._.:||:._.:||:._.:||:._.:||:._.:
> ||:.
>
> Nathan Young
> Cisco.com->Interface Development
> A: ncy1717
> E: natyoung@cisco.com
>
> _______________________________________________________________________
>
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
>
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
> subscribe: xml-dev-subscribe@lists.xml.org
> List archive: http://lists.xml.org/archives/xml-dev/
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>
>
>
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]