OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help



   Re: [xml-dev] RE: Why is there little usage of XML on the "visibleWeb"?

[ Lists Home | Date Index | Thread Index ]

On Thu, 2006-07-20 at 10:34 -0700, Guillaume Lebleu wrote:
> Roger,
> I think there is not a lot of XML on the visible web because:
>       * HTML = good guarantee against mass-copyright infringement: It
>         makes it harder for 3rd parties to automatically extract and
>         re-purpose data/content without the authorization of the
>         publisher (ex. index it so that a search engine could provide
>         the ability to perform rich queries). 
>       * Because of this, today the publisher has an internal XML that
>         is published in HTML ("encrypted" format) to the public, and
>         in XML to distributors, or a mix  of XML (RSS) focusing on
>         metadata and HTML.
>       * Writing XML+XSL requires skills that are not as widely
>         available as writing HTML.
> Basically, it's a all-or-nothing approach to me.

Depends I guess what you think xml is actually to be used for. What is
it's purpose?

I used to think the web was a bit stupid, but over time I've come to see
how good it is in representing language information.

Some people use xml as config files. That's ok.

> But, why are the two approaches you give incompatible? why can't we
> have what I would call "Semantic Tagging" of HTML:
> <html:HTML>
>     <html:body>
>         <html:ul>
>             <ia:grocerylist>
>             <html:li><ia:fruit>Orange</ia:fruit></html:li>
>             <html:li><ia:meat>Chicken</ia:meat></html:li>
>             <html:li><ia:vegetable>Corn</ia:vegetable></html:li>
>             <ia:grocerylist>
>         </html:ul>
>     </html:body>
> </html:HTML>

The way I do it is like this...

 <Product Information>
  <Product Item> Name&="Orange" List_Price$=0.80 </Product Item>
  <Product Item> Name&="Meat" List_Price$=8.90 Unit&="kg"</Product Item>
  <Product Item> 
    Name&="Vegetable" List_Price$=2.90 Unit&="kg"
  </Product Item>
 </Product Information>
Anyway, that's how i would do it. Couldn't resist adding in prices..

> I think this would not be too steep a learning curve for people
> knowing HTML only to learn it, and choose which data they want to
> "tag" for others to extract/process,

Depends what the application actually is. There just comes a point where
you get to information overload and need to resort to automated tools to
handle it all.

> Also, I'm not an XML parsing expert, but I believe this may require a
> couple changes on how some XML technologies work, but I imagine it
> would not be a revolution either.

That's what I say also but nobody listens :-) Just turn up the music
like I do or get used to being ignored.

But seriously, sometimes you have to do things yourself, otherwise you
are at the mercy of others.

> That way, a HTML browser can only keep the content enclosed in the
> namespace it understands and display it, while upon selection by the
> user an "ia" compatible browser plugin  could extract the data and
> import it, or a complete separate application could aggregate all the
> grocery list in the world in one searchable database...

Oh yes, I like it....

You need to convert all the xml into something that will go into a
database. That would be sql...

  insert into product_items (product_name,list_price) ;
  values ("Orange",  0.80)

  insert into product_items (product_name,list_price,Unit) ;
  values ("Meat",  8.90,"kg")

  insert into product_items (product_name,list_price) ;
  values ("Vegetable",  2.90,"kg")

So it's really a transformation issue.

Your database format could be MS-Access, MySQL, DB/2 whatever. In years
gone by a database was expensive. Now you can get a MySQL + Linux +
OldComputer for $25. It's not an issue of cost. Or, alternatively, just
an MDB on the C drive. All new windows-xp pcs come with the database
drivers installed, so it's not a big issue.

Traditional designs do the XML import into a production table. But I say
just load it into one big table outside the production system and do all
the searches on that.


 select * from product_items where (product_name eq "Meat) and
(list_price < 11);

and stream the price updates into the database...

> Well, ok, a grocery list may not be the best example, 

I love that example.

One day when I get organised I think I will jump on an aeroplane and go
to another island. I'm quite keen to head to Europe at the end of the
year and see what is happening there.

We do all this stuff, pretty close to making an alpha-release. Just join
the list.


David Lyon
Project Manager
PreisShare OpenSource Project



News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS