[
Lists Home |
Date Index |
Thread Index
]
- From: "Adam M. Donahue" <adam@cyber-guru.com>
- To: xml-dev@ic.ac.uk
- Date: Sun, 7 Jun 1998 19:38:59 -0400
Hi all,
This is my first posting to xml-dev, as I'm still getting comfortable
the XML specification. However, as an exercise I have begun
putting together a first XML DTD (hopefully to be put together) which
I'm tentatively calling the "Site Map Definition Language" or SMDL.
(The name will most likely change.)
The are currently hundreds of thousands (if not millions) of grouped
collections of documents on the WWW which we often refer to
individually at "sites." We can use XML to define the various types
of information out there in a uniform language. We can further
classify these different types of documents into groups, the most
common of which right now is the idea of the Web site. Right now,
the structure of any given site is usually laid out in a tree of
documents. However, we have not yet seen a uniform way for
expressing--in a single document--the contents of this tree. This
has resulted in non-standard "site maps," which vary greatly
between different locations on the Web. Compounding this is the
fact that these maps, though often friendly to the user surfing the
web, are highly machine unreadable (which is, of course, a general
problem with HTML). That is, automated web robots dispatched by
the major search engines have no easy way of gaining quick
access to the layout of an individual site. These engines must then
result to recursively searching sites for linked documents. This
poses a problem for both the web content provider, who may have
robots accessing and cataloging pages not meant to be cataloged;
it also poses a problem to the web robots themselves, which have
the time consuming and bandwidth hogging task of requesting
several pages in the hopes of keep a database up-to-date.
SMDL is a work in progress to solve the above problems. With it, I
hope to define a uniform language which web content providers can
use to offer both user agents and robots access to the full structure
of a site--including information about the tree-like layout of a site;
the frequency of updates of a particular resource; whether content
is dynamic or not (now, with html files, for instance, the web robot
cannot necessarily know if a page is server-parsed); and other
information. Obviously there are a lot of possibilities. This is why
I'm coming to the group a bit early to get feedback. What would
you include in such a language? Also, is this a worthwhile
endeavor?
My proposal now is very small, and no doubt missing some
important elements. (Again, I will post very soon; I want to make
sure I've eliminated obvious DTD errors.) It's mainly an exercise as
an early XML application. So don't be afraid to say it's unnecessary
(I doubt you would anyhow) and anything else. I appreciate any
feedback at all.
Thanks in advance. I look forward to further participation in this
group, especially with the somewhat exciting XSchema
specification.
Adam
mailto:adam@cyber-guru.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
|