Re: [xml-dev] Where is XML going

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
From: Dimitre Novatchev <dnovatchev@gmail.com>
To: Kurt Cagle <kurt.cagle@gmail.com>
Date: Sun, 5 Dec 2010 09:12:17 -0800
A nice description, Kurt.

As nice as it is, it assumes that the majority of the content
providers wouldn't mind their generated content to be changed and
combined in any possible way.

This is something not realistic, because of the effort and expenses
put into the creation of the content.

Also, you'd expect the majority of today's screenscraping
"specialists" to significantly excell in acquiring new knowledge and
competence  -- an amazing, revolutionary, miraculous leap that would
be ... well, ... a miracle.


-- 
Cheers,
Dimitre Novatchev
---------------------------------------
Truly great madness cannot be achieved without significant intelligence.
---------------------------------------
To invent, you need a good imagination and a pile of junk
-------------------------------------
Never fight an inanimate object
-------------------------------------
You've achieved success in your field when you don't know whether what
you're doing is work or play


On Sun, Dec 5, 2010 at 8:30 AM, Kurt Cagle <kurt.cagle@gmail.com> wrote:
> David,
> CSS by itself is insufficient for presentation for all but the most well
> structured of documents. HTML works well with CSS because HTML has a
> structural ordering that works well with the CSS model (as to be expected,
> given that CSS evolved in response to HTML). Some years ago I wrote a couple
> of chapters for a book on XML and CSS, showing how you can apply CSS to XML,
> but as I proved back even then, very, very few XML schemas are in fact even
> remotely appropriate for direct presentation with CSS.
> What about XSLT? Here things get a little more interesting. You can bind an
> XSLT to an XML document to perform client-side rendering. The most obvious
> mechanism is through the use of the ?xml-stylesheet PI. Unfortunately once
> past this AJAX-based transformations become somewhat trickier, because while
> there may be support for an XSLT transformation API within a browser, such
> support is far from uniform.
> Personally, I'd like to see an inline <transform> tag in HTML5:
> <transform
>     stylesheet="xs:anyURI"
>     data="xs:anyURI"
>     type="mime-type"
>     refresh="timeInterval"
>     asynch="xs:boolean"
>     media="xs:NMTokens"
>     id="xs:ID">
>       Default Internal Content
> </transform>
> This would be a display tag that would load the data either from a server or
> from a block of XML in the client, then would apply they associated
> stylesheet to that data in order to provide output that would replace the
> current child content. The mime-type would be the transformation language -
> application/xslt+xml, text/javascript, application/xquery+xml. The @refresh
> attribute would the time in seconds between refresh times, with a default of
> 0 (no refresh). If @stylesheet is not included, then this acts as an inline
> xinclude statement. Initially, before any content is loaded, the Default
> Internal Content will be displayed. The media statement would contain an
> indication about what context the transformation would be invoked, with
> NMTokens including "web","print","mobile","tv","braille" and so forth.
> In terms of APIs, they'd be about what you would expect:
> Transform:refresh() will perform an automatic refresh of the transformation
> and reset the refresh interval,
> Transform:clear() will clear the transformation and restore the default
> internal content.
> Transform:childNodes() will return the nodes resulting from the
> transformation.
> Transform:stylesheet() will return the current stylesheet document (not the
> URL) as a node
> Transform:data() will return the current data() used prior to
> transformation.
> Transform:asynch is a boolean property indicating whether the resource is
> called synchronously or asynchronously (the default).
> It would also have three events:
> Transform::ontransform - invoked prior to the transformation being called.
> This provides a way to hook a javascript function to the transform, passing
> the appropriate object in the $evt variable.
> Transform::ontransformcomplete - invoked after a successful call to the
> transformation.
> Transform::ontransformerror - invoked when a transformation fails. The
> result is an $error object containing the the relevant message data.
> To change the working data or the transformation, simply change the
> appropriate attributes.
> The benefit to this is simple - it provides a way to invoke multiple
> stylesheets in a platform independent manner. It makes simple AJAX calls a
> no brainer. It places no explicit requirement upon vendors to support a
> given transformation language - if a language isn't supported, then an error
> is thrown and the default content continues to be displayed. It provides a
> simple way for a print system to generate the full rendering of a page
> without having to get a full snapshot of the more complicated aspects of the
> DOM, and with the media attribute it means that you can display certain
> transforms in specific contexts. For instance, the following set of
> transforms:
> <transform data="myNewsFeed.atom.xml" stylesheet="atom-display-web.xsl"
> type="application/xslt+xml" media="web" refresh="10s">Loading News
> Feed</transform>
> <transform data="myNewsFeed.atom.xml" stylesheet="atom-display-mobile.xsl"
> type="application/xslt+xml" media="mobile" refresh="10s">Loading News
> Feed</transform>
> would generate a news display from an atom feed, but while the first would
> be tailored for web display, the second would display for mobile devices
> exclusively. If your source was in json and your transformation was in an
> inline script, then the transform would look something like:
> <script type="javascript">
> function showNews($evt){
>     var $data = $evt.data();
>     var $buf = [];
>     foreach($item in $data){$buf.push("<li><b><a
> href='"+$item.link+"'>"+$item.title+"</a></b><div
> class='desc'>"+$item.body+"</div></li>");};
>     var $content = "<ul>"+$buf.join("")+"</ul>";
>     return $content;
> }
> window.onload = function(){
>    var $news = document.getElementById("news");
>    $news.addEventListener("ontransform","showNews");
>    }
> </script>
> <transform data="myNewsFeed.js" type="application/javascript" refresh="10s"
> id="news"/>
> Or you could just put the invocation directly on the element:
> <transform data="myNewsFeed.js" type="application/javascript" refresh="10s"
> id="news" ontransform="showNews(this)">Loading News Feed</transform>
> Finally, if the script was external, you could even go one step farther and
> say:
> <transform data="myNewsFeed.js" stylesheet="newsFeedScripts.js"
> type="application/javascript" refresh="10s" id="news"
> ontransform="showNews(this)">Loading News Feed</transform>
> The point on all of this is that such an addition to HTML5 would be a boon
> to both JSON and XML, because it moves a lot of the framework scripting out
> of the normal flow of the HTML. It also satisfies the shift towards
> client-side tech while still working within the mandate that David notes -
> the server becomes less responsible for presentation, though it still may be
> necessary to transform the content server side into a simplified format (for
> instance, converting data from a query against a prescriptions database into
> a genericode or JSON representation that could then be consumed more readily
> by an XSLT or Javascript converter. The client is still responsible for
> presentation, but the server has to help by making the data more readily
> consumable.
> Kurt Cagle
> XML Architect
> Lockheed / US National Archives ERA Project
>
>
>
> On Sun, Dec 5, 2010 at 8:59 AM, David Lee <dlee@calldei.com> wrote:
>>
>> I must be working with different 'kinds' of documents then you (ben).
>> The space  I work in primarily is the Clinical Information area.   These
>> have a wide variety of XML data (often from non-XML sources)
>> which vary dramatically in their 'human readability'.   Some completely
>> obtuse, through some that make reasonable sense if you stripped out the
>> tags.
>> And everywhere in between.
>> But *NONE* of them could be reasonably presented  by applying CSS type
>> technology  as-is, at least to the specifications of our product designers.
>> There's a lot of reasons why, but it's not just the complex ones.    The
>> *simple* ones have problems.
>> A trivial example is re-ordering.   Text needs to  be rendered in
>> different order then document order (e.g. moved around).
>> Text Injection is another thing.  Often there are references by ID values
>> to 'outside data'.  (Could be other XML data could be DB could be
>> calculated).
>> This data needs to be extracted and put in, for example
>>
>>
>> Simple (made up but representative) example
>> From:
>>   Take   <dose amount="10" unit="mg" freq="daily"/> of  <med id="12345"/>
>> to
>>   Take 10mg of <a href="/drugs/12345">aspirin</a> daily.
>>
>> The important note here, is while the documents are *human readable*
>> (ignoring the ones that are not),
>> they are *not presentable* without domain knowledge.   And possibly
>> without access to 'out of band' data.
>>
>>
>> This (and similar issues) has led to my assumptions/conclusion ... that
>>  simply extending the presentation paradigm will *not* solve the problem of
>> 'XML On the Web".    The existing toolsets are NOT capable enough to make
>> the kinds of even simple transformations of XML to presentation even with
>> some enhancements.  And even given "human readable" XML documents.
>>
>> Then there are the 'not so human readable' documents that are still
>> "Documents" say the XML form  of a Word  document.
>>
>> So I have concluded myself that *somewhere* some heavy-lifting need be
>> done to expose this dark web into presentation.
>> And it can't be done with existing CSS type technology.
>> And its difficult to do *generically*, each document type needs different
>> rules to transform them.
>> And that engine and rules have to live somewhere.
>>
>> Now onto phase 2
>> It has been suggested that the Client is the place to do the heavy
>> lifting.
>> I really do like the idea of embedding <stylesheet type="javascript"> tags
>> ... I think that could go a LONG ways.
>> But I'm concerned its not the solve-all solution.
>> Why ?  Here's where I suspect we definitely come from different spaces, I
>> work primarily in the mobile space.
>> Which tends to have me focus a LOT on things like download speeds,
>> latency, and client processing power.   But today's mobile is tomorrows
>> desktop.
>> And networking issues are similar even today.
>> Now I know that mobile devices and cell networking is improving at a
>> remarkable pace, but it's also true that desktops are diminishing and
>> turning into mobile devices.   Did you know there are actually more mobile
>> browsers in the wild then desktop browsers today ?  This trend will
>> continue.
>> I've watched for decades as the pendulum swings back and forth between
>> client heavy vs. server heavy architectures.
>>
>> So given all that, my conclusions are that in fact
>> 1)  the client is *not* an infinite untapped resource of CPU power.
>> Rendering HTML strains most clients (desktop & mobile) to the extreme
>> already.
>>
>> 2) The complexity of the transformation is beyond current built-in
>> client/browser stack capability and may require external data.
>>
>> 3) Transformations need to change dramatically based on document and
>> document type, and the rules for those need to change.
>>
>> This means that the *code* to process documents to presentation may well
>> exceed the document size.
>> And the ancillary data the code needs to do the transformation may not
>> reside currently on the client (so has to be fetched).
>> And *that* data may be huge, so needs some efficient service to query so
>> the data, and if there are lots of those requests,
>> then *latency* is a huge problem.
>> Some of this can be solved by special purpose apps downloaded ahead of
>> time (what most Android and iPhone apps really are under the hood).
>> But to solve it 'in the wild' you can't  be asking people to download a
>> reader app for every web site & document type. (or can you ?)
>>
>> These are a lot of issues, and I agree they don’t necessarily build on
>> each other logically to form an absolute proof,
>> but In my mind they *weigh* on each other and *add up* to a reasonable
>> conclusion.
>>
>> That it is difficult, perhaps untenable, to expect simple enhancements to
>> the client stack to magically make it capable of rendering the dark web of
>> XML documents all on the client, in a presentable and efficient way.   Thus
>> I still hold that for now (5 years? past that my imagination is feeble) that
>> the server is the appropriate place to do the transformations to at least a
>> form that is *closer* to  presentation structure.
>> Maybe not spit out the full HTML, but at least spit out something which is
>> *easily and efficiently* translated to HTML on the client.
>>
>>
>>
>>
>>
>> ----------------------------------------
>> David A. Lee
>> dlee@calldei.com
>> http://www.xmlsh.org
>>
>>
>> -----Original Message-----
>> From: Ben Trafford [mailto:ben@prodigal.ca]
>> Sent: Saturday, December 04, 2010 10:33 PM
>> To: David Lee
>> Cc: Peter Hunsberger; Michael Kay; xml-dev@lists.xml.org
>> Subject: Re: [xml-dev] Where is XML going
>>
>>
>>
>> David,
>>
>> I think we're coming from alternate points of view. You seem to be
>> approaching XML as human-incomprehensible data (the app developer
>> viewpoint) -- I'm approaching it as a human-comprehensible document.
>>
>> There are numerous examples of both, but it's become increasingly common
>> for people to ignore the vast, unimaginable quantity of XML documents
>> that exist as human-comprehensible data. An example would be the
>> plethora of data that exists in aviation repair manuals -- literally,
>> hundreds of millions of pages worth of pure document.
>>
>> There will always be room for server-based transformations et. al., but
>> that space is very well addressed by existing technologies. What is
>> extremely poorly addressed is the document to end user space, and
>> -that's- what needs to be fixed, in my opinion.
>>
>> --->Ben
>>
>> On Sat, 2010-12-04 at 22:15 -0500, David Lee wrote:
>> > Touché
>> > Good argument
>> > But how does the browser know what the data means well enough to present
>> > it. ?
>> > I feel there is a difference of opinion re  separation of concerns that
>> > is a fundamental rift in agreement in the community
>> >
>> >
>> > Sent from my iPad (excuse the terseness)
>> > David A Lee
>> > dlee@calldei.com
>> >
>> >
>> > On Dec 4, 2010, at 10:03 PM, Peter Hunsberger
>> > <peter.hunsberger@gmail.com> wrote:
>> >
>> > >
>> > > On Sat, Dec 4, 2010 at 7:03 PM, David Lee <dlee@calldei.com> wrote:
>> > >>
>> > >> In my opinion the server is 'closer to the data' then the browser.
>> > >>  It has more chance of knowing about the meaning of the data then the
>> > >> browser.
>> > >
>> > > So?  The browser is closer to the user, it has more chance of knowing
>> > > about the presentation requirements than the server.
>> > >
>> > > --
>> > > Peter Hunsberger
>> >
>> > _______________________________________________________________________
>> >
>> > XML-DEV is a publicly archived, unmoderated list hosted by OASIS
>> > to support XML implementation and development. To minimize
>> > spam in the archives, you must subscribe before posting.
>> >
>> > [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
>> > Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
>> > subscribe: xml-dev-subscribe@lists.xml.org
>> > List archive: http://lists.xml.org/archives/xml-dev/
>> > List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>> >
>>
>>
>>
>> _______________________________________________________________________
>>
>> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
>> to support XML implementation and development. To minimize
>> spam in the archives, you must subscribe before posting.
>>
>> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
>> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
>> subscribe: xml-dev-subscribe@lists.xml.org
>> List archive: http://lists.xml.org/archives/xml-dev/
>> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>>
>
>
Follow-Ups:
- Re: [xml-dev] Where is XML going
  - From: Elliotte Rusty Harold <elharo@ibiblio.org>
References:
- Where is XML going
  - From: "David Lee" <dlee@calldei.com>
- Re: [xml-dev] Where is XML going
  - From: Ben Trafford <ben@prodigal.ca>
- RE: [xml-dev] Where is XML going
  - From: "David Lee" <dlee@calldei.com>
- Re: [xml-dev] Where is XML going
  - From: Michael Kay <mike@saxonica.com>
- RE: [xml-dev] Where is XML going
  - From: "David Lee" <dlee@calldei.com>
- Re: [xml-dev] Where is XML going
  - From: Peter Hunsberger <peter.hunsberger@gmail.com>
- Re: [xml-dev] Where is XML going
  - From: David Lee <dlee@calldei.com>
- Re: [xml-dev] Where is XML going
  - From: Ben Trafford <ben@prodigal.ca>
- RE: [xml-dev] Where is XML going
  - From: "David Lee" <dlee@calldei.com>
- Re: [xml-dev] Where is XML going
  - From: Kurt Cagle <kurt.cagle@gmail.com>
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]