Re: The problems and the future of the web and a formal internet technol

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

Re: The problems and the future of the web and a formal internet technology proposal

From: Rapha�l Hendricks <rhendricks@netcmail.com>
To: Phillip Hallam-Baker <phill@hallambaker.com>
Date: Mon, 4 Jan 2021 16:42:15 -0500

Dear Mr. Phillip Hallam-Baker, having read your reply, it gives the impression that you are a hurried person. Had you took the time to read the full message which I have sent, you would have realized that I am trying to bring up a serious problem which came-up with the last decade of web evolution (and to a lesser degree with the second half of the previous one), so here is a short description of the points covered in the message.

The problem is that the W3C, from a great organization has turned into an ugly three-headed monster, one head is the semantic web / XML / RDF people, the second head is the WHATWG people trying to turn the web into a remote software execution framework (the "webapp" approach), the third and final head is the copyright industry, the World Wide Web has suffered accordingly. Two of the heads, the WHATWG people and the copyright industry, tried to turn the W3C and the W3 away from what the first head was, a consoritium developping a platform to share hyperlinked documents, developping the web around the principles of openness, the semantic web, XML, XPath and so on. The current web technology meets neither of the two uses properly. There should be two different platforms for two different uses, not one general purpose monster. Trying to turn HTML/CSS/XHTML/XML into a application writing platform was trying to trun a squirrel into a dinosaur. The current platform is no longer a propper squirrel, nor did it turn in a proper dinosaur, it is some sort of ugly chimera. I am suggesting to have one first cosortium doing one first platform, doing a reborn web, centered around, openness, the semantic web concept, XML, XPath and so on, which is what the web was to become before the WHATWG people were let into the W3C effort. I am also suggesting to have a second distinct consortium and platform tasked with developpting a remote software execution platform, which would not be based on XML/XPath or HTML, instead of the turning a squirrel into a dinosaur approach. The first platform would be open and free of DRM while the second would be free to include such garbage. I also explain why I think that it should be done now and that the second platform should tightly integrate the concept of edge computing which will soon become a big thing.

I also wish to add the fact that some people, including me, have an analytical approach where the analysis and reasoning comes first and ultimately leads to a conclusion while other poeple have a synthetic approach where the suggested point comes first and justified afterwards through arguments. These are different styles of writing which are both fine and I normally use the first one. First, I go over the historical context so as to make sure that every reader will be familiar with all the crucial elements, second, I describe in detail **what** are the two different needs which the current web technology tries but fails to answer, third, I explain **how** the current web technologies fails to meet the needs, fourth, I explain that the attempts to keep some sensible direction to the web evolution have failed, fifth I explain why it would make sense to implement the change to the general approach now and the link with the upcoming wide deployment of edge computing, and what the governance structure for the two new consortiums could look like and sixth I describe what the two new platforms could (but do not have to) look like.

Finally, I wish to remind everyone on the destination list, including Mr. Phillip Hallam-Baker that the TO: address list of the messages contained many people (including those receiving the content through the mailing lists) in very different situations. Some people are specialists highly knowledgeable with the whole thing and would be fine with a small message which goes strait to the point and I understand that they may find a lot of the content useless and tiring as they are already familiar with most of the thing, some people on the list are familiar with some parts of the content, while other people are familiar with other parts, and I need to make sure that everyone on the list may read the parts with which they are not yet familiar, even if this means that other people will find that some or most sections cover content which is well-known and boring to them.

I am sure that everyone will understand the situation.

May XML live-on till the end of times.
Rapha�l Hendricks

Le 21-01-04 � 11:50, Phillip Hallam-Baker a �crit :

If you are going to propose a new communications media, you would do well to first learn how to communicate.

I have no idea what problem you are trying to solve having read the first three screenfulls. And I have a really big screen.

Changing an infrastructure once deployed is near to impossible. Nobody is going to switch from email or the Web to a new platform that meets the same needs. The only way that an established technology like fax disappears is if a new technology appears that meets the needs of a community that is not served by legacy systems which is capable of being extended to replace the legacy system over time.

Even then, it took over 20 years for MIME to replace fax and even now there are holdouts.

On Mon, Jan 4, 2021 at 10:22 AM Rapha�l Hendricks <rhendricks@netcmail.com> wrote:
The last decade of change in internet technologies has brought up
significant change and with it, some new unadressed issues. In 2005,
there has been a schism between two groups in their idea for the
future of one specific internet technology, namely the web. I am
adressing this message to several concerned groups and individuals
about some serious problems which have plagued the web as it is. In
this message, I am going over the problems and making a formal
proposal to replace the World Wide Web in its current form with two
new technological platforms as well as replacing the World Wide Web
consortium with two new consortiums, one for each new technological
platform; this message will be of particular interest to XML people,
Semantic Web people and anti-DRM people, but also, to some degree, to
privacy-advocates and accessibility-advocates as all these points are
adressed, the message will also be of interest to the IETF since I am
suggesting some changes in the administrative structure of some
internet technologies. I encourage everyone reading this message to
diffuse it as widely as possible, the more people read it, the better.

Message for Sir Timothy Berners-Lee: your ideas for basing the future
of the web on XML, XPath, properly structured data and documents, the
semantic web, and the openness principle were really great; it is a
shame that such a vision never came to be implemented, I am reviving
your ideas in this proposal again, for one of the two platforms,
because I believe that they are worth another consideration.

On one hand, there were the usual W3C working groups preparing the
furthering of the traditional web principles, namely accessibility,
ease of indexing and referencing, separation of content, meaning and
presentation, making sure that any content is not limited to a
specific use by allowing a proper structure for the content, allowing
easy archiving, implementing mandatory validation, and so on. All
this was being done through the working on XHTML 2.0 draft, RDF/RDFA,
the semantic web in general (which would allow better indexing and
content reuse and which would be mostly based on the previously
mentioned languages), XSLT, XForms, XML in general, XInclude, and so
on. There were working to reduce the need for ECMAScript and, more
importantly, Javascript (partly through technologies such as XForms,
XPath and XML Events), which is a good thing as scripting works
against the previously stated goals and should only be used as a last
resort (on top of all, Javascript is a proprietary language (owned by
Mozilla), which makes it even worse a choice, while ECMAScript, which
is only a subset of Javascript, is, at least, a standard (ECMA262),
which makes it a lesser evil). Work was being done to advance
development of aural style sheets as they had rightfully understood
that content may not exclusively be used by generating its visual
rendering. Those working groups were also trying to cut clean with
the tag-soup era. In such a vision, there was, of course, no place
for DRM as it goes against the openness which the above stated goals
tries to achiveve. It is well known that information wants to be
free. The attempt to switch to XML and the semantic web can be seen
as an attempt to go back to the original vision that Sir Timothy
envisioned when he chose to connect his web technology to the wide
internet in 1991 (making information freely available (think free-
speech, not free beer) and easely usable for any use without use
restrictions). The tag-soup era that preceeded can be seen as a move
away from that vision.

On the other hand, there were some entities which were opposed to the
switch to XML and the semantic web. They stated that those approach
were too document-centric and that they wanted to create a technology
more adapted to webshops, forums and so on, they wanted to better
support interactivity than what was proposed with the switch to XML
and the semantic web, they wanted to support client-side programming
(as opposed to the move away from scripting), they wanted to support
client-side dynamically updated content and dynamic capabilities (as
opposed to the server-side dynamic capability with static-only client
side, except for form-validation through XForms, XML to XML
conversion through XSLT and timing/events thorough XML events and
SMIL animation, or where client side dynamic capabilities are
declarative only, with no imperative support). They were also
concerned with the fact that the strict well-formedness and
validation requirements of the XML technology made integration with
several server-side programming languages problematic to say the
least. Everyone has sawn that XHTML webpage which was valid 9 times
out of 10 and invalid 1 time out of 10 due to the page having dynamic
content wich at times would make the page invalid or badly-formed.
This was due to most of those server-side languages being based
either on the approach of inserting program code within a to-be-
enriched incomplete page, which makes a page whose validation cannot
be wholely be verified or around the concept of generating the page
source by generating the markup text (as a sequence of characters) in
which case only examples of the script execution result can have
their validation tested but validation cannot be garanteed, this
instead of generating the page structure (through DOM core methods or
using XPath based technologies). The server-side programming language
which, at the time were the W3C was trying to switch the web to XML,
was the most popular, PHP, was particularly affected by this, and
languages such as Perl, Python and ASP were not in a much better
state. Now, before someone points it out, yes, there has been cases
when carefull programmers were able to use languages, such as PHP, to
implement some proper server-side dynamic capatilities by having php
files with no XML content, and, for each page, a small user-
modifiable XML config file serving to map variables to XML attribute
values or tag content, one or more XML templates, an XSLT module to
allow calling an XSLT interperter from the main language and an XSLT
transform sheet, applied to the template and used to generate the
well-formed, perfectly valid XHTML page, these cases were the
exception rather than the norm, while pages which were usually valid
but without a garantee that the dynamic features would not break the
validity (cases which would occur occasionnally), were the norm.
There were server-side technologies with proper XML support and
integration (such as, for example, for advanced capabilities: JSP/
Servlets (and debatably Ruby Electric XML), for intermediate
capabilities: server-side javascript, for rudimentary capabilities,
XInclude) but many server-side programmers didn't want to switch to
those languages, part of the resistance against switching the web to
XML certainly partly, but not only, comes from there. The entities
which were opposed to the switch to XML and the semantic web did not
want the WWW remaining a web of documents interconnected with
hypertext links (and extended through for form-validation through
XForms, XML to XML conversion through XSLT and timing/events thorough
XML events and SMIL animation), this being the too-document-centric
criticism. They wanted to turn the web into a technology to run
software in the browser. These entities were mostly Google and
several major browser makers and they went on to establish the WHATWG
to create the HTML5 specification, promote the XML-RPC format and
create the JSON and JSON-rpc formats, all of which would serve as a
basis to implement that change.

The WHATWG idea of turning the web into a platform to run software in
the browser was a stupid and extremely bad idea. It however needs to
be stated that there are legitimate reasons to wish to run software
operating in a client-server split model, that is not the problem.
The problem is that bastardizing the WWW and the technologies
supporting it is not the proper way to answer this legitimate need.
The proper way to meet this need is to create a dedicated
standardized platform to run client software connecting to server
software over the internet, but with the said platform being separate
from the web. There is a need for a technology to make openly
available hyperlinked documents, with the documents using XML markup
to indicate their structure and using RDF or RDFa to indicate
semantic information, and using SMIL aniation / XML events to generte
declarative animations, and using XForms to validate user-input data
against XPath expressions or an XML Schema before using the said data
to further the content available as hyperlinked documents. There is
also a need for a technology to run client-side software that
connects to remote software running on the server; this type of
technology, with client software connecting to software running on a
server, is the right way to handle transactional operations such a
banking and stock buying as well as online shopping; it is also well
suited for highly interactive applications, such as playing networked
computer games. There should be two different platforms for two
different uses, not one general purpose monster.

In cases such as real-time stock buying, having a remotely executely
executable application should not be seeen as being incompatible with
having a real-time XML feed with the raw data, with a basic
stylesheet which allows easier reading for those who prefer to read
the raw data; it should be the objective to always make the raw data
available where possible in parallel with having the data accessible
through the client-side application. For example, one may wish to
access the raw standard bus schedules of a city or the raw realtime
bus data feed (with a basic stylesheet to ease reading) rather than
have to go through the official bus operator supplied online
interface and should not be prohibited from doing so; the said raw
data could then also be used for other uses than online reading (ex:
statistical).

As stated above, there is a need for a technology to make openly
available hyperlinked documents, with the documents using XML markup
to indicate their structure and using RDF or RDFa to indicate
semantic information, and using SMIL aniation / XML events to generte
declarative animations, and using XForms to validate user-input data
against XPath expressions or an XML Schema before using the said data
to further the content available as hyperlinked documents; this
allows easy indexing, using RDF/RDFa to derive meaning as opposed to
having to extract the information from human readable documents
containing no extra annotation for computers; this would allow many
search engines to be easely implemented bringing competition in the
field as opposed to having an oligopoly made out of a few companies
which have access to the advanced technology to extract data from
human readable content without machine annotations, technology which
nonetheless still produce search results inferior to that of a
simpler search engine indexing only sementic-information-containing
documents. Having a unified URL+URN=URI framework is important in
information sciences and archivivsm, abolishing the URI approach to
go back to old-style URLs as was done with HTML5 (they even re-
introduced the URLs starting by the javascript: pseudo-protocol
designation, which is beyound the limits of decency) is plain stupid.
It is essential that formatting and presentation information not be
used to convey the structure of the document, using said presentation
information to convey structure limits the use of the data inside of
the document to that originally thought by the author but the data
then cannot easely be used otherwise. A user may wish to turn off
stylesheets for various reasons (for example because the user has
poor vision) and the document structure should still be easy to grab
when seen with the user-defined stylesheet inside the browser, this
is not possible if the document content is not properly structured;
not surprisely, XHTML documents tend to be much better structured
than HTML documents as, in XML, the structure is everything. It is
also baffling that none of the major websites uses aural stylesheets
and that most browsers don't support them even though page content is
meant to be independent of the rendering method and having an audio
rendering of a page is a legetimate use of the content, aural
stylesheets should be used and supported as standard just as widely
as visual stylesheets. Moreover, having content not tied to an
expected use allows unexpected new uses to come up ulteriorly. For
example, if various sellers put up catalogs of their products online
and encode the data about the products properly using a generic
fashion, even if the catalogs were meant to be read by humans, they
can easely be used by price comparison tools or they can be easely
used by statistical tools to study the historic evolution of prices.
If the catalog documents put online are only human-readable with too-
poor-a-structure to be easely analysed by computers, such extra uses
cannot come-up. Having documents which are only human-readable,
having a structure insufficient to be analysed by computers has a DRM-
lite effect by restricting the uses of the said documents. It is
therefore obvious that having content not being made for a specific
use is highly beneficial. Such a technology would likely be deployed
for uses such as publishing documentation, governement websites,
university/faculty/department/professor websites, company product
documentations, blogs, amateur websites, digital document archives
(using XForms for searching) and so on. It could also contain content
which is available against a payment but where, after supplying the
payment, the available content is non-usage-restricted. It would not
be used in cases when the content maker wants to restrict the usage
of the content but this is a good thing as such usage-restricted
content has no place on such a platform.

A technology to run client-side software connecting to server-side
software over the internet supplying a standard method for the said
software to be transferd to, as well as deployed and run on the
client is highly useful. Such a technology should not use a page-
based content structure. Making pages out of content only makes sense
if the content is constant and indexable, which is obviously not the
case for dynamically generated or dynamically updated content. Such a
technology should not use ECMAScript or Javascript as its language
since it is inappropriate for the use, Javascript was originally
created as a scripting language, not a programming language, it was
meant to add a bit of dynamic capabilities to web pages, not to write
software (hence its reputation as a hackish language among those
which use it to write real software, javascript is a good scripting
language but a poor programming language); on a platform allowing
client-side software to connect to server-side software, the client-
side software should be written using a proper programming language.
Cases such as real-time stock prices and real time bus/tramway
locations are good examples of time-varying content where client-side
dynamic capabilities make sense but putting the content in permanent
static pages doesn't. Cases such as inventory management, online
banking and product purchasing are good examples of highly-
transactional content where client-side dynamic capabilities are
useful but putting the content in pages for indexing or not-yet-
invented future uses has limited use. Cases such as running networked
computer games or running non-networked computer games are good
examples of highly interactive content where client-side adequate
software capabilites with hardware acceleration accessible by the
client-side software is useful but trying to structure the content
into pages makes no sense and the content cannot truly be indexed.
All these examples show that there are cases where a technology
distinct from the one to create a web of hyperlinked, structured and
semantically-encoded documents is highly needed and that such a need
doesn't nullify the one for the previously stated web of hyperlinked,
structured and semantically-encoded documents, since the use-cases
for each are highly different.

The HTML5/javascript/JSON-rpc approach is a real problem. It meets
neither of the two needs properly. For the structured and, perhaps,
semantically encoded, hyperlinked document need, the HTML5/javascript/
JSON-rpc is a disaster. The rpc capabilities make for dynamic content
impossible to index. JSON data doesn't allow content indexing and
reuse as easely as XML data. The reliance on javascript make the
documents harder to analyze and index, and ties the content to a
single use, which is online viewing (often visual only, without even
supporting audio rendering). Cookies, while not strictly part of
HTML5 (they are part of HTTP/HTTPS) are nonetheless a feature coming
from the tag-soup era which should be phased out, instead of that,
HTML5 introduced an enhanced version of the concept in the form of
webstorage. Cookies are almost exclusively used for three uses,
namely sessions, interwebsite-tracking and intra-website analytics.
Session cookies are an unnecessary and absurd mechansim to implement
sessions which is less safe and less sensible than the built-in HTTP
session mechanism, which should be used instead. Intra-website
analytics may be a privacy issue, webmasters requiring intra-website
analytics should either limit themselves to statistics which do not
require users to be traced or be upfront about the fact that users
must open a session, use the HTTP session mechanism for this and be
clear that their usage data will be logged and analyzed, doing
otherwise is dishonest and a lack of transparency, in a way, it is
requiring users to create a identifier without being open about it
and making them open sessions through the back door. Inter-website
tracking cookies are a privacy breach and represent a practice whihch
needs to be abolished. The few other cases where cookies are used,
namely where keeping variables from one page to the next is
necessary, can easely be handled by using address rewriting instead
of cookies. Webstorage brings the same problems as cookies but to a
greater extent. The DRM inclusion in HTML5 is the antithesis of the
openness goal. It intentionally restricts usage when the maximum
openness goal requires implementing technology to facilitate ease of
access and facilitate new and unexpected usage of the content,
removing as much technological-limitation-derived usage restruction
as possible. DRM also makes archival problematic. While not part of
HTML5, the HTTP/2 protocol often used to transfer the said content,
breaks the protocol layering principle and is a problem because of
this. The WHATWG even had the indecency to reintroduce an element
coming from the tag-soup era and having never been part of any
standard HTML or XHTML version, namely the embed tag. From all these
elements, it is obvious that the HTTP/2/HTML5/JSON/javascript/JSON-
rpc/ based web infrastructure is ill-suited to answer the need for an
open and privacy respecting web of structured and, perhaps,
semantically encoded, hyperlinked document described previously. If a
news/blog/text+image website cannot be read without enabling
javascript, it fails, unfortunately, such cases have become common,
yet people create "webapps" which need to be "run" just to access a
page of content, this is the kind of situation brought by HTML5/JSON/
javascript/JSON-rpc. The interest for XMl has not died when the W3C
decided to base the future of the web on HTML5, the companies and
other entities which were previously behind the XML effort switched
most of their XML efforts to the OASIS group, which is concerned with
the use of XML for document-centric/data-centric uses (but not
diffused on the web); this is proof that interest for the XML
structured/data-centric approach continued after the W3C decision.
For the other need, that of running software on the client-side which
connects to software running on the server (which is what "web
applications" are trying to do), HTML5/JSON/javascript is just as
badly suited to the need. Trying to turn HTML/CSS/XHTML/XML into a
application writing platform was trying to trun a squirrel into a
dinosaur. As stated previously a page based structure is not adapted
for this use, yet HTML5 retains the page approach. HTML5/JSON/
javascript/JSON-rpc is no longer a propper squirrel, nor did it turn
in a proper dinosaur, it is some sort of ugly chimera.

When creating the new HTML working group tasked with developping
HTML5 Sir Timothy Berners-Lee said that:
> Unlike the previous one, this one will be chartered to do
> incremental improvements to HTML, as also in parallel xHTML.
No real efforts have in fact been put in XHTML in the last 12 years,
while there is a theorical XHTML5, it only exists on paper.
About forms he said that:
> The plan is, informed by Webforms, to extend HTML forms. At
> the same time, there is a work item to look at how HTML forms
> (existing and extended) can be thought of as XForm
> equivalents, to allow an easy escalation path. A goal would
> be to have an HTML forms language which is a superset of the
> existing HTML language, and a subset of a XForms language
> wit [sic] added HTML compatibility.
All of this has never happened, in fact the XForms 2.0 specification
was never completed, the most recent draft is from 2010. The attempt
to use a new HTML working group to gradually move the standard to a
point where the switch to XML would be easier never worked, the
WHATWG acted as if they are the only ones in charge, refusing to
address concerns addressed by the other W3C people, which brought
long fights where the WHATWG would continue fighting until the other
side would give up; they acted as though the only role for the W3C
was to approve what the WHATWG was doing, giving it legitimacy, and
they have sadly succeeded; Ian Hickson is particularly guilty in this
case.

Once it has been established that there is a need to put an end to
the web as it is and create two succeding platforms, one, a new and
reborn web based on XML/XPath/RDF/RDFa for data structurally and,
eventually, semantically encoded and the other one, a remote
application execution platform based on some appropriate programming
technology, one may wonder why the change should be done now. Some,
especially those whose web development falls in the second use, that
of a remote application execution platform and probably not those
whose web production falls in the first use, may think that the
current web is usable enough for the said purpose. To address this,
one must first remember that moving away from XML/XPath also means
moving away from the structurally and semantically encoding
technology which is problematic for the first of the two needs, that
of the said structurally and semantically encoding technology,
second, for the remote application execution need, the ill-suited
nature of the current web, with the amount of needed workarounds it
brings, means a huge amount of anually wasted manpower worldwide. The
number of lost man-hours on an annual basis which are lost wordwide
due to the issue of trying to get the web technologies to do
something for which they are not suited is probably in the millions.
The other big reason for doing the change now is that the underlying
internet infrastructure is about to change, and it would be best to
design the remote application execution platform around the new
upcoming infracture. The said infrastructure change is the soon-to-be
widespread deployment of edge-computing. The deployment of edge-
computing will shift the approach from a client/server operation to a
client / edge-server / remote-server operation; the adoption of 4.5G
and 5G telephony will popularize the use of edge computing, which is
expected to play a significant r�le on those networks. Of course,
some more workarounds could probably be founds to operate the current
web platform with the newly running edge-servers; however, the proper
solution would be a technological redesign, it really would make
sense to design the remote application execution platform for this
model (client / edge-server / remote-server) from the beginning.
There should be a standardized method for a client terminal to send a
request to the closest edge-server with the identity of the service,
available from a remote-server, service for which the edge-server
would then download its portion of software (the part to be run on
the edge-server), as well as that of the client, through a
standardized mechanism, the standardized technology should also
define how the edge-server is to send to the client its portion of
the software (the part to be run on the client) coming from the
remote-server and finally it should define the format of the software
code. This would make remote application execution platform adapted
to the presence of edge-computing. For the other need platform, that
of a new and reborn web based on XML/XPath/RDF/RDFa for data
structurally and eventually semantically encoded, there is no use for
edge-servers, it is best to have the client computer contact directly
the remote and sole server with the said remote and sole server doing
most of the processing and simply serving the data to the client
computer, with the only processing done on the client being for XSLT/
XPath/XForms/SMIL-animation/XML-events. The current World Wide Web
consortium should be replaced by two new groups one to develop each
of the succeding platforms. The current World Wide Web, from a great
organization has turned into an ugly three-headed monster, one head
is the semantic web / XML / RDF people, the second head is the WHATWG
people trying to turn the web into a remote application execution
framework, the third and final head is the copyright industry. The
first new consortium, developping the structured and semantic web,
based on XML / XPath / RDF / RDFa, should be a joint IETF/OASIS
consortium, since the IETF is generallly commited to the openness of
technologies and OASIS is where most work around XML has happened
since XML has started leaving the web, this would help ensure good
integration with other XML technologies and the proximity with many
XML people (from OASIS) would help jumpstart the XML uptake; of
course, Sir Timothy Berners-Lee would be the proper chair for the
consortium. The second consortium concerned with creating a remote
application execution platform should be a joint consortium between
the IETF and a second group, the WHATWG being an option but the
Khronos group being preferable (the platform function would fit well
in its "connecting software to silicon" mandate), even the Object
Management Group could be chosen. The following two paragraphs
contain formal proposals for the subsequent two platforms which would
badly need to succeed to the current WWW.

In creating the platform for the structured and semantic web, based
on XML / XPath / RDF / RDFA, there are at least two sensible choices
as a basis for the central language for the platform (meant to
finally replace HTML). One is to resurect the XHTML2.0 working draft
since, after all, the people behind it are competent and did good
work and the XHTML2 was done with the very same objective which would
be pursued by this platform. The second obvious sensible choice is to
break-down some XML-based standards from OASIS, such as Docbook, into
modules, taking all the modules which cover needed functionalities
and completing the language with new modules, this would allow high
integration with the other XML standards developed at OASIS (the main
XML development group at the time). Of course, a fully new language
could also be designed, but it would provide neither of the
advantages of the previous two approaches. It is probably best not to
call this XML-based new language XHTML as people see the letter
sequence HTML (regardless of the leading X) they expect the language
to be compatible with HTML4 and will fight anything which isn't (for
the XHTML2.0 draft, it was likely a mistake not to have changed the
name). Of course, the new consortium developing the platform should
take-up the development of the core XML, XPath, RDF/RDFa (and even
XForms and XInclude) languages. Since so many people seem to be
obsessed with support for scripting/programming, it is probably a
good idea to develop a complementary fully declarative scripting
language, based on XPath, XML Events, and SMIL Animation, with some
extra XML markup, to allow scripting which is fully XML/XPath based
and avoid seing external languages being grafted on top of the
platform as has happened to HTML with javascript. It is important
that there is no cookies/webstorage, there should however be the
option to use the protocol session mechanism either that of HTTP or
of a new XML-based protocol (see the subsequent part of this
paragraph). It would also be beneficial to define standardized
styling mechanisms, both for visual stylesheets and
aural.stylesheets. The visual styling mechanism should get rid of the
GUI building components available in CSS3. For the aural stylesheet
mechanism, one option is to use the already existing, but never
implemented, W3C aural stylesheets, it would have the inconvenient of
not being XML/XPath based, a better approach, however would be to
start with an XML reformulation of the W3C aural stylesheets and
replace the current selectors with XPath based selectors, this would
give a language using XML for the styling definitions and XPath for
the selectors. For the visual stylesheets, there are several possible
options. One is to use CSS3 with the GUI building parts removed, it
would however have the inconvenient of not being fully XML/XPath
based, another option is to combine an XML reformulation of the CSS3
styling definitions with XPath selectors, this would supply a fully
XML/XPath solution, another option is to use XSLT/XSL-FO, another
option would be to use XSLT to generate SVG data for rendering (there
is unfortunately some overlap in capabilities in SVG and XSL-FO even
though XML languages should ideally, instead of reimplementing
functionalities, import modules from other languages). Of course, a
fully new XML and/or XPath styling language can be created, however
the previous approaches would allow better integration with existing
technologies than a new language would. The visual styling mechanism
should mandate the definition of two styling types: paper-like and
video-like, chosen automatically based on a browser-setting
parameter, forcing all users to have a paper-like rendering as is
currently done on the web is not the best option; a video-like style
uses pale text on a dark or coloured background with characters based
on thick lines, with no serifs or limited serifs (terminal-font-
like), may optionally use character outlines for readability on any
image backgrounds, and can be easier to read for some users with
impaired sight as well as be more suited to some display types, a
paper-like style uses dark text on a pale background with characters
of varying width and varying serif styles; most webpages and current
GUIs can be described as paper-like while some vintage GUIs as well
as most CLIs can be described as video-like. Mandating both in a
visual stylesheet would allow the user to choose which to use through
a simple browser setting. It is also worth considering wether the
consortium should develop, for the platform, a new protocol to
replace the HTTP(S), especially the SPDY based HTTP/2 which is
problematic, protocol which would be XML-based, one possibility would
be to have SOAP-over-TCP, similarly to the way that OASIS developped
SOAP-over-UDP; again, it would be a good idea not to call this HTTP
(S) to avoid raising false expectations. If combining SOAP-over-TCP
with an XML session mechanism and XHTML2 / RDFa with XSLT / XForms /
SMIL-animation / XSLT / XSL-FO or SVG / XML Events and SMIL Animation
it would finally allow to only have XML/Xpath for the whole stack
with no other technologies. Of the two platforms, the one for
hyperlinked structured documents, possibly semantically encoded,
documents based on XML/XPath, is the one to keep the name "the Web"
or the "World Wide Web", since it would implement what the Web was
meant to be. The other should be called something else, for example
Online Service Platform (OSP).

The other platform, to implement remote software execution should be
integrated with edge computing. There should be a standardized method
for the client to contact the edge-server and indicate which service
is to be accessed from a remote-server, the standard should specify
the mechanism to download the edge-server code and client code from
the remote-server to the edge server (and mechanisms for caching both
on the edge-server when possible) and the mechanism to download the
client code from the edge-server (previously received from the remote-
server) to the client (on a per-module basis instead of all at once,
module method which could also be used to transmit data as modules,
data to be used by the software running on the client). There should
be a mechanism for the client to request that the edge-server open,
on its behalf, a session on the remote-server, transmitting its
identity to the remote server, this would allow a given client
operator to have a permanent account on the remote-server, even
though the operator may use differing clients connecting to different
edge-servers over time. Since edge-servers are being used, this is
where the bulk of the processing workload should lie. The client
processing workload should be mostly limited to rendering and
handling user interaction with a few extras here and there, this
allows to reduce power usage at the client-point (useful for portable
devices), it also allows to reduce the needed processing power,
except for audio/graphics rendering circuits, at the client point
(reduces the manufacturing cost of the client devices). The remote-
server processing workload should be limited to that which cannot be
done on the edge-server, such as storing user data between sessions,
serving the edge-server and client code to the edge servers,
processing that which only needs to be calculated once before being
sent to all the connected edge-servers and relaying data between the
connected edge servers; pushing most of the load on the edge-server
allows a lower latency operation for the client. There should also be
a mechanism for the client to indicate its device class to the edge-
server and the information should be available to the software
downloaded from the remote-server and running on the edge-server;
sensible device classes allowed should include at least the
following: small_touchscreen, big_touchscreen, pointer_based,
remotecontrol_based and maybe others. There should be two more flags
accompanying the device class; the first one being the presence or
absence of a keyboard, this would allow the software to modify its
interface when no keyboard is available so as to reduce the need of
the (hard to use) onscreen keyboard to the minimum and to increase
the reliance on the keyboard when a physical keyboard is available;
the second one being the presence or absence of a joystick/joypad,
some game software may require a device class other than a
small_touchscreen as well as either a pointer_based device class or
another class accompanied by a keyboard or a joystick/joypad to be
playable and may need to check for this. When the client connects to
the edge-server, there should be a method for the client to transmit
to the edge-server a parameter indicating its prefered colour scheme
and text style (parameter made available and used by the software
coming from remote-servers and running on the edge-server and
client), with two options being available, video-like and paper-like;
forcing all users to have a paper-like rendering as is currently done
on the web is not the best option; a video-like style uses pale text
on a dark or coloured background with characters based on thick
lines, with no serifs or limited serifs (terminal-font-like), may
optionally use character outlines for readability on any image
backgrounds, and can be easier to read for some users with impaired
sight as well as be more suited to some display types, a paper-like
style uses dark text on a pale background with characters of varying
width and varying serif styles; most operators of remotecontrol_based
device class clients would likely opt for the video-like mode and
operators of other device classes would use either, but users may
have varying reasons, as stated previously, to choose either mode.
Edge-server operators could maintain a list of problematic remote-
server operators, used as a black list, which can help avoiding
client operators being defrauded by unknowingly connecting to
fraudulent online services. The platform should include a
standardized payment mechanism used for services operated on a
commercial basis. When a client operator opens a trasaction requiring
a payment with a remote-server operator, there should be a mechanism
for the edge-server operator to bill the client; the edge-server
operator could then act as an escrow and wait until the service has
been supplied in a satisfactory manner to transfer the funds to the
service provider / remote-server operator; this will push the service
providers / remote-service providers and client operators to behave
properly as opposed to the far-west that is the current state of
online business. For cases where client-devices need to have managed
or limited payment initiating capabilities such as in internet caf�s
(where a client operator would first need to pay the caf� employee to
make funds available before initiating a payment), there should be a
mechanism for one client-device to manage the payment initiating
capabilities of other client devices. When remote-servers announce
the services which are available from them, there should be a
mechanism to indicate if the service is fully free, fully paid or
partly free and partly paid, it should also indicate if the free part
has advertizing or not and if the paid part has advertizing or not.
The mechanism should also allow indicating to which standardized
category it belongs, there should be at least the following six
categories (and maybe others): online-shopping, for-profit
transactional accounts (banking, commercial utilities, etc.), non-
commercial accounts (accounts at municipalities, provinces,
countries, NGOs), non-commercial media, commercial media and other.
This would allow easy classifying and finding of the services. The
standard should specify some programming languages for execution on
the edge-servers and clients. There should be several interpreted
programming languages supported as standard and an intermediate
representation language to support software written in other
languages and compiled to the intermediate representation (this will
avoid having one or more of the interpreted programming languages
serving as a de facto intermediate language which is inefficient).
For maximum portability and interoperatability, the intermediate
representation language should be text based rather than binary, it
should be endian-neutral by implementing program adressing using
labels instead of hard adresses and data adressing by using named
variables instead of hard adresses, by supporting strings as a native
type and having numerical values (signed/unsigned ints/floats of
varying lengths) specified in hexadecimal encoded big endian format
(the most readable) and converted to the native binary format of the
appropriate endianness by the back-end compiler running on the client
and which generates the binary which is to run on the client-hosted
virtual machine. The intermediate representation language should
ideally be statically and strongly typed, with type handling left to
the frontend compiler (the feaseability of handling type definitions
and conversions in a compiler has been shown with the Nim and Crystal
compilers). On the other hand, the intermediate representation
language should be garbage collected as leaving the memory management
to the software developper or the front-end compiler risks corrupting
the memory of client-device (unless there is a garantee that the
client has memory protection); the client should handle memory
through compile-time garbage collection (in the back-end compiler
producing the binary code for the virtual machine) or run-time
garbage collection (inside the virtual machine). There should also be
a mechanism for loading shaders on the client GPU from within the
intermediate representation language, the best shader format probably
being a slightly modified SPIR-V assembly, which would be endian-
neutral, again by handling code adressing through labels instead of
real addresses, by handling data adressing by using named variables
instead of real addresses and by having numerical values written in
big-endian format (the most readable) in the transfered code and
converted to the final endianness by the client device before
assembling the shader. For the interpreted languages, endian-
neutralness can be handled by first choosing languages which do not
allow direct manipulation of addresses, which is the case of most
high engough languages and second by again having numerical values
written in big-endian format (the most readable) in the transfered
code and converted to the final endianness by the client device
before interpreting the software code. The interpreted programming
languages should have the choice to load the same type of GPU shader
as the intermediate representation language or use a mid-level
library. There should be hardware-accelerated OpenMAX DL (with a
generic DCT/IDCT extension not tied to a particular use unlike the
current versions which are for JPEG, MPEG4 AVC and MPEG4 SP only) /
OpenSL ES, Vulkan and OpenVG available. As for the choice of
interpreted programming languages, the following list might be a
sensible choice: Python, which has become the high-level interpreted
programming language of choice in the unix-like OSes community, Ruby,
which positions itself as the competitor to Python and is used by
those who wish to avoid Python, ISLisp as those programmers who do
not identify with the unix culture often are adepts of Lisp and
ISLisp is lightweight (and as such better suited to this use case)
and consists of the common subset of the major Lisp variants,
finally, the language Mercury, as it would put a purely declarative
language on the list as an alternative to the imperative or hybrid
languages, as it allows the use of three declarative programming
paradigms (logic, functional and the declarative sub-variant of
object-oriented) and, unlike most purely, declarative languages, it
has a bit of uptake in the industry and outside of the academic
world; of course other languages can be chosen. It may make sense to
standardize the use of the same interpreted programming languages and
intermediate representation language for the software running on the
edge-servers as for that running on the clients as it would ease the
development process. Big companies may use remote-servers, edge-
servers (one per site) and on-site clients as an alternative to
networked desktop computers, or they can let telecomuters use their
own client and associated edge-server, to connect to the company
remote-server to work on it, this approach of edge-server and client
has the potential to replace part of the desktop market. As a last
point, the case of DRM. DRM is fundamentally wrong and constitutes a
stupid and useless idea, however, if the copyright industry is going
to force it on a platform somewhere, it should rather be on this
platform than elsewhere, it should definitively not be allowed on the
other platform described previously (the one for hyperlinked,
structured and semantically encoded documents), where openness is
paramount. While trying to protect the "intellectual property" of the
copyright industry, when implemented on user-owned client devices,
DRM violates the physical property rights of the user. Client devices
can come in two forms, user-owned and user-rented, when owned by the
user, the user should be in control of the device, when user-rented,
the owner renting-out the device defines the device-use limitations,
having DRM on user-rented devices is a lesser evil. It could be
decided that DRM on user-owned devices is prohibited while still
allowing DRM on user-rented devices (the copyright industry would be
free to make some content only available on rented devices if they
really want to). While this is not a technical decision but a legal
one, and as such, out of the scope of the people reading this, the
various entities involved with the process can make it their official
position that DRM on user-owned devices should be prohibited and, as
such, help push for this legal concept. A broader deployment of
rented client devices would probably resonate well with the public in
this day and age. This is the time of XYZ-as-a-service all over the
place, so having "services access as a service" would be bringing the
concept to its ultimate level.

May XML live-on till the end of times.

Rapha�l Hendricks

References:
- The problems and the future of the web and a formal internet technology proposal
  - From: Rapha�l Hendricks <rhendricks@netcmail.com>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]