OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Infinity

One thing accomplished by the statement about the finite length of lexical 
representations is that it guarantees that the space of lexical representations
is a regular language and can be recognized by a finite state automaton.

It is possible to define success conditions for infinite strings read by
finite state automata (success can be that from some point in the
string forward, every state reached is an accept state; success can be
that at any point in the string, a success state will be reached by the
addition of zero or more characters; these turn out to have analogues
in temporal logic).  But regular languages as usually defined today do not 
include infinite strings.  

The work by Kleene that defined regular sets and other work in the 
fifties and sixties contemplates infinite sequences, if I remember correctly, 
but infinite sequences dropped out of focus before too long.  There is 
a beautiful book by Paul Postal and Terry Langendoen called “The Vastness 
of Natural Languages” that attacks the assumption that all sentences of 
natural languages have finite length; they observe correctly that the 
machinery of the Chomsky hierarchy of grammars and recognizers 
won’t work for such strings, but they oppose Chomsky’s claim that 
languages including infinite-length sentences are undescribable.

The desire to ensure that the lexical spaces of XSD’s simple types are
recognizable by FSAs (or in some cases context-free grammars) was a 
conscious factor for at least some WG members (me); whether 
others supported the wording for that reason or for some other 
reasons, I do not know.

Michael Kay asked what would have to change in his system if the
restriction to finite-length strings were removed.  I think that if the
restriction were removed he would need either to decide how to 
determine whether the success conditions for infinite strings were
met, which would presuppose a way of accepting infinite strings as input,
or else a way to raise an out-of-resources error when presented with
any infinite string.  

I would not like to have to write the test cases for checking that
a conforming processor either accepts a string consisting of an
infinite number of occurrences of the digit ‘0’ followed by the digit
‘5’, or else raises an out-of-resources error, and that the processor 
rejects a string consisting of the letter ‘B’ preceded by an infinite 
number of occurrences of the digit ‘0’.  

I’m kind of tired at the moment, so perhaps I’m overlooking something,
but I cannot think of any infinite length strings that can be plausibly
interpreted as denoting integers unless they are semi-infinite
and consist of a finite sequence of decimal digits preceded by
an infinite number of zeroes.  Even such a generous interpretation
won’t allow use to accept a sequence that has no ending
point (decimal point); such sequences won’t denote any integer
no matter what we do.  And I think the conventional definition of
decimal notation covers finite strings only.

I think when ERH says the restrict of the lexical representations to
finite-length strings is mathematically unnecessary, he is assuming
that one can just say a string of characters is a legal representation
of a value in the value space of xsd:integer if and only if it denotes 
an integer.  He may be right, but I think that Dave Peterson led the
WG to a higher standard on this in XSD 1.1 by providing independent
definitions of the value space, the lexical representations of values,
and the mapping from lexical representation to value, and reducing
significantly the amount of hand-waving in the spec.  I am not sure
Dave gets the credit he deserves for his work on the 1.1 Datatypes

> On Mar 3, 2018, at 11:56 AM, Elliotte Rusty Harold <elharo@ibiblio.org> wrote:
> ...
> I suspect the "finite length" verbiage is there to prevent someone
> from feeding a parser an unending stream of digits. It's not
> mathematically necessary.
> On Sat, Mar 3, 2018 at 4:22 AM, Michael Kay <mike@saxonica.com> wrote:
>> A question of no practical relevance for those on the list who are more
>> mathematically inclined than I am.
>> From XSD 1.1 part 2 §3.3.13
>> (a) The ·value space· of integer is the infinite set {...,-2,-1,0,1,2,...}.
>> (b) integer has a lexical representation consisting of a finite-length
>> sequence of decimal digits (#x30-#x39) with an optional leading sign.
>> So the value space is infinite, but the lexical space is finite. Or is it?
>> Perhaps the set of finite-length strings is itself infinite?
>> Does every integer in this infinite set have a finite-length lexical
>> representation? Or are there integers in the value space that have no
>> representation in the lexical space?
>> Whenever I read this, I think, why is that adjective "finite-length" there?
>> Would I have to change my software if it were removed?
>> Michael Kay
>> Saxonica
> -- 
> Elliotte Rusty Harold
> elharo@ibiblio.org
> _______________________________________________________________________
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
> subscribe: xml-dev-subscribe@lists.xml.org
> List archive: http://lists.xml.org/archives/xml-dev/
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php

C. M. Sperberg-McQueen
Black Mesa Technologies LLC

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS