XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] What is the meaning of a txt file? Parallels betweenthe XML Schema hexBinary type and txt files ...

I can't tell whether this is a good-faith invitation to a discussion, or word games.

On 2021-05-12 05:39, Roger L Costello wrote:
Hi Folks,

Yesterday a colleague asked me for "data". When I asked what kind of data, she said, "txt files".

That got me to thinking, "What is a txt file?" Here are some thoughts.
…
> "Subject: What is the meaning of a txt file?"

Answer: it depends on what concept you are trying to express with the word "meaning". In other words, on what is your meaning for "meaning".

And, what are you trying to convey by the spelling "txt"?  Are you talking about text files in general? Or files which have "txt" as the extension part of their filename?  You sure sound like you are talking about text files in general. But later you will distinguish "txt" and "text file", so I'm not sure. And are you talking about file contents which conform to the conventions implied by their filename extension, or pathological cases where the file contents are at odds with the filename extension?  I can put data in JPEG format into a file with the filename extension ".PPT".  Does that make it "a PPT file"?

The format of data in a txt file is unspecified.…
I disagree. The format of data in a text file (or a file named with extension ".txt") is specified, but the specification is specific to the application and how it uses the conventions of the text file genre. Different applications use different specifications which reading and writing text files. The term "text file", separated from a particular software's usage of text files, refers to a category or genre of data formats. There are many differences, e.g. which text encoding relates byte values in the file to characters at the information level, or what characters indicate line endings and paragraph endings. There are some similarities: the data in the file represents characters. It does not represent formatting, such as font choices or bolding or italics or justification. The data can deliver formatting by its use of space and line ending characters and simple graphics drawing using characters ("> " to indicate a quote). The characters often conform to a higher protocol (e.g. XML data, or written English language).

A typical JPEG file is not by any reasonable definition a text file. It is difficult to interpret the byte contents of a JPEG file according to the conventions of the text file genre.

So, the format of data in "a text file", without further clarification, refers to a genre of formats. It is ambiguously specified.  The format of data in "a text file saved by LibreOffice 7.1.2.2 for macOS with the UTF-8 setting" is much better specified.

Contrast that to the format of a JPEG file which is well-known and specified by the JPEG specification.
The JPEG specification is a lot more specific and complex then the text file specification by a typical application. Sure. But it leaves some things unspecified: how many cm wide the image will be when you display it, the exact colours your display will use to render the colour values in the image, etc. So stand back and squint: the format of a JPEG file specifies some things and leaves others unspecified. The format of files in the text file genre also specify some things and leave others unspecified.
…Likewise for …, XML, JSON, CSV, and thousands of others. But a txt file has no inherent format; if you are given an arbitrary txt file, all you know is that it holds bytes that can be rendered as character symbols.
You realise that XML, JSON, and CSV files are all text files as well?

And, "it holds bytes that can be rendered as character symbols" is a specification — vague, ambiguous, broad, but more than nothing.

A JPEG file has, in a sense, a meaning: it is an image.
So we return to what you mean by "meaning". It sounds it is: knowing the name of the format tells enough about the structure of the file contents to be satisfying.
…But what is the meaning of a txt file? It has no preordained meaning. In a sense, a txt file is meaningless. A txt file is the most primitive of text files.
This sounds to me like a cavil about the meaning of "meaning". And maybe a cavil about the distinction between "txt file" and "text files".


…Txt files are useful in those cases where you have a grab bag of data that you want stored and possibly transported.…
They are also pretty useful when you want to contain text in a human-readable written language, and/or when you want text in a software-readable form.
… txt files and the XSD hexBinary type… Neither can be interpreted without additional information.
I think it's a true statement about most information, that it cannot be interpreted without additional information. Everything exists in a context. That JPEG file cannot be interpreted as an image of a "flower" or a "dog" without additional information.


…I welcome your comments.
I stand ready to defend the honour of the humble but powerful text file format genre against the charges of "unspecified" and "meaningless".

Best regards,
      —Jim DeLaHunt

--
.   --Jim DeLaHunt, jdlh@jdlh.com http://blog.jdlh.com/ (http://jdlh.com/)
multilingual websites consultant



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS