[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] What is the meaning of a txt file? Parallels betweenthe XML Schema hexBinary type and txt files ...
- From: Jim DeLaHunt <list+xml-dev@jdlh.com>
- To: xml-dev@lists.xml.org
- Date: Thu, 13 May 2021 13:19:33 -0700
I can't tell whether this is a good-faith invitation to a discussion, or
word games.
On 2021-05-12 05:39, Roger L Costello wrote:
Hi Folks,
Yesterday a colleague asked me for "data". When I asked what kind of data, she said, "txt files".
That got me to thinking, "What is a txt file?" Here are some thoughts.
…
> "Subject: What is the meaning of a txt file?"
Answer: it depends on what concept you are trying to express with the
word "meaning". In other words, on what is your meaning for "meaning".
And, what are you trying to convey by the spelling "txt"? Are you
talking about text files in general? Or files which have "txt" as the
extension part of their filename? You sure sound like you are talking
about text files in general. But later you will distinguish "txt" and
"text file", so I'm not sure. And are you talking about file contents
which conform to the conventions implied by their filename extension, or
pathological cases where the file contents are at odds with the filename
extension? I can put data in JPEG format into a file with the filename
extension ".PPT". Does that make it "a PPT file"?
The format of data in a txt file is unspecified.…
I disagree. The format of data in a text file (or a file named with
extension ".txt") is specified, but the specification is specific to the
application and how it uses the conventions of the text file genre.
Different applications use different specifications which reading and
writing text files. The term "text file", separated from a particular
software's usage of text files, refers to a category or genre of data
formats. There are many differences, e.g. which text encoding relates
byte values in the file to characters at the information level, or what
characters indicate line endings and paragraph endings. There are some
similarities: the data in the file represents characters. It does not
represent formatting, such as font choices or bolding or italics or
justification. The data can deliver formatting by its use of space and
line ending characters and simple graphics drawing using characters (">
" to indicate a quote). The characters often conform to a higher
protocol (e.g. XML data, or written English language).
A typical JPEG file is not by any reasonable definition a text file. It
is difficult to interpret the byte contents of a JPEG file according to
the conventions of the text file genre.
So, the format of data in "a text file", without further clarification,
refers to a genre of formats. It is ambiguously specified. The format
of data in "a text file saved by LibreOffice 7.1.2.2 for macOS with the
UTF-8 setting" is much better specified.
Contrast that to the format of a JPEG file which is well-known and specified by the JPEG specification.
The JPEG specification is a lot more specific and complex then the text
file specification by a typical application. Sure. But it leaves some
things unspecified: how many cm wide the image will be when you display
it, the exact colours your display will use to render the colour values
in the image, etc. So stand back and squint: the format of a JPEG file
specifies some things and leaves others unspecified. The format of files
in the text file genre also specify some things and leave others
unspecified.
…Likewise for …, XML, JSON, CSV, and thousands of others. But a txt file has no inherent format; if you are given an arbitrary txt file, all you know is that it holds bytes that can be rendered as character symbols.
You realise that XML, JSON, and CSV files are all text files as well?
And, "it holds bytes that can be rendered as character symbols" is a
specification — vague, ambiguous, broad, but more than nothing.
A JPEG file has, in a sense, a meaning: it is an image.
So we return to what you mean by "meaning". It sounds it is: knowing the
name of the format tells enough about the structure of the file contents
to be satisfying.
…But what is the meaning of a txt file? It has no preordained meaning. In a sense, a txt file is meaningless. A txt file is the most primitive of text files.
This sounds to me like a cavil about the meaning of "meaning". And maybe
a cavil about the distinction between "txt file" and "text files".
…Txt files are useful in those cases where you have a grab bag of data that you want stored and possibly transported.…
They are also pretty useful when you want to contain text in a
human-readable written language, and/or when you want text in a
software-readable form.
… txt files and the XSD hexBinary type… Neither can be interpreted without additional information.
I think it's a true statement about most information, that it cannot be
interpreted without additional information. Everything exists in a
context. That JPEG file cannot be interpreted as an image of a "flower"
or a "dog" without additional information.
…I welcome your comments.
I stand ready to defend the honour of the humble but powerful text file
format genre against the charges of "unspecified" and "meaningless".
Best regards,
—Jim DeLaHunt
--
. --Jim DeLaHunt, jdlh@jdlh.com http://blog.jdlh.com/ (http://jdlh.com/)
multilingual websites consultant
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]