XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] Formatless files

On 09/08/2022 21:01, Roger L Costello wrote:
A file has no inherent format.

The format of a file is determined by the programs that use it.

Since file types are not determined by the file system, the "kernel"
can't tell you the type of file: it doesn't know.
Yet the Unix 'file' [1] command has been doing a pretty good job of it
since 1973. [2]

The 'file' command firstly uses filesystem tests to determine if a file
is empty or is a special file, such as a socket or a symbolic link.

It secondly uses 'magic' tests to detect the file type. The 'file'
manpage includes:

The magic tests are used to check for files with data in particular
fixed formats.

The magic tests use "magic patterns" from a 'magic' file. [3]

The 'magic' file on my Linux system includes 11 patterns that start with
'<?xml'. They are mostly followed by other tests to try to determine
the type of XML, e.g.:

0 string \<?xml\ version="
>15 string >\0
>>19 search/4096 \<svg SVG Scalable Vector Graphics image

There's even a test for '<?XML' that will be reported as 'broken XML document'.

Regards,


Tony Graham.
--
Senior Architect
XML Division
Antenna House, Inc.
----
Skerries, Ireland
tgraham@antenna.co.jp

[1] https://www.man7.org/linux/man-pages/man1/file.1.html
[2] https://www.man7.org/linux/man-pages/man1/file.1.html#HISTORY
[3] https://man7.org/linux/man-pages/man4/magic.4.html


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS