XML.orgXML.org
FOCUS AREAS |XML-DEV |XML.org DAILY NEWSLINK |REGISTRY |RESOURCES |ABOUT
OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]
Re: [xml-dev] What is Data?

Costello, Roger L. wrote:
> Hi Folks,
>   
Have a look at http://dictionary.reference.com/browse/data and
http://dictionary.reference.com/browse/concept?jss=1
> Below is a definition of data, based on our recent discussions. I ask for your comments on these aspects:
>
>     1. Is the definition factually correct?
>   
I believe you need to be more specific in terms of what level of
abstraction and what perspective you intend.
>     2. Is it general? Are there any hidden assumptions 
>        that restricts the generality of the definition?
>   
Nope, it is not.
>     3. Is it complete? Is there anything else you would 
>        add to the definition?
>   
Maybe. I think you are messing it up with the concept of concepts.
>     4. Is it clear and easy to understand?
>   

> /Roger
>
>
> xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>         What is Data?
> xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>
> When you represent an entity, you've created data. 
>
> When you represent an attribute of an entity, you've created data. 
>
> When you represent a relationship of an entity, you've created data.
>
>
> EXAMPLE
>
> There is a fellow over there; here's some data about him:
>
> John Smith
> Six feet tall
> Father of Mary
>
> "John Smith" represents an entity (the fellow over there). Specifically, it represents the entity by his name.
>
> "Six feet tall" represents an attribute of the entity. 
>   
Could also be a relation, "John Smith is six feet tall".
> "Father of" represents a relationship between the entity and another entity (Mary).
>
>
> A SUCCINCT DEFINITION
>
> Data represents entities, attributes, and relationships.
>   
>
> Every piece of data can be categorized as either a representation of an entity, attribute, or relationship.
>   
It depends on the context of interpretation which can be structured into
level of abstraction and perspective.
> Entities, attributes, and relationships are intrinsic (innate, inseparable) parts of data, just as iron and carbon are intrinsic parts of steel.
>   
Not necessarily, "fed" is data, but it includes nothing about the
attributes or relationsships about the data. It may be a hexadecimal
number representing 4077 on base 10 or it may be the word "fed". These
interpretations of data depends on the context. Further, if the data
contains information also depends on the context. There are forgotten
languages that we cannot interpret. There are documents (I believe in,
for example, Mohenjo-Daro), but we cannot interpret them.

Self-describing data is slightly different, BUT they too need a context
such as the XML to be interpretable and carry meaningful information.

So, there entities, attributes and relationsships are not intrinsic
parts of data.
> A representation of a relationship is not more important than a representation of an entity or attribute, i.e. links are not more important than the entities they relate. 
>   
Universally, no. In specific cases, it depends. In the expression
"a+b=b+a" the relation is more important compared to the single facts
therein. Obviously, we also need to introduce level of abstraction and
perspective when we interpret data. On one level the expression is a
relation, but on another level it can be a single fact (similar to an
entity).
> Relationships are not different in nature than entities or attributes. 
>   
Ah, see my previous comment.
> Relationships, entities, and attributes have equal weight.
>   

Not necessarily.
>
> xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>         What's Not Data?
> xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>
> The following description of a book is not data, although it contains data: 
>
>     In this groundbreaking book, evolutionary
>     biologist Jared Diamond stunningly dismantles
>     racially biased theories of human history by
>     revealing the environmental factors actually
>     responsible for history's broadcast patterns.
>
> Here is some of the data:
>
> There is an entity:
>     -	book
>
> It has an attribute:
>     -	innovativeness: groundbreaking
>
> There is an entity:
>     -	evolutionary biologist
>
> It has attribute:
>     -	name: Jared Diamond
>
> It has a relationship:
>     -	this entity is the author of the book entity
>
> And so forth.
>
> This example shows that text can be mined for data. 
>
>
> ANOTHER EXAMPLE
>
> This is not data and it contains no data:
>
>     Run really fast.
>
> The sentence contains a verb followed by an adverb followed by an adjective. Verbs, adverbs, and adjectives are not data.
>
> Data are nouns.
>
>
> xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>         Simplification
> xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>
> Recent research suggests that there may be just two categories of data:
>     1. Entities
>     2. Relationships
>
> An attribute is merely a special case of a relationship.
>   
Yes, that's not new from a conceptual modelling perspective. Typically,
you choose what relations are attributes and what are representated as
relationsships based on pragmatics such as the context in which a
concept is used (is it typically considered to be an attribute or is it
typically considered to be a relation) or the performance for processing
the item (cf. normal forms in relational databases).
>
> EXAMPLE
>
> Above we stated that these represent an entity, attribute, and relationship, respectively: 
>
> John Smith
> Six feet tall
> Father of
>
> Rather than considering "Six feet tall" as an attribute of entity "John Smith", we can consider "Six" to be an entity and there is a relationship (has a height of) between "John Smith" and "Six":
>
> John Smith has a height of Six
>
> Thus, in this example there are two entities ("John Smith" and "Six") and two relationships ("has a height of" and "Father of")
>
>
> xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>         Data and Datum
> xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>
> Data is the plural of datum, a singular item. In practice, however, people use data as both the singular and plural form of the word.
>   
Unfortunately, true.

/Jonas
> _______________________________________________________________________
>
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
>
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
> subscribe: xml-dev-subscribe@lists.xml.org
> List archive: http://lists.xml.org/archives/xml-dev/
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>
>   


-- 
Carpe Diem!
===
Jonas Mellin, Assistant Professor in Computer Science
School of Humanities and Informatics, Building E-2
University of Skövde, P.O. Box 408, SE-541 28 Skövde, Sweden
Phone: +46 500 448321, Fax: +46 500 448399
PGP Public Key: http://www.his.se/PageFiles/19377/Jonas_Mellin.asc
Email: jonas.mellin@his.se, URL: http://www.his.se/melj, 

----BEGIN GEEK CODE BLOCK----
GCS d s a+ C++ UL++ US++ P++ L++ E++ W++ N+ o K- w++ O- M V-- 
PS- PE+ Y+ PGP t+ 5 X R* tv- b++ DI+ D+ G+ y++++ e++++ h--- r+++
----END GEEK CODE BLOCK----


OpenPGP digital signature



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 1993-2007 XML.org. This site is hosted by OASIS