[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
What is Data?
- From: "Costello, Roger L." <costello@mitre.org>
- To: "'xml-dev@lists.xml.org'" <xml-dev@lists.xml.org>
- Date: Mon, 31 Aug 2009 08:23:38 -0400
Hi Folks,
Below is a definition of data, based on our recent discussions. I ask for your comments on these aspects:
1. Is the definition factually correct?
2. Is it general? Are there any hidden assumptions
that restricts the generality of the definition?
3. Is it complete? Is there anything else you would
add to the definition?
4. Is it clear and easy to understand?
/Roger
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
What is Data?
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
When you represent an entity, you've created data.
When you represent an attribute of an entity, you've created data.
When you represent a relationship of an entity, you've created data.
EXAMPLE
There is a fellow over there; here's some data about him:
John Smith
Six feet tall
Father of Mary
"John Smith" represents an entity (the fellow over there). Specifically, it represents the entity by his name.
"Six feet tall" represents an attribute of the entity.
"Father of" represents a relationship between the entity and another entity (Mary).
A SUCCINCT DEFINITION
Data represents entities, attributes, and relationships.
Every piece of data can be categorized as either a representation of an entity, attribute, or relationship.
Entities, attributes, and relationships are intrinsic (innate, inseparable) parts of data, just as iron and carbon are intrinsic parts of steel.
A representation of a relationship is not more important than a representation of an entity or attribute, i.e. links are not more important than the entities they relate.
Relationships are not different in nature than entities or attributes.
Relationships, entities, and attributes have equal weight.
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
What's Not Data?
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
The following description of a book is not data, although it contains data:
In this groundbreaking book, evolutionary
biologist Jared Diamond stunningly dismantles
racially biased theories of human history by
revealing the environmental factors actually
responsible for history's broadcast patterns.
Here is some of the data:
There is an entity:
- book
It has an attribute:
- innovativeness: groundbreaking
There is an entity:
- evolutionary biologist
It has attribute:
- name: Jared Diamond
It has a relationship:
- this entity is the author of the book entity
And so forth.
This example shows that text can be mined for data.
ANOTHER EXAMPLE
This is not data and it contains no data:
Run really fast.
The sentence contains a verb followed by an adverb followed by an adjective. Verbs, adverbs, and adjectives are not data.
Data are nouns.
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Simplification
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Recent research suggests that there may be just two categories of data:
1. Entities
2. Relationships
An attribute is merely a special case of a relationship.
EXAMPLE
Above we stated that these represent an entity, attribute, and relationship, respectively:
John Smith
Six feet tall
Father of
Rather than considering "Six feet tall" as an attribute of entity "John Smith", we can consider "Six" to be an entity and there is a relationship (has a height of) between "John Smith" and "Six":
John Smith has a height of Six
Thus, in this example there are two entities ("John Smith" and "Six") and two relationships ("has a height of" and "Father of")
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Data and Datum
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Data is the plural of datum, a singular item. In practice, however, people use data as both the singular and plural form of the word.
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]