Memorable quotes from Balisage 2013

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

From: "Costello, Roger L." <costello@mitre.org>
To: "xml-dev@lists.xml.org" <xml-dev@lists.xml.org>
Date: Fri, 9 Aug 2013 21:35:14 +0000

Hi Folks,

Balisage 2013 was excellent. Below are some memorable quotes that I jotted down during the conference.

(Balisage attendees: please let me know of any mistakes in the below quotes. Also, please let me know of other memorable quotes that you jotted down).

----------------------------
Memorable Quotes
----------------------------
1. In software development, the focus is often on trying to avoid breaking the code rather than writing the code you need.

2. "No one wants to learn your markup language." That is, users don't want to think about elements and attributes, they just want to focus on creating the content.

3. The value of open source has been oversold. Proprietary is about taking responsibility of the product (contrast with open source, where no one takes responsibility).

4. DFDL (data format description language) is an OpenGrid approach to getting non-XML data formats (such as JSON) into an XML format. The DFDL approach is to utilize XML Schema and add extra stuff to it. "Not sure how much the JSON folks like that." (The JSON folks hate XML Schema, so the quoted statement was said tongue-in-cheek.)

5. Infosuicide is where one deletes all traces of oneself from the Web - remove websites, email, Facebook, Twitter, etc. Example: Mark Pilgrim committed infosuicide; it is no longer possible to find him on the Web (see this article about Mark: http://www.hanselman.com/blog/410GoneThoughtsOnMarkDiveintomarkPilgrimsAndWhysInfosuicides.aspx).

6. Syntax is anything you can process, semantics is everything else.

7. 99% of humans think that a number is a decimal. We should be designing for humans, not computers. So use the decimal data type!

8. Namespaces is an enormous sledgehammer to solve a very small problem -- the collision of two names within a single XML document (the occurrence of which is highly unlikely).

9. There needs to be a shift in programming: programming must be secondary, data must be primary.

10. JavaScript in the browser is a platform you can build on and it's fast.

11. XSLT is an event-based programming language: templates are fired upon encountering a node.

12. There are currently many metadata vocabularies. However, we need to have a standard metadata vocabulary so that we know it will last.

13. People value things they pay for.

14. The cloud is an unreliable remote server that you can't always reach.

15. XML is about expressing the problem in the way that people think about the problem, whereas HTML is about expressing the problem in the way the browser thinks about things.

16. Who said something (i.e., the authority behind the statement) is more important than an estimation of the probability that the statement is correct (which is usually subjective).

17. In the semantics community they have this term "semantic lifting. Semantic lifting means that one extracts RDF triples out of a document (which is merely a conveyance mechanism), thereby giving the triples semantics. Nonsense! Symbols get their meaning from their context, symbols don't get meaning by extracting them out of a document, which is analogous to saying that a string gains semantics by making them longer (which is obviously nonsense).

18. Research has shown that you will concentrate better if you remove your socks and shoes. The explanation for this is that shoes and socks keep the feet warm, thus blood flows into the feet. By removing socks and shoes, more blood flows to the brain.

---------------------
Cool Websites
---------------------
1. https://medium.com/about/9e53ca408c48

"Medium" is a new place on the Internet where people share ideas and stories that are longer than 140 characters and not just for friends. It's designed for little stories that make your day better and manifestos that change the world. It's used by everyone from professional journalists to amateur cooks. It's simple, beautiful, collaborative, and it helps you find the right audience for whatever you have to say.

2. http://about.travis-ci.org/docs/user/getting-started/

Travis CI is a hosted continuous integration service for the open source community. It is integrated with GitHub. The Travis CI environment provides multiple runtimes (e.g. Node.js or PHP versions), data stores and so on. Because of this, hosting your project on travis-ci.org means you can effortlessly test your library or applications against multiple runtimes and data stores without even having all of them installed locally.

3. https://www.heroku.com/

Heroku provides you with all the tools you need to iterate quickly, and adopt the right technologies for your project. Build modern, maintainable apps and instantly extend them with functionality from hundreds of cloud services providers without worrying about infrastructure. Putting new features into production has never been easier. Set up staging and test environments that match production so you can deliver functionality without fear, and continuously make improvements.

4. http://en.wikipedia.org/wiki/Parsing_expression_grammar

In computer science, a parsing expression grammar, or PEG, is a type of analytic formal grammar, i.e. it describes a formal language in terms of a set of rules for recognizing strings in the language. The formalism was introduced by Bryan Ford in 2004[1] and is closely related to the family of top-down parsing languages introduced in the early 1970s. Syntactically, PEGs also look similar to context-free grammars (CFGs), but they have a different interpretation: the choice operator selects the first match in PEG, while it is ambiguous in CFG. This is closer to how string recognition tends to be done in practice, e.g. by a recursive descent parser.

5. http://www.elasticsearch.org/overview/

elasticsearch is a flexible and powerful open source, distributed real-time search and analytics engine for the cloud.

6. http://www.amazon.com/Pattern-Language-Buildings-Construction-Environmental/dp/0195019199/ref=sr_1_1?s=books&ie=UTF8&qid=1376077369&sr=1-1&keywords=a+pattern+language

"A Pattern Language" offers a practical language for building and planning based on natural considerations. The reader is given an overview of some 250 patterns that are the units of this language, each consisting of a design problem, discussion, illustration, and solution. By understanding recurrent design problems in our environment, readers can identify extant patterns in their own design projects and use these patterns to create a language of their own.

7. http://en.wikipedia.org/wiki/Earley_parser

In computer science, the Earley parser is an algorithm for parsing strings that belong to a given context-free language. The algorithm, named after its inventor, Jay Earley, is a chart parser that uses dynamic programming; it is mainly used for parsing in computational linguistics. It was first introduced in his dissertation (and later appeared in abbreviated, more legible form in a journal).

Earley parsers are appealing because they can parse all context-free languages[discuss], unlike LR parsers and LL parsers, which are more typically used in compilers but which can only handle restricted classes of languages. The Earley parser executes in cubic time in the general case {O}(n^3), where n is the length of the parsed string, quadratic time for unambiguous grammars {O}(n^2), and linear time for almost all LR(k) grammars. It performs particularly well when the rules are written left-recursively.

8. http://en.wikipedia.org/wiki/CYK_parser

In computer science, the Cocke-Younger-Kasami (CYK) algorithm (alternatively called CKY) is a parsing algorithm for context-free grammars, its name came from the inventors, John Cocke, Daniel Younger and T. Kasami. It employs bottom-up parsing and dynamic programming.
The standard version of CYK operates only on context-free grammars given in Chomsky normal form (CNF). However any context-free grammar may be transformed to a CNF grammar expressing the same language (Sipser 1997).

The importance of the CYK algorithm stems from its high efficiency in certain situations. Using Landau symbols, the worst case running time of CYK is \Theta(n^3 \cdot |G|), where n is the length of the parsed string and |G| is the size of the CNF grammar G. This makes it one of the most efficient parsing algorithms in terms of worst-case asymptotic complexity, although other algorithms exist with better average running time in many practical scenarios.

9. http://en.wikipedia.org/wiki/GLR_parser

A GLR parser (GLR standing for "generalized LR", where L stands for "left-to-right" and R stands for "rightmost (derivation)") is an extension of an LR parser algorithm to handle nondeterministic and ambiguous grammars. First described in a 1984 paper by Masaru Tomita, it has also been referred to as a "parallel parser". Tomita presented five stages in his original work, though in practice it is the second stage that is recognized as the GLR parser.

Though the algorithm has evolved since its original form, the principles have remained intact: Tomita's goal was to parse natural language text thoroughly and efficiently. Standard LR parsers cannot accommodate the nondeterministic and ambiguous nature of natural language, and the GLR algorithm can.

Follow-Ups:
- Re: [xml-dev] Memorable quotes from Balisage 2013
  - From: "Simon St.Laurent" <simonstl@simonstl.com>
- Re: [xml-dev] Memorable quotes from Balisage 2013
  - From: John Cowan <johnwcowan@gmail.com>
- RE: Memorable quotes from Balisage 2013
  - From: "Len Bullard" <cbullard@hiwaay.net>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]