I work for a major publishing company and we are storing content as an xml clob, which we are indexing with oracle 8i. The clob is called from the application and xsl is used to display it. My questions are as follows:
1) Is this a good strategy
2) Should character references be stored for special characters or should it just be pure characters (ie. should we store — for an emdash character or should we store actual utf-8 encoded characters only)
3) How do we manage the interchange between persisting and displaying of ampersands, quotes, and apostrophes. ( That is, displaying/searching/storing Tom & Jerry "War and Peace" Dylan's Crossing). Here there are issues with quotes and passing values to a DB, issues with filling form fields with retrieved information, and displaying that same form information.
Let me give an example of the problems we are finding:
a) I enter the name of a title into a form field, and the name includes quotes: "War and Peace"
b) When I click submit the value is saved in session, but in what form?:
"eWar and Peace"
\"War and Peace\"
and in which form is it redisplayed in the source html?
c) When I click submit on the next page the data goes to the DB, but does it get sent as characters, character references, escaped characters? How is it stored? How should I handle quote escaping overall for submissions to the DB
d) When I go to retrieve the title will I search for "War and Peace" or "eWar and Peace"
I am looking for an overall approach/strategy to store and escape at various levels to achieve the appropriate display, the appropriate data to search from, and the appropriate form to pass the dfata around before it goes to the DB