[
Lists Home |
Date Index |
Thread Index
]
I image thi shas been covered a thousand times on this list over the
years, but the search feature in the archives is broke, so I'm going to
have to ask this again... sorry.
I have lots (4k+) magazine articles with embedded HTML stored in a dabase
right now.
The HTML is not uniform is quality.
I'm using TIDY to "clean it up" so I have a uniform xHTML style.
These articels also have CSS tags embedded in them, I want to pull them out.
These articles are used across our several websites. All have the same look.
1) do I leave these in xHMTL format (wit or without CSS tags)?
2) do I "reduce" this to "pure" xml...
<article>
<author id="1234>
<name>Walter</name>
...
</author>
[other publishing info]
<content>
<para>
lots of text here
</para>
<sub-head>
sub-head here
</sub-head>
<para>
lots of text here
</para>
</content>
</article>
or something like this
I'd like to use some "publishing standard" so other organizations can gain
access to our articles without much fuss.
Anyone have any ideas/pointers/URLs/etc they can share?
Thanks
Watler
PS: Also, how would new articles be created? The writers are not XML
codeers, nor should they be? How do you folks solve this?
|