Thanks for that, Simon
On 4/11/2014 9:16 AM, Simon St.Laurent wrote:
(Also, at this point I think most search engines are still treating
HTML as annotated text, though I hear rumors of DOM-building.)
I had heard they were doing at least some rendering in order to deal
with the problem of invisible text spam (text rendered in white or tiny
fonts or using some other trick to make it invisible to readers, and
intended primarily to deceive the search engine). Maybe that doesn't
require a DOM, but at the very least it requires knowing how to apply
CSS to the elements,
It's been a very long time since I talked about this with anyone who
might know, but long ago the key was CSS selectors. There was some
processing of the stylesheet to look for problems. Then the search
engine watched for matches of those selectors as it read the documents.
No detailed tree building, but tracing that flagged common problems.