[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
Re: [xml-dev] MarkMail: now archiving xml-dev
- From: Jason Hunter <jhunter@acm.org>
- To: "Edward C. Zimmermann" <edz@bsn.com>
- Date: Wed, 28 Nov 2007 15:09:42 -0800
Edward C. Zimmermann wrote:
> Quoting Jason Hunter <jhunter@acm.org>:
>
>> If you divide 60 Gigs by 4,000,000 emails that's 15k per email. That's
>> bigger than I would have guessed an average email to be, but you have to
>> take into account the full headers and the influence of the (relatively
>> few) binary attachments.
>
> Even with "full headers" I think 15k average message size (excluding
> attachments) is suspect.
Only on xml-dev could the results of "du -h" against scp'd files be
taken into question. :)
> A chunk of email headers could-- if one is bothering
> to clean things up-- be excluded as about the path of email transmission
> and not content. In a service its not really of interest to anyone how
> the mail arrived and got bounced around in one's own network--- and often
> we don't want to even publish such information.
On MarkMail we definitely don't need to show the world the full headers
-- but we have found several situations where having the full headers
has been useful. Example: Having full Received headers gives you
insight to when people are (unintentionally) lying with their Date headers.
> My philosophy is to try to tackle whatever representation model is thrown
> at me. Mail is a model. This way I can throw XML, mail and all kinds of
> other inputs into a big heap, search them (exploiting their structure),
> retrieve bits (exploiting their structure for unit of retrieval) and, should
> I desire, convert on the fly into other representations.. With a semantic
> crosswalk one can do some really really wacky things :-)
Sounds fun. Where can I see this in action? (Sorry, I don't know your
background, so when you say "we can..." I don't know where to look.)
-jh-
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]