Good question! It depends on the content. Most human-readable texts are broken down into conceptual units already; articles, sections, chapters, entries, etc. We try to pick one that will at least fill the screen with text, and then impose maximum size constraints based on the delivery channel's capacity. It's not just viewing though that informs the choice; search often figures into it as well. Ideally search results are 1-1 with viewable chunks; this leads to a natural, easily-grasped interface, and makes search implementation straightforward. Sometimes texts (like novels) don't have natural breaks; in these cases search is less important, reading more so, and we just paginate according to the user's viewport size. Other texts impose their own specific chunking requirements (enormous court documents; dictionaries where you can search entries, senses (within an entry) or quotations (within a sense)) that fight against the simple rules. In these cases we try to recast the problem in more familiar terms, sometimes chunking at multiple levels at once for search, but displaying using anchors or pagination within a larger chunk. Machine to machine I think is informed by a different set of considerations: transaction boundaries, channel capacity again, ability to rollback and retry, etc. Basically a compromise between performance (large messages will tend to be more performant, up to memory limits), and robustness (small messages make a smaller crater when they fail). As far as human-machine, it does also depend to a certain extent on the software. Word can handle much larger documents than in-browser editors, and features like autosave can mitigate the failure to save a large document, but generally speaking I'd say chunk size here is similar to the human-human piece. I do sometimes end up poking around in 50MB xml documents in emacs, sometimes even changing something, and it works fine, but I don't think that's a typical use case? I find that 100MB is pretty much the limit for that sort of thing. -Mike On 1/5/2012 7:14 PM, Len Bullard wrote: 8044FBBA608F4BAEACD54B9453165FD9@LenBullardPro" type="cite"> |