Hi Tom, > What would people recommend to help a) ensure that the sample *is* representative and b) help target/prioritise the work on the transformation? > I'm interested in how others approach the problem - any ideas? We resolve this problem in a couple of ways. Firstly, we ask our clients to commit to an upfront discovery/analysis session. During this session we will have the client present a range
of samples, and in return we ask a lot of questions. The result of that session is a deep understanding of the content and its meaning, and we document this for reference during the development/conversion process. Secondly, we would ask for a DTD/schema for the source format. If none were available, then we build our own as you have suggested. There are tools that are able to examine a set of files
and build a schema that accounts for all tagging variations encountered. The idea is that the schema forms a contract between ourselves and the client as to what we will convert. Changes to the contract (and consequent pricing variations) are then able to
be accounted on that basis. Regards prioritisation of the transformation, that would be something we work out with the client. For example, they may want a phased delivery of various document types, or phased delivery
by content features (e.g. ignore tables in the first round). It varies job-to-job as to what best suits the client. If instead you are asking about managing the prioritisation of transformation requirements, then that is requirements management and there are tools for that. For our simpler jobs that
can even be Trello or a spreadsheet. For some of our more advanced jobs, the requirements were tracked in an XML format and linked into the XSLT by @id so we could trace from stated requirements through to their actual implementation. I hope that helps. Regards, // Gareth Oakes // Chief Architect, GPSL // www.gpsl.co |