
David's solution is perfectly OK for me. It is sufficient that PG does not discourage keeping the extra information (it did until recently). The volunteers will do the rest. An importaint improvement would be to be able to go easily from the text to the corresponding page scan. Just having the two separately is fine, but having them linked is better; going from image to txt is easy (search), but the converse is often hard. There are of course different solutions. All require, in some form, to preserve the page information, including page numbers in the source is just one method. Another remark, on page scans obtained from other sources: one of these sources, the one that I mostly use, and that has originated hundreds and probably thousands of PG books, is the french national library, http://gallica.bnf.fr. I have received (by email) a ratheer broad permission to use everything on the site to produce ebooks for DP and PG, and related sites (I have used the permission for LiberLiber and DP-EU). It might be possible to renegotiate the permission, but might result in a restriction of the terms. But I believe that the original permission could cover the possibility of giving to the user the possibility of checking an individual page for comparison, not of mirroring their files, once the transcription completed; these files can very well obtained from the origin. The french national library is not expected to die or to become unavailable: and in that case we have the image files. Carlo