>it would've been best to retain the _end-line_hyphenates_ too
(or else justification is unworkable), as well as _pagebreaks_
(because one objective is to compare the text with the scans.)

I agree with you that you will have trouble with PDF unless you maintain the original source hyphens, but my understanding was that you were trying to work from the PG txt files – which do not retain the original hyphenations. Recovering original hyphenations should be in theory possible too, but not work that I have looked at yet. The linebreak recovery algorithm I worked on was intended to allow people at DP, for example, if they want to, to resubmit some of the early PG works and run them through DP again. Without automatic recovery of linebreaks one has several days of extremely tedious work reintroducing the original linebreaks.

The other alternative for you is to leave healthy right margins and leave your PDF’s “ragged right” [*very* ragged right!]