
a concrete example might help... here's the table of contents from "free culture" by lawrence lessig, in zen markup language format, generated automatically from a simple straightforward analysis in about one-half of a second... even though there are 3 levels of headers, they are very clear, indicated by varying indentation (which represents, at the headers themselves, a varying number of preceding blank lines, of course.) text-structures even more complex than the one shown in this outline can be communicated easily by the number of preceding blank lines -- _if_ the rule is followed _consistently_ -- and grokked by routines consisting of just a few lines of dirt-simple code... by the way, just to say something "obvious" that lee probably had not considered before, one of the many ways my routines determine the headers in a digitized text is to look for a "table of contents" section -- usually toward the start of the file, and usually marked with "contents" or "table of contents" as a header -- and then examine that section quite carefully. ends up it does a very good job of telling you what specific phrases "might be" header-lines. and if you're cleaning up the o.c.r. of a p-book, for instance, there are usually _page-numbers_ there too, telling what _page_ each header is on. pretty handy, eh? indeed, in the .pdf of this book, which you can download at http://www.lessig.org, you will see that the page-numbers _are_ there, and chapter 11, chimera, for instance, starts on page 177. like i said, if you know what a header is likely to be, and on what page it is located, it's fairly easy to find. indeed, people have been using the "table of contents" for precisely that reason for several hundred years now. this is just one of the reasons why it ain't that hard to write routines to ascertain the headers in a book. like i said, it sounds very obvious when you hear it. but have you ever heard anyone say it here before? -bowerbird --------------------------------------------- TABLE OF CONTENTS Free Culture Table of Contents License Publisher Page Library of Congress Cataloging Dedication Preface Introduction 'Piracy' Chapter 1: Creators Chapter 2: "Mere Copyists" Chapter 3: Catalogs Chapter 4: "Pirates" Film Recorded Music Radio Cable TV Chapter 5: "Piracy" Piracy I Piracy II 'Property' Chapter 6: Founders Chapter 7: Recorders Chapter 8: Transformers Chapter 9: Collectors Chapter 10: "Property" Why Hollywood Is Right Beginnings Law: Duration Law: Scope Law and Architecture: Reach Architecture and Law: Force Market: Concentration Together Puzzles Chapter 11: Chimera Chapter 12: Harms Constraining Creators Constraining Innovators Corrupting Citizens Balances Chapter 13: Eldred I Chapter 14: Eldred II Conclusion Afterword Us, Now Rebuilding Freedoms Previously Presumed: Examples Rebuilding Free Culture: One Idea Them, Soon More Formalities Shorter Terms Free Use Vs. Fair Use Liberate the Music -- Again Fire Lots of Lawyers Footnotes Hyperlinks Acknowledgments Index About the Author Jacket Typos Corrected Permissions The Dead-Tree Hardback Version of this Work zero markup language -- z.m.l. -- the future of electronic-books --------------------------------------------- p.s. extra points for everyone who realized that -- since the lines in the table of contents section are not to be rewrapped -- that is the reason that all are prefaced with at least one leading space...