Relevant: We can't do real A/B testing, but we can easily do dev/www splitting.

(It's the testing part that's hard, not the A/B part)

On Sep 15, 2024, at 6:42 PM, Greg Newby <gbnewby@pglaf.org> wrote:

Thanks, Joyce. One response below:

On Sun, Sep 15, 2024 at 08:15:17PM +0930, Jacqueline Jeremy wrote:
Thanks for asking for feedback, and thanks to everyone who has
contributed so far.

It's a person who teaches programming,
and a precocious 14-year old student.

It's great for them to have a project to work on. It is always good to
feel useful.

The idea is to do a monthly addition of browsing categories and
summaries for new books that have undergone human cataloging

Can't help but think some submitters would have happily submitted a
summary when uploading their books to PG. Going forward rather than
looking back, once there is reserved space for a summary, could
submitters be given an option to include their own summary in a field
on the upload form, which would then be processed as part of the
header? Sure, it might be more work for automation to accommodate
humans but what's the purpose if humans are made redundant?

It's absolutely possible, and fairly easy, to add an option for a
summary to the upload page. Getting it into the catalog is no
problem - it's just a "500 General Note" or something else in the
500 series.

I think I brought that idea up years ago (probably more than 10) and
there wasn't enthusiasm.

DP, in particular, seemed to frown on adding words that are not
part of the book they are digitizing.

The WWers at the time didn't want to have responsibility for
editorial oversight for submitted summaries.

In short, I like this idea and had floated it in the past. If
DP would consider this, that would be a way to get a lot of human-created
summaries.
 ~ Greg


On Sun, 15 Sept 2024 at 05:32, Greg Newby <gbnewby@pglaf.org> wrote:

Hi Joyce. Great comments! I realize not everyone can respond
to all the different lists on the To: line, however my response
should let everyone see your comments with my responses below:

On Sat, Sep 14, 2024 at 12:46:23PM -0500, Joyce Wilson wrote:
I looked through quite a few of the AI-generated summaries, and they seem at least superficially ok, though I think there's still room for skepticism/caution, and there's an annoying "fluffy" and "same-y" quality to the AI-generated text (the current iteration of summaries really seems to love the words "social", "society", and "societal", for instance).  Though I'm not at all thrilled about the idea, I can grudgingly see that providing AI-generated summaries could potentially be useful to users, perhaps especially for users interested in works in languages other than English, since those books are likelier to have pretty minimal metadata.


We updated the prompt so there is a little more variety in the AI
wording, but you are right they are very similar. This was a tradeoff
to NOT allow a lot of latitude, since that seems more likely to lead
to AI "hallucination."

My expectation is that most people visiting the PG site will not
be reading a ton of summaries. Instead, they'd just like to get
an idea about whether to read a specific book.


But I do have questions and thoughts!

What's in it for the outside programmers who are working on this, and what's the current source of funding?

Nothing, and they're not funded. It's a person who teaches programming,
and a precocious 14-year old student.

They've committee to making the software & prompts available. There's
not much software, just a little Python code that uses the ChatGPT
API. They also explored other AI models, and prefer Claude to ChatGPT
but Claude would cost 10s of $thousands for the whole PG collection.

They already gave us the "browsing" categories I added a few weeks
ago here: https://www.gutenberg.org/ebooks/bookshelf/

The idea is to do a monthly addition of browsing categories and
summaries for new books that have undergone human cataloging

What's the expected source of funding in the future?

They estimate ChatGPT will charge $200-300 to make summaries of the
whole collection once we stop exploring. PG will foot the bill.

What happens if ChatGPT subscriptions get considerably more expensive?  My sense is that the price of subscriptions is significantly subsidized, and likely does not reflect the actual cost of providing the service.

The trend is towards greatly decreased costs and greater capabilities,
with fewer limits.

Worst case is we're stuck with our first round of new summaries and
cannot make any more. That seems unlikely, and in fact the prompt
can be used with the free tier of access for ChatGPT and Claude.

Scaling up to 70,000+ books requires a paid account, though.

Can we see the prompt that is used to generate the summaries?  If not, why not?

There have been several. The most recent is attached, also one of the
earlier ones.

Will AI-generated summary text be treated as keywords for the purposes of keyword searches in the catalog?

I don't think so. I discovered recently that bookshelves are not currently
searchable either (as far as I could tell).

Search improvements are high on my list of things to address, and
separate from AI summaries.

I noticed that the AI-generated summaries page has been updated since I first saw it because I noticed a change in wording in one of the summaries.  Would AI-generated summaries be more-or-less "set" once they're on the landing page, or would all of them be generated anew each time new summaries are needed for a batch of new books?

Yes, we'll do a big batch and insert them all into the database.

They could be changed or deleted individually if needed, but I would
not expect a new batch - to replace the old batch - until the technique
improves.

One thing I expect is for the AI models to be inexpensive enough to
slurp up an entire book, rather than just feeding it the first 18K tokens.
18K tokens (around 12K characters) is our current setting.

My perception is that lots of folks are eager to produce AI-generated text content.  It's much less clear to me that anyone is very eager to consume AI-generated text content.  Are we fairly sure this is something users want, or is this something we're pursuing because generative AI is (or was) a popular buzzword, and we're having FOMO?

I've wanted summaries for a very long time, and saw no practical way
to do it, at least at scale.

One related thing I've wanted for very long time is links to Wikipedia
pages for books and authors, when one exists. The programmers are hoping
to tackle that next, and I hope we can triangulate with some software
that Eric wrote years ago for the same purpose.

I think these are desired by readers. Modern physical books have a summary,
and all the online bookstores do too. It seems like this is reasonably
expected by readers, and it's a gap that PG doesn't currently offer them.

It's my suspicion that this has more up-side potential for PG's partners in the project than for PG itself.  That would be consistent with the AI-generated audiobooks project, where now we have the PG name attached to a collection of poor-quality audiobooks, while the partners got some nice publicity and now appear to have no interest in correcting serious errors that were reported over a year ago (chunks of omitted text, e.g.).  If PG decides the AI-generated summaries aren't working out, will it be easy to pull the plug?

I don't know what partners are expected to benefit. I'm only trying
to benefit our readers.

For these summaries, we control everything. I'm the person who does
the insertions to the catalog database (via PostgreSQL commands -
though they could be managed via the cataloger admin page also).

I can't speak for other members of the cataloging team, but I will not ever be attempting to quality-check the AI-generated summaries.  My plan would be to ignore them for cataloging purposes, though I'm not sure how well that will work.

Human review of summaries is not part of the plan. We'll solicit reader
feedback and I can delete or fix problematic entries. I don't expect a lot,
and if they exceed what I can handle I'll look then for ways to deal
with the problems.

Note that browsing categories, mentioned above, are derived from
subject cataloging. They cannot happen until after the human catalogers
do their job.

Summaries also need to happen after human cataloging, because the
summaries pull the author + title from the catalog record. We found
the AI was making mistakes in identifying author + title sometimes,
so instead we will tell it the author & title.


My hope is that if I happened to notice that the AI-generated summary seemed to reflect something very different from my understanding of a book, that that would lead me to dig further into other sources as a check on my understanding.  I really hope I would not base catalog metadata on an AI-generated summary, but it's hard to be certain at this point.  Seeing an AI-generated summary might easily contaminate my thinking about a book, a rather disturbing possibility to me.

Since cataloging needs to happen before the AI stuff, there's no danger
of that type of pollution.

What happens if a user reports an inaccuracy in an AI-generated summary?  I don't want to have to deal with that message.

I'll deal with it.

Despite my (somewhat conflicted) lack of enthusiasm, I assume that I can't actually stop this from being implemented, so I'll make a suggestion that at least some A/B testing should be done.  Try adding the AI-generated summaries to odd-numbered books only, and in six months compare the number of actual downloads between books with and without AI-generated summaries.  I assume that PG's goal is not to have people use the AI-generated summaries as a sort of "SparkNotes" in place of reading PG's actual books (though that often seems to be the stated goal of AI-generated summarization).

That's an interesting proposal. I'm not confident that I'd be able to
commit the effort do the analysis needed. Download counts are highly
variable already, and most books have very low download counts. The
logfiles involved are massive (gigabytes/day).

For testing, my intention was simply to mention the new summaries
on www.gutenberg.org and solicit input. If we find there are big problems,
I could remove some or all summaries as easily as I put them in.

Will AI-generated summaries be seen as an adequate substitute for a human cataloging team when current members retire or are otherwise no longer available?  The other day I saw this quote from Cory Doctorow: "The AI can’t do your job, but the AI salesman can convince your boss to fire you and replace you with an AI anyway."

You probably don't know that I did my MA thesis on AI in 1988. At the
time, I read essentially all the English-language literature about
AI. I'm sure you know that my whole working career has been at the
high end of high tech.

I'm not an AI fan boy.

Consider that the systems that are now called AI are in the general
category of "large language models." My observation and belief is that
these purposes we're discussing above are quite well-suited for today's
AI systems.

A very specific prompt is needed to avoid hallucination. Earlier in the
year, Roger and I spent time looking at another person's offerings of
AI-based interaction with PG books. We found the words from the AI to be
nicely written and compelling, and then found out that much of what was
written was hallucination - such as characters and scenes the AI reported
that were not actually in the book.

So, we're trying to ensure the summaries are usable and without
serious inaccuracies. That's why the sample exists and was shared, and
has been iterated a few times just in the past week.
 ~ Greg


I'd be very interested in seeing other folks thoughts on the AI-generated summaries question, so I hope there will be more reply-alls.

--Joyce (cataloging team)

On Thu, Sep 12, 2024, at 12:33 PM, Greg Newby wrote:
Your input is requested: I’ve been working with some programmers to build AI-based book summaries.

The intention is for these to be added to the landing pages for books on the Project Gutenberg website. We have iterated on the prompt to the AI (ChatGPT 4.o mini) and, to me, the summaries are pretty good. On landing pages, we’ll clearly label them as AI-generated summaries. In the future, we might replace them with improved summaries.

We are only able to feed the first 12K characters or so to the AI, due to the costs of the AI model (in the future, this will improve). The summaries all have a similar structure: The title & author, some basic background including the period it was written, and a second paragraph that characterizes the start of the book. (Just a few stories in these examples characterize the whole book, when it fits within the 12K character limit).

I’d value general feedback on this approach and the quality of the summaries.

If you have some specific book #s that you know well, and would like to see automated summaries of them, please let me know and we’ll add them to the list.

The summaries are here: https://displaysummaries-johannesseiko.replit.app/

Your input can go in this thread, or by email to me: gbnewby@pglaf.org

Thanks!

 ~ Greg

Dr. Gregory B. Newby
Chief Executive and Director
Project Gutenberg Literary Archive Foundation www.gutenberg.org
A 501(c)(3) not-for-profit organization with EIN 64-6221541
gbnewby@pglaf.org

--
errata-team mailing list -- errata-team@lists.pglaf.org
To unsubscribe send an email to errata-team-leave@lists.pglaf.org