
Joshua asked:
Anyone heard if the BBeB format is open/documented? And even better, if anyone has created an open source converter? If these things take off, it would be nice to have the ability to generate files for them from our collection. If there is an open source converter, there is a chance we could do such a thing right on the server.
My best understanding from following the Librie list, talking with the "librie guy", and a few snippets of news releases, is that the BBeB Xylog DTD/schema/spec is still unpublished, but that Sony plans to publish (and maybe release as an "open standard") the format. Looking at the incomplete Xylog schema used in the Librie which has been reverse engineered, as well as a couple of Xylog XML documents, has revealed some interesting tidbits: 1) It's an all-in-one XML document -- everything is dumped inside a single document, including images, metadata, etc. (I vaguely remember Microsoft trying to patent this idea. Anyone know?) 2) All the examples I've seen are text-encoded in UTF-16. This means either that UTF-16 is supported (along with hopefully UTF-8), or that it is required. This makes sense for the Japanese origin of the format where, I gather, UTF-16 is more efficient than UTF-8 when encoding Han characters and such. 3) It does NOT use CSS -- rather it uses its own styling scheme which does not appear to completely map to CSS (or the mapping is very complex). The core model may not be the same as the CSS box model. This is troubling why they chose their own styling language rather than fully embrace some subset of CSS. Part of this may stem from the core layout model, which is faintly reminiscent of PDF. 4) The document structure is dirt simple. There are two types of "text blocks" supported, which is sort of analogous to a <div> box. Within a text block one can have one or more <P> (paragraphs), and there is a small supported set of inline tags. There does not appear, but I'm not certain (I can only go by what I've seen so far), to be support for defined structures such as tables, lists, blockquotes, and even headers. All these things have to be fitted within the text block/paragraph. This appears to make accessibility more difficult since there's no predefined semantics one can assign to the various structures (which could include, I suppose sidebars and stuff) so those using text-to-speech may have to figure out what's what without any machine-recognizable cues.) Definitely, the Xylog vocabulary is not suitable for use as a "master" format for etexts. It's more of a derivative format for primarily visual presentation purposes. Anyway, these are my impressions from incomplete information. Once the BBeB Xylog schema is published, we'll know for sure. And it is possible the schema used for the U.S. Sony may be updated from the one used in the Librie. It is sad that they ignored established standards (such as HTML, OEBPS, CSS, TEI, etc.) and decided to roll their own. And so far I don't see any innovations that makes it better for representing digital publications. I see it as a step backwards. (To be fair, the motivation for developing it maybe was to minimize hardware resource requirements, so for that it may be innovative, but I see no other advantages, not even in document conversion.) We'll see... YMMV. Jon