
30 Jul
2009
30 Jul
'09
4:40 a.m.
But i am reading the rdf into a (file) database. That is more or less what Lucene is. What i am filtering is just what insert into the database, so that its creation is faster / searches only on the fields that interest me. Sure its a lot of code, that will break if the format changes, but it reduced the creation step from 5 minutes or so to 40 seconds (this on a fast dual-core computer - i shudder to think what would happen if a user tried to re-index in a 1000 hertz machine). The index is at about 33.5 mb, and should compress into < 10mb. Probably enough to be included into the application.