Unfortunately I was not be able to attend at the ApacheCon US 2008 in New Orleans this year – way too far away from good ol’ germany! But I reviewed the given sessions on Lucene (and Solr, Mahout, Tika) afterwards the conference to get some inspiration. Some comments on it:
* Advanced Indexing Techniques with Apache Lucene (Michael Busch)
Detailled presentation about Indexing capabilities of Apache Lucene and a very interesting part on how to use Token Payloads and POS-Tagging with the new TokenStream API.
* Apache Solr: Out of the Box (Chris Hostetter)
Introduction to Solr from installation and administration (Admin Console, Luke), querying (Facets, Highlighting) and configuration (Analyzers, Multiple Indexes, Replication).
* Introducing Mahout: Apache Machine Learning (Grant Ingersoll)
Already posted this session in my recent Breakfast Links. A nice presentation about what Machine Learning stands for and the approach of Mahout.
* Apache Solr: Beyond the Box (Chris Hostetter)
Presentation about Solr’s history and real world examples such like Geo search.
* Content analysis for ECM with Apache Tika (Paolo Mottadelli)
Impressive and extensive presentation about Apache Tika and its Alfresco integration for content extraction.
