Meyer Information Management Blog Rotating Header Image

Upcoming Book from Grant Ingersoll, Taming Text

I just stumbled upon a new book from Grant Ingersoll and Thomas Morton called “Taming Text“. It is currently in progress but you can already look into it via Manning early access. Since I was really excited on the Chapter 4 (Identifying People, Places and Things) I already bought the MEAP eBook.

The book tries to fill the gap between theoretical/scientific books about text processing and practical ones on software engineering (like Lucene in Action). The recommended software (e.g. for Natural Language Processing) is mostly open source and since Grant Ingersoll is a committer on Lucene projects you can also find Apache Solr and Mahout right away.

The book’s chapters:

  • Getting started taming text
  • Foundations of taming text
  • Searching with Apache Solr
  • Identifying people, places and things
  • Keyword tagging
  • Clustering text
  • Document summarization
  • UnTamed text: The Next Frontier

I just flew over the book and I really like the balance of theoretical and technical points. For example in Chapter 4 on NER it explains some basics, concrete examples with OpenNLP and also talks about performance.

Keep in mind, the book is still in progress so there might be some changes.

Share and Enjoy:
  • Digg
  • Sphinn
  • del.icio.us
  • Google Bookmarks
  • Technorati
  • MisterWong

Leave a Reply