Tutorial for combining ManifoldCF and Solr for files search

NOTE: If you are interested in using ManifoldCF with Solr, you may want to look at our Datafari software, which combines Apache ManifoldCF with Solr, so it eases this kind of integration. The code is available on google code: https://github.com/francelabs/datafari

With the arrival of Manifold CF 1.0 (now already in v2.5), the open source community is looking for tutorials to combine it with Solr 4. That’s the intent of this tutorial, which will drive you through the different steps required to make it work.

First, we’ll recap the installation process of Manifold CF (we’ll call it MCF later on), and of Solr. Second, we’ll configure both tools so that they can interact with each other. Third, we’ll configure MCF so that it crawls a windows file share. In this tutorial, when I specify installation directory such as solr-4.1.0, you have to complete with the absolute path of the installation directory. Continue reading