NOTE: this post has a French version at the bottom of this page.
Enterprise Search Europe is the largest european event dedicated to Enterprise Search. Looking at this year’s agenda, I have the feeling a particular highlight will be given to open source. As in the recent years, several case studies are dedicated to open source, but in addition, the keynote will be focused on it. Charlie Hull, CEO and cofounder of Flax, expert in open source enterprise search, will be sharing his thoughts on the future of search and the link betweeb search and big data. Other open source tracks include a migration from Exalead to Apache Solr (the talk will be given by France Labs, yeeepieeeee), and a round table on open source implementation. You can find more details on the ESEU 2015 programme page.
NOTE: There is a French version to this tutorial, which you’ll find on the second half of this blog entry.
In this tutorial, we’ll be setting up a Solrcloud cluster on Amazon EC2.
We’ll be using Solr 5.1, the embedded Jetty, Zookeeper 3.4.6 on Debian 7 instances.
This tutorial explains step by step how to reach this objective.
We’ll be installing a set of 3 machines, with 3 shares and 2 replicas per shard, which gives us a total of 9 shards.
We’ll also be installing a Zookeeper ensemble of 3 machines.
This architecture will be flexible enough to allow for a fail-over of one or two machines, depending on whether we’re at the indexing phase or at the querying phase:
- Indexing: a machine can fail without impacting the cluster (the zookeeper ensemble of 3 machines allows for one machine down). The updates are successfully broadcasted to the machines still running.
- Querying: two machines can fail without impacting the cluster. Since each machine hosts 3 shards, a search query can be processed without problems, the only constraints being a slower response time due to the higher load on the remaining machine.
NOTE: For English version, please look further down.
Nous avons créé une mailing list Solr Francophone, pour que les développeurs qui se sentent plus à l’aise en français qu’en anglais puissent échanger sur Solr dans la langue de Molière. Retrouvez-nous donc vite sur la mailing list Solr en français !
NOTE: French version at the bottom of this page.
We can often see on the web that Elasticsearch is really cool because it is schemaless, and Solr is not. Although Elasticsearch is cool for many reasons, we want to remind you that Solr is also schemaless since July 2013 (Solr 4.4).
To remind you what schemaless means: Without manually editing the Solr schema, it can recognize some data types automatically when receiving data to be indexed. Those types are: Boolean, Integer, Long, Float, Double, and Date
That’s pretty convenient for quick prototyping. Still, as for Elasticsearch, Continue reading
UPDATE: This tutorial is based on Solr 4. If you want to use Solr 5, we strongly recommend to use our recent blog entry to set up Solrcloud 5 on Amazon EC2
NOTE: There is French version to this tutorial, which you’ll find on the second half of this blog entry.
In this tutorial, we’ll be installing a SolrCloud cluster on Amazon EC2.
We’ll be using Solr 4.9, Tomcat 7 and Zookeeper 3.4.6 on Debian 7 instances.
This tutorial will explain how to achieve this result.
We’ll be installing a set of 3 machines with 3 shards and 2 replicas per shard, thus creating a set of 9 shards.
We’ll also be installing a Zookeeper ensemble of 3 machines.
NOTE: English version on top, French version below.
We have noticed that in Solr 4, there is problem with the UI related to cache hit ratio evaluation of SolrMeter. Digging a bit, the problem is due to a type change between Solr 3 and Solr 4. SolrMeter expects a string, whereas Solr4 sends back a float. More precisely, Solr 4 does that within its request handler mbean, in the cache sub category.
We’re now using a patch available for this bug, created by Javier Mendez, see his contribution on this google group.
Still, there is no binary version of SolrMeter, hence this blog. Continue reading
Note: French version available at the second half of this blog entry.
Note Note: don’t hesitate to test our new open source package solution Datafari, which combines Apache ManifoldCF, Apache Solr and AjaxFranceLabs 🙂
SPIP is a well known open source platform. We wanted to share with you how to integrate graphically a Solr server with a SPIP server. The scenario is the following: you already have SPIP based web site, and you want to have a nice search functionnality based on the lastet Solr, to benefit from all its cool functionalities. You have set up a Solr, you have crawled your SPIP content, but now you want to have your Solr search in your SPIP website. This is what we present in this tutorial.
NOTE: If you are interested in using ManifoldCF with Solr, you may want to look at our Datafari software , which combines Apache ManifoldCF with Solr, so it eases this kind of integration. The code is available on google code: https://github.com/francelabs/datafari
Manifold CF (MCF) provides a early-binding authorization mechanism for file searchs. The aim of this entry is to will describe this mechanism, and then to show you the different steps needed to configure MCF and Solr to use this fonctionnality.
MCF extracts ACLs from files at crawling-time, and injects them into Solr as specific fields for the Solr document. Continue reading
NOTE: If you are interested in using ManifoldCF with Solr, you may want to look at our Datafari software, which combines Apache ManifoldCF with Solr, so it eases this kind of integration. The code is available on google code: https://github.com/francelabs/datafari
With the arrival of Manifold CF 1.0 (now already in v2.5), the open source community is looking for tutorials to combine it with Solr 4. That’s the intent of this tutorial, which will drive you through the different steps required to make it work.
First, we’ll recap the installation process of Manifold CF (we’ll call it MCF later on), and of Solr. Second, we’ll configure both tools so that they can interact with each other. Third, we’ll configure MCF so that it crawls a windows file share. In this tutorial, when I specify installation directory such as solr-4.1.0, you have to complete with the absolute path of the installation directory. Continue reading