How to upgrade a SolrCloud cluster – Tutorial

Let’s say that we have a SolrCloud cluster using Solr 4.X. Now we want to upgrade our Solr cluster and to have a modern Solr version as Solr 6.X, how can we do it ?

Well, there are many ways to do it. The cleanest is to install directly the new version of Solr, to adapt the configuration files and to reindex all data. But in production, it is often not acceptable to do that.

In this tutorial, we will upgrade in two steps : from Solr 4 to Solr 5 and then from Solr 5 to Solr 6. It is not possible to upgrade directly from Solr 4 to Solr 6 i.e. between 2 major versions because the index format changes and Solr can only read an index format from the parent major Solr version.

Continue reading

Tutorial – Deploying Solrcloud 7 on Amazon EC2

UPDATE: This tutorial is based on Solr 7. If you want to use Solr 8, we strongly recommend to use our recent blog entry to set up Solrcloud 8 on Amazon EC2

In this tutorial, we will be setting up a Solrcloud cluster on Amazon EC2.
We’ll be using Solr 7.1, Zookeeper 3.4.10 on Debian 9 instances.
This tutorial explains step by step how to reach this objective.

We will be installing a set of 3 machines, with 3 shards per server, which gives us a total of 9 shards. The replication factor is 3.
We will also be installing a Zookeeper ensemble of 3 machines.

This architecture will be flexible enough to allow for a fail-over of one or two machines, depending on whether we are at the indexing phase or at the querying phase:

  • Indexing: a machine can fail without impacting the cluster (the zookeeper ensemble of 3 machines allows for one machine down). The updates are successfully broadcasted to the machines still running.
  • Querying: two machines can fail without impacting the cluster. Since each machine hosts 3 shards, a search query can be processed without problems, the only constraints being a slower response time due to the higher load on the remaining machine.

Continue reading

Tutorial – Deploying Solrcloud 6 on Amazon EC2

UPDATE: This tutorial is based on Solr 6. If you want to use Solr 8, we strongly recommend to use our recent blog entry to set up Solrcloud 8 on Amazon EC2

In this tutorial, we will be setting up a Solrcloud cluster on Amazon EC2.
We’ll be using Solr 6.6.0, Zookeeper 3.4.6 on Debian 8 instances.
This tutorial explains step by step how to reach this objective.

We will be installing a set of 3 machines, with 3 shards per server, which gives us a total of 9 shards. The replication factor is 3.
We will also be installing a Zookeeper ensemble of 3 machines.

This architecture will be flexible enough to allow for a fail-over of one or two machines, depending on whether we are at the indexing phase or at the querying phase:

  • Indexing: a machine can fail without impacting the cluster (the zookeeper ensemble of 3 machines allows for one machine down). The updates are successfully broadcasted to the machines still running.
  • Querying: two machines can fail without impacting the cluster. Since each machine hosts 3 shards, a search query can be processed without problems, the only constraints being a slower response time due to the higher load on the remaining machine.

Continue reading

Tutorial – Deploying Solrcloud 5 on Amazon EC2

UPDATE: This tutorial is based on Solr 5. If you want to use Solr 8, we strongly recommend to use our recent blog entry to set up Solrcloud 8 on Amazon EC2

NOTE: There is a French version to this tutorial, which you’ll find on the second half of this blog entry.

In this tutorial, we’ll be setting up a Solrcloud cluster on Amazon EC2.
We’ll be using Solr 5.1, the embedded Jetty, Zookeeper 3.4.6 on Debian 7 instances.
This tutorial explains step by step how to reach this objective.

We’ll be installing a set of 3 machines, with 3 shares and 2 replicas per shard, which gives us a total of 9 shards.
We’ll also be installing a Zookeeper ensemble of 3 machines.

This architecture will be flexible enough to allow for a fail-over of one or two machines, depending on whether we’re at the indexing phase or at the querying phase:

  • Indexing: a machine can fail without impacting the cluster (the zookeeper ensemble of 3 machines allows for one machine down). The updates are successfully broadcasted to the machines still running.
  • Querying: two machines can fail without impacting the cluster. Since each machine hosts 3 shards, a search query can be processed without problems, the only constraints being a slower response time due to the higher load on the remaining machine.

Continue reading