{"id":533,"date":"2020-10-02T10:42:00","date_gmt":"2020-10-02T09:42:00","guid":{"rendered":"http:\/\/www.francelabs.com\/blog\/?p=533"},"modified":"2020-12-21T15:56:35","modified_gmt":"2020-12-21T14:56:35","slug":"tutorial-deploying-solrcloud-8-on-amazon-ec2","status":"publish","type":"post","link":"https:\/\/www.francelabs.com\/blog\/tutorial-deploying-solrcloud-8-on-amazon-ec2\/","title":{"rendered":"Tutorial \u2013 Deploying Solrcloud 8 on Amazon EC2"},"content":{"rendered":"\n<p>In this tutorial, we will be setting up a Solrcloud cluster on Amazon EC2.<br> We\u2019ll be using Solr 8.6.2, Zookeeper 3.5.7 on Debian&nbsp;10 instances.<br> This tutorial explains step by step how to reach this objective.<\/p>\n\n\n\n<p>We will be installing a set of 3 machines, with 3 shards per server, which gives us a total of 9 shards. The replication factor is 3.<br>\nWe will also be installing a Zookeeper ensemble of 3 machines.<\/p>\n\n\n\n<p>This architecture will be flexible enough to allow for a fail-over of one or two machines, depending on whether we are at the indexing phase or at the querying phase:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Indexing: a machine can fail without impacting the cluster (the zookeeper ensemble of 3 machines allows for one machine down). The updates are successfully broadcasted to the machines still running.<\/li><li>Querying: two machines can fail without impacting the cluster. Since each machine hosts 3 shards, a search query can be processed without problems, the only constraints being a slower response time due to the higher load on the remaining machine.<\/li><\/ul>\n\n\n\n<!--more-->\n\n\n\n<p>Here is the architecture of what we want to achieve :<\/p>\n\n\n\n<p><\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><a href=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2015\/05\/Solrcloud5_EC2_01_archi.png\"><img loading=\"lazy\" decoding=\"async\" width=\"300\" height=\"251\" src=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2015\/05\/Solrcloud5_EC2_01_archi-300x251.png\" alt=\"Solrcloud 5 archi EC2\" class=\"wp-image-328\" srcset=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2015\/05\/Solrcloud5_EC2_01_archi-300x251.png 300w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2015\/05\/Solrcloud5_EC2_01_archi-358x300.png 358w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2015\/05\/Solrcloud5_EC2_01_archi.png 772w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/figure><\/div>\n\n\n\n<p>To achieve this, we will be using Amazon EC2 instances.<br>\nThe steps are:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Configuration of the EC2 instances<\/li><li>Installation of the software components<\/li><li>Configuration of Solr Home<\/li><li>Configuration of Zookeeper<\/li><li>Configuration of Solrcloud<\/li><\/ul>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\"><p><strong>Configuration of EC2 instances:<\/strong><\/p><\/blockquote>\n\n\n\n<p>In this tutorial, the chosen instances are of type m3 medium with the following specs :<\/p>\n\n\n\n<p>t2.medium : 2vcpu 4 Go RAM<\/p>\n\n\n\n<p>This will be more than enough for our tutorial.<\/p>\n\n\n\n<p>Once connected to AWS, go to the EC2 page and create 3 instances of type t2.medium. Chose the Debian10 image (Buster) 64 bits, which is available for free on the AWS Store.<\/p>\n\n\n\n<p>Create or use a security key common to the 3 instances.<\/p>\n\n\n\n<p>Once started, you should have the following 3 AWS instances, named respectively solr1, solr2 and solr3 :<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><a href=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_ec2_cluster.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"78\" src=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_ec2_cluster-1024x78.jpg\" alt=\"\" class=\"wp-image-546\" srcset=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_ec2_cluster-1024x78.jpg 1024w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_ec2_cluster-300x23.jpg 300w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_ec2_cluster-768x59.jpg 768w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_ec2_cluster-1536x118.jpg 1536w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_ec2_cluster-500x38.jpg 500w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_ec2_cluster.jpg 1699w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/a><\/figure>\n\n\n\n<p>You need to set the group security associated to the instances so that the instances can communicate with each other, and so that we can access to the Jetty from outside.<\/p>\n\n\n\n<p>The rules to be added are (in addition of the SSH connection):<br>\nSolr: TCP 8983<br>\nZookeeper: TCP 2181, 2888, 3888<\/p>\n\n\n\n<p>All traffic from other machines from same security group<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><a href=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2017\/06\/security_group.jpeg\"><img loading=\"lazy\" decoding=\"async\" width=\"300\" height=\"53\" src=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2017\/06\/security_group-300x53.jpeg\" alt=\"\" class=\"wp-image-402\" srcset=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2017\/06\/security_group-300x53.jpeg 300w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2017\/06\/security_group-768x137.jpeg 768w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2017\/06\/security_group-1024x182.jpeg 1024w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2017\/06\/security_group-500x89.jpeg 500w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2017\/06\/security_group.jpeg 1686w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/figure><\/div>\n\n\n\n<p>We advise assigning \u00ab elastic ip \u00bb addresses to the instances, in order to connect to it using a fix public ip address.<br>\nWithout elastic ip addresses, you would be forced to declare the private ip addresses of the instances within the Zookeeper (ZK) configuration, and these IPs change everytime you restart your machines, meaning you\u2019d need to reconfigure your ZK after every restart. Using elastic ips, you benefit from the public dns which takes care of mapping the private ip addresses.<\/p>\n\n\n\n<p>Once the instances are ready, we can connect to it.<\/p>\n\n\n\n<p>For this, use either Putty on Windows, or the terminal on Linux\/Mac OS X, with the security key bound to the instances.<br>\nIn this tutorial, we\u2019ll be using iTerm on macOs. You can also try for example EC2Box (http:\/\/ec2box.com) which can easily send grouped commands to several EC2 instances.<\/p>\n\n\n\n<p>To establish the SSH connection with the EC2 instances, you can find some help here : http:\/\/docs.aws.amazon.com\/AWSEC2\/latest\/UserGuide\/AccessingInstancesLinux.html<\/p>\n\n\n\n<p>Basically you need to change the permission on the private key you associated to your instances :<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>chmod 400 **\/path\/private\/key\/*.pem<\/code><\/pre>\n\n\n\n<p>And the SSH command connection is :<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">ssh -i **\/path\/private\/key\/*.pem admin@ELASTIC_IP<\/pre>\n\n\n\n<p>So for our instances the commands will be :<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">ssh -i \/Users\/olivier\/Documents\/code\/Olivkeys.pem admin@15.237.35.26\n\nssh -i \/Users\/olivier\/Documents\/code\/Olivkeys.pem admin@15.237.67.168\n\nssh -i \/Users\/olivier\/Documents\/code\/Olivkeys.pem admin@15.237.78.56<\/pre>\n\n\n\n<p>NB : if you did not take elastic IP, replace the IP by the public DNS of each instance :&nbsp;ec2-XX-XX-XX-XX.XXX.compute.amazonaws.com<\/p>\n\n\n\n<p>Our instances are successfully created and we successfully connected to it, so now let\u2019s do the real work!<\/p>\n\n\n\n<p>For the rest of the tutorial, We change the user connected into root &nbsp;:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">sudo -i<\/pre>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\"><p><strong>Installing Java and the software components:<\/strong><\/p><\/blockquote>\n\n\n\n<p>We will start with installing Oracle Java JDK 11, then we\u2019ll download the latest versions of Solr and Zookeeper.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\"><p><strong>Installing Java:<\/strong><\/p><\/blockquote>\n\n\n\n<p>To quickly setup Java,&nbsp;we will install OpenJDK JDK11 :<\/p>\n\n\n<pre class=\"brush: powershell; title: ; notranslate\" title=\"\"> apt-get update<\/pre>\n\n\n<pre class=\"brush: powershell; title: ; notranslate\" title=\"\"> apt-get install openjdk-11-jdk -y<\/pre>\n\n\n\n<p>Set the JAVA_HOME variable : <\/p>\n\n\n<pre class=\"brush: powershell; title: ; notranslate\" title=\"\">echo 'export JAVA_HOME=\/usr\/lib\/jvm\/java-11-openjdk' &amp;amp;amp;amp;amp;gt;&amp;amp;amp;amp;amp;gt; \/etc\/profile<\/pre>\n\n\n\n<p>Then to take into account the changes :<\/p>\n\n\n<pre class=\"brush: powershell; title: ; notranslate\" title=\"\">source \/etc\/profile<\/pre>\n\n\n\n<p>To check that Java is properly installed, enter :<\/p>\n\n\n<pre class=\"brush: powershell; title: ; notranslate\" title=\"\">java \u2013version<\/pre>\n\n\n\n<p>And you should be getting :<\/p>\n\n\n<pre class=\"brush: powershell; title: ; notranslate\" title=\"\">openjdk version &quot;11.0.8&quot; 2020-07-14<\/pre>\n\n\n<pre class=\"brush: powershell; title: ; notranslate\" title=\"\">OpenJDK Runtime Environment (build 11.0.8+10-post-Debian-1deb10u1)<\/pre>\n\n\n<pre class=\"brush: powershell; title: ; notranslate\" title=\"\">OpenJDK 64-Bit Server VM (build 25.151-b12, mixed mode)<\/pre>\n\n\n\n<figure class=\"wp-block-image size-large\"><a href=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_jdk_version.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"761\" height=\"986\" src=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_jdk_version.jpg\" alt=\"\" class=\"wp-image-545\" srcset=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_jdk_version.jpg 761w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_jdk_version-232x300.jpg 232w\" sizes=\"auto, (max-width: 761px) 100vw, 761px\" \/><\/a><\/figure>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\"><p><strong>Installing the software components:<\/strong><\/p><\/blockquote>\n\n\n\n<p><\/p>\n\n\n\n<p>Go to \/root :<\/p>\n\n\n<pre class=\"brush: powershell; title: ; notranslate\" title=\"\">cd \/root<\/pre>\n\n\n\n<p>Download Solr :<br>\nhttp:\/\/lucene.apache.org\/solr\/downloads.html<\/p>\n\n\n\n<p>We choose Solr&nbsp;8.6.2 (.tgz)<\/p>\n\n\n\n<p><strong>wget <a><\/a><a href=\"https:\/\/downloads.apache.org\/lucene\/solr\/8.6.2\/solr-8.6.2.tgz\">https:\/\/downloads.apache.org\/lucene\/solr\/8.6.2\/solr-8.6.2.tgz<\/a><\/strong><\/p>\n\n\n\n<p><\/p>\n\n\n\n<p>Download Zookeeper :<br>\nhttp:\/\/www.apache.org\/dyn\/closer.cgi\/zookeeper\/<\/p>\n\n\n\n<p>Then choose Zookeeper 3.5.7 (tar.gz)<\/p>\n\n\n\n<p><strong>wget <a href=\"https:\/\/archive.apache.org\/dist\/zookeeper\/zookeeper-3.5.7\/apache-zookeeper-3.5.7-bin.tar.gz\">https:\/\/archive.apache.org\/dist\/zookeeper\/zookeeper-3.5.7\/apache-zookeeper-3.5.7-bin.tar.gz<\/a><\/strong><\/p>\n\n\n\n<p>Then unzip these archives :<\/p>\n\n\n<pre class=\"brush: powershell; title: ; notranslate\" title=\"\">tar xfvz solr*.tgz<\/pre>\n\n\n<pre class=\"brush: powershell; title: ; notranslate\" title=\"\">tar xfvz apache-zookeeper-*.tar.gz<\/pre>\n\n\n\n<p>Solr is installed as a service using the provided script :<\/p>\n\n\n<pre class=\"brush: powershell; title: ; notranslate\" title=\"\">cd \/root\/solr-*\/bin<\/pre>\n\n\n<pre class=\"brush: powershell; title: ; notranslate\" title=\"\"> .\/install_solr_service.sh \/root\/solr*.tgz -n<\/pre>\n\n\n\n<p>You can leave the default parameters of the script.<br> All the install files are in \/opt\/solr (do not modify them) and all the files to be modified are located in \/var\/solr (solr home, logs) and the primary configuration file : solr.in.sh in \/etc\/default\/solr<br> Solr is installed by defaults in \/opt\/solr-8.x with a symbolic link towards \/opt\/solr.<\/p>\n\n\n\n<p>Java is correctly installed, as well as all the software components required to configure our SolrCloud cluster.<br>\nLet\u2019s move on to the configuration aspects !<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\"><p><strong>Configuring Solr :<\/strong><\/p><\/blockquote>\n\n\n\n<p>Let&#8217;s modify the solr.in.sh file located in \/etc\/default\/solr  by editing and uncommenting the ZK_HOST  and the SOLR_HOST properties :<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"> nano \/etc\/default\/solr.in.sh ZK_HOST=\"ec2-52-28-143-201.eu-central-1.compute.amazonaws.com:2181,ec2-18-196-4-69.eu-central-1.compute.amazonaws.com:2181,ec2-18-195-201-156.eu-central-1.compute.amazonaws.com:2181\" SOLR_HOST=ec2-52-28-143-201.eu-central-1.compute.amazonaws.com<\/pre>\n\n\n\n<p>(adapt the value of the SOLR_HOST for each server. You need to indicate the public IP of the server).<br> The RAM is also configured by default. To change these values, you need to modify SOLR_JAVA_MEM.<\/p>\n\n\n<pre class=\"brush: powershell; title: ; notranslate\" title=\"\">SOLR_JAVA_MEM=&quot;-Xms512m -Xmx512m&quot;<\/pre>\n\n\n\n<p>The most important parameter is ZK_HOST where one needs to specify the IP addresses of our ZK ensemble. Solr will start automatically in solrcloud mode when this parameter&nbsp;is&nbsp;filled.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\"><p><strong>Configuring Zookeeper:<\/strong><\/p><p><\/p><\/blockquote>\n\n\n\n<p>Move the zookeeper folder into \/opt :<\/p>\n\n\n<pre class=\"brush: powershell; title: ; notranslate\" title=\"\">mv \/root\/apache-zookeeper-*-bin \/opt\/zookeeper<\/pre>\n\n\n\n<p>Go to \/opt\/zookeeper :<\/p>\n\n\n<pre class=\"brush: powershell; title: ; notranslate\" title=\"\">cd \/opt\/zookeeper<\/pre>\n\n\n\n<p>We will be creating the configuration that allows to setup a ZK ensemble made of 3 machines.<br>\nCreate a folder allowing Zookeeper to place its data :<\/p>\n\n\n<pre class=\"brush: powershell; title: ; notranslate\" title=\"\">mkdir \/opt\/zookeeper\/data<\/pre>\n\n\n\n<p>Inside it, create a file called myid :<\/p>\n\n\n<pre class=\"brush: powershell; title: ; notranslate\" title=\"\">nano \/opt\/zookeeper\/tmp\/myid<\/pre>\n\n\n\n<p>Change the value for each instance :<br>\nFor solrcloud1, enter 1<br>\nFor solrcloud2, enter 2<br>\nFor solrcloud3, enter 3<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><a href=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_zk_myid.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"984\" src=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_zk_myid-1024x984.jpg\" alt=\"\" class=\"wp-image-544\" srcset=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_zk_myid-1024x984.jpg 1024w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_zk_myid-300x288.jpg 300w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_zk_myid-768x738.jpg 768w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_zk_myid-312x300.jpg 312w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_zk_myid.jpg 1031w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/a><\/figure>\n\n\n\n<p>Now let&#8217;s move on to the configuration per se, by creating the zoo.cfg file in \/opt\/zookeeper\/conf :<\/p>\n\n\n<pre class=\"brush: powershell; title: ; notranslate\" title=\"\">nano \/opt\/zookeeper\/conf\/zoo.cfg<\/pre>\n\n\n\n<p>Now we are going to change the value of the property dataDir and add the DNS name of the servers of the cluster and finally activate the autopurge :<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"># The number of milliseconds of each tick\n tickTime=2000\n # The number of ticks that the initial\n # synchronization phase can take\n initLimit=10\n # The number of ticks that can pass between\n # sending a request and getting an acknowledgement\n syncLimit=5\n # the directory where the snapshot is stored.\n # do not use \/tmp for storage, \/tmp here is just\n # example sakes.\n <strong>dataDir=\/opt\/zookeeper\/data<\/strong>\n # the port at which the clients will connect\n clientPort=2181\n # the maximum number of client connections.\n # increase this if you need to handle more clients\n #maxClientCnxns=60\n #\n # Be sure to read the maintenance section of the\n # administrator guide before turning on autopurge.\n #\n # http:\/\/zookeeper.apache.org\/doc\/current\/zookeeperAdmin.html#sc_maintenance\n #\n # The number of snapshots to retain in dataDir\n #autopurge.snapRetainCount=3\n # Purge task interval in hours\n # Set to \"0\" to disable auto purge feature\n #autopurge.purgeInterval=1\n <strong>autopurge.snapRetainCount=3<\/strong>\n <strong>autopurge.purgeInterval=1<\/strong>\n <strong>server.1=ec2-15-237-35-26.eu-west-3.compute.amazonaws.com:2888:3888\n server.2=ec2-15-237-67-168.eu-west3.compute.amazonaws.com:2888:3888\n server.3=ec2-15-237-78-56.eu-west-3.compute.amazonaws.com:2888:3888 <\/strong><\/pre>\n\n\n\n<figure class=\"wp-block-image size-large\"><a href=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_zk_zoo.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"646\" height=\"988\" src=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_zk_zoo.jpg\" alt=\"\" class=\"wp-image-539\" srcset=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_zk_zoo.jpg 646w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_zk_zoo-196x300.jpg 196w\" sizes=\"auto, (max-width: 646px) 100vw, 646px\" \/><\/a><\/figure>\n\n\n\n<p>Hints :<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Take care not to put the public IP address, but rather the public DNS, otherwise the machines won&#8217;t be able to talk to each other.<br>\nTo be sure, check upfront that they can ping each other<\/li><li>Also beware of the instances security group. Ensure that the ports that are necessary for communication between instances on Zookeeper are properly open (2181, 2888 et 3888)<\/li><\/ul>\n\n\n\n<p><strong>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<em>&nbsp;Start ZK and Solr:<\/em><\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\"><li>Start ZK<\/li><\/ol>\n\n\n\n<p>Go to \/opt\/zookeeper\/bin<\/p>\n\n\n<pre class=\"brush: powershell; title: ; notranslate\" title=\"\">cd \/opt\/zookeeper\/bin<\/pre>\n\n\n\n<p>Then enter :<\/p>\n\n\n<pre class=\"brush: powershell; title: ; notranslate\" title=\"\">bash zkServer.sh start<\/pre>\n\n\n\n<p>You should obtain :<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><a href=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_zk_start.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"662\" height=\"994\" src=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_zk_start.jpg\" alt=\"\" class=\"wp-image-543\" srcset=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_zk_start.jpg 662w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_zk_start-200x300.jpg 200w\" sizes=\"auto, (max-width: 662px) 100vw, 662px\" \/><\/a><\/figure>\n\n\n\n<p>Open the ZK logs and check that everything is fine :<\/p>\n\n\n<pre class=\"brush: powershell; title: ; notranslate\" title=\"\">tail \u2013f \/opt\/zookeeper\/bin\/zookeeper.*out<\/pre>\n\n\n\n<figure class=\"wp-block-image size-large\"><a href=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_zk-start_log.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"739\" src=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_zk-start_log-1024x739.jpg\" alt=\"\" class=\"wp-image-542\" srcset=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_zk-start_log-1024x739.jpg 1024w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_zk-start_log-300x216.jpg 300w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_zk-start_log-768x554.jpg 768w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_zk-start_log-416x300.jpg 416w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_zk-start_log.jpg 1328w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/a><\/figure>\n\n\n\n<ol class=\"wp-block-list\"><li>Start Solr<\/li><\/ol>\n\n\n\n<p>Warning : Zookeeper MUST be started before Solr. Otherwise Solr cannot be launched into SolrCloud mode.<\/p>\n\n\n\n<p>Launch directly the service :<\/p>\n\n\n<pre class=\"brush: powershell; title: ; notranslate\" title=\"\">service solr start<\/pre>\n\n\n\n<p>We can now connect to the web interface from any instance, and start configuring our SolrCloud cluster.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\"><p><strong>SolrCloud configuration:<\/strong><\/p><\/blockquote>\n\n\n\n<p>To connect to the web interface, for instance solrcloud1 :<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"> http:\/\/ec2-52-28-143-201.eu-central-1.compute.amazonaws.com:8983\/solr<\/pre>\n\n\n\n<p>You should see a Cloud tab, which is a good sign !<br>\nClick on this tab and you should get\u2026 an empty screen, which is normal since no collection has been configured yet. But if you click on the sub-tab Tree and adter on the folder live nodes, you should see the 3 server IPs.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><a href=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/10\/solr8_cloud_first_start.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"785\" src=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/10\/solr8_cloud_first_start-1024x785.jpg\" alt=\"\" class=\"wp-image-547\" srcset=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/10\/solr8_cloud_first_start-1024x785.jpg 1024w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/10\/solr8_cloud_first_start-300x230.jpg 300w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/10\/solr8_cloud_first_start-768x589.jpg 768w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/10\/solr8_cloud_first_start-391x300.jpg 391w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/10\/solr8_cloud_first_start.jpg 1157w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/a><\/figure>\n\n\n\n<p>So let&#8217;s add a collection made of 3 shards with 2 replicas each, on our ensemble of 3 instances.<br>\nTo achieve this, let&#8217;s use the collections API of SolrCloud : https:\/\/cwiki.apache.org\/confluence\/display\/solr\/Collections+API<br>\nThe syntax looks like this :<br>\n\/admin\/collections?action=CREATE&amp;name=name&amp;numShards=number&amp;replicationFactor=number&amp;maxShardsPerNode=number&amp;createNodeSet=nodelist&amp;collection.configName=configname<\/p>\n\n\n\n<p>To do this, we first need a Solr configuration in Zookeeper (parameter collection.configName=configname).<br>\nWe&#8217;ll be using here the&nbsp;Solr control script which is available in the Solr distro under&nbsp;bin to upload our Solr configuration towards ZK.<br>\nWe will send to ZK a standard configuration that is present by default in the Solr distribution :&nbsp;techproducts (\/opt\/solr\/server\/solr\/configsets\/techproducts)<\/p>\n\n\n\n<p>BEWARE ! From now on, the commands must be entered in only one instance, not simultaneously on the 3 instances anymore !<\/p>\n\n\n\n<p>For this, go to\/opt\/solr\/bin :<\/p>\n\n\n<pre class=\"brush: powershell; title: ; notranslate\" title=\"\">cd \/opt\/solr\/bin<\/pre>\n\n\n\n<p>Then enter :<\/p>\n\n\n<pre class=\"brush: powershell; title: ; notranslate\" title=\"\">.\/solr zk upconfig -n techproducts -d ..\/server\/solr\/configsets\/sample_techproducts_configs\/<\/pre>\n\n\n\n<p>zkhost represents the ZK ensemble, -cmd upconfig represents the folder in which you&#8217;ll find the Solr configuration to be sent (here we use the default configuration provided with Solr) and last the name with which the configuration will be stored in ZK.<\/p>\n\n\n\n<p>Hint: the sequence of the arguments in the command does matter<\/p>\n\n\n\n<p>What&#8217;s left for us is to connect to the web interface to check if the configuration is present in ZK :<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><a href=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_solr_zk_config.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"722\" height=\"846\" src=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_solr_zk_config.jpg\" alt=\"\" class=\"wp-image-541\" srcset=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_solr_zk_config.jpg 722w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_solr_zk_config-256x300.jpg 256w\" sizes=\"auto, (max-width: 722px) 100vw, 722px\" \/><\/a><\/figure>\n\n\n\n<p>We can now create the collection on our 3 instances, let&#8217;s use the command mentionned above, which comes from the collections API :<br>\n\/admin\/collections?action=CREATE&amp;name=name&amp;numShards=number&amp;replicationFactor=number&amp;maxShardsPerNode=number&amp;createNodeSet=nodelist&amp;collection.configName=configname<\/p>\n\n\n\n<p>Let&#8217;s adapt this commande for our case, and type it in in our web browser :<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><a href=\"http:\/\/ec2-15-237-35-26.eu-west-3.compute.amazonaws.com:8983\/solr\/admin\/collections?action=CREATE&amp;name=francelabs&amp;numShards=3&amp;replicationFactor=3&amp;collection.configName=techproducts&amp;maxShardsPerNode=3\">http:\/\/ec2-52-28-143-201.eu-central-1.compute.amazonaws.com:8983\/solr\/admin\/collections?action=CREATE&amp;name=francelabs&amp;numShards=3&amp;replicationFactor=3&amp;collection.configName=techproducts&amp;maxShardsPerNode=3<\/a><\/pre>\n\n\n\n<p>Some explanation about the parameters :<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>ec2-52-28-143-201.eu-central-1.compute.amazonaws.com :8983 : the public ip address of the solrcloud1 instance, we could also have chosen the ones of solrcloud2 or solrcloud3, this doesn&#8217;t matter<\/li><li>name = francelabs : the name of our collection<\/li><li>numShards = 3 : the number of shards that will be sharing the Solr index<\/li><li>replicationFactor = 3 : each shard is replicated 2 times (as a matter of fact, using a replication factor of 1 means that shard is alone)<\/li><li>maxShardsPerNode : the maximum number of shards per instance: here we have 3 shards with a replication factor of 3 hence 3&#215;3 = 9 shards in total. We have 3 machines hence 9 \/ 3 = 3 shards per node, so we set maxShardsPerNode to 3.<\/li><\/ul>\n\n\n\n<p>This command will take some time to complete. Once it is done, check the status of your cloud :<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"> http:\/\/ec2-52-28-143-201.eu-central-1.compute.amazonaws.com:8983\/solr\/#\/~cloud<\/pre>\n\n\n\n<figure class=\"wp-block-image size-large\"><a href=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_solrcloud_ok.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"660\" src=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_solrcloud_ok-1024x660.jpg\" alt=\"\" class=\"wp-image-540\" srcset=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_solrcloud_ok-1024x660.jpg 1024w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_solrcloud_ok-300x193.jpg 300w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_solrcloud_ok-768x495.jpg 768w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_solrcloud_ok-466x300.jpg 466w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_solrcloud_ok.jpg 1239w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/a><\/figure>\n\n\n\n<p>And if we want to see which files have been created (connect to any instance, here to solrcloud1) :<br>\ncd \/var\/solr\/data<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"530\" height=\"235\" src=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_solrhome_folder.jpg\" alt=\"\" class=\"wp-image-538\" srcset=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_solrhome_folder.jpg 530w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_solrhome_folder-300x133.jpg 300w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_solrhome_folder-500x222.jpg 500w\" sizes=\"auto, (max-width: 530px) 100vw, 530px\" \/><\/figure>\n\n\n\n<p>We have 3 created folders, each containing a part of the index :<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"529\" height=\"132\" src=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_solrhome_data.jpg\" alt=\"\" class=\"wp-image-537\" srcset=\"https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_solrhome_data.jpg 529w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_solrhome_data-300x75.jpg 300w, https:\/\/www.francelabs.com\/blog\/wp-content\/uploads\/2020\/09\/solr8_solrhome_data-500x125.jpg 500w\" sizes=\"auto, (max-width: 529px) 100vw, 529px\" \/><\/figure>\n\n\n\n<p>The only folder is data, as the configuration of Solr is stored in Zookeeper.<\/p>\n\n\n\n<p>And VOILA, you now have a fully functional Solrcloud system on Amazon EC2. We hope you enjoyed this blog post!<\/p>\n\n\n\n<hr class=\"wp-block-separator\"\/>\n","protected":false},"excerpt":{"rendered":"<p>In this tutorial, we will be setting up a Solrcloud cluster on Amazon EC2. We\u2019ll be using Solr 8.6.2, Zookeeper 3.5.7 on Debian&nbsp;10 instances. This tutorial explains step by step how to reach this objective. We will be installing a &hellip; <a href=\"https:\/\/www.francelabs.com\/blog\/tutorial-deploying-solrcloud-8-on-amazon-ec2\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-533","post","type-post","status-publish","format-standard","hentry","category-search"],"_links":{"self":[{"href":"https:\/\/www.francelabs.com\/blog\/wp-json\/wp\/v2\/posts\/533","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.francelabs.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.francelabs.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.francelabs.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.francelabs.com\/blog\/wp-json\/wp\/v2\/comments?post=533"}],"version-history":[{"count":43,"href":"https:\/\/www.francelabs.com\/blog\/wp-json\/wp\/v2\/posts\/533\/revisions"}],"predecessor-version":[{"id":605,"href":"https:\/\/www.francelabs.com\/blog\/wp-json\/wp\/v2\/posts\/533\/revisions\/605"}],"wp:attachment":[{"href":"https:\/\/www.francelabs.com\/blog\/wp-json\/wp\/v2\/media?parent=533"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.francelabs.com\/blog\/wp-json\/wp\/v2\/categories?post=533"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.francelabs.com\/blog\/wp-json\/wp\/v2\/tags?post=533"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}