Active Directory

Disclaimer: This blog is not really new, as it’s just the migration of the technical content of our website – see further down for the French version.

NOTE: If you are interested in using AD with Solr, you may want to look at our Datafari software (still in Alpha version), which combines Apache ManifoldCF with Solr, so it eases this kind of integration. The code is available on google code: http://code.google.com/p/datafari/

In enterprise environments, enterprise search often needs a security aspect which is not necessary for standard web search. In order to assist you, we release here a small code in order to allow Constellio 1.2 (and probably 1.3 although we didn’t test it) to connect to an Active Directory in order to do the credentials check at authentication time. Here is how it works: Continue reading

Potential security risk if you use Solr together with an internet facing CMS

We recently stumbled upon a detailed article on a Solr attack using SSRF, by Nicolas Grégoire. To summarise: if you think you are safe because you have your Solr hidden behind another system, and that you have only a http server facing the web to make things ok, you may have problems you did not think about.

While reading this article, I was thinking about use cases related to CMS systems with users management, and which are accessible from the web. They are a good fit for such attacks. The good news is that Solr 4.6 solves this vulnerability. The bad news is that you need to do your migration quickly if you want to sleep well 😉

Tutorial on Authorizations for Manifold CF and Solr

NOTE: If you are interested in using ManifoldCF with Solr, you may want to look at our Datafari software , which combines Apache ManifoldCF with Solr, so it eases this kind of integration. The code is available on google code: https://github.com/francelabs/datafari

Manifold CF (MCF) provides a early-binding authorization mechanism for file searchs. The aim of this entry is to will describe this mechanism, and then to show you the different steps needed to configure MCF and Solr to use this fonctionnality.

MCF extracts ACLs from files at crawling-time, and injects them into Solr as specific fields for the Solr document. Continue reading

Activating early binding in Constellio 1.3

Waiting for Constellio V2.0, we thought you may be interested in seeing how to activate early binding in Constellio 1.3
As a reminder, there are two ways to manage security for documents search: early binding and late binding. By security management, we mean the fact that an authorised user in a search engine is allowed to see as an answer to a search request, only the results he is actually allowed to see.
Early binding is the recommended way as it provides the fastest answer time. It consists in storing as part of the index the ACL (Access Control List) of the indexed documents, as an additional field of the Lucene index. Thus, when someone does a search, his username is appended to the search query, and there is a field filtering based on his username. The pros is that it only impacts the search time by the time it takes to filter on a field (which means a very small overhead). The con is that the documents ACLs are only synchronised when the documents are recrawled and reindexed. So if you plan a crawl everynight, your indexed ACLs will only be updated every night, hence generating a potential one day discrepancy. Still, this is the recommended way for standard scenarios, as most enterprise needs don’t require a to-the-minute update of the ACLs of files. Continue reading