Motivation and Background

picturesafe has developed enterprise solutions for enterprise customers in the areas of publishing, licensing, industry and public administration.

Some aspects of our solutions:

  • maintaining 90+ million image assets (iptc, rights management) with constant ingest and a lot of concurrent users
  • supporting 30+ million article assets (constant ingest of 400.000 articles/day and constant deletions)
  • pay attention to complex rights and visibility rules (licensing)
  • managing complex hierarchical data structures (ERP) and multi-million attribute updates/hour

Our technology stack is Java enterprise focused and we always relied on a relational, transactional database system to store the ‘single source of truth’. As it turns out, Oracle Database was a perfect fit, at least as long our focus was on the ‘relational’ or ‘transactional’ part and datasets did not become to large.

As our customers grow, licensing costs started to become an issue for nearly any company, that needed fast searches on large amounts of data. Oracle scales well but with additional licensing costs.

Additionally, new search concepts entered the stage, that could not be satisfied with Oracle.

Enter Lucene. Lucene is one of the best available fulltext search systems available. We developed some systems with this technology, but aside searching we spend a lot of time in index maintenance and scaling issues.

Enter Elasticsearch. Elasticsearch is a distributed, near realtime fulltext search system with strong analytic capabilities. On its base there is still Lucene, but index maintenance, scaling and administration can be done via the Elasticsearch REST API.

However, while the Elasticsearch REST API is very good and well documented, it shows also the complexity of the system. Most of the time there are more than one way to do things and sometimes you have to choose the right way to be able to combine it with other requirements.

When you decide to add fulltext search to your application while maintaining the relational database, you have to maintain consistency, find a way to cope with transactions (or the missing of transactions in Elasticsearch) and large data changes that require heavy reindexing of data.

Maybe you will end up with the need to join search results.

Sometimes there are additional requirements to refine or tune the search result (authorisation, visibility, business needs, etc.).

Enter picturesafe-search. Directly from our battle proven toolset we formed an opinionated library that not only gives you a head start and fast success in implementing your solution but also a welcome level of abstraction to help guide you in the the usage of Elasticsearch features.

Give it a try!