picturesafe-search-enterprise provides methods to initialize or rebuild indices asynchronously with one single call.
If you want to create an Elasticsearch index based on given data, you may use one of the index initialization methods of the EnterpriseElasticsearchService
.
The index initialization is processed asynchronously, so your code does not have to wait for it to complete. The data ingested to the index is provided by a
given DocumentProvider
. If also you provide a IndexInitializationListener
, it will be notified on the
progress.
There are two different ways of defining a DocumentProvider
:
Static DocumentProvider
defined as a bean in the spring context (will be autowired)
Dynamic DocumentProvider
as a parameter to the index initialization method
If you have an existing Elasticsearch index, and want to sync it with your data source or you want to change the existing field definition, you may use the
rebuildIfExists
parameter of the index initialization methods in EnterpriseElasticsearchService
. The index rebuild is also processed asynchronously, and a
IndexInitializationListener
will be notified on the progress.
The existing index will still be searchable via its alias while the rebuild is in progress. Any update calls (insert, update or delete of documents) in the meantime will be queued and processed afterwards, so no changes will be lost.
The DocumentProvider
interface may be implemented to have your data ingested in the Elasticsearch index. Its loadDocuments
method will be called
asynchronously and has to provide the data in chunks to the given call-back DocumentHandler
.
Note: If your chunk size is too big, the REST request to Elasticsearch may get too large. On the other hand, if your chunk size is too small, the overall performance of the index initialization or rebuild may be bad. A value between 100 and 1000 should be a good value to try.
For special purposes, there are some predefined DocumentProvider
implementations:
Loads data from a CSV file.
Note: The first line of the CSV file will be considered as the field names.
Loads data from web pages. It will read the HTML data of a given URL and crawl the contained links which refer to the same domain.
Creates test data matching your index settings.
The IndexInitializationListener
interface is a call-back for index initialization or rebuild process. The initialization process will notify the listener on
the initialization step, which is currently performed, and the number of documents, which have been processed, and the total number of documents.
The IndexSettings
bundles the IndexPresetConfiguration
and the FieldConfiguration
list. If you want to provide these settings dynamically to the index
initialization or rebuild call, you can pass an IndexSettings
instance as a method parameter.
The provided IndexSettings
will be persisted automatically by picturesafe-search-enterprise, so search requests can rely on it. In this case you do not
need to define the IndexPresetConfiguration
and the FieldConfiguration
list as spring beans.
Please see the index creation and field configuration documentations for more details.
with by the picturesafe-search community
Code licensed Apache License 2.0 Documentation licensed CC-BY-4.0