Handbook
Distributed Operation (G7)

Introduction

ATTENTION: This documentation refers to Mindbreeze InSpire G7 generation appliances. If you are using an older version, please read the Handbook - Synchronized Operation G6.

Mindbreeze InSpire can be operated with dedicated Producer and Consumer Nodes. A distinction is also made between master and synchronized nodes.

Exactly one server is the master, whereby it usually acts as a producer. Multiple servers act as synchronized nodes, which can be producers or consumers. The configuration is carried out exclusively at the master, whereby this configuration is then distributed to all synchronized nodes.

One or more servers serve as producers. Initial indexing and delta indexing are carried out on these nodes according to the valid configuration.

Furthermore, the Producer Servers operate all Mindbreeze indices as well as one Mindbreeze Filter Service each. The indexes are produced (indexed) on these servers and they also perform delta indexing. The Producer Nodes are therefore pure producers of indices.

The indexes created or renewed in this way are automatically distributed to the consumer nodes by copying.

Run on the Consumer Nodes (also distributed to several producers)

all Mindbreeze indexes in read mode,
the associated sandbox processes for contextualization and authorization,
(Caching) Principal Resolution Services for Authorization, and
Client Services.

These consumer servers are responsible for answering search queries and providing client services.

Main advantages

Indexing incl. delta crawl runs do not impair search performance negatively
No influence on search performance by
Initial indexing during operation
Delta indexing during operation
High-frequency delta indexing (e.g. 15 minutes actuality)
no loss of data sources during re-indexing (searching on consumers with existing indexes, re-indexing on producers)
Easy migration to new product versions even if re-indexing is recommended/necessary.
Regular re-indexing can be carried out without affecting operation
Flexibility, e.g. when changing configurations
Central administration of all nodes (including app.telemetry)

Prerequisites

Additional hardware for Producer and Consumer Nodes
Load distributors are also required/provided for additional failure safety
TCP ports 8443 and 2222 must be accessible from all appliances to each other
Each appliance must have a unique host name that can be resolved by the DNS server

Load Balancer

If you have multiple client services distributed across multiple nodes in your setup, you can use load balancers to distribute search queries across your client services to improve performance and ensure resilience.

It is important that the load balancer is configured in such a way that a user always searches on the same client service. In order to realize this, there are different possibilities, which are dependent on the concrete load balancer, whereby the following list gives a few examples on how to implement this:

the load balancer supports session affinity out-of-the-box
requests that come from the same IP address are always forwarded to the same client service
requests with the same cookie "JSESSIONID" are always forwarded to the same client service

Configuration

Management of Nodes

In the Management Center under "Setup" -> "Nodes" you can manage the cluster of your Mindbreeze Inspire Appliances (Nodes). When setting up the cluster, open the above mentioned configuration interface at the Mindbreeze InSpire Appliance, which you want to use later as master. The Master Node is always at the top of the Node Overview (see screenshot below) and is marked with a grey background so that it can be distinguished from synchronized nodes.

The screenshot below shows the following configuration:

inspirenode2: Master; Producer for inspirenode1
inspirenode1: Consumer from inspirenode2
inspirenode3: Producer for inspirenode3
inspirenode4: Consumer from inspirenode3
inspirenode5: Acts as its own producer and consumer

Number of documents in your license

Under "Total Licensed Documents" you can view the maximum number of documents you can index under your license.

The number of documents still freely available is also displayed.

Adding Nodes

With the button "Add Node" nodes can be added to the cluster.

Configure the options listed in the table below and then click the Connect button. Options marked with an asterisk are mandatory.

Field	Description
Hostname/IP Address*	Hostname or IP address of the Mindbreeze InSpire Appliance
Username*	User name of the administrator
password	Related password
client ID	Optional: If you do not use the standard OAuth Client for authentication, enter the OAuth Client ID here (relevant for external keycloak).
Client Secret	Optional: If you do not use the standard OAuth client for authentication and this requires a client secret in addition to the client ID, specify the client secret here

Changing Nodes

You have the possibility to change the source/producer (column "Source") of the node and to display or change further options (button "i", column "Action").

Changing the source

In the column "Source" you can change the source. You have the following options for synchronized nodes:

None

The node has its own index with query service, i.e. it acts as both producer and consumer.

The selected node (<node hostname>) becomes the producer for the consumer.

For a better understanding see the screenshot from the section "Management of Nodes" above: inspirenode1.myorganization.com and is consumers of the producer inspirenode2.myorganization.com.

Producer

Cannot be selected. If a node is selected in the dropdown list, the selected node becomes the producer and this becomes visible at this node.

Changing other options

If you click on the "i" button in the "Action" column of a node, you can view more information about that node. You will also find further setting options here.

Basic Node Information

This area displays basic information. If you click on the "Edit" button, you can change these settings. The following table describes the settings that can be changed:

Field	Description
Hostname	The host name of the appliance.
Public Hostname	Derived from Hostname.
Backend Hostname	Derived from Hostname.
Backend Service Base URL	Derived from Hostname.
Service Data Directory	Directory in which, for example, all indexes are stored by default.
Service Tempdata Directory	Temporary directory that is used for index synchronization.
Service Config Directory	Directory in which additional configurations for certain services are stored. This is the case with Retrieval Augmented Generation (RAG) Administration, for example.
Cluster Authentication Method	Specifies the authentication method between the client service and the query service. You can choose between two methodes: JSON Web Token (JWT) Open Authentication (OAuth) 2 (Not recommended) Default value: JSON Web Token (JWT). The JSON Web Token (JWT) is more performant especially for parallel searches by many users.

Node Status

In this area, status information of the node is displayed. The following table describes the individual fields.

Field	Description
Document Limit	Maximum number of documents that can be indexed on the node. The values vary depending on your Mindbreeze InSpire license.
Indexed Document Count	Number of documents currently indexed on the node.
Number of Services	Number of services running on the node.

Product Version

This area displays product version information for the node. The following table describes the individual fields.

Field	Description
Release	Release name of the installed release
Product	Name of the installed product
Version	Version of the installed product
Trademark	Trademark information
Copyright	Copyright information

Change Master

If you have opened the settings for a non-master node, a button with the name "Propagate this node as master" will be displayed, with which you can make a node the master.

“Indexed Document” Column

This column displays the percentage number of currently indexed documents in relation to the maximum number ("Licensed Documents") of the respective node.

“Licensed Documents” Column

This column displays the maximum number of documents that can be indexed in the respective node.

Notes:

A Producer Node may not exceed the number of "Total Licensed Documents".
A Consumer Node must not exceed the number of the associated Producer Node.

Removing Nodes

If you want to remove a node from the cluster, click on the recycle bin icon of the respective node. Please note that only nodes that are not configured as producers can be removed. If you want to remove a producer, you must first remove all associated consumers from the producer.

Management of Tasks

In the Management Center, under Setup -> Tasks, you can define synchronization tasks that affect your Mindbreeze Inspire Appliances (Node) added in the Nodes menu. Always open the Tasks menu on your master node, as all settings are always performed on this node.

Mindbreeze InSpire distinguishes between three types of tasks:

Configuration and data synchronization
index synchronization
app.telemetry dashboards update

The following sections describe the various tasks in detail.

Configuration and data synchronization

The Synchronize config and data list manages all configuration and data synchronization tasks. A task always ensures that files are transferred from the master to all synchronized nodes, as long as this is not configured differently (later on). By default, there is a configuration and data synchronization task that runs every 15 minutes. This task is marked with a yellow frame and cannot be deleted or changed (except "Schedule" and "Enable/Disable").

Configuration synchronization involves the transfer of configuration files as well as certificates, licenses and plug-ins, for example. This ensures that all files necessary for the configuration are also available on the synchronized nodes.

During data synchronization, all files from /data/apps and /data/resources are distributed by the master to all synchronized nodes.

With the button "Add Task" a new synchronization task can be added. The following table describes the options that can be set.

Option	Description
Enable / Disable	With the slider in the upper right corner, tasks can be activated ("Task is enabled") and deactivated ("Task is disabled"). Deactivated tasks remain in the list of tasks, but are no longer executed (despite a defined schedule).
Name	Name of the task. A descriptive name should be given here in order to keep the overview.
Description	Description of the task. Useful if the name of the task is not sufficient to describe it.
Max Duration	Maximum number of seconds for which the task can run. If this time is exceeded, the task is aborted.
Schedule	Cron Expressions, which describes the execution times of the tasks. With the button "Add Schedule" as many Cron Expressions as desired can be added and deleted, whereby both a graphical editor and a textual editor for Cron Expressions are available. With the option "Run now" it can be defined that a task is executed immediately and only once. The task is executed with the current settings, regardless of whether "Close" or "Save" is selected afterwards.
Execution	At the current product level, there can only be one master at a time. This may change in later product versions, but the following options are already available: "Execute on all available nodes": All masters synchronize to their nodes "Execute exclusively on the following nodes": All selected masters synchronize to their nodes "Do not execute on the following nodes": All unselected masters synchronize to their nodes
Request	"Sync Config" (to synchronize the configuration) and / or "Sync Data" (to synchronize the data) can be selected

Index Synchronization

The Synchronize indices list manages all index synchronization tasks. A task always ensures that indexes are transferred from all producers to their consumers. By default, there is an index synchronization task that executes an index sync delta every 2 hours. This task is marked with a yellow frame and cannot be deleted or changed (except "Schedule" and "Enable/Disable").

In addition to the standard tasks, it is recommended to perform a full index synchronization ("Operation": "Sync Full") once a day on all producers, for example at night.

With the button "Add Task" a new synchronization task can be added. The following table describes the options that can be set.

Option	Description
Enable / Disable	With the toggle at the top right, tasks can be activated ("Task is enabled") and deactivated ("Task is disabled"). Deactivated tasks remain in the list of tasks, but are no longer executed (despite a defined schedule).
Name	Name of the task. A descriptive name should be given here in order to keep the overview.
Description	Description of the task. Useful if the name of the task is not sufficient to describe it.
Max Duration (in sec.)	Maximum number of seconds for which the task can run. If this time is exceeded, the task is aborted.
Schedule	Cron Expressions, which describes the execution times of the tasks. With the button "Add Schedule" as many Cron Expressions as desired can be added and deleted, whereby both a graphical editor and a textual editor for Cron Expressions are available. With the option "Run now" it can be defined that a task is executed immediately and only once. The task is executed with the current settings, regardless of whether "Close" or "Save" is selected afterwards.
Execution	Three options are available: "Execute on all available nodes": all producers synchronize to their consumers "Execute exclusively on the following nodes": all selected producers synchronize to their consumers "Do not execute on the following nodes": all non-selected producers synchronize to their consumers
Operation	Three options are available: "Sync Delta: Delta synchronization is performed to synchronize changes to the producer index of the consumer index. It is recommended to use "Sync Delta" if you want to synchronize every 2 hours or more frequently. Up to X changed buckets are synced on each run (according to the setting “Maximum Number of Final Buckets To Copy” on the index, default: 20) "Sync Full: full synchronization is performed. "Sync Full" is recommended if the task should be executed less frequently (e.g. every day at night). All changed Buckets are synced. "Sync Forced Full": full synchronization (all Buckets) is performed even if there were no changes.
Services	Three options are available: "Execute on all available services": Producers synchronize all indexes to their consumers "Execute exclusively on the following services": Producers synchronize all selected indices with their consumers "Do not execute on the following nodes": Producers synchronize all non-selected indices to their consumers
Maximum parallel synchronized indices	If more than one index is running on a producer, the indices are synchronized in parallel. To reduce the system load caused by synchronization, the maximum number of parallel synchronizations can be configured. A value of 0 means all indices are synchronized in parallel. A value of 1 means the indices are synchronized sequentially.

app.telemetry dashboard update

A description of this task can be found in separate documentation here.

Current Tasks

In the "Current Tasks" list you will find an overview of the tasks that are currently running or have already run. Only those tasks are displayed which are currently running or have been completed in the last 5 minutes. However, a maximum of 15 tasks are displayed.

If a task is currently running, it can be aborted with the "Cancel" button.

If a task is no longer running, it can be restarted with the "Rerun" button.

A task can have different statuses:

In Progress: The task is currently running
Completed: The task was successfully completed.
Failed: The task could not be completed successfully. In the details you can find the error description.
Canceled: The task was aborted (button "Cancel")

Task History

If you want to display more than the last 15 tasks from the "Current Tasks", you can use the "Task History". A period must first be selected in which the tasks to be displayed have taken place. The task history can then be displayed with the "Get Task History" button. The following periods can be selected:

Period	Description
Today	All the tasks that ran today
Yesterday	All the tasks that ran yesterday
Last 7 Days	All tasks that have run within the last 7 days
Last 30 Days	All tasks that have run within the last 30 days
This Month	All tasks that have been run this month
Last Month	All tasks that ran in the last month

With the button "Previous" and "Next" you can navigate to the previous or next page. Between the two buttons you can see which page you are currently on and how many pages there are in total. The dropdown box at the top right (under "Status") can be used to filter for "Completed", "Canceled", "Failed" and "In Progress" tasks. With the button "Download Data" the information of the tasks (all pages) can be downloaded as JSON file. With the button "Close" the window can be closed again.

Service Configuration in Manager UI

As already mentioned, the configuration is carried out exclusively on the master. To make changes, go to the menu "Configuration" in the Mindbreeze Management Center at the Master.

Please note that the changes only take effect after configuration synchronization on the other nodes. If you do not want to wait for the next synchronization task, you can start a synchronization manually (see Configuration and data synchronization).

The following subsections explain how to configure the different Mindbreeze Inspire Services to different nodes.

Note: In principle, however, only master, synchronized and producer nodes should be selected when configuring a service. All services that are to run on the consumer nodes are automatically configured to the consumer nodes when the configuration is synchronized, so they no longer have to be explicitly configured. Detailed information can be found in the following subsections.

The following node configuration is used for the following examples:

inspirenode2: Master; Producer for inspirenode1
inspirenode1: Consumer from inspirenode2
inspirenode3: Producer for inspirenode3
inspirenode4: Consumer from inspirenode3
inspirenode5: Acts as its own producer and consumer

Index Service

If you want to create a new index in the tab "Indexes" (with the button "Add Index"), you can select the desired node in the dropdown box "Index Node". The Index Service then also runs on all consumers that the selected node (producer) has after the configuration has been synchronized. The data source (crawler), however, then only runs on the selected node (producer).

If you want to change the node of an existing index, you can do this in the index in the Setup section with the dropdown box "Index Node".

Index Layout

Activate the "Advanced Settings" to configure the "Index Layout".

Setting

Description

Multi-Index Layout

Active if checked (default: checked).

Enable Conversion to Multi-Index Layout

Configure here, in which case an index is converted to multi-index layout.

Option	Beschreibung
Auto	Same behavior as "Read-Only Index".
Read-Only Index	Only active on Consumer Node
Enabled	active
Disabled	Not active

Maximum Number of Index Entries in the Multi-Index Layout

The maximum number of index entries (index versions) that are retained. If the number is exceeded, the oldest index entry is automatically deleted (default: 10).

Index Synchronization Settings

Activate the "Advanced Settings" to configure the following options in "Index Synchronization Settings":

Setting	Description
SyncDelta Outgoing Directory	Allows setting a custom temporary directory used for outgoing synchronization operations.
Maximum Number of Final Buckets To Copy	Allows overriding the default number of buckets copied within one synchronization operation.
Enable Task History Cleanup	If active, the last task status files are deleted when the index is started. The maximum number of deleted files can be changed with the "Maximum Number of Initial Cleaned-Up Task History Entries" option and is set to 500 000 by default.
Maximum Number of Persistent Task History Entries	Allows you to specify the maximum number of persistent task history files that are stored locally. These files will not be deleted by the Task History Cleanup. Default value: 10 000.
Maximum Number of Initial Cleaned-Up Task History Entries	Allows you to configure the maximum number of Task Status files that can be deleted during the Task History Cleanup. Default value: 500 000.
Maximum Number of Synchronization Threads	Allows limiting the number of threads used for a synchronization operation.
Wait for Invertion Completed before Synchronization	If active, the index waits for the current inversion tasks before the synchronization process, so that the synchronized data is complete. (Default value: active).
Resolve Index Conflicts on Synchronization	If turned on, try to resolve index synchronization conflicts implicitly. (Default value: active).

Services

If you want to create a new service in the "Indices" tab, you can click the "Add Service" button and select all nodes in the "Nodes" section of the created service on which the service is to run. Here, too, no consumer node should be explicitly selected, since when the configuration is synchronized, the service automatically starts at the consumer if it is required for the search (e.g. with Principal Resolution Services).

Filter Service

If you want to create a new filter in the "Filters" tab, you can click the "Add Service" button and select all nodes in the "Nodes" section of the created service on which the service is to run. Filter Services never run on consumers because they are only used for indexing.

Client service

If you want to create a new Client Service in the tab "Client Services" (with the button "Add Client Service"), you can select the desired node in the dropdown box "Node". The Client Service then also runs on all consumers that the selected Node (Producer) has after the configuration has been synchronized.

Troubleshooting

app.telemetry

In the app.telemetry on the master all other nodes are registered as agents. This has the advantage that all logs of all nodes are available in the app.telemetry (from the master).

To filter for an agent (= node) in a log pool, right-click on a cell in the "Agent" column and select "Add Filter".

Enter the corresponding node name by which you want to filter and confirm the filter with the button "Update".

Index synchronization always ends with status "Failed“

If your index synchronization tasks ("Sync Delta" or "Sync Full") on one or more indexes always fail (status "Failed"), an index synchronization task can help with the operation "Sync Forced Full". Create a new "Synchronize indices" task (button "Add Task") and perform the following steps:

Show advanced options with "Show request
Under "Operation", select the "Sync Forced Full" option.
Optional: Select the affected indexes under "Services
Click under "Schedule" on the button "Run now".
Click on the "Close" button in the lower right corner so that the task is not saved and is only executed once.

Removing a node that is offline ("force remove")

You may want to remove a node that is offline if, for example, the affected appliance is no longer in use and can no longer be reached. To remove this node under "Nodes", click on the recycle bin icon.

You will be prompted to confirm the removal with the following dialog: "Caution: This node is synchronized! Are you sure to remove Node inspirenode5.myorganization.com?"

Since the node is not available at the moment, you have to confirm a second dialog: "Force remove Node inspirenode5.myorganization.com".

Node cannot be added if it was removed before with "force remove

If you want to add a node removed with "force remove" again later, it is possible that the node cannot be added anymore (e.g. if the master has changed in the meantime while the removed node was offline). In this case, you must perform the following steps:

Open the "Nodes" configuration at the remote Node (not at the Master!)
A warning dialog with "Read-Only Mode" is displayed. In the dialog click on "Continue".
Open the "Node Properties" for the remote node ("i" icon)
Click on the button "Reset this node".
Open the "Nodes" configuration at the Master
You can now add the remote node again.

Changing the master when it is offline

If the Master Node is offline and will no longer be online in the near future, you can select another Node to become the new Master. Carry out the following steps:

Open the "Nodes" configuration at the node that is to become the new master (not at the current master!).
A warning dialog with "Read-Only Mode" is displayed. In the dialog click on "Continue".
Open the "Node Properties" for the Node you want to make the new Master ("i"-symbol)
Click on the button "Propagate this node as master".

ATTENTION: If you use Persisted Resources, the Persisted Resources database will be regenerated, resulting in your changes being lost. For more information about restoring databases, see Handbook - Backup & Restore - Restoring databases.

You will also need to restart your nodes, starting with the new master:

In the Management Center, open the Services menu.
Click on the gear symbol of your node in the "Nodes" area and select "Stop".
Wait until the node is stopped.
Start the node again with the gear symbol and "Start".
Repeat these steps with all nodes, but you must always be in the Management Center of the respective node.

Appendix

Monitoring of tasks in app.telemetry

The execution of tasks to synchronize data, configuration and indices can be monitored with app.telemetry. Tasks create service check files during execution that can be used with app.telemetry health checks. The following subsections explain which configuration steps are necessary for monitoring tasks.

Creating the Service Group

Navigate to the configuration of app.telemetry by selecting "Reporting" "Telemetry Details" "Configuration" in Mindbreeze Management Center. Then select "Services" and click on the "New Service Group" button.

Then assign a name and confirm the dialog with "OK".

Create the "Services" for each node / agent

Create a service for each master or producer node in the service group you just created.

For each created "Service" a name must be assigned and the corresponding agent must be selected.

If you have 2 producer nodes, your configuration should be similar to the screenshot below:

Creating the "Service Checks"

Depending on the role of a node, different service checks can be created for the available tasks.

Creating the "Service Checks" for the Master Node

A service check can be created at the master node to monitor data and configuration synchronization.

Select "Counter check using file system" and confirm your selection with "Next".

Select "mindbreeze.api.v3.admin.SyncConfigAndData.txt" as counter check file and switch to the tab "Check Properties".

Assign a name and enter the following values:

Check Interval [sec]*: 60 (a lower value can also be chosen)
Value Expiration Factor: 0
Warning Limits Numeric Ranges Above (≥): 1
Critical Limits Numeric Ranges Above (≥): 1

Accept the configuration of the service check with the "OK" button.

Creating the "Service Checks" for a producer node

A service check can be created at the producer node(s) to monitor the index synchronization.

Select "Counter check using file system" and confirm your selection with "Next".

Select "mindbreeze.api.v3.admin.SyncIndex.txt" as counter check file and switch to the tab "Check Properties".

Assign a name and enter the following values:

Check Interval [sec]*: 60 (a lower value can also be chosen)
Value Expiration Factor: 0
Warning Limits Numeric Ranges Above (≥): 1
Critical Limits Numeric Ranges Above (≥): 1

Accept the configuration of the service check with the "OK" button.

Checking the current status

To now check the current status of the tasks, go to "Status" <configured service group> <configured service> (see screenshot below).

Update notes between image versions (18.x to 19.0)

If you already had a producer-consumer setup with image version 18.x and want to update to 19.0, this section is available to help you update to the new version.

As already mentioned, the master distributes the configurations to all synchronized nodes. Please note that the existing configurations of your other Mindbreeze InSpire appliances will be overwritten after a certain time after they have been added to the master as synchronized nodes. So if you have configurations on the affected Mindbreeze InSpire appliances that should not get lost, don't forget to create a backup with export_managerconfig.sh before:

export_managerconfig.sh <backup path>

Please note that the <backup path> directory will be overwritten if the directory already exists. The value for <backup path> is /config/mesconfig/export, for example.

The following sections describe how different use cases were configured in your previous version and what steps are necessary to configure them in the new version. So that the new features can be used in this way, a dismantling of manual adjustments is necessary.

Configuration synchronization

In your previous setup, you might have had a cron job for each consumer that synchronized the configuration:

#Example: Sync Config to Consumer. Daily. IMPORTANT: replace <target> with hostname of consumer

30 23 * * * mes /opt/mindbreeze/scripts/export-and-sync-masterconfig-to-consumer.sh <target>

You will no longer need this cron job in the future and should therefore comment it out or delete it. All you have to do in the Management Center is configure the relationships between your nodes (Master, Producer, Consumer ...), see Management of Nodes. A task is automatically created that synchronizes your configuration from the master node to all non master nodes every 15 minutes.

Synchronization of the indexes

In your previous setup, you had either a cron job that synchronized all indexes in a script, or a cron job for each index that synchronized the indexes with the mescontrol command line tool:

#Example: Sync Index with port 23100. Every hour

0 * * * * * mes /opt/mindbreeze/bin/mescontrol http://localhost:23100 syncdelta

You will no longer need this cron job in the future and should therefore comment it out or delete it. A synchronization task for the indices is also automatically created here after you have defined the relationships of your nodes (Master, Producer, Consumer ...). This executes the operation "Sync Delta" by default and is executed every 2 hours.

Configuration of Services

In your previous configuration, you also specified which indexes are synchronized to which consumers:

It is recommended to select only the Producer Node in the new version, as the index service will run automatically on the Consumer Node(s). Also the option "Sync" (see screenshot above) should no longer be used.

Other service types (such as Client Service or Caching Principal Resolution Services) behave similarly. Which services run automatically on the consumer can be read in the section Service Configuration in Manager UI.

Modifying the SSH Keys Manually

Manual modification of the SSH keys is no longer necessary on G7 appliances.
In addition, access via the management center is no longer possible for security reasons.

{{{i18n.refineSearch}}}

Handbook Distributed Operation (G7)