Home
Home
German Version
Support
Impressum
25.2 Release ►

Start Chat with Collection

    Main Navigation

    • Preparation
      • Connectors
      • Create an InSpire VM on Hyper-V
      • Initial Startup for G7 appliances
      • Setup InSpire G7 primary and Standby Appliances
    • Datasources
      • Configuration - Atlassian Confluence Connector
      • Configuration - Best Bets Connector
      • Configuration - Box Connector
      • Configuration - COYO Connector
      • Configuration - Data Integration Connector
      • Configuration - Documentum Connector
      • Configuration - Dropbox Connector
      • Configuration - Egnyte Connector
      • Configuration - GitHub Connector
      • Configuration - Google Drive Connector
      • Configuration - GSA Adapter Service
      • Configuration - HL7 Connector
      • Configuration - IBM Connections Connector
      • Configuration - IBM Lotus Connector
      • Configuration - Jira Connector
      • Configuration - JVM Launcher Service
      • Configuration - LDAP Connector
      • Configuration - Microsoft Azure Principal Resolution Service
      • Configuration - Microsoft Dynamics CRM Connector
      • Configuration - Microsoft Exchange Connector
      • Configuration - Microsoft File Connector (Legacy)
      • Configuration - Microsoft File Connector
      • Configuration - Microsoft Graph Connector
      • Configuration - Microsoft Loop Connector
      • Configuration - Microsoft Project Connector
      • Configuration - Microsoft SharePoint Connector
      • Configuration - Microsoft SharePoint Online Connector
      • Configuration - Microsoft Stream Connector
      • Configuration - Microsoft Teams Connector
      • Configuration - Salesforce Connector
      • Configuration - SCIM Principal Resolution Service
      • Configuration - SemanticWeb Connector
      • Configuration - ServiceNow Connector
      • Configuration - Web Connector
      • Configuration - Yammer Connector
      • Data Integration Guide with SQL Database by Example
      • Indexing user-specific properties (Documentum)
      • Installation & Configuration - Atlassian Confluence Sitemap Generator Add-On
      • Installation & Configuration - Caching Principal Resolution Service
      • Installation & Configuration - Mindbreeze InSpire Insight Apps in Microsoft SharePoint On-Prem
      • Mindbreeze InSpire Insight Apps in Microsoft SharePoint Online
      • Mindbreeze Web Parts for Microsoft SharePoint
      • User Defined Properties (SharePoint 2013 Connector)
      • Whitepaper - Mindbreeze InSpire Insight Apps in Salesforce
      • Whitepaper - Web Connector - Setting Up Advanced Javascript Usecases
    • Configuration
      • CAS_Authentication
      • Configuration - Alerts
      • Configuration - Alternative Search Suggestions and Automatic Search Expansion
      • Configuration - Back-End Credentials
      • Configuration - Chinese Tokenization Plugin (Jieba)
      • Configuration - CJK Tokenizer Plugin
      • Configuration - Collected Results
      • Configuration - CSV Metadata Mapping Item Transformation Service
      • Configuration - Entity Recognition
      • Configuration - Exporting Results
      • Configuration - External Query Service
      • Configuration - Filter Plugins
      • Configuration - GSA Late Binding Authentication
      • Configuration - Identity Conversion Service - Replacement Conversion
      • Configuration - InceptionImageFilter
      • Configuration - Index-Servlets
      • Configuration - InSpire AI Chat and Insight Services for Retrieval Augmented Generation
      • Configuration - Item Property Generator
      • Configuration - Japanese Language Tokenizer
      • Configuration - Kerberos Authentication
      • Configuration - Management Center Menu
      • Configuration - Metadata Enrichment
      • Configuration - Metadata Reference Builder Plugin
      • Configuration - Mindbreeze Proxy Environment (Remote Connector)
      • Configuration - Personalized Relevance
      • Configuration - Plugin Installation
      • Configuration - Principal Validation Plugin
      • Configuration - Profile
      • Configuration - Reporting Query Logs
      • Configuration - Reporting Query Performance Tests
      • Configuration - Request Header Session Authentication
      • Configuration - Shared Configuration (Windows)
      • Configuration - Vocabularies for Synonyms and Suggest
      • Configuration of Thumbnail Images
      • Cookie-Authentication
      • Documentation - Mindbreeze InSpire
      • I18n Item Transformation
      • Installation & Configuration - Outlook Add-In
      • Installation - GSA Base Configuration Package
      • JWT Authentication
      • Language detection - LanguageDetector Plugin
      • Mindbreeze Personalization
      • Mindbreeze Property Expression Language
      • Mindbreeze Query Expression Transformation
      • SAML-based Authentication
      • Trusted Peer Authentication for Mindbreeze InSpire
      • Using the InSpire Snapshot for Development in a CI_CD Scenario
      • Whitepaper - AI Chat
      • Whitepaper - Create a Google Compute Cloud Virtual Machine InSpire Appliance
      • Whitepaper - Create a Microsoft Azure Virtual Machine InSpire Appliance
      • Whitepaper - Create AWS 10M InSpire Appliance
      • Whitepaper - Create AWS 1M InSpire Appliance
      • Whitepaper - Create AWS 2M InSpire Appliance
      • Whitepaper - Create Oracle Cloud 10M InSpire Application
      • Whitepaper - Create Oracle Cloud 1M InSpire Application
      • Whitepaper - MMC_ Services
      • Whitepaper - Natural Language Question Answering (NLQA)
      • Whitepaper - SSO with Microsoft AAD or AD FS
      • Whitepaper - Text Classification Insight Services
    • Operations
      • Adjusting the InSpire Host OpenSSH Settings - Set LoginGraceTime to 0 (Mitigation for CVE-2024-6387)
      • app.telemetry Statistics Regarding Search Queries
      • CIS Level 2 Hardening - Setting SELinux to Enforcing mode
      • Configuration - app.telemetry dashboards for usage analysis
      • Configuration - Usage Analysis
      • Deletion of Hard Disks
      • Handbook - Backup & Restore
      • Handbook - Command Line Tools
      • Handbook - Distributed Operation (G7)
      • Handbook - Filemanager
      • Handbook - Indexing and Search Logs
      • Handbook - Updates and Downgrades
      • Index Operating Concepts
      • Inspire Diagnostics and Resource Monitoring
      • Provision of app.telemetry Information on G7 Appliances via SNMPv3
      • Restoring to As-Delivered Condition
      • Whitepaper - Administration of Insight Services for Retrieval Augmented Generation
    • User Manual
      • Browser Extension
      • Cheat Sheet
      • iOS App
      • Keyboard Operation
    • SDK
      • api.chat.v1beta.generate Interface Description
      • api.v2.alertstrigger Interface Description
      • api.v2.export Interface Description
      • api.v2.personalization Interface Description
      • api.v2.search Interface Description
      • api.v2.suggest Interface Description
      • api.v3.admin.SnapshotService Interface Description
      • Debugging (Eclipse)
      • Developing an API V2 search request response transformer
      • Developing Item Transformation and Post Filter Plugins with the Mindbreeze SDK
      • Development of a Query Expression Transformer
      • Development of Insight Apps
      • Embedding the Insight App Designer
      • Java API Interface Description
      • OpenAPI Interface Description
    • Release Notes
      • Release Notes 20.1 Release - Mindbreeze InSpire
      • Release Notes 20.2 Release - Mindbreeze InSpire
      • Release Notes 20.3 Release - Mindbreeze InSpire
      • Release Notes 20.4 Release - Mindbreeze InSpire
      • Release Notes 20.5 Release - Mindbreeze InSpire
      • Release Notes 21.1 Release - Mindbreeze InSpire
      • Release Notes 21.2 Release - Mindbreeze InSpire
      • Release Notes 21.3 Release - Mindbreeze InSpire
      • Release Notes 22.1 Release - Mindbreeze InSpire
      • Release Notes 22.2 Release - Mindbreeze InSpire
      • Release Notes 22.3 Release - Mindbreeze InSpire
      • Release Notes 23.1 Release - Mindbreeze InSpire
      • Release Notes 23.2 Release - Mindbreeze InSpire
      • Release Notes 23.3 Release - Mindbreeze InSpire
      • Release Notes 23.4 Release - Mindbreeze InSpire
      • Release Notes 23.5 Release - Mindbreeze InSpire
      • Release Notes 23.6 Release - Mindbreeze InSpire
      • Release Notes 23.7 Release - Mindbreeze InSpire
      • Release Notes 24.1 Release - Mindbreeze InSpire
      • Release Notes 24.2 Release - Mindbreeze InSpire
      • Release Notes 24.3 Release - Mindbreeze InSpire
      • Release Notes 24.4 Release - Mindbreeze InSpire
      • Release Notes 24.5 Release - Mindbreeze InSpire
      • Release Notes 24.6 Release - Mindbreeze InSpire
      • Release Notes 24.7 Release - Mindbreeze InSpire
      • Release Notes 24.8 Release - Mindbreeze InSpire
      • Release Notes 25.1 Release - Mindbreeze InSpire
      • Release Notes 25.2 Release - Mindbreeze InSpire
    • Security
      • Known Vulnerablities
    • Product Information
      • Product Information - Mindbreeze InSpire - Standby
      • Product Information - Mindbreeze InSpire
    Home

    Path

    Sure, you can handle it. But should you?
    Let our experts manage the tech maintenance while you focus on your business.
    See Consulting Packages

    Installation and Configuration
    Data Integration Connector

    Important NoticePermanent link for this heading

    The Talend Open Studio software product was discontinued on January 31, 2024. Therefore, the Data Integration Connector can no longer be further developed in its current form and will only receive maintenance updates.

    For alternatives and questions regarding the maintenance of existing solutions, please contact Mindbreeze Support at support@mindbreeze.com.

    Creating a “Data Integration Process“Permanent link for this heading

    The Data Integration Connector can be used together with Talend Open Studio to connect your own data sources. Talend Open Studio is available for download under https://www.talend.com/products/data-integration/data-integration-open-studio/.

    Older versions can be downloaded from https://www.talend.com/products/data-integration-manuals-release-notes/. Please note that you do not use Milestone versions (with M1, M2, etc. at the end of the version number). These are beta versions which are often unstable.

    Note: The Mindbreeze Java SDK supports Java version 11. It is possible that Talend Open Studio automatically installs a different Java version on your system, which can then lead to problems with the Mindbreeze Java SDK later on. Make sure that Java version 11 is installed and that the JAVA_HOME environment variable is set to the Java 11 JDK installation directory.

    The Data Integration Connector contains components for Talend Open Studio, which will need to be installed separately. Unpack the file components-<<VERSION>>.zip from the Data Integration Connector installation package into any folder (e.g. C:\custom-talend-components-12.03.123\). We recommend to include the version number in the directory name.

    Create a new project after installing Talend Open Studio.

    Open Window -> Preferences in the Talend Open Studio menu:

    Select Talend -> Components and enter the name of the folder into which you unpacked the components in the field “User component folder”.

    Select Import/Export Settings an activate the Option “Add classpath.jar in exported jobs”:

    You can now create a new job. Add data sources according to your requirements. More information about working with Talend Open Studio is available in the Talend Open Studio documentation.

    The target of this kind of processing chain must always be the component named "MindbreezeIndexOutput".

    Troubleshooting - DependenciesPermanent link for this heading

    In recent versions of Talend Open Studio, dependencies may not be detected automatically:

    In this case the JAR file dependencies have to be resolved manually:

    • Click button “Install…”
    • Click on each -Symbol
    • Assign JARs with matching name in the folder MindbreezeIndexOutput

    Updating the Mindbreeze ComponentsPermanent link for this heading

    To achieve seamless operation, we recommend that you always update the Mindbreeze Components to the same version as the Mindbreeze InSpire version you are using. The update works similar to the initial configuration, but requires additional steps.

    1. As explained in the section "Configuring Mindbreeze User Components", download the Mindbreeze Components from the Mindbreeze website and unpack the "components-<<VERSION>>.zip" archive. E.g. to the location "C:\custom-talend-components-12.03.123".

    It is recommended to change the path of the User Component folder in the Talend Settings after each update or to always use the version number in the path. This is because Talend often does not recognize that the User Components have been updated. In this case, Talend internally continues to use the old cached versions of the components without this being apparent.

    1. Restart Talend before continuing. Only restarting Talend will ensure that the new version of Mindbreeze Components is completely reloaded. (The automatic update when Talend is open is not sufficient).

    Using Mindbreeze ComponentsPermanent link for this heading

    Furthermore, please note that for the component to function correctly, the following fields (string type) must be defined in the data set schema:

    • key
    • title
    • extension
    • categoryClass

    The following fields are optional and can additionally be used for further processing in the Mindbreeze Index:

    • acl (list of string values) in the format: "TestUser1||GRANT"
    • date (type "Date")
    • modificationDate (type "Date")
    • content (type "String")

    Should further fields be defined in the schema, these are imported as metadata. It is also possible to define annotations in the following format:

    val1|||mes:annotated|||categoryclass=cc1|||value=v1

    In this example, "val1" becomes an annotation with the categoryClass "cc1" and the value "v1".

    All "list" type fields become lists of metadata; all other fields are automatically converted into "string" types.

    Testing and exporting the “Data Integration Job”Permanent link for this heading

    Test with loggingPermanent link for this heading

    When your job configuration is complete, you can run it to test its functionality. The data are not sent to an index but exported to Talend Open Studio.

    Test as pusherPermanent link for this heading

    You can test the job by sending data to the Mindbreeze InSpire Appliance without exporting it first. The corresponding Index has to be created first:

    Then click on “Save” to save the changes. The following pop-up is displayed by your browser, which you must confirm with ‘OK’:

    You must perform the next step in Talend Open Studio again. In the Designer, select the “MindbreezeIndexOutput” component you are using and select “Use as Pusher”.

    Configure the following settings:

    Setting

    Description

    Category

    The category.

    Category Instance

    The category instance.

    InSpire Base URL

    The URL pointing to your Mindbreeze InSpire appliance.

    Filter Pipeline ID

    The port of the Filter Service (23400 by default).

    Index ID

    The port of the Index Service.

    Node ID

    The Node ID of the appliance (can also be found in the previously created index).

    Inspire Generation

    The following options are available:

    • Auto Discovery: The appliance generation will be detected automatically
    • G6: Force G6 (Deprecated)
    • G7: Force G7, to make additional authentication configurable.

    Username

    The user name that is used for indexing. If your appliance is G6, “inspireapi” must be used. If you have a G7 appliance, you can use any user who has the role “InSpire Index Writer”.

    Further configuration: https://help.mindbreeze.com/de/index.php?topic=doc/Konfiguration---Backend-Credentials/index.htm.

    Password

    The password of the user, WITHOUT quotation marks.

    Tip: If your password contains characters that must not occur in a Java String literal, they must be escaped: https://docs.oracle.com/javase/tutorial/java/data/characters.html

    Additional Settings for G7:

    Setting

    Description

    Client ID

    The Client ID of the OAuth2 Client in Keycloak (e.g mindbreeze-inspire-public).

    Client Secret

    Not necessary for „mindbreeze-inspire-public“.

    Use external Auth URL

    Select this option, if you use an external Keycloak installation

    External Auth URL

    The URL of the external Keycloak installation from which you can obtain the Bearer token: {base-url}/realms/{realm-name}/protocol/openid-connect/token.

    ExportPermanent link for this heading

    If the functionality test runs smoothly, the job still needs to be exported. That can be done by clicking in the context menu of the job:

    It is important that the generated ZIP-file is also unpacked.

    The "Main-Class" required for the configuration of Fabasoft Mindbreeze Enterprise can be found in the generated batch file.

    Configuration of MindbreezePermanent link for this heading

    Select the setting “Advanced Settings”.

    Click on the “Indices” tab and then on the “Add new index” symbol to create a new index.

    Enter the index path, e.g. “C:\Index”. Adapt the Display Name of the Index Service and the related Filter Service if necessary:

    Add a new data source with the symbol “Add new custom source” at the bottom right.

    To configure the crawler, you need to enter the Job ZIP archive or the extracted job directory in "Path to Job" and the Java job class in "Main Class". Please note that Talend Job Zip archives are only supported starting with Mindbreeze InSpire G7.

    If the option “Delete Unprocessed Documents” is enabled, then all unprocessed documents in the index are delete if the crawlrun was successful (exit code of the Talend-Job is 0).

    Download PDF

    • Configuration - Data Integration Connector

    Content

    • Important Notice
    • Creating a “Data Integration Process“
    • Testing and exporting the “Data Integration Job”
    • Configuration of Mindbreeze

    Download PDF

    • Configuration - Data Integration Connector