Home
Home
German Version
Support
Impressum
22.1 Release ►

    Main Navigation

    • Preparation
      • Connectors
      • Initial Startup for G7 appliances
      • Setup InSpire G7 primary and Standby Appliances
    • Datasources
      • Configuration - Atlassian Confluence Connector
      • Configuration - Best Bets Connector
      • Configuration - COYO Connector
      • Configuration - Data Integration Connector
      • Configuration - Documentum Connector
      • Configuration - Dropbox Connector
      • Configuration - Egnyte Connector
      • Configuration - GitHub Connector
      • Configuration - Google Drive Connector
      • Configuration - GSA Adapter Service
      • Configuration - HL7 Connector
      • Configuration - IBM Connections Connector
      • Configuration - IBM Lotus Connector
      • Configuration - Jira Connector
      • Configuration - JiveSoftware Jive Connector
      • Configuration - JVM Launcher Service
      • Configuration - LDAP Connector
      • Configuration - Microsoft Azure Principal Resolution Service
      • Configuration - Microsoft Dynamics CRM Connector
      • Configuration - Microsoft Exchange Connector
      • Configuration - Microsoft File Connector (Legacy)
      • Configuration - Microsoft File Connector
      • Configuration - Microsoft Graph Connector
      • Configuration - Microsoft Project Connector
      • Configuration - Microsoft SharePoint Connector
      • Configuration - Microsoft Stream Connector
      • Configuration - Microsoft Teams Connector
      • Configuration - Salesforce Connector
      • Configuration - SAP KMC Connector
      • Configuration - SemanticWeb Connector
      • Configuration - ServiceNow Connector
      • Configuration - SharePoint Online Connector
      • Configuration - Sitecore Connector
      • Configuration - Web Connector
      • Configuration - Yammer Connector
      • Data Integration Guide with SQL Database by Example
      • Indexing user-specific properties (Documentum)
      • Installation & Configuration - Atlassian Confluence Sitemap Generator Add-On
      • Installation & Configuration - Caching Principal Resolution Service
      • Installation & Configuration - Jive Sitemap Generator
      • Installation & Configuration - Mindbreeze InSpire Insight Apps in Microsoft SharePoint On-Prem
      • Mindbreeze InSpire Insight Apps in Microsoft SharePoint Online
      • Mindbreeze Web Parts for Microsoft SharePoint
      • User Defined Properties (SharePoint 2013 Connector)
      • Whitepaper - Mindbreeze InSpire Insight Apps in Salesforce
    • Configuration
      • CAS_Authentication
      • Cognito JWT Authentication
      • Configuration - Alternative Search Suggestions and Automatic Search Expansion
      • Configuration - Back-End Credentials
      • Configuration - Chinese Tokenization Plugin (Jieba)
      • Configuration - CJK Tokenizer Plugin
      • Configuration - Collected Results
      • Configuration - CSV Metadata Mapping Item Transformation Service
      • Configuration - Entity Recognition
      • Configuration - Exporting Results
      • Configuration - Filter Plugins
      • Configuration - GSA Late Binding Authentication
      • Configuration - Identity Conversion Service - Replacement Conversion
      • Configuration - Index-Servlets
      • Configuration - Item Property Generator
      • Configuration - Japanese Language Tokenizer
      • Configuration - Kerberos Authentication
      • Configuration - Management Center Menu
      • Configuration - Metadata Enrichment
      • Configuration - Metadata Reference Builder Plugin
      • Configuration - Mindbreeze Proxy Environment (Remote Connector)
      • Configuration - Notifications
      • Configuration - Personalized Relevance
      • Configuration - Plugin Installation
      • Configuration - Principal Validation Plugin
      • Configuration - Profile
      • Configuration - QueryExpr Label Transformer Service
      • Configuration - Reporting Query Logs
      • Configuration - Reporting Query Performance Tests
      • Configuration - Request Header Session Authentication
      • Configuration - Shared Configuration (Windows)
      • Configuration - Vocabularies for Synonyms and Suggest
      • Configuration of Thumbnail Images
      • Cookie-Authentication
      • Documentation - Mindbreeze InSpire
      • I18n Item Transformation
      • Installation & Configuration - Outlook Add-In
      • Installation - GSA Base Configuration Package
      • Language detection - LanguageDetector Plugin
      • Mindbreeze Personalization
      • Mindbreeze Property Expression Language
      • Mindbreeze Query Expression Transformation
      • Non-Inverted Metadata Item Transformer
      • SAML-based Authentication
      • Trusted Peer Authentication for Mindbreeze InSpire
      • Using the InSpire Snapshot for Development in a CI_CD Scenario
      • Whitepaper - SSO with Microsoft AAD or AD FS
      • Whitepaper - Text Classification Insight Services
    • Operations
      • app.telemetry Statistics Regarding Search Queries
      • Configuration - app.telemetry dashboards for usage analysis
      • Configuration Usage Analysis
      • Deletion of Hard Disks
      • Handbook - Backup & Restore
      • Handbook - Command Line Tools
      • Handbook - Distributed Operation (G7)
      • Handbook - Filemanager
      • Handbook - Indexing and Search Logs
      • Handbook - Updates and Downgrades
      • Index Operating Concepts
      • Inspire Diagnostics and Resource Monitoring
      • InSpire Support Documentation
      • Mindbreeze InSpire SFX Update
      • Provision of app.telemetry Information on G7 Appliances via SNMPv3
      • Restoring to As-Delivered Condition
    • User Manual
      • Cheat Sheet
      • iOS App
      • Keyboard Operation
    • SDK
      • api.v2.alertstrigger Interface Description
      • api.v2.export Interface Description
      • api.v2.personalization Interface Description
      • api.v2.search Interface Description
      • api.v2.suggest Interface Description
      • api.v3.admin.SnapshotService Interface Description
      • Debugging (Eclipse)
      • Developing an API V2 search request response transformer
      • Developing Item Transformation and Post Filter Plugins with the Mindbreeze SDK
      • Development of Insight Apps
      • Embedding the Insight App Designer
      • Java API Interface Description
    • Release Notes
      • Release Notes 20.1 Release - Mindbreeze InSpire
      • Release Notes 20.2 Release - Mindbreeze InSpire
      • Release Notes 20.3 Release - Mindbreeze InSpire
      • Release Notes 20.4 Release - Mindbreeze InSpire
      • Release Notes 20.5 Release - Mindbreeze InSpire
      • Release Notes 21.1 Release - Mindbreeze InSpire
      • Release Notes 21.2 Release - Mindbreeze InSpire
      • Release Notes 21.3 Release - Mindbreeze InSpire
      • Release Notes 22.1 Release - Mindbreeze InSpire
    • Security
      • Known Vulnerablities
    • Product Information
      • Product Information - Mindbreeze InSpire - Standby
      • Product Information - Mindbreeze InSpire
    Home

    Path

    Sure, you can handle it. But should you?
    Let our experts manage the tech maintenance while you focus on your business.
    See Consulting Packages

    Data Integration Connector

    Installation and Configuration

    Copyright ©

    Mindbreeze GmbH, A-4020 Linz, 2022.

    All rights reserved. All hardware and software names used are brand names and/or trademarks of their respective manufacturers.

    These documents are strictly confidential. The submission and presentation of these documents does not confer any rights to our software, our services and service outcomes, or any other protected rights. The dissemination, publication, or reproduction hereof is prohibited.

    For ease of readability, gender differentiation has been waived. Corresponding terms and definitions apply within the meaning and intent of the equal treatment principle for both sexes.

    Creating a „Data Integration Process“Permanent link for this heading

    The Data Integration Connector can be used together with Talend Open Studio to connect your own data sources. Talend Open Studio is available for download under https://www.talend.com/products/data-integration/data-integration-open-studio/.

    Older versions can be downloaded from https://www.talend.com/products/data-integration-manuals-release-notes/. Please note that you do not use Milestone versions (with M1, M2, etc. at the end of the version number). These are beta versions which are often unstable.

    Note: The Mindbreeze Java SDK supports Java version 8. It is possible that Talend Open Studio automatically installs a different Java version on your system, which can then lead to problems with the Mindbreeze Java SDK later on. Make sure that Java version 8 is installed and that the JAVA_HOME environment variable is set to the Java 8 JDK installation directory.

    The Data Integration Connector contains components for Talend Open Studio, which will need to be installed separately. Unpack the file components.zip from the Data Integration Connector installation package into any folder (e.g. C:\custom-talend-components).

    Create a new project after installing Talend Open Studio.

    Open Window -> Preferences in the Talend Open Studio menu

    Select Talend -> Components and enter the name of the folder into which you unpacked the components in the field “User component folder”.

    Select Import/Export Settings an activate the Option „Add classpath.jar in exported jobs“

    You can now create a new job. Add data sources according to your requirements. More information about working with Talend Open Studio is available in the Talend Open Studio documentation.

    The target of this kind of processing chain must always be the component named "MindbreezeIndexOutput".

    In recent versions of Talend Open Studio depedencies may no be detectes automatically:

    In this case the JAR file dependencies have to be resolved manually:

    • Click button “Install…”
    • Click on each -Symbol
    • Assign JARs with matching name in the folder MindbreezeIndexOutput

    Furthermore, please note that for the component to function correctly, the following fields (string type) must be defined in the data set schema:

    • key
    • title
    • extension
    • categoryClass

    The following fields are optional and can be used additionally for further processing in the Mindbreeze Index:

    • acl (list of string values) in the format: "TestUser1||GRANT"
    • date (type "Date")
    • modificationDate (type "Date")
    • content (type "String")

    Should further fields be defined in the schema, these are imported as metadata. It is also possible to define annotations in the following format:

    val1|||mes:annotated|||categoryclass=cc1|||value=v1

    In this example, "val1" becomes an annotation with the categoryClass "cc1" and the value "v1".

    All "list" type fields become lists of metadata; all other fields are automatically converted into "string" types.

    Testing and exporting the “Data Integration Job”Permanent link for this heading

    Test with loggingPermanent link for this heading

    When your job configuration is complete, you can run it to test its functionality. The data are not sent to an index but exported to Talend Open Studio.

    Test as pusherPermanent link for this heading

    You can test the Job by sending data to the Mindbreeze InSpire Appliance without exporting it first. The corresponding Index has to be created first:

    Select the „MindbreezeIndexOutput“ component and select „Use as Pusher“.

    Configure the following settings:

    Category

    The category

    Category Instance

    The category instance

    InSpire Base URL

    URL pointing to your Mindbreeze InSpire appliance

    Filter Pipeline ID

    Port of the Filter-Service (23400 by default)

    Index ID

    Port of the Index-Service

    Node ID

    Node ID of the appliance

    Inspire Generation

    • Auto Discovery: The appliance generation will be detected automatically
    • G6: Force G6
    • G7: Force G7, to make additional authentication configurable.

    Username

    For G6 use „inspireapi“. For G7-Appliance any backend user with the Rolsse „InSpire Index Writer“ can be used.Further configuration: https://help.mindbreeze.com/de/index.php?topic=doc/Konfiguration---Backend-Credentials/index.htm.

    Password

    the password of the user, WITHOUT quotation marks

    Tip: If your password contains characters that must not occur in a Java String literal, they must be escaped: https://docs.oracle.com/javase/tutorial/java/data/characters.html

    Additional Settings for G7::

    Client ID

    Client ID of the OAuth2 Client in Keycloak (e.g mindbreeze-inspire-public)

    Client Secret

    Not neccesary for „mindbreeze-inspire-public“

    Use external Auth URL

    Select this option if you use an externam Keycloak installation

    External Auth URL

    the URL of the external Keycloak installation: {base-url}/realms/{realm-name}/protocol/openid-connect/token

    ExportPermanent link for this heading

    If the functionality test runs smoothly, the job still needs to be exported. That can be done by clicking in the context menu of the job:

    It is important that the generated ZIP-file is also unpacked.

    The "Main-Class" required for the configuration of Fabasoft Mindbreeze Enterprise can be found in the generated batch file.

    Configuration of MindbreezePermanent link for this heading

    Select the “Advanced” installation method:

    Click on the “Indices” tab and then on the “Add new index” symbol to create a new index.

    Enter the index path, e.g. “C:\Index”. Adapt the Display Name of the Index Service and the related Filter Service if necessary

    Add a new data source with the symbol “Add new custom source” at the bottom right.

    To configure the crawler, you need to enter the Job ZIP archive or the extracted job directory in "Path to Job" and the Java job class in "Main Class". Please note that Talend Job Zip archives are only supported starting with Mindbreeze InSpire G7.

    If the option “Delete Unprocessed Documents” is enabled, then all unprocessed documents in the index are delete if the crawlrun was successful (exit code of the Talend-Job is 0).

    Download PDF

    • Configuration - Data Integration Connector

    Content

    • Creating a „Data Integration Process“
    • Testing and exporting the “Data Integration Job”
    • Configuration of Mindbreeze

    Download PDF

    • Configuration - Data Integration Connector