Home
Home
German Version
Support
Impressum
25.2 Release ►

Start Chat with Collection

    Main Navigation

    • Preparation
      • Connectors
      • Create an InSpire VM on Hyper-V
      • Initial Startup for G7 appliances
      • Setup InSpire G7 primary and Standby Appliances
    • Datasources
      • Configuration - Atlassian Confluence Connector
      • Configuration - Best Bets Connector
      • Configuration - Box Connector
      • Configuration - COYO Connector
      • Configuration - Data Integration Connector
      • Configuration - Documentum Connector
      • Configuration - Dropbox Connector
      • Configuration - Egnyte Connector
      • Configuration - GitHub Connector
      • Configuration - Google Drive Connector
      • Configuration - GSA Adapter Service
      • Configuration - HL7 Connector
      • Configuration - IBM Connections Connector
      • Configuration - IBM Lotus Connector
      • Configuration - Jira Connector
      • Configuration - JVM Launcher Service
      • Configuration - LDAP Connector
      • Configuration - Microsoft Azure Principal Resolution Service
      • Configuration - Microsoft Dynamics CRM Connector
      • Configuration - Microsoft Exchange Connector
      • Configuration - Microsoft File Connector (Legacy)
      • Configuration - Microsoft File Connector
      • Configuration - Microsoft Graph Connector
      • Configuration - Microsoft Loop Connector
      • Configuration - Microsoft Project Connector
      • Configuration - Microsoft SharePoint Connector
      • Configuration - Microsoft SharePoint Online Connector
      • Configuration - Microsoft Stream Connector
      • Configuration - Microsoft Teams Connector
      • Configuration - Salesforce Connector
      • Configuration - SCIM Principal Resolution Service
      • Configuration - SemanticWeb Connector
      • Configuration - ServiceNow Connector
      • Configuration - Web Connector
      • Configuration - Yammer Connector
      • Data Integration Guide with SQL Database by Example
      • Indexing user-specific properties (Documentum)
      • Installation & Configuration - Atlassian Confluence Sitemap Generator Add-On
      • Installation & Configuration - Caching Principal Resolution Service
      • Installation & Configuration - Mindbreeze InSpire Insight Apps in Microsoft SharePoint On-Prem
      • Mindbreeze InSpire Insight Apps in Microsoft SharePoint Online
      • Mindbreeze Web Parts for Microsoft SharePoint
      • User Defined Properties (SharePoint 2013 Connector)
      • Whitepaper - Mindbreeze InSpire Insight Apps in Salesforce
      • Whitepaper - Web Connector - Setting Up Advanced Javascript Usecases
    • Configuration
      • CAS_Authentication
      • Configuration - Alerts
      • Configuration - Alternative Search Suggestions and Automatic Search Expansion
      • Configuration - Back-End Credentials
      • Configuration - Chinese Tokenization Plugin (Jieba)
      • Configuration - CJK Tokenizer Plugin
      • Configuration - Collected Results
      • Configuration - CSV Metadata Mapping Item Transformation Service
      • Configuration - Entity Recognition
      • Configuration - Exporting Results
      • Configuration - External Query Service
      • Configuration - Filter Plugins
      • Configuration - GSA Late Binding Authentication
      • Configuration - Identity Conversion Service - Replacement Conversion
      • Configuration - InceptionImageFilter
      • Configuration - Index-Servlets
      • Configuration - InSpire AI Chat and Insight Services for Retrieval Augmented Generation
      • Configuration - Item Property Generator
      • Configuration - Japanese Language Tokenizer
      • Configuration - Kerberos Authentication
      • Configuration - Management Center Menu
      • Configuration - Metadata Enrichment
      • Configuration - Metadata Reference Builder Plugin
      • Configuration - Mindbreeze Proxy Environment (Remote Connector)
      • Configuration - Personalized Relevance
      • Configuration - Plugin Installation
      • Configuration - Principal Validation Plugin
      • Configuration - Profile
      • Configuration - Reporting Query Logs
      • Configuration - Reporting Query Performance Tests
      • Configuration - Request Header Session Authentication
      • Configuration - Shared Configuration (Windows)
      • Configuration - Vocabularies for Synonyms and Suggest
      • Configuration of Thumbnail Images
      • Cookie-Authentication
      • Documentation - Mindbreeze InSpire
      • I18n Item Transformation
      • Installation & Configuration - Outlook Add-In
      • Installation - GSA Base Configuration Package
      • JWT Authentication
      • Language detection - LanguageDetector Plugin
      • Mindbreeze Personalization
      • Mindbreeze Property Expression Language
      • Mindbreeze Query Expression Transformation
      • SAML-based Authentication
      • Trusted Peer Authentication for Mindbreeze InSpire
      • Using the InSpire Snapshot for Development in a CI_CD Scenario
      • Whitepaper - AI Chat
      • Whitepaper - Create a Google Compute Cloud Virtual Machine InSpire Appliance
      • Whitepaper - Create a Microsoft Azure Virtual Machine InSpire Appliance
      • Whitepaper - Create AWS 10M InSpire Appliance
      • Whitepaper - Create AWS 1M InSpire Appliance
      • Whitepaper - Create AWS 2M InSpire Appliance
      • Whitepaper - Create Oracle Cloud 10M InSpire Application
      • Whitepaper - Create Oracle Cloud 1M InSpire Application
      • Whitepaper - MMC_ Services
      • Whitepaper - Natural Language Question Answering (NLQA)
      • Whitepaper - SSO with Microsoft AAD or AD FS
      • Whitepaper - Text Classification Insight Services
    • Operations
      • Adjusting the InSpire Host OpenSSH Settings - Set LoginGraceTime to 0 (Mitigation for CVE-2024-6387)
      • app.telemetry Statistics Regarding Search Queries
      • CIS Level 2 Hardening - Setting SELinux to Enforcing mode
      • Configuration - app.telemetry dashboards for usage analysis
      • Configuration - Usage Analysis
      • Deletion of Hard Disks
      • Handbook - Backup & Restore
      • Handbook - Command Line Tools
      • Handbook - Distributed Operation (G7)
      • Handbook - Filemanager
      • Handbook - Indexing and Search Logs
      • Handbook - Updates and Downgrades
      • Index Operating Concepts
      • Inspire Diagnostics and Resource Monitoring
      • Provision of app.telemetry Information on G7 Appliances via SNMPv3
      • Restoring to As-Delivered Condition
      • Whitepaper - Administration of Insight Services for Retrieval Augmented Generation
    • User Manual
      • Browser Extension
      • Cheat Sheet
      • iOS App
      • Keyboard Operation
    • SDK
      • api.chat.v1beta.generate Interface Description
      • api.v2.alertstrigger Interface Description
      • api.v2.export Interface Description
      • api.v2.personalization Interface Description
      • api.v2.search Interface Description
      • api.v2.suggest Interface Description
      • api.v3.admin.SnapshotService Interface Description
      • Debugging (Eclipse)
      • Developing an API V2 search request response transformer
      • Developing Item Transformation and Post Filter Plugins with the Mindbreeze SDK
      • Development of a Query Expression Transformer
      • Development of Insight Apps
      • Embedding the Insight App Designer
      • Java API Interface Description
      • OpenAPI Interface Description
    • Release Notes
      • Release Notes 20.1 Release - Mindbreeze InSpire
      • Release Notes 20.2 Release - Mindbreeze InSpire
      • Release Notes 20.3 Release - Mindbreeze InSpire
      • Release Notes 20.4 Release - Mindbreeze InSpire
      • Release Notes 20.5 Release - Mindbreeze InSpire
      • Release Notes 21.1 Release - Mindbreeze InSpire
      • Release Notes 21.2 Release - Mindbreeze InSpire
      • Release Notes 21.3 Release - Mindbreeze InSpire
      • Release Notes 22.1 Release - Mindbreeze InSpire
      • Release Notes 22.2 Release - Mindbreeze InSpire
      • Release Notes 22.3 Release - Mindbreeze InSpire
      • Release Notes 23.1 Release - Mindbreeze InSpire
      • Release Notes 23.2 Release - Mindbreeze InSpire
      • Release Notes 23.3 Release - Mindbreeze InSpire
      • Release Notes 23.4 Release - Mindbreeze InSpire
      • Release Notes 23.5 Release - Mindbreeze InSpire
      • Release Notes 23.6 Release - Mindbreeze InSpire
      • Release Notes 23.7 Release - Mindbreeze InSpire
      • Release Notes 24.1 Release - Mindbreeze InSpire
      • Release Notes 24.2 Release - Mindbreeze InSpire
      • Release Notes 24.3 Release - Mindbreeze InSpire
      • Release Notes 24.4 Release - Mindbreeze InSpire
      • Release Notes 24.5 Release - Mindbreeze InSpire
      • Release Notes 24.6 Release - Mindbreeze InSpire
      • Release Notes 24.7 Release - Mindbreeze InSpire
      • Release Notes 24.8 Release - Mindbreeze InSpire
      • Release Notes 25.1 Release - Mindbreeze InSpire
      • Release Notes 25.2 Release - Mindbreeze InSpire
    • Security
      • Known Vulnerablities
    • Product Information
      • Product Information - Mindbreeze InSpire - Standby
      • Product Information - Mindbreeze InSpire
    Home

    Path

    Sure, you can handle it. But should you?
    Let our experts manage the tech maintenance while you focus on your business.
    See Consulting Packages

    Interface description
    api.chat.v1beta.generate

    IntroductionPermanent link for this heading

    This document deals with the Mindbreeze Web API for generating chat completions using RAG pipelines.

    Generate requests are sent as HTTP POST requests to a client service. The path for Generate requests is the following:

    <Client Service>/api/chat/v1beta/generate

    A JSON document describing the Generate request is sent in the body of the HTTP request. The structure of this JSON document is described in the section "Request Fields".

    Events (stream) sent by the server are also received as a response. The format is described in the chapter "Response Fields".

    An OpenAPI specification of the API is also available. More detailed instructions can be found here: OpenAPI Interface Description .

    Request Fields [ConversationInput]Permanent link for this heading

    id (optional)

    Identifier of the request (optional).

    This ID is shown in app.telemetry.

    Type: String

    inputs

    Input text for the Generate request.

    Type: String

    stream

    Controls whether the generation is streamed.

    Default: true

    Type: Boolean

    model_id

    ID of the RAG pipeline to be used.

    Type: String

    pipeline_id

    Alias for model_id.

    Type: String

    pipeline_key

    The key to a pipeline.

    For more information, see here.

    Type: String

    messages

    Conversation history (optional).

    See the chapter messages [List].

    parameters

    Generation parameters (optional).

    See the chapter parameters.

    prompt_dictionary

    Additional values of the prompt template (optional).

    See the chapter prompt dictionary.

    retrieval_options

    Additional search restrictions (optional).

    See the chapter retrieval_options.

    generation_options

    Additional setting options for the generation (optional).

    For more information, see the chapter generation_options.

    {

       "inputs": "Who is the CEO of Mindbreeze?",

       "stream": false,

       "model_id": "3a0e8612-a24f-4b16-93cc-aa6307d0c62b",

       "retrieval_options": {

          "constraint": {

             "unparsed": "title:management"

          }

       }

    }

    messages [List]Permanent link for this heading

    The API is stateless. To use the Generate Endpoint for chat applications, a chat history can optionally be specified as a list of messages. The "Use chat history" option must also be activated in the pipeline for this.

    from

    Sender of the message ("user" or "assistant").

    Type: String

    id

    Message identifier (optional).

    Type: String

    content

    Text content of the message.

    Type: String

    content_processed

    Completed prompt template with search results (optional).

    Type: String

    [

      {

        "from": "user",

        "content": "Who is the CEO of Mindbreeze?",

        "content_processed": "Given the following extracted parts of ..."

      },

      {

        "from": "assistant",

        "content": "Daniel Fallmann is the CEO of Mindbreeze"

      }

    ]

    parametersPermanent link for this heading

    Optionally, the generation parameters of the pipeline used can be overwritten.

    temperature

    Overwrites "Randomness of the response (temperature)"

    Controls the randomness of the generated response (0 - 100%).

    Higher values make the output more creative, while lower values make it more targeted and deterministic.

    Type: Integer

    max_new_tokens

    Overwrites "Maximum response length (tokens)"

    Limits the number of tokens generated (100 tokens ~ 75 words; depending on the tokenizer).

    Type: Integer

    details

    Adds more detailed information about the individual tokens to the response in addition to the generated text.

    Type: Boolean

    Hint: Is only relevant if it is not streamed.

    retrieval_details

    Adds more detailed information about the retrieved answers to the response in addition to the generated text.

    Type: Boolean

    {

      "temperature": 5,

      "max_new_tokens": 500

      "details": true,

      "retrieval_details": true

    }

    optional_parametersPermanent link for this heading

    This can be used to transfer optional parameters that are not supported by all LLMs.

    Name

    Description

    Supported LLM protocols

    do_sample

    If do_sample is set to false, the text generation is deterministic. The model always selects the token with the highest probability (logits value). This setting is recommended for clearly defined and predictable tasks.

    If do_sample is set to true, the selection of the next tokens is stochastic, based on the probability distributions calculated by the model. This enables more creative and diverse outputs.

    Type: Boolean

    InSpire LLM

    truncate

    Reduces the number of tokens to the specified size. Defining this setting is useful for maintaining the context size for LLMs. The configuration of this setting is recommended, if it can be assumed that very long prompts are used, which could possibly exceed the context length of the LLM.

    Type: Integer

    Attention: Information may be lost if the „truncate“ parameter is used.

    InSpire LLM

    {
      "do_sample": true,

      "truncate": 8000

    }

    prompt_dictionaryPermanent link for this heading

    Optionally, key-value pairs can be specified to fill in placeholders in the prompt template of the pipeline used. To overwrite default placeholders, the setting "Allow overwriting of system prompt template variables" must be activated in the pipeline.

    "prompt_dictionary": {

       "question": "Tell me about Mindbreeze",

       "answer": "Mindbreeze is fast!"

    }

    retrieval_optionsPermanent link for this heading

    Optionally, the retrieval settings of the pipeline used can be overwritten.

    constraint

    Query expression that is used for retrieval in addition to the search restriction configured in the pipeline.

    Type: String

    search_request

    Extends the search query that is used for the retrieval. Fields that are not present in the search query in the pipeline are added. To allow fields to be overwritten, the setting "Allow overwriting of search query template" must be activated in the pipeline.

    Type: Object

    For more information, see api.v2.search Interface Description - Fields in the search query.

    use_inputs

    Controls whether the input text (inputs) is used as a query for retrieval.

    Default: true

    Type: Boolean

    skip_retrieval

    Skips the retrieval part. This setting is helpful if you want to generate answers without an additional context or if you want to specify the answers yourself.

    Default value: false

    Type: Boolean

    For more information, see the setting “answers” in the chapter generation_options.

    "retrieval_options": {

       "constraint": {

          "unparsed": "title:management"

       },

       "search_request": {

          "term": "mind"

       },

       "use_inputs ": false

    }

    generation_optionsPermanent link for this heading

    Optionally, the retrieval settings of the pipeline that is being used can be overwritten:

    prompt_dictionary

    This „prompt dictionary“ has priority over the other prompt_dictionary.

    For more information, see the chapter prompt_dictionary

    llm_selector

    This setting can be used to select an LLM for the generation via the name or the family.

    This is only possible if no pipeline has been specified with a model_id.

    The values of the individual LLMs can be found via the /data interface.

    answers

    With this setting, you can specify the answers yourself, provided that the retrieval has been deactivated with skip_retrieval in the retrieval_options.

    Type: List[Answer]

    For more information, see api.v2.search Interface Description - answers.

    message_templates

    With this setting, the messages to be sent to the LLM can be specified very precisely.

    For more information, see the chapter message_templates [List].

    "generation_options": {
       "prompt_dictionary": {
          "company": "Mindbreeze"
       },
       "llm_selector": {
          "family": "Meta Llama 3 Instruct"

       }

    }

    message_templates [List]Permanent link for this heading

    role

    Defines the role of the conversation participant of this message. Possible values are

    • system
    • user
    • assistant

    Type: String

    content

    The content of the message, which can consist of several parts.

    For more information, see the chapter content [List].

    {
       "role": "user",
       "content": [
          {
             "type": "text",
             "text": "Who is the CEO of Mindbreeze?"
          }
       ]
    }

    content [List]Permanent link for this heading

    type

    Defines the type of this content. Possible values are:

    text

    text/fstring-template

    Type: String

    text

    Defines the actual content.

    If the setting type has the value text/fstring-template, placeholders from the setting prompt_dictionary or the standard placeholders (summaries and question) can be used here.

    Type: String

    {
       "type": "text/fstring-template",
       "text": "You are a helpful AI assistant. Please answer the question with the context below:\n{summaries}"

    }

    Response FieldsPermanent link for this heading

    The structure depends on the "stream" field in the request.

    stream (Request)

    Response structure

    true (Default)

    TokenStreamEvent

    false

    GeneratTextResponse

    TokenStreamEventPermanent link for this heading

    When streaming, only the last TokenStreamEvent contains the complete generated text.

    data: {"token": {"text": " Daniel", "logprob": -0.055236816, "id": 4173}}

    data: {"token": {"text": " ", "logprob": -0.0005774498, "id": 32106}}

    data: {"token": {"text": " case", "logprob": -7.176399e-05, "id": 2589}}

    ...

    "data": {

       "token":{

          "text":"</s>",

          "logprob":-0.22509766,

          "id":1,

          "special":true

       },

       "generated_text": "Daniel Fallmann is the CEO of Mindbreeze.\n\nRetrieved Sources: ...",

       "details":{

          "finish_reason": "eos_token",

          "generated_tokens":19.0,

          "seed":null

       },

       "content_processed": "Given the following ..."

    }

    tokenPermanent link for this heading

    Contains information on the generated token

    text

    Text content of the token.

    Type: String

    logprob

    Logarithmic probability of the token.

    Type: Float ]-inf,0]

    id

    Identification of the token in relation to its context

    Type: Integer

    special

    Token has a special meaning (e.g. end-of-sequence).

    Type: Boolean

    "token": {

      }, "text": "mind",

      "logprob": -0.0029792786,

      "id": 1,

      "special": false

    }

    generated_text [String]Permanent link for this heading

    The complete generated text, i.e. all streamed tokens except special tokens.

    "generated_text": "The CEO of Mindbreeze is...",

    detailsPermanent link for this heading

    finish_reason

    Reason for completing the token generation (e.g. "eos_token").

    Type: String

    generated_tokens

    Number of tokens generated.

    Type: Float

    seed

    When generating used Seed.

    Type: String | null

    "details": {

      }, "finish_reason": "eos_token",

      "generated_tokens": 51.0,

      "seed": null

    }

    content_processed [String]Permanent link for this heading

    Contains the text that was sent to the LLM as input for generation (prompt). The text is generated using the prompt template of the pipeline used.

    "content_processed": "Given the following extracted parts of ..."

    retrieval_detailsPermanent link for this heading

    Contains information on the received answers from the search. Only present if the parameter “retrieval_details” was sent with “true” in the request.

    answers

    The list of answers from the search.

    Type: List[Answer]

    For more information, see api.v2.search Interface Description - answers.

    GenerateTextResponsePermanent link for this heading

    Without streaming, only the generated text is returned, without additional information as with streaming.

    {

      "generated_text": "Daniel Fallmann is the CEO…",

      "details": {

        "generated_tokens": 19,

        "tokens": [

          {

            "text": "Daniel",

            "logprob": -0.055236816

          },

          {

            "text": " ",

            "logprob": -0.0005774498

          },

          {

            "text": "Fall",

            "logprob": -7.176399e-05

          },

          …

           {

             "text": "</s>",

             "logprob": -0.22509766,

             "special":true

           }

        ]

      },

      "retrieval_details" {

        "answers": [

          {

            "score": 0.7742171883583069,

            "text": {

              "text": "Management \nDaniel Fallmann Founder & CEO\nDaniel Fallmann founded Mindbreeze in 2005 and as its CEO he is a living example of high quality and innovation standards.",

              "context_after": " From the company’s very beginning, Fallmann, together with his team, laid the foundation for the highly scalable and intelligent Mindbreeze InSpire appliance.",

              "text_start_pos": 0,

              "text_end_pos": 164

            },

            "property_name": "content",

            "properties": [

              {

                "id": "extension",

                "name": "Extension",

                "data": [

                  {

                    "value": {

                      "str": "html"

                    }

                  }

                ]

              },

              ...

            ]

          }

        ]

      }

    }

    generated_text [String]Permanent link for this heading

    The complete generated text, i.e. all streamed tokens except special tokens.

    "generated_text": "The CEO of Mindbreeze is…",

    detailsPermanent link for this heading

    Contains information about the generated text. Only present if the "details" parameter was sent with "true" in the request.

    generated_tokens

    Number of tokens generated.

    Type: Integer

    tokens

    The generated tokens.

    Type: Array[Token]

    tokenPermanent link for this heading

    Contains information on the generated token

    text

    Text content of the token.

    Type: String

    logprob

    Logarithmic probability of the token.

    Type: Float ]-inf,0]

    retrieval_detailsPermanent link for this heading

    Contains information on the received answers from the search. Only present if the parameter “retrieval_details” was sent with “true” in the request.

    answers

    The list of answers from the search.

    Type: List[Answer]

    For more information, see api.v2.search Interface Description - answers.

    Download PDF

    • api.chat.v1beta.generate Interface Description

    Content

    • Introduction
    • Request Fields [ConversationInput]
    • Response Fields

    Download PDF

    • api.chat.v1beta.generate Interface Description