Language Detection

LanguageDetector Plug-In

Copyright ©

Mindbreeze GmbH, A-4020 Linz, 2018.

All rights reserved. All hardware and software names used are registered trade names and/or registered trademarks of the respective manufacturers.

These documents are highly confidential. No rights to our software or our professional services, or results of our professional services, or other protected rights can be based on the handing over and presentation of these documents. Distribution, publication or duplication is not permitted.

.

.

IntroductionPermanent link for this heading

Mindbreeze provides languge dectection for documents using the LangugageDector ItemTransformer plugin.

LanguageDetector Plug-InPermanent link for this heading

To use the language detection the LanguageDetector has to be added to you Mindbreeze installation by loading the corresponding plugin (the Item Transformation Services are included in the package “ Mindbreeze Item Transformation Plugins”).

The plugin also has to be included in your Mindbreeze license.

InstallationPermanent link for this heading

  • Install the plugin (either use the manager UI or the command line tool mesextension)

mesextension --interface=plugin --type=archive --file=LanguageDetector-Text-<version>.zip install

ConfigurationPermanent link for this heading

  • Activate the plugin for each needed index using the manager UI:
    • Select the tab „Indices“ and activate „Advanced Settings
    • Scroll to the „Item Transformation Services” section
    • Select the “TextPlugin.LanguageDetector” plugin and click add.

  • Language Probability Threshold: Specifies the probability threshold which has to be reached for a language to be included.
  • Source Property Pattern: Specifies the property used for language detection.
  • Language Target Property: Specifies the new property for the detected languages.
  • Language Property: defines the property which already includes the language. This skips the language detection and sets target property.
  • Language Property Pattern: Defines languages that should be considered from the “Language Property”
  • Included Languages: Defines languages that should be considered by the detector.

Run the LanguageDetector as separate ServicePermanent link for this heading

The LanguageDetector Plugin can also be use as a separate Server. This can improve the performance on large installations with multiple indices.

Add a new Servce in the „Indices“-Tab in the section „Services“ and choose „ItemTransformationServicePlugin.LanguageDetector“. In the setting of the new service configure a „Display Name“ and free TCP-Port as „Bind port“. The other settings should be configured as above. Add the newly creates ItemTransformation Service to each index that should use it.