Microsoft File Connector
Installation and Configuration
Mindbreeze GmbH, A-4020 Linz, 2018.
All rights reserved. All hardware and software names used are registered trade names and/or registered trademarks of the respective manufacturers.
These documents are highly confidential. No rights to our software or our professional services, or results of our professional services, or other protected rights can be based on the handing over and presentation of these documents.
Distribution, publication or duplication is not permitted.
The term ‘user‘ is used in a gender-neutral sense throughout the document.
Before installing the Microsoft File Connector ensure that the Mindbreeze Server is already installed and this connector is also included in the Mindbreeze license.
Extending Mindbreeze to use Microsoft File Connector
The Microsoft File Connector is available as a ZIP file. This file must be registered with the Mindbreeze Server via mesextension.exe as follows:
mesextension --interface=plugin --type=archive --file=MicrosoftFileConnector<version>.zip install
PLEASE NOTE: The Connector can be updated by calling the same mesextention. Mindbreeze Enterprise will automatically carry out the required update.
Needed Rights for Crawling User
- The user must have “Read” permission on shares to crawled.
Configuration of Mindbreeze
Select the “Advanced” installation method:
Click on the “Indices” tab and then on the “Add new index” symbol to create a new index.
Enter the index path, e.g. “/data/indices/filesystem”. Change the Display Name of the Index Service and the related Filter Service if necessary.
Add a new data source with the symbol “Add new custom source” at the bottom right.
Configuration of Data Source
Caching Principal Resoution Service
To use the Caching Principal Resolution Service you have to select CachingLdapPrincipalResoution. Then it is used to resolve the AD group membership of a user in the search.
For more details click here Caching Principal Resolution Service.
- “Root Paths”: The root path must be a UNC path.
- “Thread Count”: Both traversal of directories and retrieval of documents are performed in parallel.
- “Batch Size”: The size of the queue where the documents are queued before retrieving their properties and content.
- “Includes”: Only those files and directories are crawled whose path match this pattern (case sensitive).
- “Document Size Limit (MB)”: The Crawler will ignore documents larger than this Size. If this limit is changed, the limit and the rpc-timeout of the filter service should be adapted as well.
- “Include Patterns”: With “regexpIgnoreCase:” allows to use case insensitive regular expressions.
- “Excludes”: Those files and directories whose path match this pattern are not crawled (case sensitive).
- “Exclude Patterns”: With “regexpIgnoreCase:” allows to use case insensitive regular expressions.
- “Exclude Directories”: If selected directories are not crawled.
- “Always Use Directory Rights”: If selected direct security permissions on files itself are ignored.
- “Full Traversal Interval (Hours)”: Interval between two full traversals of all documents in file share. Modified documents are crawled during incremental traversal after “Crawler Interval”.
- “Remove Deleted Documents from Index”: By selecting this option documents which are deleted from file share are deleted from index at the end of a full traversal.
Content Location Optimization
For crawling large files it is beneficial to use Content Location Optimzation. For example if you want to crawl Outlook PST Files.
Configure the mount point according to the screenshot above. The following configuration Options are needed:
- “Root Directory (UNC Path)”: Use the same root path you used in the source config above.
- “Root Directory (Mount Path)”: The local path to which the UNC Path is mounted.
- “Files Pattern (Regex)”: A regex pattern matchting those files which should be indexed using Content Location Optimization.
The Content Location Optimization feature requires that the UNC Path is mounted locally. This can be configured using the System Configuration in the Management Center:
- Create the local folder using Filemin:
- Grant Permissions to the Mindbreeze user (mes):
- Add a CIFS mount using the “Disk and Network Filesystems” Module:
- Configure the mount:
- After you press “Create” the Network filesystem will be mounted and is ready for use.
Crawling Outlook PST Files
In addition to the File Crawler configuration above you also need to add an Outlook PST Datasource to crawl PST Files remove “Default” from Category Instance field.
And finally ensure that a Filter Plugin is enabled for .pst extension.
The user must have read access permission in order to crawl a share.
- “LDAP Server”: Provide this only if you want to override LDAP setting configured under “Network Setting” tab.
Open Search Results
To open search results from a Microsoft File datasource the current user has to be signed in to the corresponding fileserver.
Uninstalling the Microsoft File Connector
To uninstall the Microsoft File Connector, first delete all Microsoft File Crawlers and then carry out the following command:
mesextension --interface=plugin --type=archive --file=MicrosoftFileConnector<version>.zip uninstall