Mindbreeze GmbH, A-4020 Linz, 2018.
All rights reserved. All hardware and software names are brand names and/or trademarks of their respective manufacturers.
These documents are strictly confidential. The submission and presentation of these documents does not confer any rights to our software, our services and service outcomes or other protected rights.
The dissemination, publication or reproduction hereof is prohibited.
For ease of readability, gender differentiation has been waived. Corresponding terms and definitions apply within the meaning and intent of the equal treatment principle for both sexes.
Before you install the Atlassian Confluence Connector plugin, you need to ensure that the Mindbreeze server is installed and that this connector is also included in the Mindbreeze license. The Atlassian Confluence Connector is installed by default on the Mindbreeze InSpire server. If you want to install or update the connector manually, use the Mindbreeze Management Center.
To install or update Mindbreeze plugin files, open the Mindbreeze Management Center. Navigate to the "Plugins" tab under the menu item "Configuration". Select the ZIP file under the "Plugin management" section and use the "Upload" button to upload it. This will automatically install or update the connector. Mindbreeze services are restarted after a plugin installation.
When prompted to choose an installation method, select "Advanced".
Click on the "Indices" tab and then on the "Add new index" icon to create a new index.
Enter the index path, e.g. "/data/indices/confluence". If necessary, adjust the Display Name of the Index Service and the associated Filter Service.
Add a new data source using the symbol "Add new custom source" on the lower right.
If not already selected, select "Atlassian Confluence" using the "Category" button.
With the "Crawler Interval" setting, you configure the amount of time that elapses between two indexing runs.
The “Crawling Root” field allows you to specify an URL, via which an Atlassian Confluence sitemap is accessible. If you have the Mindbreeze Sitemap Generator add-on installed on your Atlassian Confluence server and a sitemap is generated, enter the URL <Atlassian Confluence URL>/plugins/servlet/sitemapservlet?jobbased=true.
The field "URL Regex" lets you define a regular expression, which sets a pattern for the links which are to be indexed.
If certain URLs should be excluded from the crawl, they can be configured using a regular expression under "URL Exclude Pattern".
With the option „Convert URL-s to lower case“, all located URLs will be converted to lower case.
If the DNS resolution of certain Web servers doesn’t work due to a problem with the network, you can specify the IPs using "Additional Hosts File".
Use the "Accept Headers" setting if you want to add specific HTTP headers (for example, Accept-Language).
To edit Confluence sitemaps, activate "Delta Crawling" and enter the Confluence sitemap URL as the crawling root.
In this mode, the Connector reads the websites solely from the sitemaps. Here both the properties lastmod and changefreq of the pages of the site map are compared with the indexed pages. Very high frequency indexing strategies can be applied using a precise sitemap.
For the "Sitemap-based Delta Crawling" mode, two options are available:
The "Use Stream Parser" option uses a stream parser for processing the sitemap. This option is suitable for sitemaps with a lot of URLs.
In this section (available only when "Advanced Settings" is selected), the crawl speed can be adjusted.
Under "Number Of Crawler Threads", you can define how many threads simultaneously pick sites from the web server.
"Request Interval" defines the number of milliseconds the crawler (thread) waits between each single request. However, a "crawl-delay" robot command is always taken into consideration and will override this value.
You can enter a proxy server in the "Network" tab if your infrastructure so requires.
If the Atlassian Confluence sitemap is accessible by http form authentication, the login parameters in the "Form Based Login" section can be configured as follows:
Add the Caching Confluence Principal Resolution Service. (Note: ConfluenceAccessx.x.x.zip must first be installed in the tab "Plugins").
Enter the "Confluence Server URL".
The necessary login information for accessing the "Confluence Server URL" needs to be configured in the "Network" tab and mapped to the "Confluence server URL" endpoint.
Enter the directory path for the cache and, if necessary, change the "Cache In Memory Items Size", depending on the available space of the JVM.
The service is available at the specified "Webservice Port". If multiple Principal Resolution Services are configured, make sure that the "Webservice Port" parameters are different and the configured ports are available.