Copyright ©
Mindbreeze GmbH, A-4020 Linz, 2022.
All rights reserved. All hardware and software names used are registered trade names and/or registered trademarks of the respective manufacturers.
These documents are highly confidential. No rights to our software or our professional services, or results of our professional services, or other protected rights can be based on the handing over and presentation of these documents.
Distribution, publication or duplication is not permitted.
The term “user” is used in a gender-neutral sense throughout the document.
The Mindbreeze Sitemap Generator add-on generates a sitemap of the Atlassian Confluence pages. The pages contained are restricted by rights of the user generating the sitemap. Additionally you can exclude pages using regular expressions.
The Remote API-Interface of Atlassian Confluence has to be enabled in order for the Mindbreeze Sitemap Generator add-on to work. Activate at: “Further Configuration > Remote API (XML-RPX & SOAP)”
Install the add-on using “Manage add-ons” and “Upload add-on”:
Please refer to the Product Information for the latest supported version.
Choose the file „confluence-5.6-mindbreeze-plugin-<version number>.jar“ in the Plugin folder of the Web Connector and submit using the “Upload” button:
The add-on installation is finished:
Use the “Configure” button to change the settings of the Mindbreeze Sitemap Generator add-on:
Sitemap Generating User | Atlassian Confluence user, used to generate the sitemap. Default: admin |
Sitemap Downloader Group | Only members of the given Atlassian Confluence group are allowed to download the sitemap. Default: confluence-users |
ACL Encryption Password | A password used for encrypting the ACL elements. If this parameter is left empty, the ACL elements will not be encrypted. |
Confluence Base URL | the base URL that should be used for generating the links in the sitemap. |
Sitemap Cache Directory | A directory where the generated sitemap.xml is stored on the Atlassian Confluence Server. |
Disable Parent Reference Metadata for Pages | If enabled, no reference metadata to the parent document is generated for Confluence pages. This reduces the number of database queries. |
Add Performance Metrics to Sitemap | If enabled, the times required for sitemap generation tasks are entered as comments in the sitemap. |
ACL Exempt Group Name (ex. confluence-administrators) | Group that has read-access to all Confluence Content regardless of the explicit rights. |
Custom Content Property Key Pattern | With this option, custom content properties can be included in the sitemap. A regular expression is defined that matches the name of the custom content properties (without the prefix custprop_ ). Matching properties are included in the sitemap. Note: Custom Content Property values of type JSON Object, are flattened into one or more metadata. Furthermore, custom content properties are only supported for pages and not for attachments. Default value: not set. Example values: .* (includes all custom content properties) or myProp.* (includes all custom content properties that begin with myProp, e.g. myPropLikes). Note: This feature is only supported for Confluence Version 5.6+. |
Delta Sitemap Containing Last Modified Sites Minutes | The delta sitemap contains all documents that have been changed in the last minutes. How many minutes this actually is can be configured with this option. If this option is not set, the delta sitemap will not contain any <url> elements. |
Generate REST URLs | Instead of the normal Confluence Sitemap URLs, REST API URLs are generated which are set as document key in the Confluence crawler. This has the advantage, for example, that no temporary duplicates are created during a delta crawl run if the title of pages has been changed. If you enable this option, please also make sure that the option "Use Rest API for Page Content" is active in the Atlassian Confluence Crawler. Attention: If you have already indexed Confluence and want to enable or disable this option afterwards, you need an empty index before changing this option. This would otherwise lead to document duplicates, since the mes:key scheme changes in the process. |
REST URL Base Path | If the REST API endpoint is not located directly on <your-confluence-url>/rest/api, the "REST URL Base Path" can be specified. For example, if it is located at <your-confluence-url>/mybasepath/rest/api, the "REST URL Base Path" value must be /mybasepath. |
Include Labels | If active, label metadata ("labels") for sites, spaces and attachments are included in the sitemap. |
After a successful installation of the Atlassian Confludence Sitemap Generator Add-on, the sitemap can be generated with a scheduled job. To set up the sitemap generator job navigate to the Confluence Admin interface to the section “Scheduled Jobs”
The sitemap generator job can be started automatically according to a given schedule. This schedule can be specified using standard cron expressions by clicking on the “Edit” action of the “scheduledjob.desc.mindbreezeGenerateSitemapJob”.
The sitemap generator job can also be started manually by clicking on the “Run” action.
After the sitemap generator job has completed the sitemap is available using the following URL: <confluence_url>/plugins/servlet/sitemapservlet?jobbased=true.
The Delta Sitemap is available at
<Atlassian Confluence URL>/plugins/servlet/sitemapservlet?jobbased=true&delta=true
You can configure the log-level for the Atlassian Confluence Sitemap Generator Add-On at “Administration -> Logging and Profiling”.
Create a new Entry for the Class/Packet name: “com.mindbreeze.enterprisesearch.connectors” and select the log-level.
The logfiles are available at the folloging path: <Confluence Home>/logs/atlassian-confluence.log