With the help of the Atlassian Confluence REST connector, you can connect your Confluence Cloud instance to the Mindbreeze search. This allows you to use your Confluence spaces, pages, blogs, attachments, comments, etc. in the Mindbreeze Insight Apps.
The Atlassian Confluence REST Connector is already included in Mindbreeze InSpire by default.
A Confluence Cloud user with access permissions for all spaces and pages is required to be used in Mindbreeze InSpire. This user then generates an API token that primarily has read permissions for the Confluence Cloud REST API.
read:account
read:analytics.content:confluence
read:app-data:confluence
read:attachment:confluence
read:audit-log:confluence
read:blogpost:confluence
read:configuration:confluence
read:comment:confluence
read:confluence-content.all
read:confluence-content.permission
read:confluence-content.summary
read:confluence-groups
read:confluence-props
read:confluence-space.summary
read:confluence-user
read:content:confluence
read:content-details:confluence
read:content.metadata:confluence
read:content.permission:confluence
read:content.property:confluence
read:content.restriction:confluence
read:database:confluence
read:group:confluence
read:me
read:task:confluence
read:user.property:confluence
read:custom-content:confluence
read:embed:confluence
read:inlinetask:confluence
read:label:confluence
read:page:confluence
read:permission:confluence
read:relation:confluence
read:space-details:confluence
read:space:confluence
read:space.permission:confluence
read:space.property:confluence
read:space.setting:confluence
read:template:confluence
read:user:confluence
read:watcher:confluence
read:whiteboard:confluence
readonly:content.attachment:confluence
search:confluence
Hint: Ensure that the API token is renewed regularly before it expires.
Open the Mindbreeze InSpire Management Center in your browser to start with the configuration.
Add a new index in the tab “Indices” using the button “+Add Index”. Select the desired “Index Node” and “Client Service” and choose the data source “Atlassian Confluence REST” in the setting “Data Source”. Then confirm your entries with “Apply”:
Activate “Advanced Settings” and change the following settings:
Setting | Entry |
Use ACL References | Activated |
Enable Precomputed ACLs | Force |
Now configure the data source.
Setting | Description | ||||||
Confluence Base URL* | The URL for the Confluence Cloud instance in the format: https://api.atlassian.com/ex/confluence/<<CloudID>>/wiki | ||||||
Confluence Credential* | The username/password credential created in the tab “Network”. The following items must be configured for this:
| ||||||
Log All Requests | When enabled, all requests to the Confluence API are logged in a file named “request-log.csv.” This can be useful for troubleshooting. | ||||||
Connection Timeout | Time in seconds to wait for a response before canceling the API call. | ||||||
Maximum Fetch Retries | The maximum number of retries that will be attempted when the server sends certain throttling responses (e.g., 429). | ||||||
Search Page Size | The page size that is used for search requests. Maximum is 25. | ||||||
Resource Page Size | The page size that is used for resource requests. Maximum is 100. | ||||||
Max Content Length (MB) | If documents exceed the size (in MB) specified in this setting, they will be indexed with empty content. | ||||||
User Agent | User agent header used for API calls. | ||||||
Redirect Pattern | List of regex patterns for allowed HTTP redirects. | ||||||
Trust all SSL Certificates | Allows the use of unsecured connections, for example for test systems. Attention: Must not be enabled in the production environment. | ||||||
* = These settings must be configured so that the cache works and is built. All other settings must be configured according to the application. | |||||||
Setting | Description | ||||||
Include Private Spaces | When enabled, private spaces are also indexed. | ||||||
Include Space Keys | A list of spaces to be indexed. Note that only one space key can be specified per line. | ||||||
Exclude Space Keys | A list of spaces that should not be indexed. Note that only one space key can be specified per line.
Note: This setting cannot be used in conjunction with the setting “Include Space Keys”. | ||||||
Include Custom Property Pattern | This setting allows custom content properties to be indexed. A list of regular expressions is defined that match the names of the custom content properties. Matching properties are indexed. Example values:
Default setting: not set. | ||||||
Include Comments | When enabled, comments will be indexed. | ||||||
Include Attachments | When enabled, attachments are indexed. | ||||||
Include Attachments Pattern | Controls which attachments are indexed. A list of regular expressions is defined that are matched to the download URL path. Example of a download URL path: /download/attachments/123456789/My Document.pdf Example to index only PDF attachments: | ||||||
Exclude Attachments Pattern | Controls which attachments should be excluded from indexing. A list of regular expressions is defined that are matched to the download URL path.
Example of a download URL path: /download/attachments/123456789/My Document.pdf Example to index only PDF attachments: | ||||||
Content Body Format | The format in which the API returns the content of pages and blog posts.
| ||||||
Comment Body Format | The format in which the API returns the content of comments.
|
Setting | Description |
Enable Global Anonymous | If your Confluence instance has global anonymous access enabled (see https://support.atlassian.com/confluence-cloud/docs/set-up-public-access/), you should enable this setting so that users who are not logged in can also find anonymous documents in Mindbreeze. |
Delete Documents | When enabled, documents deleted in Confluence are also deleted in Mindbreeze. Hint: It is recommended to always leave this setting enabled on production systems. |
In the new or existing service, select the option “Atlassian Confluence Principal Resolution Service” in the setting “Service”.
For more information about additional configuration options and how to create a cache and how to do the basic configuration of a cache for a Principal Resolution Service, see Installation & Configuration - Caching Principal Resolution Service.
The following table describes the settings you need to configure for the Principal Resolution Service. Depending on the use case, additional settings are available as options.
For the Principal Resolution Service to resolve Project Roles or Groups for a user, the user’s email address needs to be public. You can configure this here. To do this, set the column "Who can see this?" to "Anyone" for the user's email address.
Setting | Description | ||||||||
Confluence Base URL* | The URL for the Confluence Cloud instance in the format: https://api.atlassian.com/ex/confluence/<<CloudID>>/wiki | ||||||||
Confluence Credential* | The username/password credential created in the tab “Network”. The following items must be configured for this:
| ||||||||
Confluence Credential* |
| ||||||||
Log All Requests | When enabled, all requests to the Confluence API are logged in a file named “request-log.csv.” This can be useful for troubleshooting. | ||||||||
Connection Timeout | Time in seconds to wait for a response before canceling the API call. | ||||||||
Maximum Fetch Retries | The maximum number of retries that will be attempted when the server sends certain throttling responses (e.g., 429). | ||||||||
Search Page Size | The page size that is used for search requests. Maximum is 25. | ||||||||
Resource Page Size | The page size that is used for resource requests. Maximum is 100. | ||||||||
Max Content Length (MB) | If documents exceed the size (in MB) specified in this setting, they will be indexed with empty content. | ||||||||
User Agent | User agent header used for API calls. | ||||||||
Redirect Pattern | List of regex patterns for allowed HTTP redirects. | ||||||||
Trust all SSL Certificates | Allows the use of unsecured connections, for example for test systems. Attention: Must not be enabled in the production environment. | ||||||||
* = These settings must be configured so that the cache works and is built. All other settings must be configured according to the application. | |||||||||
Problem: The following error is displayed:
ERROR: Found 200 group permissions for content 'Business Transaction Open'.
There might be more but they are skipped.
Solution: If a page/blog post contains more than 200 group restrictions, only the first 200 will be retrieved – the others will be skipped.
This was designed due to a limitation of Confluence's REST API for DataCenter.
Problem: The following error is displayed:
Failed to index all pages for space 'ABC'. Continuing with next space.
...
Error on requesting url: 'https://api.atlassian.com/ex/confluence/<<tenant-id>>/wiki/rest/api/search?next=true&cursor=...&expand=...&limit=5&start=25&cql=space+%3D+%22ABC%22+and+type+IN+%28blogpost%2C+page%29+order+by+lastModified+desc' status: 404 responseContent: '{"statusCode":404,"data":{"authorized":false,"valid":false,"errors":[{"message":{"translation":"No content with id <ContentId{id=123456789}> can be found","args":[]}}],"successful":false},"message":"com.atlassian.confluence.api.service.exceptions.NotFoundException: No content with id <ContentId{id=123456789}> can be found"}'
This usually happens when already ‘broken’ data is migrated from DataCenter to Cloud.
Use Case:
Solution:
To ensure clean crawl runs, these spaces must be excluded from crawling (via the setting “Exclude Space Keys”) or the content must be corrected in Confluence.
These pages/blog posts can usually be easily identified by checking the Confluence user interface, where the parent entry should be listed but is empty.