Copyright ©
Mindbreeze GmbH, A-4020 Linz, 2024.
All rights reserved. All hardware and software names used are brand names and/or trademarks of their respective manufacturers.
These documents are strictly confidential. The submission and presentation of these documents does not confer any rights to our software, our services and service outcomes, or any other protected rights. The dissemination, publication, or reproduction hereof is prohibited.
For ease of readability, gender differentiation has been waived. Corresponding terms and definitions apply within the meaning and intent of the equal treatment principle for both sexes.
The Microsoft Loop Connector can be used to index Microsoft Loop pages with their metadata and content. The Loop Connector works differently compared to other connectors. To index Loop, a Loop Sitemap Generator is created, which then provides a sitemap. This sitemap is then crawled by a Web Connector configured specifically for Loop.
For more information, please see the chapter Limitations.
In the Mindbreeze Management Center, open the ‘Configuration’ section to configure the Microsoft Loop Sitemap Generator and the Microsoft Loop Principal Resolution Service.
Add a new service in the tab “Indices” with “+Add Service”. Then select “Microsoft Loop Sitemap Generator” for the setting “Service” in the new service.
Now configure the Microsoft Loop Sitemap Generator with the settings in the section “Connection Settings”.
Setting | Description | Example/Default Setting |
User Credentials* | Specifies the Crawling User for Microsoft Loop. | Example: Loop Mindbreeze User |
Bind Port* | The port where the created sitemap is available. | Example: 23950 |
Sitemap Generation Interval (Minutes) | Defines the interval in which a new sitemap is generated. | Default Setting: 60 |
Page Size | Defines the number of objects that are fetched simultaneously from Microsoft Loop. | Default Setting: 100 |
Log All Requests | Is this option activated, all requests against Microsoft Loop are written to the “request-log.csv” log file, as long as the login is successful. | Default Setting: Deactivated |
* = These settings must be configured so that the Sitemap Generator works and is built. All other settings must be configured according to the use case. | ||
Settings marked with „(Advanced Settings)“ require the activation of „Advanced Settings“ in the configuration. These settings are only necessary in special cases. |
Add a new service in the tab “Indices” with “+Add Service”. Then select “Microsoft Loop Principal Resolution Service” for the setting “Service” in the new service.
Now configure the Microsoft Loop Principal Resolution Service with the settings in the section “Connection Settings”.
Hint: For more information about the creation, basic configuration of a cache for a Principal Resolution Service and other configuration options, see Installation & Configuration - Caching Principal Resolution Service.
Setting | Description | Example/Default Setting |
User Credentials* | Specifies the Crawling User for Microsoft Loop. | Example: Loop Mindbreeze User |
Page Size | Defines the number of objects that are fetched simultaneously from Microsoft Loop. | Default Setting: 100 |
Log All Requests | Is this option activated, all requests against Microsoft Loop are written to the “request-log.csv” log file, as long as the login is successful. | Default Setting: Deactivated |
* = These settings must be configured so that the Sitemap Generator works and is built. All other settings must be configured according to the use case. | ||
Settings marked with „(Advanced Settings)“ require the activation of „Advanced Settings“ in the configuration. These settings are only necessary in special cases. |
Add a new index in the tab “Indices” with “+Add Index”. Select the desired “Index Node” and “Client Service” and select “Web” as “Data Source”. Then confirm your entries with “Apply”.
To set up the Web Connector, you can copy the Config XML from the Web Connector Import/Export XML section and import it using the Import/Export button:
In addition, the following changes are also necessary:
The following limitations should be noted:
Only workspaces to which the specified Loop User has access, can be indexed.
The Loop Sitemap Generator can only process up to a maximum of 1000 workspaces.
A workspace may contain a maximum of 1000 users.
The following XML can be used for setting up the web connector, among other things. For more information, see the chapter Setup of the Web Connector.
<settings>
<id>plugin:com.mindbreeze.datasource.Crawler/Web</id>
<attributes>
<attribute name="category" value="Web"/>
<attribute name="categoryinstance" value="Microsoft Loop"/>
<attribute name="datasource" value="Web"/>
<attribute name="processtype" value="command"/>
<attribute name="interval" value="6"/>
<attribute name="intervalmult" value="3600"/>
<attribute name="launchedservice" value="true"/>
</attributes>
<properties>
<property name="com.mindbreeze.datasource.enable_javascript" value="true"/>
<property name="com.mindbreeze.datasource.include_network_resources_hostname">
<![CDATA[login.microsoftonline.com
aadcdn.msauth.net
aadcdn.msftauth.net
login.live.com
.*data.microsoft.com
graph.microsoft.com
substrate.office.com
ecs.office.com
odc.officeapps.live.com
clients.config.office.net
.*sharepoint.com
config.edge.skype.com
.*cdn.office.net
.*hubblecontent.osi.office.net
loop.cloud.microsoft]]>
</property>
<property name="com.mindbreeze.datasource.credential_scripts" value="composite">
<property name="com.mindbreeze.datasource.credential_scripts.script_name" value="MS Login Username"/>
<property name="com.mindbreeze.datasource.credential_scripts.script_allowed_hosts" value="login.microsoftonline.com"/>
<property name="com.mindbreeze.datasource.credential_scripts.script_selector_type" value="XPATH"/>
<property name="com.mindbreeze.datasource.credential_scripts.script_trigger_selector" value="//*[@type=\"email\"]"/>
<property name="com.mindbreeze.datasource.credential_scripts.script" value="// 24.7
event = new Event('change')
usernameField = document.evaluate("//*[@type=\"email\"]", document, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null).singleNodeValue;
submitButton = document.evaluate("//*[@type=\"submit\"]", document, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null).singleNodeValue;
usernameField.value = mesCredential.username;
usernameField.dispatchEvent(event);
submitButton.click(); "/>
<property name="com.mindbreeze.datasource.credential_scripts.scrip_credential" value="377777061136392"/></property>
<property name="com.mindbreeze.datasource.maxhopsfromcrawlingroot" value="1"/>
<property name="com.mindbreeze.datasource.crawlingroot" value="http://localhost:23950/sitemap.xml"/>
<property name="com.mindbreeze.datasource.robotshonoringpolicytype" value="IGNORE"/>
<property name="com.mindbreeze.datasource.presence_selectors" value="composite">
<property name="com.mindbreeze.datasource.presence_selectors.content_presence_selector_url_patterns" value="https://loop.cloud.microsoft/.*"/>
<property name="com.mindbreeze.datasource.presence_selectors.content_selector_type" value="XPATH"/>
<property name="com.mindbreeze.datasource.presence_selectors.content_presence_selector" value="//meta[@name=\"isready\"]"/></property>
<property name="com.mindbreeze.datasource.exclude_javascript_url_pattern">
<![CDATA[.*robots.txt]]>
</property>
<property name="com.mindbreeze.datasource.enable_verbose_logging" value="false"/>
<property name="com.mindbreeze.datasource.allowed_resource_types">
<![CDATA[DOCUMENT
STYLESHEET
IMAGE
MEDIA
FONT
SCRIPT
XHR
FETCH
PING
CSPVIOLATIONREPORT
OTHER]]>
</property>
<property name="com.mindbreeze.datasource.skip_head_request" value="true"/>
<property name="com.mindbreeze.datasource.parallelqueuecount" value=""/>
<property name="com.mindbreeze.datasource.crawlerthreadcount" value="5"/>
<property name="com.mindbreeze.datasource.scripts" value="composite">
<property name="com.mindbreeze.datasource.scripts.script_name" value="Reload Loop"/>
<property name="com.mindbreeze.datasource.scripts.script_url_patterns" value="https://loop.cloud.microsoft/.*"/>
<property name="com.mindbreeze.datasource.scripts.script_selector_type" value="XPATH"/>
<property name="com.mindbreeze.datasource.scripts.script_trigger_selector" value="//*[@id=\"loopApp-menu2\"]"/>
<property name="com.mindbreeze.datasource.scripts.script" value="// 24.7
location.reload();"/></property>
<property name="com.mindbreeze.datasource.scripts" value="composite">
<property name="com.mindbreeze.datasource.scripts.script_name" value="Press KMSI"/>
<property name="com.mindbreeze.datasource.scripts.script_url_patterns" value="https://login.microsoftonline.com/common/login"/>
<property name="com.mindbreeze.datasource.scripts.script_selector_type" value="XPATH"/>
<property name="com.mindbreeze.datasource.scripts.script_trigger_selector" value="//*[@type=\"submit\"]"/>
<property name="com.mindbreeze.datasource.scripts.script" value="// 24.7
document.evaluate("//*[@type=\"submit\"]", document, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null).singleNodeValue.click();"/></property>
<property name="com.mindbreeze.datasource.credential_scripts" value="composite">
<property name="com.mindbreeze.datasource.credential_scripts.script_name" value="MS Login Password"/>
<property name="com.mindbreeze.datasource.credential_scripts.script_allowed_hosts" value="login.microsoftonline.com"/>
<property name="com.mindbreeze.datasource.credential_scripts.script_selector_type" value="XPATH"/>
<property name="com.mindbreeze.datasource.credential_scripts.script_trigger_selector" value="//*[@id=\"idA_PWD_ForgotPassword\"]"/>
<property name="com.mindbreeze.datasource.credential_scripts.script" value="// 24.7
passwordField = document.evaluate("//*[@type=\"password\"]", document, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null).singleNodeValue;
submitButton = document.evaluate("//*[@type=\"submit\"]", document, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null).singleNodeValue;
passwordField.value = mesCredential.password;
passwordField.dispatchEvent(event);
submitButton.click();"/>
<property name="com.mindbreeze.datasource.credential_scripts.scrip_credential" value="377777061136392"/></property>
<property name="com.mindbreeze.datasource.page_load_timeout_seconds" value="20"/>
<property name="com.mindbreeze.datasource.network_timeout" value="20"/>
<property name="com.mindbreeze.datasource.isdeltarun" value="complete"/>
<property name="com.mindbreeze.datasource.on_new_document_script">
<![CDATA[// 24.7
window.open = function(...args) {
console.log("Popup blocked: window.open was called, but no action was taken.");
};
window.print = function () {
window.onbeforeprint();
const meta = document.createElement('meta');
meta.name = "isready";
meta.content = "true";
document.head.appendChild(meta);
};]]>
</property>
<property name="com.mindbreeze.datasource.match_network_resources_hostnames_as_regex" value="true"/>
<property name="com.mindbreeze.datasource.content_presence_selector" value=""/>
</properties>
</settings>