Google Drive Connector
Installation and Configuration

Requirements

Overview of the configuration possibilities

The Google Drive Connector generally uses a server-to-server authentication when communicating with Google. This is the method recommended by Google, and there is no need for anyone to be present to operate the connector. This method requires a Google Service Account that has the “G-Suite Domain-wide Delegation” option. This account can then be used to “impersonate” any person in the G-Suite (impersonation) and their files. If you can use this method, follow the steps below in the "Service Account" sections.

If you do not have a Google service account, the alternative method is to run the Google Drive Connector using OAuth. No Google Service Account is required for this method. The OAuth method is not recommended, and it should only be used in exceptional cases. With this very complicated method, the person whose files are to be indexed must be present to enter the password in the Google OAuth process one time and accept the read permissions. If you want to use this non-recommended method, follow these steps in the "OAuth" sections.

Creating and configuring the Google Service Account

Create a Google Service Account

Navigate to console .developers.google.com/ and register there.

Click on “Credentials” on the left-hand menu bar and then click on “Manage Service Accounts“.

Click on "Create Service Account" and enter an account name. Select the "Service Account Admin" role, and click "Enable G Suite Domain-wide Delegation".

Open the account options on the right-hand side and click "Create Key".

Select "P12" and save the .p12 file.

Note: The .p12 file is necessary for the Google Driver Crawler and Google Drive Principal Resolution Service.

Now you should see the client ID of your Google service account. You’ll need to note this ID because it is necessary for the next step (1.2.2).

Linking Gsuite to a Google Service Account

Navigate to Gsuite https://gsuite.google.com/intl/de/ and log in with your domain with access to the Admin console. If necessary, log in with the same user that you used to register at console.developers.google.com.

Click “Security > Show more > Advanced Settings > Manage API Client Access

Enter the client ID of the Google service account in the client name field.

Enter the correct API ranges.

Necessary entries:

https://www.googleapis.com/auth/admin.directory.group.readonly,https://www.googleapis.com/auth/admin.directory.user.readonly,https://www.googleapis.com/auth/drive.readonly

Configuring Google OAuth

A Google account within a GSuite domain is required. The account is used to index files that this account can access.

Creating the Credentials

This step is required to obtain access to the Google Drive files from the account.

Enabling the Drive API

Pull up the “Library Page” in the API Console. (https://console.developers.google.com/apis/library). Make sure you are logged on with the correct account.

Click on “G Suite APIs” => “Drive API”.

When the Google Drive API page says: "A project is needed to enable APIs", then click "Create Project" and then "Create," and specify a project name, such as "Mindbreeze Crawler."

Return to the Google Drive API page and click the "Enable" button.

Creating the authorization credentials

Open the Credentials page by clicking the "Credentials" button on the left side. (https://console.developers.google.com/apis/credentials)

Click "Create credentials" and choose "OAuth Client ID"

When the page prompts you to set up the Consent Screen, click "Configure consent screen". Set the "Product name shown to users" to: "Mindbreeze Crawler", for example. Then click on "Save".

Back on the Create Client ID page, you will be prompted by a wizard to select which “Application Type” you want to use. Select "Other" and enter a name, such as "Mindbreeze Crawler", and click "Create".

A pop-up dialog appears, "OAuth Client". Close the dialog by clicking on "OK". The displayed information will be downloaded separately later. The newly created credential appears in the list. Click on the "Download JSON" icon on the right-hand side to download the credential as a JSON file.

Enabling and configuring the Admin API

This step is necessary to enable the account to have access to the users and group names of GSuite. From this information, the access rights for a specific file are computed for search.

Enable the Admin SDK

Pull up the “Library Page” in the API Console. (https://console.developers.google.com/apis/library). Make sure you are logged on with the correct account.

Click on “G Suite APIs” -> “Admin SDK”

Make sure that the correct project (for example, "Mindbreeze Crawler") is selected above. Then click on "Enable".

Allow access to read the group and user names for the Google account

Only the “Users.Read” and “Groups.Read” permissions are required. This means that the Google account is only able to read its own files and the names of the groups and users in the GSuite domain. The Google account cannot impersonate other users. The Google account cannot read files from other users unless the files have been explicitly shared.

Open the Google Admin Console (https://admin.google.com) and login with a GSuite account with administrator privileges.

Click "Admin roles" to navigate to the Admin roles page.

Click "Create a new Role". This role is configured with minimal access rights. Select a name (for example, "Mindbreeze Crawler") and click "Create".

The "Privileges" tab for the new role will appear. Under the "Admin API Privileges" section, expand the "Users" entry and check the "Read" box. Then, expand the "Groups" entry and check the "Read" box. Then click "Save" below. The Mindbreeze Crawler role now has the Admin API permissions “Users.Read” and “Groups.Read”.

Navigate to the "Admins" tab above. Click on "Assign Admins". Enter the e-mail address of the Google Account that you used to activate the Admin SDK. Click "Confirm Assignment". The Mindbreeze Crawler role now has an assigned administrator.

Installation

Before you install the Google Drive connector, make sure that the Mindbreeze server is installed and the Google Drive Connector is included in the license. To install or update the connector, use the Mindbreeze Management Center.

Installing the Plugins via Mindbreeze Management Center

To install the plug-in, open the Mindbreeze Management Center. Select "Configuration" from the menu on the left-hand side. Then navigate to the "Plugins" tab. In the "Plugin Management" section, select the appropriate zip file and upload it by clicking the "Upload" button. This automatically installs or updates the connector, such as the case may be. In this process, the Mindbreeze services are restarted.

Configuring Mindbreeze

Select the installation method "Advanced" for configuration.

Configuring Google Drive Caching Principal Resolution Service

In a new or existing service, select the option Google Drive Principal Resolution Service in the setting service. For more information on creating and configuring a basic configuration of a Principal Resolution Service cache, see Installation & Configuration - Caching Principal Resolution Service.

In the "Connection Settings" section, set the connection settings. The connector supports two configuration variants: Service Account (recommended) and OAuth.

Connecting via a service account

The following settings must be made:

“GSuite Domain“	The domain that is used in Google Drive
“Service Account Name“	The e-mail address of the Google service account. Usually ends with .gserviceaccount.com
“GSuite Admin User Mail Address”	The e-mail address of the GSuite administrator. Usually ends with @myorganization.com
“Path to P12 Certificate“	Path to the P12 certificate generated when the Google service account was created.

In addition, the following settings must be set in the "Cache Settings" section:

“Database Directory Path“	Directory in which the cache data may be stored
“Cache Update Interval (Minutes)“	Specifies the duration of the update interval of the cache in minutes

Connecting via OAuth

Defining the settings

The following settings must be made:

“GSuite Domain“	The domain that is used in Google Drive
“Use OAuth instead”	Tick this box so that OAuth is used
“Client Secret JSON File Path“	Path to the JSON file that was downloaded when the OAuth credentials were created.
“Credential Persistence Directory Path“	Path to a directory where the credentials can be stored.
“OAuth response receive Port (HTTP)“	Port to receive the OAuth code. Select a free port. The port is opened only during initial setup.

In addition, the following settings must be set in the "Cache Settings" section:

“Database Directory Path“	Directory in which the cache data may be stored
“Cache Update Interval (Minutes)“	Specifies the duration of the update interval of the cache in minutes

Performing OAuth Authentication

Start the Principal Resolution Service
The Principal Resolution Service log file displays a note instructing you to open a URL with a browser on the same system.

Open the URL on a browser on the same system. (If this is not possible, use a browser on a different system; however, this requires additional steps.)
You’ll be directed to the Google login page. Sign in with the Google account you planned to use for the connector. Follow the on-screen instructions. Allow access.

The browser displays the text "Received verification code. You may now close this window." This completes the process. You can ignore the next steps.
If you have used a browser on another system, you will get a message that the connection could not be established. Below shows an example with Firefox:

Copy the URL from the address bar of the browser. The URL is as follows: “http://localhost:12345/oauthresponse?code=123abc”
On the Mindbreeze system, send an HTTP-GET request. To do this, run the following command from the command line: „curl "http://localhost:12345/oauthresponse?code=123abc" ".
Replace the URL with the previously copied URL.
The command line then outputs the following text: "Received verification code. You may now close this window." This completes the process.

Configuring the index and crawler

Navigate to the "Indices" tab and click on the "Add new index" icon in the upper right corner to create a new index.

Enter the path to the index and, if necessary, change the display name.

Add a new data source by clicking the "Add new custom source" icon at the top right. Select the Google Drive category. For "Caching Principal Resolution Service," select the previously configured Google Drive Caching Principal Resolution Service.

In the "Connection Settings" section, set the connection settings. The connector supports two configuration variants: Service Account (recommended) and OAuth.

Connecting via a Service Account

The following settings must be made:

“Service Account Name“	The e-mail address of the Google service account. Usually ends with .gserviceaccount.com
“GSuite Admin User Mail Address”	The e-mail address of the GSuite administrator. Usually ends with @myorganization.com
“Path to P12 Certificate“	Path to the P12 certificate generated when the Google service account was created.

Connecting via OAuth

Make sure you've successfully set up the DriveCaching Principal Resolution Service.

The following settings must be made:

“Use OAuth instead”	Tick this box so that OAuth is used
“Client Secret JSON File Path“	Path to the JSON file that was downloaded when the OAuth credentials were created.
“Credential Persistence Directory Path“	Path to a directory where the credentials can be stored. Select the same directory as the Google Drive Caching Principal Resolution Service.
“OAuth response receive Port (HTTP)“	Port to receive the OAuth code. Select a free port. The port is opened only during initial setup.

Advanced Settings

In order to make these settings visible, activate the "Advanced" option in the filter tab at the top right. The following settings are available:

In the section "Crawler Settings":

“Maximum File Size (MB)“	Maximum file size. Larger files are ignored. This is ineffective with Google Docs objects
„Corpora“	Determines which document body is indexed. Available are the values "User" (indexes documents of the crawling user) and "Domain" (indexes documents released in the domain of the crawling user). Default Value: "User“
“Number of Crawler Threads“	Number of threads that download parallel documents. Too high a number may cause Google Drive API errors.
“Max Fetch Retry Count“	For some errors when downloading a document, the connector tries to download the document again. This determines the maximum number of attempts.
“Exponential Backoff Wait Time(s)“	For some errors when downloading a document, the connector tries to download the document again. The connector waits for the set time before each attempt. The waiting time is doubled each time (exponential backoff).

In the section “Content Settings“:

“Exclude MIME Types Pattern“	If the MIME type of a document matches this regular expression, the document is ignored. Example: application/vnd\.google\-apps.* ignores all Google Docs documents.
“Exclude Filename Pattern“	If the filename of a document matches this regular expression, the document is ignored. For example: .*\.zip ignores all ZIP archives.
“Enable GoogleDrive Delta Crawling”	If set, the GoogleDrive API will only fetch changes to files instead of crawling over the whole GoogleDrive instance.

Google Drive Connector
Installation and Configuration

Requirements

Overview of the configuration possibilities

Creating and configuring the Google Service Account

Create a Google Service Account

Linking Gsuite to a Google Service Account

Configuring Google OAuth

Creating the Credentials

Enabling the Drive API

Creating the authorization credentials

Enabling and configuring the Admin API

Allow access to read the group and user names for the Google account

Installation

Installing the Plugins via Mindbreeze Management Center

Configuring Mindbreeze

Configuring Google Drive Caching Principal Resolution Service

Connecting via a service account

Connecting via OAuth

Defining the settings

Performing OAuth Authentication

Configuring the index and crawler

Connecting via a Service Account

Connecting via OAuth

Advanced Settings

Download PDF

Download PDF

{{{i18n.refineSearch}}}

Google Drive Connector Installation and Configuration

Requirements

Overview of the configuration possibilities

Creating and configuring the Google Service Account

Create a Google Service Account

Linking Gsuite to a Google Service Account

Configuring Google OAuth

Creating the Credentials

Enabling the Drive API

Creating the authorization credentials

Enabling and configuring the Admin API

Allow access to read the group and user names for the Google account

Installation

Installing the Plugins via Mindbreeze Management Center

Configuring Mindbreeze

Configuring Google Drive Caching Principal Resolution Service

Connecting via a service account

Connecting via OAuth

Defining the settings

Performing OAuth Authentication

Configuring the index and crawler

Connecting via a Service Account

Connecting via OAuth

Advanced Settings

Download PDF

Download PDF

Google Drive Connector
Installation and Configuration