bernard

package module
v0.5.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 15, 2021 License: MIT Imports: 9 Imported by: 2

README

Introduction

Bernard is an essential character in my Journey of Transfer narrative as he is in charge of mirroring the state of a Shared Drive to a specified datastore. Specifically, Bernard acts as an engine to fetch changes from the Google Drive API to then propagate these changes to a datastore such as SQLite.

Journey of Transfer is a narrative I am writing with projects named after characters of Westworld. The narrative is my exploration process of the Go language, while building a programme utilising service accounts to upload and sync files to Google Drive.

Bernard is the second character of this narrative and was created to provide an alternative to RClone to provide local low-latency access to Google Drive metadata.

Early Access

Bernard is provided as an early-access preview as the API may still change. Furthermore, not all components have associated tests.

This early-access preview comes with a small CLI to visually reflect the changes Bernard picks up. Once Bernard is proven to be stable and correct, this CLI will be removed.

Building the CLI

  1. Install Golang.
  2. Set the environment variable CGO_ENABLED=1 and make sure you have a GCC compiler present.
  3. Clone this repository and cd into it.
  4. Run: go build -o bernard cmd/bernard/main.go

You should now see a binary called bernard in the current working directory.

Using the CLI

Make sure you create a Service Account which has read access to the Shared Drive in question. Additionally, please check whether you have the Drive API enabled in Google Cloud. Save a JSON key of this service account and store it somewhere you can easily access the file.

Bernard will store a SQLite database file called bernard.db in your current working directory. It is advised to store the JSON key of the service account in the same directory.

The CLI requires three arguments:

  1. full or diff
  2. The ID of the Shared Drive you want to synchronise
  3. The path to the JSON key of the service account

The first argument specifies the operation, where full will activate a full synchronisation of the Shared Drive and diff will fetch the latest changes. You must fully synchronise once before fetching the differences.

The second argument takes a string as input which should be the ID of your Shared Drive. Make sure the Service Account has read access to the Shared Drive in question.

The third argument takes a string as input which should point to the JSON key of the service account on your file system.

CLI example
bernard "full" "1234xxxxxxxxxxxxxVA" "./account.json"

In this example, a full synchronisation is activated for the Shared Drive 1234xxxxxxxxxxxxxVA with the Service Account ./account.json.

Using Bernard in your Go project

Bernard is available as a Go module. To add Bernard to your Go project run the following command:

go get github.com/m-rots/bernard

Full & Partial Synchronisation

Bernard allows two ways of synchronising the datastore. The FullSync() takes a considerable amount of time depending on the number of files placed in the Shared Drive. Bernard roughly processes 1000 files every 1-2 seconds in the full synchronisation mode.

Please note that the full synchronisation can be incomplete if you make changes to the Shared Drive in the minutes leading up to the full synchronisation.

Once you have fully synchronised the Shared Drive, you can use the PartialSync() to fetch the differences between the last synchronisation (both full and partial) and the current Shared Drive state.

Hooks

Hooks allow you to run code in-between the fetch of changes and the processing of these changes to the datastore. The reference SQLite datastores comes with a NewDifferencesHook() function to check which of the Google-reported files have actually changed. Furthermore, it also retrieves the last-known values of removed items and reports which items have been added (do not exist yet).

To create a DifferencesHook, you can utilise the following code:

hook, diff := store.NewDifferencesHook()
err = bernard.PartialSync("driveID", hook)

// access diff

diff is a pointer to a Difference struct and is filled with data by the PartialSync function. The Difference struct contains:

  • AddedFiles, a slice of files not currently present in the datastore.
  • AddedFolders, a slice of folders not currently present in the datastore.
  • ChangedFiles, a slice of FileDifferences, providing both the old and new state.
  • ChangedFolders, a slice of FolderDifferences, providing both the old and new state.
  • RemovedFiles, a slice of removed files with their last-known state stored by the datastore.
  • RemovedFolders, a slice of removed folders with their last-known state stored by the datastore.

Datastore

The datastore is a core component of Bernard's operations. Bernard provides a reference implementation of a Datastore in the form of a SQLite database. This reference datastore can be expanded to allow other operations on the underlying database/sql interface.

If SQLite is not your database of choice, feel free to open a pull request with support for another database such as MongoDB, Fauna or CockroachDB. I highly advise you to have a look at datastore/datastore.go and datastore/sqlite/sqlite.go files to get a feel for the operations the Datastore interface should perform.

Authenticator

Bernard exports an Authenticator interface which hosts an AccessToken function. This function should fetch a valid access token at all times. It should respond with the access token as a string, its UNIX expiry time as an int64 and an error in case the credentials are invalid.

type Authenticator interface {
  AccessToken() (string, int64, error)
}

To get started quickly, you can use Stubbs as it implements the Authenticator interface.

Example code

In this example, Stubbs is used as the Authenticator and the reference SQLite datastore is used.

package main

import (
  "fmt"
  "os"

  "github.com/m-rots/bernard"
  "github.com/m-rots/bernard/datastore/sqlite"
  "github.com/m-rots/stubbs"
)

func getAuthenticator() (bernard.Authenticator, error) {
  clientEmail := "stubbs@westworld.iam.gserviceaccount.com"
  privateKey := "-----BEGIN PRIVATE KEY-----\n..."
  scopes := []string{"https://www.googleapis.com/auth/drive.readonly"}

  priv, err := stubbs.ParseKey(privateKey)
  if err != nil {
    // invalid private key
    return nil, err
  }

  account := stubbs.New(clientEmail, &priv, scopes, 3600)
  return account, nil
}

func main() {
  // Use Stubbs as the authenticator
  authenticator, err := getAuthenticator()
  if err != nil {
    fmt.Println("Invalid private key")
    os.Exit(1)
  }

  driveID := "1234xxxxxxxxxxxxxVA"
  datastorePath := "bernard.db"

  store, err := sqlite.New(datastorePath)
  if err != nil {
    // Either the database could not be created,
    // or the SQL schema is broken somehow...
    fmt.Println("Could not create SQLite datastore")
    os.Exit(1)
  }

  bernie := bernard.New(authenticator, store)

  err = bernie.FullSync(driveID)
  if err != nil {
    fmt.Println("Could not fully synchronise the drive")
    os.Exit(1)
  }

  err = bernie.PartialSync(driveID)
  if err != nil {
    fmt.Println("Could not partially synchronise the drive")
    os.Exit(1)
  }
}

Documentation

Index

Constants

This section is empty.

Variables

View Source
var ErrInvalidCredentials = errors.New("bernard: invalid credentials")

ErrInvalidCredentials can occur when the wrong authentication scopes are used, the access token does not have access to the specified resource, or the token is simply invalid or expired.

View Source
var ErrNetwork = errors.New("bernard: network related error")

ErrNetwork is the result of a networking error while contacting the Google Drive API. This error is only thrown on status codes not equal to 200, 401 and 500.

View Source
var ErrNotFound = errors.New("bernard: cannot find Shared Drive")

ErrNotFound only occurs when the provided auth does not have access to the Shared Drive or if the Shared Drive does not exist.

Functions

This section is empty.

Types

type Authenticator

type Authenticator interface {
	AccessToken() (string, int64, error)
}

Authenticator represents any struct which can create an access token on demand

type Bernard

type Bernard struct {
	// contains filtered or unexported fields
}

Bernard is a synchronisation backend for Google Drive.

func New

func New(auth Authenticator, store ds.Datastore, opts ...Option) *Bernard

New creates a new instance of Bernard

func (*Bernard) FullSync

func (bernard *Bernard) FullSync(driveID string) error

FullSync syncs the entire content of Drive to the datastore.

func (*Bernard) PartialSync

func (bernard *Bernard) PartialSync(driveID string, hooks ...Hook) error

PartialSync syncs the latest changes within the Drive to the underlying datastore.

Optionally, you can provide one or multiple Hooks to get insight into the fetched changes.

type Hook

type Hook = func(drive ds.Drive, files []ds.File, folders []ds.Folder, removed []string) error

Hook allows the injection of functions between the fetch and datastore operations.

The hook provides the changes as provided by Google, which could contain data anomalies.

The first hook parameter, Drive, always provides the ID of the Drive. If the name of the Drive is changed in a partial sync, the drive.Name will be updated to reflect the new value. If the name did not change, then drive.Name is an empty string.

The second parameter, files, contains all the updated files in their new state.

The third parameter, folders, contains all the updated folders in their new state. Sometimes Google says a folder has changed when it actually has not. If you want to be sure that the folder changed, compare the old datastore state versus the new state as provided in this slice of folders.

The fourth parameter, removed, contains all the removed files and folders by ID. Google does not provide the last known state the files and folders were in. So to check the state of the removed items, use the datastore to get their last-known state.

type Option

type Option func(*Bernard)

An Option can override some of the default Bernard values.

func WithClient

func WithClient(client *http.Client) Option

WithClient allows one to override the default HTTP client.

func WithJSONDecoder

func WithJSONDecoder(d jsonDecoder) Option

WithJSONDecoder allows one to replace Go's default JSON decoding with a more memory-efficient and quicker solution.

func WithPreRequestHook

func WithPreRequestHook(preHook func()) Option

WithPreRequestHook allows one to apply rate-limiting before every request.

This function is called before fetching the authentication token to prevent tokens from expiring when a rate-limit is applied.

func WithSafeSleep

func WithSafeSleep(duration time.Duration) Option

WithSafeSleep allows one to sleep between the pageToken fetch and the full sync. Setting this between 1 and 5 minutes prevents any data from going rogue when changes are actively being made to the Shared Drive.

The default value of safeSleep is set at 0.

Directories

Path Synopsis
cmd
Package datastore provides the file and folder representations used in Bernard.
Package datastore provides the file and folder representations used in Bernard.
sqlite
Package sqlite provides a reference implementation of a Bernard datastore.
Package sqlite provides a reference implementation of a Bernard datastore.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL