rdss-archivematica-channel-adapter

command module
v0.9.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 26, 2021 License: AGPL-3.0 Imports: 5 Imported by: 0

README

Travis CI GoDoc Coverage Status Go Report Card Sourcegraph Apache

RDSS Archivematica Channel Adapter

Introduction

RDSS Archivematica Channel Adapter is an implementation of a channel adapter for Archivematica following the RDSS messaging API specification.

Installation

This application is distributed as a single static binary file that you can download from the Releases page. You can use a process manager such systemd to run it.

The following example runs the application server using the Docker image.

$ docker run \
    --tty --rm \
    --env "RDSS_ARCHIVEMATICA_ADAPTER_ADAPTER.QUEUE_RECV_MAIN_ADDR=https://queue.amazonaws.com/444455556666/recv" \
    --env "RDSS_ARCHIVEMATICA_ADAPTER_ADAPTER.QUEUE_SEND_MAIN_ADDR=arn:aws:sqs:us-east-2:444455556666:send" \
    --env "AWS_REGION=us-east-1" \
    --env "AWS_ACCESS_KEY=1234" \
    --env "AWS_SECRET_KEY=5678" \
    artefactual/rdss-archivematica-channel-adapter:latest \
        server

Read the configuration section before you proceed with the deployment.

Configuration

All configuration attributes are described in the source code. See config.go for more.

There are sensible defaults in place. You need to pay special attention to the attributes below and tweak them according to your environment:

  • adapter.processing_table
  • adapter.repository_table
  • adapter.registry_table
  • adapter.queue_recv_main_addr
  • adapter.queue_send_main_addr
  • adapter.validation_service_addr
Configuration file

We use the TOML configuration file format. The configuration file can be indicated via the --config command-line argument. When undefined, the application attempts to read from one of the following locations:

  • $HOME/.config/rdss-archivematica-channel-adapter.toml
  • /etc/archivematica/rdss-archivematica-channel-adapter.toml

This is a minimal configuration example:

[adapter]
processing_table = "adapter_processing"
repository_table = "repository_processing"
registry_table = "registry_processing"
queue_recv_main_addr = "https://queue.amazonaws.com/444455556666/recv"
queue_send_main_addr = "arn:aws:sqs:us-east-2:444455556666:send"
Environment variables

Configuration from environment variables have precedence over file-based configuration. All environment variables follow the same naming scheme: RDSS_ARCHIVEMATICA_ADAPTER_<SECTION>_<ATTRIBUTE>=<VALUE>. Some valid examples are:

  • RDSS_ARCHIVEMATICA_ADAPTER_LOGGING.LEVEL=DEBUG
    (section: LOGGING, attribute: LEVEL, value: DEBUG)
  • RDSS_ARCHIVEMATICA_ADAPTER_ADAPTER.VALIDATION_SERVICE_ADDR=http://service.tld:9000
    (sectoin: ADAPTER, attribute: VALIDATION_SERVICE_ADDR, value: http://service.tld:9000)
Service dependencies

This application sits between multiple services and assumes access to the following resources and actions.

Resource API action Configuration
AWS SQS sqs:ReceiveMessage adapter.queue_recv_main_addr
aws.sqs_profile (optional)
aws.sqs_endpoint (optional)
AWS SNS sns:Publish adapter.queue_send_main_addr
adapter.queue_send_invalid_addr
adapter.queue_send_error_addr
aws.sns_profile (optional)
aws.sns_endpoint (optional)
AWS DynamoDB dynamodb:GetItem
dynamodb:PutItem
dynamodb:Scan
adapter.processing_table
adapter.repository_table
adapter.registry_table
aws.dynamodb_profile (optional)
aws.dynamodb_endpoint (optional)
AWS S3 s3:GetObject adapter.s3_profile
adapter.s3_endpoint
(only needed when preservation requests point to S3 buckets.)
Archivematica N/A (configured via the adapter.registry_table)

SQS/SNS resources are expected to be provisioned by RDSS. The DynamoDB tables are local to the adapter and need to be created by the user. For example, they can be created using the AWS CLI as in the following example:

aws dynamodb create-table \
    --table-name="rdss_archivematica_adapter_local_data_repository" \
    --attribute-definitions="AttributeName=ID,AttributeType=S" \
    --key-schema="AttributeName=ID,KeyType=HASH" \
    --billing-mode="PAY_PER_REQUEST"

aws dynamodb create-table \
    --table-name="rdss_archivematica_adapter_processing_state" \
    --attribute-definitions="AttributeName=objectUUID,AttributeType=S" \
    --key-schema="AttributeName=objectUUID,KeyType=HASH" \
    --billing-mode="PAY_PER_REQUEST"

aws dynamodb create-table \
    --table-name="rdss_archivematica_adapter_registry" \
    --attribute-definitions="AttributeName=tenantJiscID,AttributeType=S" \
    --key-schema="AttributeName=tenantJiscID,KeyType=HASH" \
    --billing-mode="PAY_PER_REQUEST"
AWS service client configuration

The AWS service client configuration rely on the shared configuration functionality which is similar to the AWS CLI configuration system.

Additionally, you can override the configuration profile on each client as well as the endpoint using the following environment strings:

  • RDSS_ARCHIVEMATICA_ADAPTER_AWS.S3_PROFILE
  • RDSS_ARCHIVEMATICA_ADAPTER_AWS.S3_ENDPOINT
  • RDSS_ARCHIVEMATICA_ADAPTER_AWS.DYNAMODB_PROFILE
  • RDSS_ARCHIVEMATICA_ADAPTER_AWS.DYNAMODB_ENDPOINT
  • RDSS_ARCHIVEMATICA_ADAPTER_AWS.SQS_PROFILE
  • RDSS_ARCHIVEMATICA_ADAPTER_AWS.SQS_ENDPOINT
  • RDSS_ARCHIVEMATICA_ADAPTER_AWS.SNS_PROFILE
  • RDSS_ARCHIVEMATICA_ADAPTER_AWS.SNS_ENDPOINT

This can be useful under a variety of scenarios:

  • Deployment of alternative services like LocalStack, Minio, etc...
  • Applying different credentials, e.g. assuming a IAM role in the SQS/SNS clients.
Registry of Archivematica pipelines

The adapter uses a registry of Archivematica pipelines stored in DynamoDB (table adapter.repository_table) that looks like the following:

tenantJiscID url user key transferDir
1 http://192.168.1.1/api user juoCah3o /mnt/share/tenant1
2 http://192.168.1.2/api user Ixie9aid /mnt/share/tenant2

It is possible to create, delete and scan items in various ways, including the AWS Management Console. The folowing is an example of item creation using the AWS CLI:

env \
    AWS_DEFAULT_REGION="us-east-1" \
    AWS_ACCESS_KEY_ID="1234" \
    AWS_SECRET_ACCESS_KEY="5678" \
        aws dynamodb put-item \
            --table-name="rdss_archivematica_adapter_registry"
            --item "file:///tmp/test-registry-item.json"

The previous command loads the record in /tmp/test-registry-item.json:

{
    "tenantJiscID": {"S": "3"},
    "url": {"S": "http://192.168.1.3/api"},
    "user": {"S": "user"},
    "key": {"S": "eh6eeDuu"},
    "transferDir": {"S": "/mnt/share/tenant3"}
}

The adapter loads the registry in three cases:

  • When the application starts.

  • Every 10 seconds once the application has been initialized properly.

  • When a USR1 signal is received, e.g.:

    killall -s SIGUSR1 rdss-archivematica-channel-adapter
    

Send the USR2 signal to log the current instances loaded:

killall -s SIGUSR2 rdss-archivematica-channel-adapter

Metrics and runtime profiling data

rdss-archivematica-channel-adapter server runs a HTTP server that listens on 0.0.0.0:6060 with two purposes:

  • /metrics serves metrics of the Go runtime and the application meant to be scraped by a Prometheus server.
  • /debug/pprof serves runtime profiling data in the format expected by the pprof visualization tool. Visit net/http/pprof docs for more.

Contributing

  • See CONTRIBUTING.md for information about setting up your environment and the workflow that we expect.
  • Check out the open issues.

Also, the broker package can be used to implement your own RDSS adapter using the Go programming language. The linked docs include documentation and examples. The API stability is not guaranteed.

Documentation

The Go Gopher

There is no documentation for this package.

Directories

Path Synopsis
message
Package message provide types and functions to work with the RDSS API.
Package message provide types and functions to work with the RDSS API.
`make test-integration` is the simplest way to run these tests.
`make test-integration` is the simplest way to run these tests.
internal

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL