harmony-one-to-bigquery

module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 19, 2021 License: MPL-2.0

README

harmony-one-to-bigquery

A golang application to import Harmony ONE blockchain data into GCP BigQuery. The overall objective of this program is to request the most recent block number submitted to the Harmony One blockchain. Then retrieve the most recent blockchain data inserted into GCP BigQuery. Then begin backfilling the blockchain data into BigQuery by making RPC requests for each block missing from BigQuery.

Example

Most recent block header specifies block number: 0xd92e14 -> 14233108

Most recent block number found in BigQuery: 0xd92e0c -> 14233100

So BigQuery is missing 8 of the most recent blocks and their transactions. So it will attempt to retrieve the first of those 8 missing blocks by starting at 14233100 and working it's way to 14233108.

Pre-requisites

In order to use the backfill binary in production you will need to have access to a GCP project that has access to BigQuery. Since this program also uses "streaming inserts" to insert data into BigQuery your GCP project will need to have billing enabled. Once you have a project that is up and running, has access to BigQuery and has billing enable you should generate application credential for the program to use. A complete guide on how to get that can be found here.

In order to use everything that was utilized in this program it is recommended that you use and understand Docker. Additionally it is also recommended to have a basic understanding of Kubernetes.

These were key to getting this program running in a production environment that allowed for consistent runtimes without the issues of trying to load ~1.4M blocks with transactions into GCP BigQuery on a local machine.

Building

Locally

You can build a local version of the binary that is specific to your OS type by just using the go build command.

$ go build -o ./bin/hmy-bq-import ./cmd/hmy-bq-import

The resulting binary in the ./bin folder will allow you to run the program locally and begin backfilling Harmony One blockchain data into your GCP BigQuery project.

You can also build a dockerized version of the application by first build a linux specific binary and placing that in the docker/artifacts folder.

$ GOOS=linux GOARCH=amd64 go build -o ./docker/artifacts/hmy-bq-import ./cmd/hmy-bq-import

The next step you will need is to copy your credetials file into the artifacts folder.

When you dowloaded your Google application credentials file it will have more than likely ended up in your Downloads folder. If you copy that over to the docker/artifacts folder it will mounted to the docker image on build. Replace the source part of the cp command with where you credentials file lives.

$ cp $HOME/path/to/Downloads/google-application-credentials.json ./docker/artifacts/harmonyone-gcp-bigquery.json

Now you should be ready to perform a docker build command to build the hmy-bq-import docker image. Now when you created your GCP project you should have been given a project-id to reference, we will use this as it will be useful when pushing the image to GCR (Google Cloud Registry).

export PROJECT_ID=your-project-id

Next lets build the docker image from the docker folder in the project

$ cd ./docker
$ docker build -t grc.io/${PROJECT_ID}/hmy-bq-import:v1 .

Verify that the image was built:

$ docker images
REPOSITORY                                     TAG       IMAGE ID       CREATED        SIZE
gcr.io/${PROJECT_ID}/hmy-bq-import             v1        acafe4ca74a5   10 hours ago   23MB

Running

Locally

You can simply run the binary using the backfill command:

./bin/hmy-bq-import backfill --gcp-project-id $PROJECT_ID --help
Docker

You can check that the docker image build works by running the following docker run command.

$ docker run -it --rm \
  -e GOOGLE_APPLICTION_CREDENTIALS=/etc/hmy/harmonyone-gcp-bigquery.json \
  -e GCP_PROJECT_ID=${PROJECT_ID \
  gcr.io/${PROJECT_ID}/hmy-bq-import:v1

Using Kubernetes

Since this application is dockerized it can be run in Kubernetes. And this was deployed to GCP Kubernetes Engine to allow for the backfill to continuously run and keep the public dataset as close to realtime as possible.

If you wish to run this application on Kubernetes in GCP a quickstart guide will be able to help you do so.

Environment Variables

Env Var Name Description Default Value Required
NODE_URL the url of the node used to pull historical data from https://api.s0.t.hmny.io N
GCP_PROJECT_ID the project id used in GCP to store blockchain data in BigQuery Y
GCP_DATASET_ID the dataset id used in GCP to store blockchain data in BigQuery crypto_harmony N
GCP_BLOCKS_TABLE_ID the blocks table id used in GCP to store blockchain data in BigQuery blocks N
GCP_TXNS_TABLE_ID the transactions table id used in GCP to store blockchain data in BigQuery transactions N
CONCURRENCY the number concurrent go routines pulling Harmony One blockchain data 1 N

LICENSE

Mozilla Public License Version 2.0

Directories

Path Synopsis
cmd
internal
clients/bigquery/mock_bigquery
Package mock_bigquery is a generated GoMock package.
Package mock_bigquery is a generated GoMock package.
clients/harmony/mock_harmony
Package mock_harmony is a generated GoMock package.
Package mock_harmony is a generated GoMock package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL