entity-resolution

module
v0.1.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 7, 2022 License: MIT

README

Entity Resolution

Entity resolution (ER) is the task of disambiguating records that correspond to real world entities across and within datasets.

The problems associated with entity resolution are equally big as the volume and velocity of data grow, inference across networks and semantic relationships between entities becomes increasingly difficult. This project attempts to provide a solution using Elasticsearch and Graph Database.

Overview

Links: RedisGraph, Data Modeling

Getting Started

Clone
gh repo clone xmlking/entity-resolution
Install dependencies
export GOPRIVATE=github.com/entity-resolution/*,go.buf.build
go work sync
go mod tidy
Install git hooks
# cog install-hook all
cog install-hook commit-msg
# you can verify if the hooks are installed by running
cat .git/hooks/commit-msg

Maintenance

Update generated proto code from BSR, after you publish proto to BSR

export GOPRIVATE=github.com/entity-resolution/*,go.buf.build
go get go.buf.build/grpc/go/entity-resolution/entityapis
go work sync

update outdated Go dependencies interactively

export GOPRIVATE=github.com/entity-resolution/*,go.buf.build 

go-mod-upgrade
# then commit the changes. 
Update deps
go work sync
task mod:outdated
task mod:sync
task mod:verify
Lint Code

before commit, line your code with following command in the order:

go fmt ./...
#golangci-lint --version
golangci-lint run -c .github/linters/.golangci.yml

Development

Launch Redis
docker compose up
# docker compose up redis
# open Grafana UI and enable redis plugin
open http://localhost:3000/plugins/redis-app/
open http://localhost:3000/dashboards

# to ssh to grafana
docker-compose exec grafana /bin/bash
cd /etc/grafana/provisioning

# stop
docker compose down
# this will stop redis and remove all volumes
docker compose down -v 
Run
# first generate go code.
go generate ./... 
# run engine
go run ./service/entity/... 
#go run ./cmd/er/...   

To see all config environment variable options, run:

CONFY_LOG_LEVEL=debug \
CONFY_DEBUG_MODE=true \
CONFY_VERBOSE_MODE=true \
go run ./service/entity/... 
Test
# first generate go code. <-- IMPORTANT
go generate ./... 

go test -v ./service/entity/... 
go test -v ./cmd/er/...   
Build
# first generate go code.
go generate ./... 
go build -o build ./service/entity/... 
go build -o build ./cmd/er/...
Release

Following command bump VERSION number and push changes and tag to remote
Then, GitHub Action trigger GoReleaser process.

NOTE: make sure you commit all changes before running this command.

### 
```shell
# dry-run: calculate the next version based on the commit types since the latest tag
cog bump --auto --dry-run 
# calculate the next version based on the commit types since the latest tag
cog bump --auto
Verify
Test multi-platform docker images
```shell
docker run -it --rm --init --platform linux/amd64 ghcr.io/xmlking/entity-resolution/entity:latest
docker run -it --rm --init --platform linux/arm64 ghcr.io/xmlking/entity-resolution/entity:latest
Local Docker Build

multi-platform, multi-stage, multi-module local build

#VERSION=$(git describe --tags || echo "HEAD")
VERSION=v0.1.262
BUILD_DATE=$(date +%FT%T%Z)
DOCKER_IMAGE=ghcr.io/xmlking/entity-resolution/entity

# build 
docker buildx create --use

docker buildx build --platform linux/arm64,linux/amd64 \
-t $DOCKER_IMAGE:$VERSION \
-t $DOCKER_IMAGE:latest \
--build-arg BUILD_DATE=$BUILD_DATE --build-arg VERSION=$VERSION \
--secret id=BUF_TOKEN,src=buf_token.txt \
-f Dockerfile.local --push .

# inspect
docker buildx imagetools inspect $DOCKER_IMAGE:$VERSION
docker buildx imagetools inspect --raw $DOCKER_IMAGE:$VERSION

# run
docker run -it --rm --platform linux/arm64 $DOCKER_IMAGE:$VERSION
docker run -it --rm --platform linux/amd64 $DOCKER_IMAGE:$VERSION
# Notice `platform` in the build_info logs
# build_info={"branch":"main","build_time":"","commit":"","compiler":"gc","go_version":"go1.19","platform":"linux/arm64","state":"dirty","tag":""}

TODO

Reference

Directories

Path Synopsis
cmd
er module
internal
service
entity module

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL