geobed

package module
v0.0.0-...-f448944 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 12, 2026 License: BSD-3-Clause Imports: 23 Imported by: 0

README

Geobed

A high-performance, offline geocoding library for Go. Geocode city names to coordinates and reverse geocode coordinates to city names without any external API calls.

Features

  • Offline: All data embedded in the binary - no network requests after import
  • Fast reverse geocoding: S2 spatial index delivers ~8μs per query (~150,000 queries/sec)
  • Forward geocoding: Fuzzy matching with scoring for city name lookups
  • 165K cities: Cities with population > 1000 from Geonames dataset
  • Thread-safe: Safe for concurrent use from multiple goroutines
  • Zero configuration: Works out of the box with NewGeobed()

Installation

go get github.com/andreiashu/geobed

Requires Go 1.24 or later.

Quick Start

package main

import (
    "fmt"
    "log"

    "github.com/andreiashu/geobed"
)

func main() {
    g, err := geobed.NewGeobed()
    if err != nil {
        log.Fatal(err)
    }

    // Forward geocoding: city name -> coordinates
    city := g.Geocode("Austin, TX")
    fmt.Printf("%s: %.4f, %.4f\n", city.City, city.Latitude, city.Longitude)
    // Output: Austin: 30.2672, -97.7431

    // Reverse geocoding: coordinates -> city
    result := g.ReverseGeocode(51.5074, -0.1278)
    fmt.Printf("%s, %s\n", result.City, result.Country())
    // Output: City of London, GB
}

API

Creating a GeoBed Instance
// Create a new instance (loads ~218MB into memory)
g, err := geobed.NewGeobed()

// Or use a shared singleton (thread-safe, initialized once)
g, err := geobed.GetDefaultGeobed()
Forward Geocoding
// Basic lookup
city := g.Geocode("Paris")

// With region qualifier
city := g.Geocode("Paris, TX")      // Paris, Texas
city := g.Geocode("Paris, France")  // Paris, France

// Access result fields
fmt.Println(city.City)        // "Paris"
fmt.Println(city.Country())   // "FR"
fmt.Println(city.Region())    // "" (or state code for US cities)
fmt.Println(city.Latitude)    // 48.8566
fmt.Println(city.Longitude)   // 2.3522
fmt.Println(city.Population)  // 2138551
Reverse Geocoding
// Find nearest city to coordinates
city := g.ReverseGeocode(37.7749, -122.4194)
fmt.Printf("%s, %s, %s\n", city.City, city.Region(), city.Country())
// Output: San Francisco, CA, US
GeobedCity Struct
type GeobedCity struct {
    City       string  // City name
    CityAlt    string  // Alternate names (comma-separated)
    Latitude   float32 // Latitude in degrees
    Longitude  float32 // Longitude in degrees
    Population int32   // Population count
}

// Methods
func (c GeobedCity) Country() string  // ISO 3166-1 alpha-2 country code
func (c GeobedCity) Region() string   // State/province code (e.g., "TX", "CA")

Performance

Operation Time Throughput
Reverse geocode ~8μs ~150,000/sec
Forward geocode ~12ms ~80/sec
Initial load ~2s -

Benchmarked on Apple M1. Forward geocoding is slower due to fuzzy string matching across 2.38M cities.

Memory Usage

  • Runtime memory: ~57MB
  • Binary size: ~7MB (embedded compressed data)

The library loads all city data into memory on initialization. This enables fast lookups with minimal memory overhead.

How It Works

Forward Geocoding

Uses a scored fuzzy matching algorithm that considers:

  • Exact city name matches (highest priority)
  • Region/state matches
  • Country matches
  • Alternate city names
  • Partial matches
  • Population (as tiebreaker)
Reverse Geocoding

Uses Google's S2 Geometry library with a cell-based spatial index:

  1. Divides Earth into hierarchical cells at level 10 (~10km)
  2. Maps each city to its containing cell
  3. On query, checks the target cell plus 8 neighbors
  4. Returns the closest city by spherical distance

This achieves O(k) complexity where k ≈ 100-500 cities, compared to O(n) for naive scanning.

Data Sources

City data comes from Geonames:

  • cities1000.zip: Cities with population > 1000
  • countryInfo.txt: Country metadata

Data snapshot: February 2026

Updating the Data

To refresh the embedded data with the latest from Geonames:

# Download fresh data and regenerate cache
make update-data

# Commit the updated files
git add geobed-data geobed-cache
git commit -m "Update Geonames data to $(date +%Y-%m)"

Geonames updates their data daily around 3AM CET.

Limitations

  • City-level precision only (no street addresses)
  • Forward geocoding works best with well-known city names
  • No typo correction (yet)
  • US-centric region support (state codes work best for US)

License

MIT License - see LICENSE file.

Credits

  • Tom Maiaroto (@tmaiaroto) - Original author of geobed
  • jvmatl (@jvmatl) - Added embedded data files and offline capability

Acknowledgments

Documentation

Index

Constants

This section is empty.

Variables

View Source
var UsStateCodes = map[string]string{
	"AL": "Alabama", "AK": "Alaska", "AZ": "Arizona", "AR": "Arkansas",
	"CA": "California", "CO": "Colorado", "CT": "Connecticut", "DE": "Delaware",
	"FL": "Florida", "GA": "Georgia", "HI": "Hawaii", "ID": "Idaho",
	"IL": "Illinois", "IN": "Indiana", "IA": "Iowa", "KS": "Kansas",
	"KY": "Kentucky", "LA": "Louisiana", "ME": "Maine", "MD": "Maryland",
	"MA": "Massachusetts", "MI": "Michigan", "MN": "Minnesota", "MS": "Mississippi",
	"MO": "Missouri", "MT": "Montana", "NE": "Nebraska", "NV": "Nevada",
	"NH": "New Hampshire", "NJ": "New Jersey", "NM": "New Mexico", "NY": "New York",
	"NC": "North Carolina", "ND": "North Dakota", "OH": "Ohio", "OK": "Oklahoma",
	"OR": "Oregon", "PA": "Pennsylvania", "RI": "Rhode Island", "SC": "South Carolina",
	"SD": "South Dakota", "TN": "Tennessee", "TX": "Texas", "UT": "Utah",
	"VT": "Vermont", "VA": "Virginia", "WA": "Washington", "WV": "West Virginia",
	"WI": "Wisconsin", "WY": "Wyoming",

	"AS": "American Samoa", "DC": "District of Columbia",
	"FM": "Federated States of Micronesia", "GU": "Guam",
	"MH": "Marshall Islands", "MP": "Northern Mariana Islands",
	"PW": "Palau", "PR": "Puerto Rico", "VI": "Virgin Islands",

	"AA": "Armed Forces Americas", "AE": "Armed Forces Europe", "AP": "Armed Forces Pacific",
}

UsStateCodes maps US state abbreviations to full names.

Functions

func CountryCount

func CountryCount() int

CountryCount returns the number of unique country codes in the lookup table. Useful for testing and debugging.

func RegenerateCache

func RegenerateCache() error

RegenerateCache forces a reload from raw data files and regenerates the cache. This is useful for updating the embedded cache after downloading fresh data. The raw data files must exist in ./geobed-data/ before calling this function.

After running, compress the cache files with bzip2:

bzip2 -f geobed-cache/*.dmp

func RegionCount

func RegionCount() int

RegionCount returns the number of unique region codes in the lookup table. Useful for testing and debugging.

func ValidateCache

func ValidateCache() error

ValidateCache loads the cache and performs integrity and functional checks. Returns an error if validation fails.

Types

type AdminDivision

type AdminDivision struct {
	Code string // Admin1 code (e.g., "TX", "08")
	Name string // Full name (e.g., "Texas", "Ontario")
}

AdminDivision represents a first-level administrative division (state, province, etc.)

type Cities

type Cities []GeobedCity

Cities is a sortable slice of GeobedCity.

func (Cities) Len

func (c Cities) Len() int

func (Cities) Less

func (c Cities) Less(i, j int) bool

func (Cities) Swap

func (c Cities) Swap(i, j int)

type CountryInfo

type CountryInfo struct {
	Country            string
	Capital            string
	Area               int32
	Population         int32
	GeonameId          int32
	ISONumeric         int16
	ISO                string
	ISO3               string
	Fips               string
	Continent          string
	Tld                string
	CurrencyCode       string
	CurrencyName       string
	Phone              string
	PostalCodeFormat   string
	PostalCodeRegex    string
	Languages          string
	Neighbours         string
	EquivalentFipsCode string
}

CountryInfo contains metadata about a country from Geonames.

type DataSource

type DataSource struct {
	URL  string       // Download URL
	Path string       // Local file path
	ID   DataSourceID // Identifier for processing logic
}

DataSource defines a data source for geocoding data.

type DataSourceID

type DataSourceID string

DataSourceID identifies a data source type.

const (
	DataSourceGeonamesCities  DataSourceID = "geonamesCities1000"
	DataSourceGeonamesCountry DataSourceID = "geonamesCountryInfo"
	DataSourceGeonamesAdmin1  DataSourceID = "geonamesAdmin1Codes"
	DataSourceMaxMindCities   DataSourceID = "maxmindWorldCities"
)

type GeoBed

type GeoBed struct {
	Cities    Cities        // All loaded cities, sorted by name
	Countries []CountryInfo // Country metadata from Geonames
	// contains filtered or unexported fields
}

GeoBed provides offline geocoding using embedded city data. Safe for concurrent use after initialization.

func GetDefaultGeobed

func GetDefaultGeobed() (*GeoBed, error)

GetDefaultGeobed returns a shared GeoBed instance, initializing it on first call. Unlike sync.Once, transient errors (e.g., network down during download) allow retry.

func NewGeobed

func NewGeobed(opts ...Option) (*GeoBed, error)

NewGeobed creates a new GeoBed instance with geocoding data loaded into memory.

Options can be provided to customize data and cache directories:

g, err := NewGeobed(WithDataDir("/custom/data"), WithCacheDir("/custom/cache"))

Example:

g, err := NewGeobed()
if err != nil {
    log.Fatal(err)
}
city := g.Geocode("Austin, TX")
fmt.Printf("%s: %f, %f\n", city.City, city.Latitude, city.Longitude)

func (*GeoBed) Geocode

func (g *GeoBed) Geocode(n string, opts ...GeocodeOptions) GeobedCity

Geocode performs forward geocoding, converting a location string to coordinates.

func (*GeoBed) ReverseGeocode

func (g *GeoBed) ReverseGeocode(lat, lng float64) GeobedCity

ReverseGeocode converts lat/lng coordinates to a city location.

type GeobedCity

type GeobedCity struct {
	City    string // City name
	CityAlt string // Alternate names (comma-separated)

	Latitude   float32 // Latitude in degrees
	Longitude  float32 // Longitude in degrees
	Population int32   // Population count
	// contains filtered or unexported fields
}

GeobedCity represents a city with geocoding data. Memory-optimized: uses indexes for Country/Region, float32 for coordinates.

func (GeobedCity) Country

func (c GeobedCity) Country() string

Country returns the ISO 3166-1 alpha-2 country code (e.g., "US", "FR").

func (GeobedCity) Region

func (c GeobedCity) Region() string

Region returns the administrative region code (e.g., "TX", "CA").

type GeobedConfig

type GeobedConfig struct {
	DataDir  string // Directory for raw data files (default: "./geobed-data")
	CacheDir string // Directory for cache files (default: "./geobed-cache")
}

GeobedConfig contains configuration options for GeoBed initialization.

type GeocodeOptions

type GeocodeOptions struct {
	ExactCity     bool // Require exact city name match
	FuzzyDistance int  // Max edit distance for typo tolerance (0 = disabled, 1-2 recommended)
}

GeocodeOptions configures geocoding behavior.

type Option

type Option func(*GeobedConfig)

Option is a functional option for configuring GeoBed.

func WithCacheDir

func WithCacheDir(dir string) Option

WithCacheDir sets the directory for cache files.

func WithDataDir

func WithDataDir(dir string) Option

WithDataDir sets the directory for raw data files.

Directories

Path Synopsis
cmd
update-cache command
Command update-cache regenerates the geobed cache files from raw data and validates the result.
Command update-cache regenerates the geobed cache files from raw data and validates the result.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL