tableschema-go

module
v0.1.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 9, 2017 License: MIT

README

Build Status Coverage Status Go Report Card Gitter chat GoDoc

tableschema-go

Table schema tooling in Go.

Getting started

Installation

This package uses semantic versioning 2.0.0.

Using dep
$ dep init
$ dep ensure -add github.com/frictionlessdata/tableschema-go/csv@>=0.1

Main Features

Tabular Data Load

Have tabular data stored in local files? Remote files? Packages like the csv are going to help on loading the data you need and making it ready for processing.

package main

import "github.com/frictionlessdata/tableschema-go/csv"

func main() {
   tab, err := csv.NewTable(csv.Remote("myremotetable"), csv.LoadHeaders())
   // Error handling.
}

Supported physical representations:

You would like to use tableschema-go but the physical representation you use is not listed here? No problem! Please create an issue before start contributing. We will be happy to help you along the way.

Schema Inference and Configuration

Got that new dataset and wants to start getting your hands dirty ASAP? No problems, let the schema package try to infer the data types based on the table data.

package main

import (
   "github.com/frictionlessdata/tableschema-go/csv"
   "github.com/frictionlessdata/tableschema-go/schema"
)

func main() {
   tab, _ := csv.NewTable(csv.Remote("myremotetable"), csv.LoadHeaders())
   sch, _ := schema.Infer(tab)
   fmt.Printf("%+v", sch)
}

Want to go faster? Please give InferImplicitCasting a try and let us know how it goes.

There might be cases in which the inferred schema is not correct. One of those cases is when your data use strings like "N/A" to represent missing cells. That would usually make our inferential algorithm think the field is a string.

When that happens, you can manually perform those last minutes tweaks Schema.

   sch.MissingValues = []string{"N/A"}
   sch.GetField("ID").Type = schema.IntegerType

After all that, you could persist your schema to disk:

sch.SaveToFile("users_schema.json")

And use the local schema later:

sch, _ := sch.LoadFromFile("users_schema.json")

Finally, if your schema is saved remotely, you can also use it:

sch, _ := schema.LoadRemote("http://myfoobar/users/schema.json")

Processing Tabular Data

Once you have the data, you would like to process using language data types. schema.Encode and schema.EncodeTable are your friends on this journey.

package main

import (
   "github.com/frictionlessdata/tableschema-go/csv"
   "github.com/frictionlessdata/tableschema-go/schema"
)

type user struct {
   ID   int
   Age  int
   Name string
}

func main() {
   tab, _ := csv.NewTable(csv.FromFile("users.csv"), csv.LoadHeaders())
   sch, _ := schema.Infer(tab)
   var users []user
   sch.DecodeTable(tab, &users)
   // Users slice contains the table contents properly encoded into
   // language types. Each row will be a new user appended to the slice.
}

If you have a lot of data and can no load everything in memory, you can easily iterate trough it:

...
   iter, _ := sch.Iter()
   for iter.Next() {
      var u user
      sch.Decode(iter.Row(), &u)
      // Variable u is now filled with row contents properly encoded
      // to language types.
   }
...

Even better if you could do it regardless the physical representation! The table package declares some interfaces that will help you to achieve this goal:

Saving Tabular Data

Once you're done processing the data, it is time to persist results. As an example, let us assume we have a remote table schema called summary, which contains two fields:

import (
   "github.com/frictionlessdata/tableschema-go/csv"
   "github.com/frictionlessdata/tableschema-go/schema"
)


type summaryEntry struct {
    Date time.Time
    AverageAge float64
}

func WriteSummary(summary []summaryEntry, path string) {
   sch, _ := schema.LoadRemote("http://myfoobar/users/summary/schema.json")

   f, _ := os.Open(path)
   defer f.Close()

   w := csv.NewWriter(f)
   defer w.Flush()

   w.Write([]string{"Date", "AverageAge"})
   for _, summ := range summary{
       row, _ := sch.Encode(summ)
       w.Write(row)
   }
}

API Reference and More Examples

More detailed documentation about API methods and plenty of examples is available at https://godoc.org/github.com/frictionlessdata/tableschema-go

Contributing

Found a problem and would like to fix it? Have that great idea and would love to see it in the repository?

Please open an issue before start working

That could save a lot of time from everyone and we are super happy to answer questions and help you alonge the way. Furthermore, feel free to join frictionlessdata Gitter chat room and ask questions.

This project follows the Open Knowledge International coding standards

  • Before start coding:

    • Fork and pull the latest version of the master branch
    • Make sure you have go 1.8+ installed and you're using it
    • Make sure you dep installed
  • Before sending the PR:

$ cd $GOPATH/src/github.com/frictionlessdata/tableschema-go
$ dep ensure
$ go test ./..

And make sure your all tests pass.

Directories

Path Synopsis
examples
Package table provides the main interfaces used to manipulate tabular data.
Package table provides the main interfaces used to manipulate tabular data.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL