Documentation
¶
Overview ¶
Package chutils is a set of utilities designed to work with ClickHouse. The utilities are designed to facilitate import and export of data.
The chutils package facilitates these types of functions. The principal use cases are:
- file --> ClickHouse
- ClickHouse --> file
- ClickHouse --> ClickHouse
Why is use case 3 helpful?
- Automatic generation of the CREATE TABLE statement
- Data cleaning
- Renaming fields
- Adding fields that may be complex functions of the Input and/or use data from other Go variables.
The chutils package defines:
- An Input interface that reads data.
- A TableDef struct that specifies the structure of the input. Features include:
- The fields/types of a TableDef can be specified, or they can be imputed from the data.
- The corresponding CREATE TABLE statement can be built and issued.
- Checks of the range/values of fields as they are read.
- An Output interface that writes data.
- Concurrent execution of Input/Output interfaces
The file package implements Input and Output for text files. The sql package implements Input and Output for SQL.
These two packages can be mixed and matched for Input/Output.
Example uses ¶
- Load a CSV to ClickHouse -- Option 1 (see Example in package file) a. Define a file Reader to point to the CSV. b. Use Init to create the TableDef and then Impute to determine the fields and types. c. Use the Create method of TableDef to create the ClickHouse table to populate. d. Run TableSpecs().Check() to verify the TableSpec is set up correctly. e. Define a file Writer that points a temporary file. f. Use chutils Export to create a temporary file that uses the Reader/Writer. g. Use the Writer Insert method to issue a command to clickhouse-client to load the temporary file.
- Load a CSV to ClickHouse -- Option 2 (see Example in package sql). a. same as a, above. b. same as b, above. c. same as c, above. d. same as d, above. e. Define an SQL Writer that points to the table to populate. f. Use chutils Export to create a VALUES insert statement. g. Use the Writer Insert statement to execute the Insert.
- Insert to a ClickHouse table from a ClickHouse query -- Option 1. a. Define an sql Reader to define the source query. b. Use Init to define the TableDef and Create to create the output table. c. Run TableSpec().Check() to make sure the TableSpec is set up correctly. d. Use Insert to execute the insert query. (Note, there is no data validation).
- Insert to a ClickHouse table from a ClickHouse query -- Option 2. a. Same as a, above. b. Same as b, above. c. Run TableSpec().Check() to make sure the TableSpec is set up correctly. d. Define an sql Writer that points to the table to populate. e. Use chutils Export to create the VALUEs statement that is used to insert into the table. f. Use the Writer Insert statement to execute the Insert. (Note, there *is* data validation).
Index ¶
- Variables
- func Concur(nWorker int, rdrs []Input, wrtrs []Output, ...) error
- func Export(rdr Input, wrtr Output) error
- func Load(rdr Input, wrtr Output) (err error)
- func Wrapper(e error, text string) error
- type ChField
- type ChType
- type Connect
- type EngineType
- type ErrType
- type FieldDef
- type Input
- type LegalValues
- type Output
- type Row
- type Status
- type TableDef
- func (td *TableDef) Check() error
- func (td *TableDef) Create(conn *Connect, table string) error
- func (td *TableDef) Get(name string) (int, *FieldDef, error)
- func (td *TableDef) Impute(rdr Input, rowsToExamine int, tol float64) (err error)
- func (td *TableDef) Nest(nestName string, firstField string, lastField string) error
- type Valid
Constants ¶
This section is empty.
Variables ¶
var ( DateMissing = time.Date(1970, 1, 2, 0, 0, 0, 0, time.UTC) IntMissing = -1 //math.MaxInt32 FloatMissing = -1.0 //math.MaxFloat64 StringMissing = "!" )
Missing values used when the user does not supply them
var DateFormats = []string{"2006-01-02", "2006-1-2", "2006/01/02", "2006/1/2", "20060102", "01022006", "01/02/2006", "1/2/2006", "01-02-2006", "1-2-2006", "200601", time.RFC3339}
DateFormats are formats to try when guessing the field type in Impute()
Functions ¶
Types ¶
type ChField ¶
type ChField struct {
Base ChType // Base is base type of ClickHouse field.
Length int // Length is length of field (0 for String).
OuterFunc string // OuterFunc is the outer function applied to the field (e.g. LowCardinality(), Nullable())
Format string // Format for incoming dates from Input, or outgoing Floats
}
ChField struct holds the specification for a ClickHouse field
func NewChField ¶
type ChType ¶
type ChType int
ChType enum is supported ClickHouse field types.
type Connect ¶
type Connect struct {
Http string // Http is "http" or "https"
Host string // Host is host IP
User string // User is ClickHouse user name
Password string // Password is user's password
*sql.DB // ClickHouse database connector
}
Connect holds the ClickHouse connect information
func NewConnect ¶
type EngineType ¶
type EngineType int
EngineType enum specifies ClickHouse engine types
const ( MergeTree EngineType = 0 + iota Memory )
func (EngineType) String ¶
func (i EngineType) String() string
type FieldDef ¶
type FieldDef struct {
Name string // Name of the field.
ChSpec ChField // ChSpec is the Clickhouse specification of field.
Description string // Description is an optional description for CREATE TABLE statement.
Legal *LegalValues // Legal are optional bounds/list of legal values.
Missing interface{} // Missing is the value used for a field if the value is missing/illegal.
Width int // Width of field (for flat files)
}
FieldDef struct holds the full definition of single ClickHouse field.
func NewFieldDef ¶
func (*FieldDef) CheckRange ¶
CheckRange checks whether checkVal is a legal value. Returns fd.Missing, if not.
type Input ¶
type Input interface {
Read(nTarget int, validate bool) (data []Row, valid []Valid, err error) // Read from the Input, possibly with validation
Reset() error // Reset to beginning of source
CountLines() (numLines int, err error) // CountLines returns # of lines in source
Seek(lineNo int) error // Seek moves to lineNo in source
Close() error // Close the source
TableSpec() *TableDef // TableSpec returns the TableDef for the source
}
The Input interface specifies the requirments for reading source data.
type LegalValues ¶
type LegalValues struct {
LowLimit interface{} // Minimum legal value for types Int, Float
HighLimit interface{} // Maximum legal value for types Int, Float
Levels []string // Legal values for types String, FixedString
}
LegalValues holds bounds and lists of legal values for a ClickHouse field
func NewLegalValues ¶
func NewLegalValues() *LegalValues
NewLegalValues creates a new LegalValues type
type Output ¶
type Output interface {
Write(b []byte) (n int, err error) // Write byte array, n is # of bytes written. Writes do not go to ClickHouse (see Insert)
Name() string // Name of output (file, table)
Insert() error // Inserts into ClickHouse
Separator() rune // Separator returns the field separator
EOL() rune // EOL returns the End-of-line character
Close() error // Close writer
}
The Output interface specifies requirements for writing data.
type Row ¶
type Row []interface{}
Row is single row of the table. The fields may be of any type. A Row is stored in the same order of the TableDef FieldDefs slice.
type Status ¶
type Status int
Status enum is the validation status of a particular instance of a ChField field as judged against its ClickHouse type and acceptable values
type TableDef ¶
type TableDef struct {
Key string // Key is the key for the table.
Engine EngineType // EngineType is the ClickHouse table engine.
FieldDefs map[int]*FieldDef // Map of fields in the table. The int key is the column order in the table.
// contains filtered or unexported fields
}
TableDef struct defines a table.
func NewTableDef ¶
func NewTableDef(key string, engine EngineType, fielddefs map[int]*FieldDef) *TableDef
NewTableDef creates a new TableDef struc
func (*TableDef) Check ¶
Check verifies that the fields are of the type chutils supports. Check also converts, if needed, the Legal.HighLimit and Legal.LowLimit and Missing interfaces to the correct type and checks LowLimit <= HighLimit. It's a good idea to run Check before reading the data.
func (*TableDef) Create ¶
Create builds and issues CREATE TABLE ClickHouse statement. The table created is "table"
func (*TableDef) Get ¶
Get returns the FieldDef for field "name". The FieldDefs map is by column order, so access by field name is needed.
Source Files
¶
Directories
¶
| Path | Synopsis |
|---|---|
|
Package file implements Input/Output for text files.
|
Package file implements Input/Output for text files. |
|
Package nested allows for additional calculations to be added to the output of a chutils.Input reader.
|
Package nested allows for additional calculations to be added to the output of a chutils.Input reader. |
|
Package sql implements Input/Output for SQL.
|
Package sql implements Input/Output for SQL. |