tm

package module
v0.0.0-...-092b941 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 26, 2017 License: MIT Imports: 23 Imported by: 1

README

TMFRAME

TMFRAME, pronounced "time frame", is a simple and efficient binary standard for encoding time series data.

Starting with a 64-bit nanoseconds-since the Unix epoch timestamp, the idea here is that the low 3-bits are really just random noise, given that our clock calibrations just aren't that accurate.

So we replace those 3 bits with a useful data payload to get a highly compressed timeseries format.

specification

The TMFRAME format allows very compact expression of time-series. For example, for a simple time-series, the TMFRAME encoding can be as simple as a sequence of 64-bit timestamps (whose resolution is limited to 10 nanoseconds). However the same format can be accompanied by much longer additional event data if need be. Common situations where a single float64 are needed for the timepoint's value are supported with exactly two words (two 64-bit words; one for the timestamp and one for the float64 payload).

overview of the format

A TMFRAME message always starts with a primary word.

Depending on the content of the low 3 bits of the primary word, the primary word may be the entire message. However, there may also be additional words and bytes following the primary word that complete the message.

TMFRAME messages can be classified as being either be 8 bytes (primary word only), 16 bytes long, greater than 16 bytes long.

Frequently a TMFRAME message will consist of one primary word, one UDE word, and a variable length payload.

The primary word and UDE word are always 64-bit words each. The payload can be up to 2^43 bytes in length.

We illustrate the possible TMFRAME message lengths here:

a) primary word only

+---------------------------------------------------------------+
|      primary word (64-bits) with PTI={0, 4, 5, or 6}          |
+---------------------------------------------------------------+

b) primary word and UDE word only:

+---------------------------------------------------------------+
|                primary word (64-bits) with PTI=7              |
+---------------------------------------------------------------+
|            User-defined-encoding (UDE) descriptor             |
+---------------------------------------------------------------+

c) primary word + UDE word + variable byte-length message:

+---------------------------------------------------------------+
|                primary word (64-bits) with PTI=7              |
+---------------------------------------------------------------+
|            User-defined-encoding (UDE) descriptor             |
+---------------------------------------------------------------+
|               variable length                                 |
|                message here                          ----------
|     (the UDE supplies the exact byte-count)          |
+-------------------------------------------------------

There are also three special payload types that are not UDE based. They handle the common need to attach one or two 64-bit values to a timestamp.

d) primary word + one int64

+---------------------------------------------------------------+
|                primary word (64-bits) with PTI=1              |
+---------------------------------------------------------------+
|                     V1 (int64)                                |
+---------------------------------------------------------------+

e) primary word + one float64

+---------------------------------------------------------------+
|                primary word (64-bits) with PTI=2              |
+---------------------------------------------------------------+
|                     V0 (float64)                              |
+---------------------------------------------------------------+

f) primary word + one float64 + one int64

+---------------------------------------------------------------+
|                primary word (64-bits) with PTI=3              |
+---------------------------------------------------------------+
|                     V0 (float64)                              |
+---------------------------------------------------------------+
|                     V1 (int64)                                |
+---------------------------------------------------------------+

1. number encoding rules

Integers and floating point numbers are used in the protocol that follows, so we fix our definitions of these.

  • Integers: are encoded in little-endian format. Signed integers use two’s complement. Integers are signed unless otherwise noted.
  • float64, also known as 64-bit floating-point numbers: Encoded in little-endian IEEE-754 format.

2. primary word encoding

A TMFRAME message always starts with a primary word.


msb                  primary word (64-bits)                   lsb
+-----------------------------------------------------------+---+
|                        TMSTAMP                            |PTI|
+-----------------------------------------------------------+---+

TMSTAMP (61 bits) =
     The primary word is generated by starting
     with a 64-bit signed little endian integer, the number
     of nanoseconds since the unix epoch; then truncating off
     the lowest 3-bits and overwriting them with the value of PTI.
     The resulting TMSTAMP value is the 61 most significant
     bits of the timestamp and can be used directly as an
     integer timestamp by first copying the full 64-bits of the
     timeframe word and then zero-ing out the 3 bits of PTI.
     
PTI (3 bits) = Payload type indicator, decoded as follows:

    0 => a zero value is indicated for this timestamp.
         (the zero value can also be encoded, albeit
         less efficiently, by a UDE word with bits all 0).
         
         Use the zero-value for time-stamp only time-series.

         The primary word is the only word in this message.
         The next word will be the primary word of the next
         message on the wire.

         By convention, the 0 value can indicate the
         payload false for boolean series.

    1 => exactly one 64-bit int64 payload value follows.
         The message has exactly two 64-bit words.
         The payload is known as V1.

    2 => exactly one 64-bit float64 payload value follows.
         Nmemonic: The total number of 64-bit words in the message is 2.
         The payload is known as V0.

    3 => exactly two 64-bit payload values follow, one float64 and one int64.
         Nmemonic: The total number of 64-bit words in the message is 3.
         The payload components are known as V0 (the float64), and
         V1 (the int64).

    4 => NULL: the null-value, a known and intentionally null value. Written as NULL.

         NB By convention, for a strictly boolean series, PTI=4 is the true value,
         while PTI=0 is the false value.

         The primary word is the only word in this message.

    5 => NA: not-available, an unintentionally missing value.
         In statistics, this indicates that *any* value could
         have been the correct payload here, but that the
         observation was not recorded. a.k.a. "Missing data". Written as NA.

         The primary word is the only word in this message.

    6 => NaN: not-a-number, IEEE-754 floating point NaN value.
         Obtained when dividing zero by zero, for example. math.IsNaN()
         detects these.

         The primary word is the only word in this message.

    7 => user-defined-encoding (UDE) descriptor word follows.

3. User-defined-encoding descriptor

msb    user-defined-encoding (UDE) descriptor 64-bit word     lsb
+---------------------------------------------------------------+
| EVTNUM (21-bits)  |                UCOUNT (43-bits)           |
+---------------------------------------------------------------+

  UCOUNT => is a 43-bit unsigned integer number of bytes that
       follow as a part of this message. Zero is allowed as a
       value in UCOUNT, and is useful when the type information in EVTNUM
       suffices to convey the event. Mask off the high 21-bits
       of the UDE to erase the EVTNUM before using the count
       of bytes found in UCOUNT. The payload starts immediately
       after the UDE word, and can be up to 8TB long (2^43 bytes).
       Shorter payloads are recommended whenever possible.

       There is no requirement that UCOUNT be padded to
       any alignment boundary. It should be the exact length
       of the payload in bytes.

       The next message's primary word will commence after the
       UCOUNT bytes that follow the UDE.

       If UCOUNT is > 0, then the payload of bytes must
       include a 0 byte as its last value. This assists
       in languages bindings (e.g. C) where strings need a
       terminating zero byte.

  EVTNUM => a 21-bit signed two's-compliment integer capable
       of expressing values in the range [-(2^20), (2^20)-1].

       Positive numbers are for pre-defined system event
       types. Negative numbers are reserved for user-defined
       event types starting with -2, -3, -4, ...

       There is one pre-defined user-defined event number.
       The one pre-defined user EVTNUM value is:

       -1 => an error message string in utf8 follows; it is
             of length UCOUNT, and the count includes a
             zero termination byte if and only if the string has
             one or more bytes in it.

       Any custom user-defined types added by the user will
       therefore start at EVTNUM = -2. The last usable EVTNUM is
       the -1 * (2^20) value; so over one million user
       defined event types are available.

       System defined EVTNUM values as of this writing are:

       0 => this is also a zero value payload. The corresponding
            UCOUNT must also be 0. There are no other words
            in this message. This allows encoders to not
            have to go back and compress out a zero value by
            writing a PTI of zero; although they are encouraged
            to do so whenever possible to save a word of space.

       1..7 => these are reserved and should never actually
               appear in the EVTNUM field on the wire. However
               we defined them here to make the API
               easier to use--when NewFrame() is called
               with evtnum in this range, the PTI field is automatically
               set accordingly, and the implicit compression
               of the primary word-only messages is recognized.
               To provide documentation for the NewFrame() calls
               evtnum formal parameter, we will restate the encoding here:

               1 => EvOneInt64, the payload is defined to be the
                    following int64 value.
               2 => EvOneFloat64, the payload will be the
                    following float64 value.
               3 => EvTwo64, the payload will be the float64
                    and the int64 that follow.
               4 => EvNull, payload is defined as the NULL value.
               5 => EvNA, payload is defined as NA, or Not-Available,
                    denoting missing data.
               6 => EvNaN, payload denotes IEEE-754 Not-a-Number, NaN.
               7 => EvUDE, payload is describe by the UDE word that
                    follows.

       8 => a TMFRAME-HEADER value follows, giving time-series
            metadata. To be described later.
            
       9 => a Msgpack[version 2] encoded message follows.
       
       10 => a Binc encoded message follows.
       
       11 => a Capnproto encoded message segment follows.

       12 => a sequence of S-expressions (code or data) in zygomys
            parse format follows. [note 1]
 
       13 => the payload is a UTF-8 encoded string. As noted
             above in the UCOUNT section, the wire
             format will include a zero termination byte after
             the string to help with C bindings, and
             the UCOUNT will reflect that inclusion if the
             string length itself is greater than zero.

       14 => the payload is a JSON UTF-8 string. The UCOUNT
             will include an additional terminating zero byte
             if the string has length > 0.

       15 => the payload is a Kafka/msgpack message.

       16 => the payload is in ZebraPack format.
             See https://github.com/glycerine/zebrapack for
             the specification and a Go implementation.

After any variable length payload that follows the UDE word, the next TMFRAME message will commence with its primary word.

This concludes the specification of the TMFRAME format.

conclusion

TMFRAME is a very simple yet flexible format for time series data. It allows very compact and dense information capture, while providing the ability to convey and attach full event information to each timepoint as required.

implementation

There is a full reference Go implementation in this repo. Docs here.

NB EVTNUM convention for display only in the Go implementation

EVTNUM between 2000 and 9999 are assummed to be json, and will be displayed by tfcat as such.

notes

[1] For zygomys parse format, see https://github.com/glycerine/zygomys

Copyright (c) 2016, Jason E. Aten.

LICENSE: MIT license.

Documentation

Overview

See https://github.com/glycerine/tmframe for the specification of the TMFRAME format which we implement here.

Index

Constants

View Source
const KeepLow43Bits uint64 = 0x000007FFFFFFFFFF

KeepLow43Bits allows one to mask off a UDE and discover the UCOUNT in the lower 43 bits quickly. For example: ucount := ude & KeepLow43Bits

Variables

View Source
var DataTooBigErr = fmt.Errorf("data cannot be over 8TB - 1 byte")

DataTooBigErr is returned from NewFrame() if the user tries to submit more than 2^43 -1 bytes of data.

View Source
var EastCoastUSLocation *time.Location
View Source
var EvtnumOutOfRangeErr = fmt.Errorf("evtnum out of range. min allowed is -1048576, max is 1048575")

EvtnumOutOfRangeErr is retuned from NewFrame() when the evtnum is out of the allowed range.

View Source
var FrameTooLargeErr = fmt.Errorf("frame was larger than FrameReader's maximum")
View Source
var LondonLocation *time.Location
View Source
var MyNaN float64

MyNaN provides the IEEE-754 floating point NaN value without having to make a call each time to math.NaN().

View Source
var NoDataAllowedErr = fmt.Errorf("data must be empty for this evtnum")

NoDataAllowedErr is returned from NewFrame() when the data argument is supplied but not conveyed in that evtnum specified.

View Source
var TooShortErr = fmt.Errorf("data supplied is too short to represent a TMFRAME frame")

TooShortErr is returned by Frame.Unmarshal() when the by bytes are supplied are insufficient for the encoded EVTNUM or UCOUNT.

View Source
var UTCLocation = time.UTC
View Source
var WestCoastUSLocation *time.Location

Functions

func DateAfter

func DateAfter(a *Date, b *Date) bool

return true if a > b

func DateBefore

func DateBefore(a *Date, b *Date) bool

return true if a < b

func DatesEqual

func DatesEqual(a *Date, b *Date) bool

DatesEqual returns true if a and b are the exact same day.

func Dedup

func Dedup(r io.Reader, w io.Writer, windowSize int, dupsW io.Writer, detectOnly bool) error

Dedup dedups over a window of windowSize Frames a stream of frames from r into w. dupsW can be nil. If dupsW is supplied, recognized duplicate events will be written to this io.Writer. If detectOnly is true, we will return a DupDetectedErr at the first duplicate, to enable scanning a filesystem. With detectOnly set, no dedupped output Frames are written.

func DirExists

func DirExists(name string) bool

DirExists returns true if the named path is a directly presently in the filesystem.

func FileExists

func FileExists(name string) bool

FileExists returns true if the named path exists in the filesystem and is a file (and not a directory).

func FramesEqual

func FramesEqual(a, b *Frame) bool

FramesEqual calls Marshal() both frames a and b and returns returns true if the serialized versions of a and b are byte-for-byte identical. FramesEqual will panics if there is a marshaling error.

func GetDateSubdirs

func GetDateSubdirs(dir string) ([]string, error)

func GetProperFilesInDir

func GetProperFilesInDir(dir string) ([]string, error)

GetProperFilesInDir returns a slice listing the files (not directories) found in dir.

func IntToPrimTm

func IntToPrimTm(t int64) int64

convert from a UnixNano timestamp (int64 number of nanoseconds) to a frame.Tm() comparable timestamp

func IntersectDays

func IntersectDays(begin *Date, endx *Date, avail []string) (readDays []string, err error)

IntersectDays: avail must already be in sorted calendar increasing order. endx can be nil, begin cannot be nil.

func IsDateDir

func IsDateDir(d string) bool

IsDateDir says true to '2016', '01', and '31', but rejects '1999', '41' or '00'.

func NewMarshalledFrame

func NewMarshalledFrame(writeHere []byte, tm time.Time, evtnum Evtnum, v0 float64, v1 int64, data []byte) ([]byte, error)

NewMarshalledFrame creates a frame already marshalled into the writeHere buffer, assuming that writeHere is large enough. It returns the marshalled frame as bytes, and any error. Effectively this is a convenience combination of NewFrame() followed by Marshal().

func NewMsgpackFrame

func NewMsgpackFrame(tm time.Time, m msgp.Marshaler, buf []byte) ([]byte, error)

NewMsgpackFrame is a convenience method, taking a method that has had github.com/tinylib/msgp code generated for it. Such code will have an msgp.Marshaler implementation defined by the generated code. The provided buf will be used if it has sufficient space, but is optional and can be nil. The marshalled frame's bytes are returned, along with any error encountered.

func ReadAvailDays

func ReadAvailDays(readDir string) (res []string, err error)

func ReadNewlineDelimFile

func ReadNewlineDelimFile(path string) ([]string, error)

read in new-line delimited text from a file

func SkeletonDemoCopyFames

func SkeletonDemoCopyFames(r io.Reader, w io.Writer) error

SkeletonDemoCopyFames is a skeleton function for Frame processing. Bare bones, it simply copies frames without doing any other transformation. It is meant to serve as a starting point for other customized processing functions.

The emphasis on safety here means that this is deliberately not a zero copy implementation. Optimization is possible, but not demonstrated here.

func TimeToPrimTm

func TimeToPrimTm(t time.Time) int64

convert from a time.Time to a frame.Tm() comparable timestamp

func ValidEvtnum

func ValidEvtnum(evtnum Evtnum) bool

Validate our acceptable range of evtnum. The min allowed is -1048576, max allowed is 1048575

Types

type BufferedFrameReader

type BufferedFrameReader struct {
	Name     string
	Reader   *FrameReader
	Next     *Frame
	TmpFrame Frame
}

BufferedFrameReader supports PeekFrame(), Advance(), and ReadOne() that help in merging (merge sorting) two streams.

func NewBufferedFrameReader

func NewBufferedFrameReader(r io.Reader, maxFrameBytes int64, name string) *BufferedFrameReader

NewBufferedFrameReader makes a new BufferedFrameReader. It imposes a message size limit of maxFrameBytes in order to size its internal FrameReader's buffer.

func (*BufferedFrameReader) Advance

func (s *BufferedFrameReader) Advance() error

Advance skips forward a frame in the stream. We discard the next frame -- the next framing being the one that would have been returned if Peek had been called instead.

func (*BufferedFrameReader) Peek

func (s *BufferedFrameReader) Peek() (*Frame, error)

Peek gets a look at the next Frame, without advancing past it. Repeated calls to Peek without any intervening ReadAndAdvance or Advance calls will return the same Frame.

func (*BufferedFrameReader) ReadOne

func (s *BufferedFrameReader) ReadOne() (*Frame, error)

ReadOne reads the next frame and then advances past it. Calling it repeatedly will read all frames in a stream in order. ReadOne may return the next Frame and a non-nil error from the Advance() call such as io.EOF. Callers should process the returned *Frame (if not nil) before considering the returned error.

func (*BufferedFrameReader) WriteTo

func (b *BufferedFrameReader) WriteTo(w io.Writer) (n int64, err error)

WriteTo implements io.WriterTo. It bypasses Frame handling and allows copying from the underlying stream directly. It should be used to skip any further Frame processing and copy the rest of the byte stream directly.

type Date

type Date struct {
	Year  int
	Month int
	Day   int
}

Date represents a UTC time zone day

func NextDate

func NextDate(d *Date) *Date

NextDate returns the next calendar day after d.

func ParseDate

func ParseDate(datestring string) (*Date, error)

ParseDate converts a datestring '2016/02/25' into a Date{} struct.

func PrevDate

func PrevDate(d *Date) *Date

PrevDate returns the first calendar day prior to d.

func TimeToDate

func TimeToDate(tm time.Time) Date

TimeToDate returns the UTC Date associated with tm.

func UTCDateFromTime

func UTCDateFromTime(tm time.Time) *Date

UTCDateFromTime returns the date after tm is moved to the UTC time zone.

func (*Date) DecodeMsg

func (z *Date) DecodeMsg(dc *msgp.Reader) (err error)

DecodeMsg implements msgp.Decodable

func (Date) EncodeMsg

func (z Date) EncodeMsg(en *msgp.Writer) (err error)

EncodeMsg implements msgp.Encodable

func (Date) MarshalMsg

func (z Date) MarshalMsg(b []byte) (o []byte, err error)

MarshalMsg implements msgp.Marshaler

func (Date) Msgsize

func (z Date) Msgsize() (s int)

func (*Date) String

func (d *Date) String() string

String turns the date into a string.

func (*Date) ToGoTime

func (d *Date) ToGoTime() time.Time

ToGoTime turns the date into UTC time.Time, at the 0 hrs 0 min 0 second start of the day.

func (*Date) Unix

func (d *Date) Unix() int64

Unix converts the date into an int64 representing the nanoseconds since the unix epoch for the ToGoTime() output of Date d.

func (*Date) UnmarshalMsg

func (z *Date) UnmarshalMsg(bts []byte) (o []byte, err error)

UnmarshalMsg implements msgp.Unmarshaler

type DupDetectedErr

type DupDetectedErr struct {
	Msg string
}

func NewDupDetectedErr

func NewDupDetectedErr(msg string) *DupDetectedErr

func (*DupDetectedErr) Error

func (dd *DupDetectedErr) Error() string

type Evtnum

type Evtnum int32

The Evtnum is the message type when pti = PtiUDE and UDE descriptors are in use for describing TMFRAME message longer than just the one Primary word.

const (
	EvErr Evtnum = -1

	// 0-7 deliberately match the PTI to make the
	// API easier to use. Callers to NewFrame need
	// only specify an Evtnum, and the framing code
	// sets PTI and EVTNUM correctly.
	EvZero       Evtnum = 0
	EvOneInt64   Evtnum = 1
	EvOneFloat64 Evtnum = 2
	EvTwo64      Evtnum = 3
	EvNull       Evtnum = 4
	EvNA         Evtnum = 5
	EvNaN        Evtnum = 6
	EvUDE        Evtnum = 7

	EvHeader    Evtnum = 8
	EvMsgpack   Evtnum = 9
	EvBinc      Evtnum = 10
	EvCapnp     Evtnum = 11
	EvZygo      Evtnum = 12
	EvUtf8      Evtnum = 13
	EvJson      Evtnum = 14
	EvMsgpKafka Evtnum = 15
	EvZebraPack Evtnum = 16
)

func (Evtnum) String

func (e Evtnum) String() string

String pretty prints the names of the events into a string.

type Frame

type Frame struct {
	Prim int64 // the primary word

	V0 float64 // primary float64 value, for EvOneFloat64 and EvTwo64

	// Ude alternatively represents V1 for EvTwo64 and EvOneInt64
	// GetV1() to access as V1.
	Ude int64 // the User-Defined-Encoding word

	Data []byte // the variable length payload after the UDE
}

Frame holds a fully parsed TMFRAME message.

func GenTestFrames

func GenTestFrames(n int, outpath *string) (frames []*Frame, tms []time.Time, by []byte)

generate n test Frames, with 4 different frame types, and randomly varying sizes if outpath is non-nill, write to that file.

func GenTestFramesSequence

func GenTestFramesSequence(n int, outpath *string) (frames []*Frame, tms []time.Time, by []byte)

generate 0..(n-1) as floating point EvOneFloat64 frames

func GenTestTwo64Frames

func GenTestTwo64Frames(n int, outpath *string) (frames []*Frame, tms []time.Time, by []byte)

generate n test Two64Frames

func GenTestdataZebraPackTestFrames

func GenTestdataZebraPackTestFrames(n int, outpath *string) (frames []*Frame, tms []time.Time, by []byte)

generate n test Frames of ZebraPack encoded. if outpath is non-nill, write to that file.

func MakeTwo64Frames

func MakeTwo64Frames(tm time.Time,
	deltas []float64,
	typ []int64,
	addSec []int64) []*Frame

MakeTwo64Frames creates a slice of *Frames, starting at tm, and occuring thereafter as determined by the offset in the sec (seconds) slice.

The deltas argument determines the length and the content of the slice. Each element is an EvOneFloat64 valued Frame, use Frame.GetV0() to observe the value. deltas of 0 are skipped. typ gives the int64 V1 value for each frame, accessed with Frame.GetV1().

func NewFrame

func NewFrame(tm time.Time, evtnum Evtnum, v0 float64, v1 int64, data []byte) (*Frame, error)

NewFrame creates a new TMFRAME message, ready to have Marshal called on for serialization into bytes. It will not make an internal copy of data. When copied on to the wire with Marshal(), a zero byte will be added to the data to make interop with C bindings easier; hence the UCOUNT will always include in its count this terminating zero byte if len(data) > 0.

func ReadAllFrames

func ReadAllFrames(inputFile string) ([]*Frame, error)

ReadAllFrames is a helper function, reading all the Frames found in inputFile and returning them.

func (*Frame) Blake2b

func (f *Frame) Blake2b() []byte

Blake2b returns the 64-byte BLAKE2b cryptographic hash of the Frame. This is useful for hashing and de-duplicating a stream of Frames.

reference: https://godoc.org/github.com/codahale/blake2 reference: https://blake2.net/ reference: https://tools.ietf.org/html/rfc7693

func (*Frame) DisplayForR

func (frame *Frame) DisplayForR(w io.Writer)

func (*Frame) DisplayFrame

func (frame *Frame) DisplayFrame(w io.Writer, i int64, prettyPrint bool, skipPayload bool, rReadable bool, zSchema *zebra.Schema)

DisplayFrame prints a frame to w (e.g. pass os.Stdout as w), along with optional number i.

If i < 0, the i is not printed. If prettyPrint is true and the payload is json or msgpack, we will display in an easier to ready pretty-printed json format. If skipPayload is true we will only print the Frame header information.

If rReadable, then we print in a format that can be consumed by R's read.table() call. The skipPayload, prettyPrint, and i values are ignored.

func (*Frame) GetEvtnum

func (f *Frame) GetEvtnum() Evtnum

func (*Frame) GetPTI

func (f *Frame) GetPTI() PTI

func (*Frame) GetUDE

func (f *Frame) GetUDE() int64

func (*Frame) GetUlen

func (f *Frame) GetUlen() int64

func (*Frame) GetV0

func (f *Frame) GetV0() float64

func (*Frame) GetV1

func (f *Frame) GetV1() int64

GetV1 retrieves the V1 value if the frame is of type PtiTwo64. Otherwise it returns 0.

func (*Frame) Marshal

func (f *Frame) Marshal(buf []byte) ([]byte, error)

Marshal serialized the Frame into bytes. We'll reuse the space pointed to by buf if there is sufficient space in it. We return the bytes that we wrote, plus any error.

func (*Frame) NumBytes

func (f *Frame) NumBytes() int64

NumBytes returns the number of bytes that the serialized Frame will consume on the wire. The count will be at least 8 bytes, and at most 16 + 2^43 bytes (which is 16 bytes + 8TB).

func (*Frame) SetTm

func (f *Frame) SetTm(t int64)

SetTm set the Prim timestamp from t. It zeros the first 3 bits of t before storing it, and preserves the PTI already in the primary word.

func (*Frame) SetV1

func (f *Frame) SetV1(v1 int64)

SetV1 sets the Frames V1 value if the frame is of type PtiTwo64. Otherwise it is a no-op.

func (Frame) String

func (f Frame) String() string

String converts the Frame's header information to a string. It doesn't read or stingify any variable length UDE payload, even if present.

func (*Frame) Stringify

func (frame *Frame) Stringify(i int64, prettyPrint bool, skipPayload bool, rReadable bool) string

StringifyFrame is like DisplayFrame but it returns a string.

func (*Frame) StringifyForR

func (f *Frame) StringifyForR() string

func (*Frame) Tm

func (f *Frame) Tm() int64

Tm extracts and returns the Prim timestamp from the frame (this is a UnixNano nanosecond timestamp, with the low 3 bits zeroed).

func (*Frame) TmTime

func (f *Frame) TmTime() time.Time

TmTime extracts and returns the Prim timestamp from the frame (this is a UnixNano nanosecond timestamp, with the low 3 bits zeroed), then converts it to a UTC timezone time.Time

func (*Frame) Unmarshal

func (f *Frame) Unmarshal(by []byte, copyData bool) (rest []byte, err error)

Unmarshal overwrites f with the restored value of the TMFRAME found in the by []byte data. If copyData is true, we'll make a copy of the underlying data into the frame f.Data; otherwise we merely point to it. NB If the underlying buffer by is recycled/changes, and you want to keep around multiple frames, you should use copyData = true.

type FrameChWriter

type FrameChWriter struct {
	Frames []*Frame

	SendOnMe chan *nats.Msg
	// contains filtered or unexported fields
}

FrameChWriter provides merge-sort (via Merge) with output sent to a channel rather than an io.Writer. Output frames are sent on the SendOnMe channel. FrameChWriter may buffer frames and does not force sending immediately.

func NewFrameChWriter

func NewFrameChWriter(maxFrameBytes int64, sendOnMe chan *nats.Msg) *FrameChWriter

NewFrameToChannelWriter construts a new FrameChWriter for buffering and sending Frames on sendOnMe. It imposes a message size limit of maxFrameBytes in order to size its internal marshalling buffer.

func (*FrameChWriter) Merge

func (fw *FrameChWriter) Merge(datestr string, strms ...*BufferedFrameReader) error

Merge merges the strms input into timestamp order, based on the Frame.Tm() timestamp, and writes the ordered sequence out to fw.SendOnMe

func (*FrameChWriter) SendOnCh

func (fw *FrameChWriter) SendOnCh(f *Frame, subject string, datestr string)

SendOnCh sends f on the fw.SendOnMe channel, using a nats.Msg with subject: "tseries.replay." + subject + "." + datestr. SendOnCh marshals the frame, effectively copying it.

type FrameReader

type FrameReader struct {
	R             *bufio.Reader
	MaxFrameBytes int64
	By            []byte
}

FrameReader provides assistance for reading successive Frames from an io.Reader. FrameReader uses bufio to peek ahead and determine the size of the next frame -- see PeekNextFrame() and NextFrame().

func NewFrameReader

func NewFrameReader(r io.Reader, maxFrameBytes int64) *FrameReader

NewFrameReader makes a new FrameReader. It imposes a message size limit of maxFrameBytes in order to size its internal read buffer.

func (*FrameReader) NextFrame

func (fr *FrameReader) NextFrame(fillme *Frame) (frame *Frame, nbytes int64, err error, raw []byte)

NextFrame reads the next frame into fillme if provided. If fillme is nil, NextFrame allocates a new Frame. NextFrame returns a pointer to the filled frame, along with the number of bytes on the wire used by the frame. If err is not nil, we returns a nil *Frame and 0 for nbytes.

Warning about the returned 'raw' bytes:

If err is nil, the 4th return argument, raw, holds the raw bytes of the frame. Copy these bytes immediately if you need them, as the raw bytes will be overwritten on the next call to this library. If err is not nil, raw will be nil.

func (*FrameReader) NextFrameBytes

func (fr *FrameReader) NextFrameBytes(fillme []byte) (nextbytes []byte, err error)

NextFrameBytes is like NextFrame but avoids Unmarshalling and so can be more efficient. NextFrameBytes reads the next frame into fillme if provided, but does not Unmarshal it; only the raw bytes of the frame are copied into fillme. If fillme is nil, NextFrameBytes allocates a new byte slice, copies the raw bytes for the next frame in, and returns it as nextbytes.

func (*FrameReader) PeekNextFrameBytes

func (fr *FrameReader) PeekNextFrameBytes() (nBytes int64, err error)

PeekNextFrameBytes returns the size of the next frame in bytes.

The returned err will be non-nil if we encountered insufficient data to determine the size of the next frame. If err is non-nil then nBytes will be 0.

Otherwise, if err is nil then nBytes holds the number of bytes in the next frame in FrameReader's underlying io.Reader.

func (*FrameReader) WriteTo

func (b *FrameReader) WriteTo(w io.Writer) (n int64, err error)

WriteTo implements io.WriterTo. It bypasses any Frame handling and allows copying from the underlying stream directly. WriteTo writes data to w until there is no more data to write or an error occurs.

type FrameRingBuf

type FrameRingBuf struct {
	A        []*Frame
	N        int // MaxView, the total size of A, whether or not in use.
	Beg      int // start of in-use data in A
	Readable int // number of *Frame available in A (in use)
}

FrameRingBuf:

a fixed-size circular ring buffer of *Frame

func NewFrameRingBuf

func NewFrameRingBuf(n int) *FrameRingBuf

constructor. NewFrameRingBuf will allocate internally a slice of size n.

func (*FrameRingBuf) Adopt

func (b *FrameRingBuf) Adopt(me []*Frame)

Adopt(): non-standard.

For efficiency's sake, (possibly) take ownership of already allocated slice offered in me.

If me is large we will adopt it, and we will potentially then write to the me buffer. If we already have a bigger buffer, copy me into the existing buffer instead.

func (*FrameRingBuf) Advance

func (b *FrameRingBuf) Advance(n int)

Advance(): non-standard, but better than Next(), because we don't have to unwrap our buffer and pay the cpu time for the copy that unwrapping may need. Useful in conjuction/after ReadWithoutAdvance() above.

func (*FrameRingBuf) Avail

func (f *FrameRingBuf) Avail() int

Avail returns the number of available readable Frames stored in the ring.

func (*FrameRingBuf) First

func (f *FrameRingBuf) First() int

First returns the earliest index, or -1 if the ring is empty

func (*FrameRingBuf) Kth

func (f *FrameRingBuf) Kth(k int) *Frame

Kth presents the contents of the ring as a strictly linear sequence, so the user doesn't need to think about modular arithmetic. Here k indexes from [0, f.Readable-1], assuming f.Avail() is greater than 0. Kth() returns the k-th frame starting from f.Beg, where f.Beg itself is at k = 0. If k is out of bounds, or the ring is empty, nil is returned.

func (*FrameRingBuf) Last

func (f *FrameRingBuf) Last() int

Last returns the index of the last element, or -1 if the ring is empty.

func (*FrameRingBuf) LegalPos

func (b *FrameRingBuf) LegalPos() (a0, aLast, b0, bLast int)

LegalPos returns the legal index positions, [a0,aLast] and [b0,bLast] inclusive, where the [a0,aLast] holds the first FIFO ordered segment, and the [b0,bLast] holds the second ordered segment, if any. A position of -1 means the segment is not used, perhaps because b.Readable is zero, or because the second segment [b0,bLast] is not in use (when everything fits in the first [a0,aLast] segment).

func (*FrameRingBuf) Nextpos

func (f *FrameRingBuf) Nextpos(from int) int

Nextpos returns the index of the element after from, or -1 if no more. returns -2 if erroneous input (bad from).

func (*FrameRingBuf) Prevpos

func (f *FrameRingBuf) Prevpos(from int) int

Prevpos returns the index of the element before from, or -1 if no more and from is the first in the ring. Returns -2 on bad from position.

func (*FrameRingBuf) ReadFrames

func (b *FrameRingBuf) ReadFrames(p []*Frame) (n int, err error)

ReadFrames():

from bytes.Buffer.Read(): Read reads the next len(p) *Frame from the buffer or until the buffer is drained. The return value n is the number of bytes read. If the buffer has no data to return, err is io.EOF (unless len(p) is zero); otherwise it is nil.

func (*FrameRingBuf) ReadWithoutAdvance

func (b *FrameRingBuf) ReadWithoutAdvance(p []*Frame) (n int, err error)

ReadWithoutAdvance(): if you want to Read the data and leave it in the buffer, so as to peek ahead for example.

func (*FrameRingBuf) Reset

func (b *FrameRingBuf) Reset()

Reset quickly forgets any data stored in the ring buffer.

func (*FrameRingBuf) TwoContig

func (b *FrameRingBuf) TwoContig(makeCopy bool) (first []*Frame, second []*Frame)

TwoContig returns all readable *Frame, but in two separate slices, to avoid copying. The two slices are from the same buffer, but are not contiguous. Either or both may be empty slices.

func (*FrameRingBuf) WriteCapacity

func (b *FrameRingBuf) WriteCapacity() int

WriteCapacity returns the number of spaces left to write in the ring before it is full. When the ring is full, 0 is returned.

func (*FrameRingBuf) WriteFrames

func (b *FrameRingBuf) WriteFrames(p []*Frame) (n int, err error)

WriteFrames writes len(p) *Frame values from p to the underlying ring. It returns the number of Frames written from p (0 <= n <= len(p)) and any error encountered that caused the write to stop early. Write must return a non-nil error if it returns n < len(p).

type FrameWriter

type FrameWriter struct {
	Frames []*Frame

	Out io.Writer
	// contains filtered or unexported fields
}

FrameWriter writes Frames to Out, an underlying io.Writer. FrameWriter may buffer frames and does not force i/o immediately.

func NewFrameWriter

func NewFrameWriter(w io.Writer, maxFrameBytes int64) *FrameWriter

NewFrameWriter construts a new FrameWriter for buffering and writing Frames to w. It imposes a message size limit of maxFrameBytes in order to size its internal marshalling buffer.

func (*FrameWriter) Append

func (fw *FrameWriter) Append(f *Frame)

Append adds f the stream to be written, assuming it can take ownership. Copy f first if need be and do not write into *f after calling Append.

func (*FrameWriter) Flush

func (b *FrameWriter) Flush() error

Flush writes any buffered b.Frames to b.Out.

func (*FrameWriter) Merge

func (fw *FrameWriter) Merge(strms ...*BufferedFrameReader) error

Merge merges the strms input into timestamp order, based on the Frame.Tm() timestamp, and writes the ordered sequence out to the fw.Out io.writer.

func (*FrameWriter) Sync

func (s *FrameWriter) Sync() error

Sync writes the stream to disk, forcing any pending buffered writes to be persisted on disk.

func (*FrameWriter) Write

func (b *FrameWriter) Write(p []byte) (n int, err error)

Write writes len(p) bytes from p to the underlying FrameWriter.Out. It returns the number of bytes written from p (0 <= n <= len(p)) and any error encountered that caused the write to stop early. Write must return a non-nil error if it returns n < len(p).

func (*FrameWriter) WriteTo

func (b *FrameWriter) WriteTo(w io.Writer) (n int64, err error)

WriteTo writes any buffered frames pending in the FrameWriter to w and returns the number of bytes n written along with any error encountered during writing.

type JsonBytesAsStringExt

type JsonBytesAsStringExt struct{}

JsonBytesAsStringExt allows github.com/ugorji/go/codec to passthrough json bytes without conversion.

func (JsonBytesAsStringExt) ConvertExt

func (x JsonBytesAsStringExt) ConvertExt(v interface{}) interface{}

func (x JsonBytesAsStringExt) WriteExt(interface{}) []byte { panic("unsupported") } func (x JsonBytesAsStringExt) ReadExt(interface{}, []byte) { panic("unsupported") }

func (JsonBytesAsStringExt) UpdateExt

func (x JsonBytesAsStringExt) UpdateExt(dest interface{}, v interface{})

type PTI

type PTI byte

PTI is the Payload Type Indicator. It is the low 3-bits of the Primary word in a TMFRAME message.

const (
	PtiZero       PTI = 0
	PtiOneInt64   PTI = 1
	PtiOneFloat64 PTI = 2
	PtiTwo64      PTI = 3
	PtiNull       PTI = 4
	PtiNA         PTI = 5
	PtiNaN        PTI = 6
	PtiUDE        PTI = 7
)

type SearchStatus

type SearchStatus int

SearchStatus is returned by the four search functions FirstInForceBefore(), LastInForceBefore(), FirstAtOrBefore(), and LastAtOrBefore() to indicate the result of the search.

const (
	InPast   SearchStatus = 0
	Avail    SearchStatus = 1
	InFuture SearchStatus = 2
)

func (SearchStatus) String

func (s SearchStatus) String() string

Stringify the SearchStatus, for printing.

type Series

type Series struct {
	Frames []*Frame
}

Series represents a set of sequential Frames, which, taken in order, represent points in a timeseries.

FirstInForceBefore(), LastInForceBefore(), FirstAtOrBefore(), and LastAtOrBefore() are the main methods to search for a specific timepoint, or the last event in force before a specific timepoint.

See the LastInForceBefore() for the most detailed description of the arguments and return values. The other three functions are analogous.

func GenerateSeriesWithRepeats

func GenerateSeriesWithRepeats(reps []int) *Series

func NewSeriesFromFrames

func NewSeriesFromFrames(fr []*Frame) *Series

create a new Series from a set of Frame pointers

func (*Series) FirstAtOrBefore

func (s *Series) FirstAtOrBefore(tm time.Time) (*Frame, SearchStatus, int)

FirstAtOrBefore(): looking at the ties for the nearest timestamp s <= tm, return the earliest (first in the presented sequence order) of these ties at s. Nearest means that there is no other timestamp r such that s <= r <= tm.

func (*Series) FirstInForceBefore

func (s *Series) FirstInForceBefore(tm time.Time) (*Frame, SearchStatus, int)

FirstInForceBefore(): looking at the ties for the nearest timestamp s < tm, return the earliest (first in the presented sequence order) of these ties at s. Nearest means that there is no other timestamp r such that s < r < tm.

func (*Series) LastAtOrBefore

func (s *Series) LastAtOrBefore(tm time.Time) (*Frame, SearchStatus, int)

LastAtOrBefore(): looking at the ties for the nearest timestamp s <= tm, return the newest (last in the presented sequence order) of these ties at timestamp s. Nearest means that there is no other timestamp r such that s <= r <= tm.

func (*Series) LastInForceBefore

func (s *Series) LastInForceBefore(tm time.Time) (*Frame, SearchStatus, int)

LastInForceBefore():

If tm is greater than any seen Frame, LastInForceBefore() will return the last seen Frame and a SearchStatus of InFuture.

If tm is smaller than the oldest Frame available, LastInForceBefore will return (nil, InPast). Otherwise, it returns the Frame where Frame.Tm() is strictly before the tm (using 10 nanosecond resolution; truncating tm using the TimeToPrimTm(tm) function.

The 3rd returned argument provides the integer index of the returned frame in s.Frames, or -1 if SearchStatus is InPast.

In summary:

LastInForceBefore(): looking at the ties for the nearest timestamp s < tm, return the most recent (last in the presented sequence order) of these ties at s. Nearest means that there is no other timestamp r such that s < r < tm.

type Syncable

type Syncable interface {
	Sync() error
}

Syncable allows us to sync os.File to disk, if they are in use in FrameStream.Out

type TfcatConfig

type TfcatConfig struct {
	PrettyPrint         bool
	SkipPayload         bool
	Follow              bool
	RawCount            int
	RawSkip             int
	ReadStdin           bool
	Rreadable           bool
	ZebraPackSchemaPath string

	ZebraSchema zebra.Schema
}

configure the tfcat command utility

func (*TfcatConfig) DefineFlags

func (c *TfcatConfig) DefineFlags(fs *flag.FlagSet)

call DefineFlags before myflags.Parse()

func (*TfcatConfig) ValidateConfig

func (c *TfcatConfig) ValidateConfig() error

call c.ValidateConfig() after myflags.Parse()

type TfdedupConfig

type TfdedupConfig struct {
	WriteDupsToFile string
	WindowSize      int
	DetectOnly      bool
}

configure the tfdedup command utility

func (*TfdedupConfig) DefineFlags

func (c *TfdedupConfig) DefineFlags(fs *flag.FlagSet)

call DefineFlags before myflags.Parse()

func (*TfdedupConfig) ValidateConfig

func (c *TfdedupConfig) ValidateConfig() error

call c.ValidateConfig() after myflags.Parse()

type TffilterConfig

type TffilterConfig struct {
	Help           bool
	ExcludeMatches bool
	RegexFile      string
	Any            bool
	Sub            bool
}

func (*TffilterConfig) DefineFlags

func (c *TffilterConfig) DefineFlags(fs *flag.FlagSet)

call DefineFlags before myflags.Parse()

func (*TffilterConfig) ValidateConfig

func (c *TffilterConfig) ValidateConfig() error

type TfgroupConfig

type TfgroupConfig struct {
	Help          bool
	GroupInterval string // "min, sec, hour, ..."
}

func (*TfgroupConfig) DefineFlags

func (c *TfgroupConfig) DefineFlags(fs *flag.FlagSet)

call DefineFlags before myflags.Parse()

func (*TfgroupConfig) ValidateConfig

func (c *TfgroupConfig) ValidateConfig() error

type TfindexConfig

type TfindexConfig struct {
}

configure the tfindex command utility

func (*TfindexConfig) DefineFlags

func (c *TfindexConfig) DefineFlags(fs *flag.FlagSet)

call DefineFlags before myflags.Parse()

func (*TfindexConfig) ValidateConfig

func (c *TfindexConfig) ValidateConfig() error

call c.ValidateConfig() after myflags.Parse()

type TfsortConfig

type TfsortConfig struct {
	KeepTmpFiles bool
}

configure the tfsort command utility

func (*TfsortConfig) DefineFlags

func (c *TfsortConfig) DefineFlags(fs *flag.FlagSet)

call DefineFlags before myflags.Parse()

func (*TfsortConfig) ValidateConfig

func (c *TfsortConfig) ValidateConfig() error

call c.ValidateConfig() after myflags.Parse()

type TfsumConfig

type TfsumConfig struct {
	Help bool
}

func (*TfsumConfig) DefineFlags

func (c *TfsumConfig) DefineFlags(fs *flag.FlagSet)

call DefineFlags before myflags.Parse()

func (*TfsumConfig) ValidateConfig

func (c *TfsumConfig) ValidateConfig() error

type TimeExt

type TimeExt struct{}

TimeExt allows github.com/ugorji/go/codec to understand Go time.Time

func (TimeExt) ConvertExt

func (x TimeExt) ConvertExt(v interface{}) interface{}

func (TimeExt) ReadExt

func (x TimeExt) ReadExt(interface{}, []byte)

func (TimeExt) UpdateExt

func (x TimeExt) UpdateExt(dest interface{}, v interface{})

func (TimeExt) WriteExt

func (x TimeExt) WriteExt(interface{}) []byte

type TimeSorter

type TimeSorter []*Frame

frameSorter is used to do the merge sort

func (TimeSorter) Len

func (p TimeSorter) Len() int

Len is the sorting Len function

func (TimeSorter) Less

func (p TimeSorter) Less(i, j int) bool

Less is the sorting Less function.

func (TimeSorter) Swap

func (p TimeSorter) Swap(i, j int)

Swap is the sorting Swap function.

Directories

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL