packfile

package
v0.9.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 27, 2021 License: Apache-2.0 Imports: 15 Imported by: 0

Documentation

Overview

Package packfile provides types for operating on Git packfiles. Packfiles are used for storing Git objects on disk and when sending Git objects over the network. The format is described in https://git-scm.com/docs/pack-format.

Objects in a packfile may be either stored in their entirety or stored in a "deltified" representation. Deltified objects are stored as a patch on top of another object.

Example
package main

import (
	"fmt"
	"io"
	"os"
	"path/filepath"

	"gg-scm.io/pkg/git/githash"
	"gg-scm.io/pkg/git/packfile"
)

func main() {
	// Open a packfile.
	file, err := os.Open(filepath.Join("testdata", "DeltaObject.pack"))
	if err != nil {
		// handle error
	}
	fileInfo, err := file.Stat()
	if err != nil {
		// handle error
	}

	// Index the packfile.
	idx, err := packfile.BuildIndex(file, fileInfo.Size(), nil)
	if err != nil {
		// handle error
	}

	// Find the position of an object.
	commitID, err := githash.ParseSHA1("45c3b785642598057cf65b79fd05586dae5cba10")
	if err != nil {
		// handle error
	}
	i := idx.FindID(commitID)
	if i == -1 {
		// handle not-found error
	}

	// Read the object from the packfile.
	undeltifier := new(packfile.Undeltifier)
	bufferedFile := packfile.NewBufferedReadSeeker(file)
	prefix, content, err := undeltifier.Undeltify(bufferedFile, idx.Offsets[i], &packfile.UndeltifyOptions{
		Index: idx,
	})
	if err != nil {
		// handle error
	}
	fmt.Println(prefix)
	io.Copy(os.Stdout, content)

}
Output:

blob 13
Hello, delta

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

func DeltaObjectSize

func DeltaObjectSize(delta ByteReader) (int64, error)

DeltaObjectSize calculates the size of an object constructed from delta instructions.

func ResolveType added in v0.9.0

func ResolveType(f ByteReadSeeker, offset int64, opts *UndeltifyOptions) (object.Type, error)

ResolveType determines the type of the object located at the given offset. If the object is a deltified, then it follows the delta base object references until it encounters a non-delta object and returns its type.

Types

type BufferedReadSeeker added in v0.9.0

type BufferedReadSeeker struct {
	// contains filtered or unexported fields
}

BufferedReadSeeker implements buffering for an io.ReadSeeker object.

func NewBufferedReadSeeker added in v0.9.0

func NewBufferedReadSeeker(r io.ReadSeeker) *BufferedReadSeeker

NewBufferedReadSeeker returns a new BufferedReadSeeker whose buffer has the default size.

func NewBufferedReadSeekerSize added in v0.9.0

func NewBufferedReadSeekerSize(r io.ReadSeeker, size int) *BufferedReadSeeker

NewBufferedReadSeekerSize returns a new BufferedReadSeeker whose buffer has at least the specified size.

func (*BufferedReadSeeker) Read added in v0.9.0

func (rs *BufferedReadSeeker) Read(p []byte) (int, error)

Read reads data into p. The bytes are taken from at most one Read on the underlying Reader, hence n may be less than len(p). To read exactly len(p) bytes, use io.ReadFull(b, p). At EOF, the count will be zero and err will be io.EOF.

func (*BufferedReadSeeker) ReadByte added in v0.9.0

func (rs *BufferedReadSeeker) ReadByte() (byte, error)

ReadByte reads and returns a single byte. If no byte is available, returns an error.

func (*BufferedReadSeeker) Seek added in v0.9.0

func (rs *BufferedReadSeeker) Seek(offset int64, whence int) (int64, error)

Seek implements the io.Seeker interface.

type ByteReadSeeker added in v0.9.0

type ByteReadSeeker interface {
	io.Reader
	io.ByteReader
	io.Seeker
}

ByteReadSeeker is the interface that groups the io.Reader, io.ByteReader, and io.Seeker interfaces.

type ByteReader

type ByteReader interface {
	io.Reader
	io.ByteReader
}

ByteReader is a combination of io.Reader and io.ByteReader.

type DeltaReader added in v0.9.0

type DeltaReader struct {
	// contains filtered or unexported fields
}

DeltaReader decompresses a deltified object from a packfile. See details at https://git-scm.com/docs/pack-format#_deltified_representation

func NewDeltaReader added in v0.9.0

func NewDeltaReader(base io.ReadSeeker, delta ByteReader) *DeltaReader

NewDeltaReader returns a new DeltaReader that applies the given delta to a base object.

func (*DeltaReader) Read added in v0.9.0

func (d *DeltaReader) Read(p []byte) (int, error)

Read implements io.Reader by decompressing the deltified object.

func (*DeltaReader) Size added in v0.9.0

func (d *DeltaReader) Size() (int64, error)

Size returns the expected size of the decompressed bytes as reported by the delta header. Use DeltaObjectSize to determine the precise number of bytes that the DeltaReader will produce.

type Header struct {
	// Offset is the location in the packfile this object starts at. It can be
	// used as a key for BaseOffset. Writer ignores this field.
	Offset int64

	Type ObjectType

	// Size is the uncompressed size of the object in bytes.
	Size int64

	// BaseOffset is the Offset of a previous Header for an OffsetDelta type object.
	BaseOffset int64
	// BaseObject is the hash of an object for a RefDelta type object.
	BaseObject githash.SHA1
}

A Header holds a single object header in a packfile.

func ReadHeader added in v0.9.0

func ReadHeader(offset int64, r ByteReader) (*Header, error)

ReadHeader reads a packfile object header from r. The returned Header's Offset field will be set to the given offset. If ReadHeader does not return an error, the data of the object will be available on r as a zlib-compressed stream.

Example

This example uses ReadHeader to perform random access in a packfile.

package main

import (
	"bufio"
	"compress/zlib"
	"fmt"
	"io"
	"os"
	"path/filepath"

	"gg-scm.io/pkg/git/packfile"
)

func main() {
	// Open a packfile.
	f, err := os.Open(filepath.Join("testdata", "FirstCommit.pack"))
	if err != nil {
		// handle error
	}

	// Seek to a specific index. You can get this from an index or previous read.
	const offset = 12
	if _, err := f.Seek(offset, io.SeekStart); err != nil {
		// handle error
	}

	// Read the object and its header.
	reader := bufio.NewReader(f)
	hdr, err := packfile.ReadHeader(offset, reader)
	if err != nil {
		// handle error
	}
	fmt.Println(hdr.Type)
	// The object is zlib-compressed in the packfile after the header.
	zreader, err := zlib.NewReader(reader)
	if err != nil {
		// handle error
	}
	if _, err := io.Copy(os.Stdout, zreader); err != nil {
		// handle error
	}

}
Output:

OBJ_BLOB
Hello, World!

type Index added in v0.9.0

type Index struct {
	// ObjectIDs is a sorted list of object IDs in the packfile.
	ObjectIDs []githash.SHA1
	// Offsets holds the offsets from the start of the packfile that an object
	// header starts at. The i'th element of Offsets corresponds with the
	// i'th element of ObjectIDs.
	Offsets []int64
	// PackedChecksums holds the CRC32 checksums of each packfile object header
	// and its zlib-compressed contents. The i'th element of PackedChecksums
	// corresponds with the i'th element of ObjectIDs. Version 1 index files do
	// not have this information.
	PackedChecksums []uint32
	// PackfileSHA1 is a copy of the SHA-1 hash present at the end of the packfile.
	PackfileSHA1 githash.SHA1
}

Index is an in-memory mapping of object IDs to offsets within a packfile. This maps 1:1 with index files produced by git-index-pack(1). (*Index)(nil) is treated the same as the Index for an empty packfile.

Example
package main

import (
	"fmt"
	"os"
	"path/filepath"

	"gg-scm.io/pkg/git/packfile"
)

func main() {
	// Open a packfile.
	file, err := os.Open(filepath.Join("testdata", "FirstCommit.pack"))
	if err != nil {
		// handle error
	}
	fileInfo, err := file.Stat()
	if err != nil {
		// handle error
	}

	// Index the packfile.
	idx, err := packfile.BuildIndex(file, fileInfo.Size(), nil)
	if err != nil {
		// handle error
	}

	// Print a sorted list of all objects in the packfile.
	for _, id := range idx.ObjectIDs {
		fmt.Println(id)
	}

}
Output:

8ab686eafeb1f44702738c8b0f24f2567c36da6d
aef8a4c3fe8d296dec2d9b88d4654cd596927867
bc225ea23f53f06c0c5bd3ba2be85c2120d68417

func BuildIndex added in v0.9.0

func BuildIndex(f io.ReaderAt, fileSize int64, opts *IndexOptions) (*Index, error)

BuildIndex indexes a packfile. This is equivalent to running git-index-pack(1) on the packfile.

func ReadIndex added in v0.9.0

func ReadIndex(r io.Reader) (*Index, error)

ReadIndex parses a packfile index file from r. It performs no buffering and will not read more bytes than necessary.

func (*Index) EncodeV1 added in v0.9.0

func (idx *Index) EncodeV1(w io.Writer) error

EncodeV1 writes idx in Git's packfile index version 1 format. This generally should only be used for compatibility, since the version 1 format does not store PackedChecksums and do not support packfiles larger than 4 GiB.

func (*Index) EncodeV2 added in v0.9.0

func (idx *Index) EncodeV2(w io.Writer) error

EncodeV2 writes idx in Git's packfile index version 2 format.

func (*Index) FindID added in v0.9.0

func (idx *Index) FindID(id githash.SHA1) int

FindID finds the position of id in idx.ObjectIDs or -1 if the ID is not present in the index. The result is undefined if idx.ObjectIDs is not sorted. This search is O(log len(idx.ObjectIDs)).

func (*Index) Len added in v0.9.0

func (idx *Index) Len() int

Len returns the number of objects in the index.

func (*Index) Less added in v0.9.0

func (idx *Index) Less(i, j int) bool

Less returns whether the i'th object ID is lexicographically less than the j'th object ID.

func (*Index) MarshalBinary added in v0.9.0

func (idx *Index) MarshalBinary() ([]byte, error)

MarshalBinary encodes the index in Git's packfile index version 2 format.

func (*Index) Swap added in v0.9.0

func (idx *Index) Swap(i, j int)

Swap swaps the i'th and j'th rows of the index.

func (*Index) UnmarshalBinary added in v0.9.0

func (idx *Index) UnmarshalBinary(data []byte) error

UnmarshalBinary decodes Git's packfile index format into idx.

type IndexOptions added in v0.9.0

type IndexOptions struct {
}

IndexOptions holds optional arguments to BuildIndex.

type ObjectType

type ObjectType int8

An ObjectType holds the type of an object inside a packfile.

const (
	Commit ObjectType = 1
	Tree   ObjectType = 2
	Blob   ObjectType = 3
	Tag    ObjectType = 4

	OffsetDelta ObjectType = 6
	RefDelta    ObjectType = 7
)

Object types

func (ObjectType) NonDelta added in v0.9.0

func (typ ObjectType) NonDelta() object.Type

NonDelta returns the Git object type that the packfile object type represents or the empty string if the type represents a deltified object.

func (ObjectType) String

func (t ObjectType) String() string

String returns the Git object type constant name like "OBJ_COMMIT".

type Reader

type Reader struct {
	// contains filtered or unexported fields
}

Reader reads a packfile serially.

Use ReadHeader if you want random access in a packfile.

func NewReader

func NewReader(r ByteReader) *Reader

NewReader returns a Reader that reads from the given stream.

func (*Reader) Next

func (r *Reader) Next() (*Header, error)

Next advances to the next object in the packfile. The Header.Size determines how many bytes can be read for the next object. Any remaining data in the current object is automatically discarded.

io.EOF is returned at the end of the input.

func (*Reader) Read

func (r *Reader) Read(p []byte) (int, error)

Read reads from the current object in the packfile. It returns (0, io.EOF) when it reaches the end of that object, until Next is called to advance to the next object.

type Undeltifier added in v0.9.0

type Undeltifier struct {
	// contains filtered or unexported fields
}

An Undeltifier decompresses deltified objects in a packfile. The zero value is a valid Undeltifier. Undeltifiers have cached internal state, so Undeltifiers should be reused instead of created as needed.

For more information on deltification, see https://git-scm.com/docs/pack-format#_deltified_representation

func (*Undeltifier) Undeltify added in v0.9.0

func (u *Undeltifier) Undeltify(f ByteReadSeeker, offset int64, opts *UndeltifyOptions) (object.Prefix, io.Reader, error)

Undeltify decompresses the object at the given offset from the beginning of the packfile, undeltifying the object if needed. The returned io.Reader may read from f, so the caller should not use f until they are done reading from the returned io.Reader.

type UndeltifyOptions added in v0.9.0

type UndeltifyOptions struct {
	// Index allows the undeltify operation to resolve delta object base ID
	// references within the same packfile.
	Index *Index
}

UndeltifyOptions contains optional parameters for processing deltified objects.

type Writer

type Writer struct {
	// contains filtered or unexported fields
}

Writer writes a packfile.

Example
package main

import (
	"bytes"
	"io"
	"strings"
	"time"

	"gg-scm.io/pkg/git/object"
	"gg-scm.io/pkg/git/packfile"
)

func main() {
	// Create a writer.
	buf := new(bytes.Buffer)
	const objectCount = 3
	writer := packfile.NewWriter(buf, objectCount)

	// Write a blob.
	const blobContent = "Hello, World!\n"
	_, err := writer.WriteHeader(&packfile.Header{
		Type: packfile.Blob,
		Size: int64(len(blobContent)),
	})
	if err != nil {
		// handle error
	}
	if _, err := io.WriteString(writer, blobContent); err != nil {
		// handle error
	}
	blobSum, err := object.BlobSum(strings.NewReader(blobContent), int64(len(blobContent)))
	if err != nil {
		// handle error
	}

	// Write a tree (directory).
	tree := object.Tree{
		{Name: "hello.txt", Mode: object.ModePlain, ObjectID: blobSum},
	}
	treeData, err := tree.MarshalBinary()
	if err != nil {
		// handle error
	}
	_, err = writer.WriteHeader(&packfile.Header{
		Type: packfile.Tree,
		Size: int64(len(treeData)),
	})
	if err != nil {
		// handle error
	}
	if _, err := writer.Write(treeData); err != nil {
		// handle error
	}

	// Write a commit.
	const user object.User = "Octocat <octocat@example.com>"
	commitTime := time.Unix(1608391559, 0).In(time.FixedZone("-0800", -8*60*60))
	commit := &object.Commit{
		Tree:       tree.SHA1(),
		Author:     user,
		AuthorTime: commitTime,
		Committer:  user,
		CommitTime: commitTime,
		Message:    "First commit\n",
	}
	commitData, err := commit.MarshalBinary()
	if err != nil {
		// handle error
	}
	_, err = writer.WriteHeader(&packfile.Header{
		Type: packfile.Commit,
		Size: int64(len(commitData)),
	})
	if err != nil {
		// handle error
	}
	if _, err := writer.Write(commitData); err != nil {
		// handle error
	}

	// Finish the write.
	if err := writer.Close(); err != nil {
		// handle error
	}
}
Output:

func NewWriter

func NewWriter(w io.Writer, objectCount uint32) *Writer

NewWriter returns a Writer that writes to the given stream. It is the caller's responsibility to call Close on the returned Writer after the last object has been written.

func (*Writer) Close

func (w *Writer) Close() error

Close closes the packfile by writing the trailer. If the current object (from a prior call to WriteHeader) is not fully written or WriteHeader has been called less times than the object count passed to NewWriter, Close returns an error. This method does not close the underlying writer.

func (*Writer) Write

func (w *Writer) Write(p []byte) (n int, err error)

Write writes to the current object in the packfile. Write returns an error if more than the Header.Size bytes are written after WriteHeader.

func (*Writer) WriteHeader

func (w *Writer) WriteHeader(hdr *Header) (offset int64, err error)

WriteHeader writes hdr and prepares to accept the object's contents. WriteHeader returns the offset of the header from the beginning of the stream. The Header.Size determines how many bytes can be written for the next object. If the current object is not fully written or WriteHeader has been called more times than the object count passed to NewWriter, WriteHeader returns an error.

Directories

Path Synopsis
Package client provides a Git packfile protocol client for sending and receiving Git objects.
Package client provides a Git packfile protocol client for sending and receiving Git objects.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL