Documentation
¶
Overview ¶
Package s3reader provides io.ReadSeeker, io.ReaderAt, and io.WriterTo implementations using S3 ranged GetObject.
Index ¶
Constants ¶
const ( // DefaultThreshold is the default value for GetOptions.Threshold. // // S3's [Recommendation] is actually 8MB-16MB. // // [Recommendation]: https://docs.aws.amazon.com/whitepapers/latest/s3-optimizing-performance-best-practices/use-byte-range-fetches.html DefaultThreshold = int64(5 * 1024 * 1024) // DefaultConcurrency is the default value for Options.Concurrency. DefaultConcurrency = 3 // DefaultPartSize is the default value for Options.PartSize. // // S3's [Recommendation] is actually 8MB-16MB. // // [Recommendation]: https://docs.aws.amazon.com/whitepapers/latest/s3-optimizing-performance-best-practices/use-byte-range-fetches.html DefaultPartSize = int64(5 * 1024 * 1024) // DefaultBufferSize is the default value for Options.BufferSize. DefaultBufferSize = 1024 * 1024 )
Variables ¶
var ( // ErrSeekBeforeFirstByte is returned by Reader.Seek if the parameters would end up moving the internal read // offset to a negative number. ErrSeekBeforeFirstByte = errors.New("seek ends up before first byte") // ErrSeekPastLastByte is returned by Reader.Seek if the parameters would end up moving the internal read // offset past the offset of the last byte (Reader.Size-1). ErrSeekPastLastByte = errors.New("seek ends up past of last byte") // ErrClosed is returned by all Reader read methods after Close returns. ErrClosed = errors.New("reader already closed") )
Functions ¶
func WithProgressBar ¶
func WithProgressBar(options ...progressbar.Option) func(*Options)
WithProgressBar adds a progress bar that displays download progress.
You can also use `r.WriteTo(io.MultiWriter(file, bar))` if you're using Reader.WriteTo to write to file.
func WithProgressLogger ¶
WithProgressLogger adds a progress logger that logs download progress with the given interval.
For example, if interval is `5*time.Second`, every 5 seconds, the given logger will print `downloaded X / Y so far` where X is the number of bytes that have been downloaded, Y the total number of expected bytes, both X and Y are displayed in a human-friendly format (e.g. 5 KiB, 1 MiB, etc.).
Note: WithProgressLogger is useful only if you're using Reader as an io.Reader or io.WriterTo. Any Reader.Seek will cause the progress logger to become incorret. there is n. Reader.ReadAt will not provide updates.
Types ¶
type GetAndHeadObjectClient ¶
type GetAndHeadObjectClient interface {
GetObject(context.Context, *s3.GetObjectInput, ...func(*s3.Options)) (*s3.GetObjectOutput, error)
HeadObject(context.Context, *s3.HeadObjectInput, ...func(*s3.Options)) (*s3.HeadObjectOutput, error)
}
GetAndHeadObjectClient abstracts the S3 APIs that are needed for New to determine the object size.
type GetObjectClient ¶
type GetObjectClient interface {
GetObject(context.Context, *s3.GetObjectInput, ...func(*s3.Options)) (*s3.GetObjectOutput, error)
}
GetObjectClient abstracts the S3 APIs that are needed to implement Reader.
type Options ¶
type Options struct {
// Threshold indicates the minimum number of bytes needed for parallel GetObject.
//
// If the range is shorter than the threshold, a single GetObject will be used.
//
// Default to DefaultThreshold. Must be positive integer.
Threshold int64
// Concurrency controls the number of goroutines in the pool that supports parallel GetObject.
//
// Default to DefaultConcurrency. Must be a positive integer. Set to 1 to disable the feature.
//
// Because a single goroutine pool is shared for all Reader.Read and Reader.ReadAt calls, it is acceptable
// to set this value to a high number (`runtime.NumCPU()`) and use MaxBytesInSecond instead to add rate
// limiting.
Concurrency int
// MaxBytesInSecond limits the number of bytes that are downloaded in one second.
//
// The zero-value indicates no limit. Must be a positive integer otherwise.
MaxBytesInSecond int64
// PartSize is the size of each parallel GetObject.
//
// Default to DefaultPartSize. Must be a positive integer; unused if Concurrency is 1.
PartSize int64
// BufferSize is used to provide buffered read ahead for every Read call.
//
// BufferSize provides buffered read to reduce the number of small-range GetObject by making one mid-range
// GetObject instead, extremely helpful if Reader is being used strictly as an io.Reader.
//
// Default to DefaultBufferSize. Must be a non-negative integer. Set to 0 to disable the feature.
BufferSize int64
// contains filtered or unexported fields
}
Options customises the returned Reader of New and NewReaderWithSize.
type Reader ¶
type Reader interface {
// Read reads up to len(p) bytes into p and advances the internal read offset accordingly.
//
// Read should not be called concurrently as they share the same internal read offset and buffer. If len(p) is
// larger than Options.Threshold, Read will use parallel GetObject to retrieve the data with each part
// downloading up to Options.PartSize in bytes. Otherwise, Read will use the larger of len(p) or
// Options.BufferSize for one GetObject call to provide buffered read.
//
// See io.Reader for more information on the return values.
Read(p []byte) (int, error)
// Seek moves the internal read offset for the next Read or WriteTo.
//
// See io.Seeker for more information on the return values. This implementation returns either
// ErrSeekBeforeFirstByte or ErrSeekPastLastByte for invalid seek parameters.
Seek(offset int64, whence int) (int64, error)
// Close shuts down the internal goroutine pool that supports parallel GetObject requests.
//
// Close does not always have to be called as garbage collection will be able to reclaim the goroutines
// eventually. If you end up creating a lot of Reader instances, however, it is sensible to Close them as soon
// as possible.
Close() error
// ReadAt reads a specific range of the S3 reader start at offset off and reads no more than len(p) bytes.
//
// Concurrent ReadAt calls are safe as they do not advance the internal read offset.
//
// See io.ReaderAt for more information on the return values.
ReadAt(p []byte, off int64) (int, error)
// WriteTo writes and advances internal read offset until remaining data from S3 is exhausted.
//
// See io.WriterTo for more information on the return values.
WriteTo(dst io.Writer) (int64, error)
// Size returns the size of the S3 reader that was determined from the initial HeadObject request or given by
// way of NewReaderWithSize.
Size() int64
// Reopen returns a new Reader that with identical settings as this instance but starts at initial state
// (read offset at first byte).
//
// The new instance has its own goroutine pool and, as a result, can Close independently of this instance.
// Useful if you need to start reading from first byte again but need to keep this instance for some other
// usage.
Reopen() Reader
}
Reader uses ranged GetObject to implement io.ReadSeekCloser, io.ReaderAt, and io.WriterAt.
Each Read, ReadAt, or WriteTo may be done with one GetObject or several smaller GetObject in parallel depending on the Options passed to New. Methods from io.ReadSeeker (Read and Seek) and io.WriterAt (WriteTo) will update the internal read offset and as a result should not be called concurrently. On the other hand, concurrent calls to ReadAt are safe as they do not advance the internal read offset. Once Close returns, however, all subsequent reads will return ErrClosed.
func New ¶
func New(ctx context.Context, client GetAndHeadObjectClient, input *s3.GetObjectInput, optFns ...func(*Options)) (Reader, error)
New returns a Reader with the given GetObject input parameters.
The given context will be used for all subsequent S3 calls.
New will call HeadObject using identical input parameters to determine the reader size. If you already know the object's size, use NewReaderWithSize instead. New may return a non-nil error from the HeadObject or from invalid options.
func NewReaderWithSize ¶
func NewReaderWithSize(ctx context.Context, client GetObjectClient, input *s3.GetObjectInput, size int64, optFns ...func(*Options)) (Reader, error)
NewReaderWithSize returns a Reader with the given GetObject input parameters and known size.
The given context will be used for all subsequent S3 calls.
NewReaderWithSize will only return a non-nil error if there are invalid options.