merged_fs

package module
v1.3.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 22, 2024 License: MIT Imports: 8 Imported by: 17

README

Merged FS: Compose Multiple Go Filesystems

The release of version 1.16 of the Go programming language included a standard interface for read-only filesystems, defined in Go's io/fs standard library package. With this change came some other standard-library changes, including the fact that archive/zip now provides a "filesystem" interface for zip files, or the ability of net/http to serve files from any filesystem providing the io/fs interface. In conjunction, this means utilities like the HTTP server can now directly serve content from zip files, without the data needing to be extracted manually.

While that's already pretty cool, wouldn't it be nice if you could, for example, transparently serve data from multiple zip files as if they were a single directory? This library provides the means to do so: it implements the io/fs.FS interface using two underlying filesystems. The underlying filesystems can even include additional MergedFS instances, enabling combining an arbitrary number of filesystems into a single io/fs.FS.

This repository provides a roughly similar function to laher/mergefs, but it offers one key distinction: correctly listing contents of merged directories present in both FS's. This adds quite a bit of complexity. However, laher/mergefs will be more performant for filesystems not requiring directory- listing capabilities.

Usage

Documentation on pkg.go.dev

Simply pass two io/fs.FS instances to merged_fs.NewMergedFS(...) to obtain a new FS serving data from both. See the following example:

import (
    "archive/zip"
    "github.com/yalue/merged_fs"
    "net/http"
)

func main() {
    // ...

    // Assume that zipFile1 and zipFile2 are two zip files that have been
    // opened using os.Open(...).
    zipFS1, _ := zip.NewReader(zipFile1, file1Size)
    zipFS2, _ := zip.NewReader(zipFile2, file2Size)

    // Serve files contained in either zip file.
    mergedFS := NewMergedFS(zipFS1, zipFS2)
    http.Handle("/", http.FileServer(http.FS(mergedFS)))

    // ...
}

Additional notes:

  • Both underlying FS's must support the ReadDirFile interface when opening directories. Without this, we have no way for determining the contents of merged directories.

  • If a file with the same name is present in both FSs given to NewMergedFS, then the file in the first of the two always overrides the file with the same name in the second FS.

  • Following the prior point, if a directory in the second FS has the same name as a regular file in the first, neither the directory in the second FS nor any of its contents will be present in the merged FS (the regular file will take priority). For example, if FS A contains a regular file named a/b, and FS B contains a regular file c at the path a/b/c (in which a/b is a directory), then a/b/c will not be available in the FS returned by NewMergedFS(A, B), because the directory b is overridden by the regular file b in the first FS.

Multi-Way Merging

If you want to merge more than two filesystems, you can use the MergeMultiple function, which takes an arbitrary number of filesystem arguments:

    merged := merged_fs.MergeMultiple(fs_1, fs_2, fs_3, fs_4)

The earlier arguments to MergeMultiple will have higher priority over the later filesystems, in the same way that the first argument to NewMergedFS has priority over the second. For now, the MergeMultiple function just provides a convenient wrapper for building a tree of MergedFS instances.

Documentation

Overview

The merged_fs library implements go1.16's filesystem interface (fs.FS) using two underlying FSs, presenting two (or more) filesystems as a single FS.

Usage:

// fs1 and fs2 can be anything that supports the fs.FS interface,
// including other MergedFS instances.
fs1, _ := zip.NewReader(zipFile, fileSize)
fs2, _ := zip.NewReader(zipFile2, file2Size)
// Implements the io.FS interface, resolving conflicts in favor of fs1.
merged := NewMergedFS(fs1, fs2)

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func MergeMultiple added in v1.2.0

func MergeMultiple(filesystems ...fs.FS) fs.FS

Merges an arbitrary list of filesystems into a single filesystem. The first filesystems are higher priority than the later filesystems, and all higher-priority FS's abide by the same rules that a two-way MergedFS does. For example, a directory in a lower-priority FS will not be reachable if any part of its path is a regular file in *any* higher-priority FS. For now, this function simply constructs a balanced tree of MergedFS instances. In the future, it may use a different underlying implementation with the same semantics. Returns a valid empty filesystem (see the EmptyFS type) if no filesystem arguments are provided.

Types

type EmptyFS added in v1.2.2

type EmptyFS struct{}

Implements the FS interface, but provides a filesystem containing no files. The only path you can "Open" is ".", which provides an empty directory.

func (*EmptyFS) Open added in v1.2.2

func (f *EmptyFS) Open(path string) (fs.File, error)

type MergedDirectory

type MergedDirectory struct {
	// contains filtered or unexported fields
}

This is the key component of this library. It represents a directory that is present in both filesystems. Implements the fs.File, fs.DirEntry, and fs.FileInfo interfaces.

func (*MergedDirectory) Close

func (d *MergedDirectory) Close() error

func (*MergedDirectory) Info

func (d *MergedDirectory) Info() (fs.FileInfo, error)

func (*MergedDirectory) IsDir

func (d *MergedDirectory) IsDir() bool

func (*MergedDirectory) ModTime

func (d *MergedDirectory) ModTime() time.Time

func (*MergedDirectory) Mode

func (d *MergedDirectory) Mode() fs.FileMode

func (*MergedDirectory) Name

func (d *MergedDirectory) Name() string

func (*MergedDirectory) Read

func (d *MergedDirectory) Read(data []byte) (int, error)

func (*MergedDirectory) ReadDir

func (d *MergedDirectory) ReadDir(n int) ([]fs.DirEntry, error)

func (*MergedDirectory) Size

func (d *MergedDirectory) Size() int64

func (*MergedDirectory) Stat

func (d *MergedDirectory) Stat() (fs.FileInfo, error)

func (*MergedDirectory) Sys

func (d *MergedDirectory) Sys() interface{}

func (*MergedDirectory) Type

func (d *MergedDirectory) Type() fs.FileMode

type MergedFS

type MergedFS struct {
	// The two filesystems that have been merged. Do not modify these directly,
	// instead use NewMergedFS.
	A, B fs.FS
	// contains filtered or unexported fields
}

Implements the fs.FS interface, using the two underlying FS's. If a file is present in both filesystems, then the copy in A will always be preferred. This has an important implication: if a file is regular in A, but a directory in B, the entire directory in B will be ignored. If a file is a directory in both, then Open()-ing the file will result in a directory that contains the content from both FSs.

func NewMergedFS

func NewMergedFS(a, b fs.FS) *MergedFS

Takes two FS instances and returns an initialized MergedFS.

func (*MergedFS) Open

func (m *MergedFS) Open(path string) (fs.File, error)

If the path corresponds to a directory present in both A and B, this returns a MergedDirectory file. If it's present in both A and B, but isn't a directory in both, then this will simply return the copy in A. Otherwise, it returns the copy in B, so long as some prefix of the path doesn't correspond to a regular file in A.

func (*MergedFS) ReadFile added in v1.3.0

func (m *MergedFS) ReadFile(name string) ([]byte, error)

ReadFile reads the named file and returns the contents. A successful call returns err == nil, not err == EOF. Because ReadFile reads the whole file, it does not treat an EOF from Read as an error to be reported. This fulfills the io/fs.ReadFileFS interface. https://pkg.go.dev/io/fs#ReadFileFS

func (*MergedFS) UsePathCaching added in v1.1.0

func (m *MergedFS) UsePathCaching(enabled bool)

Enables or disables path prefix caching, and clears the cache.

I doubt most users will care about this function, but it allows working around what may be an occasional bug. Explaining it, however, unfortunately requires giving a few implementation details.

First, know that this matters only if *all* of the following conditions apply to your use case:

1) Filesystem A is something that can change during runtime, such as an os.DirFS. (It doesn't matter if filesystem B changes.)

2) You expect filesystem A to actually change at runtime.

3) You want to make sure that *adding* a regular file to A correctly prevents access to the contents of a directory in B with the same name.

Checking whether a regular file in A has the same name as a directory in B potentially requires checking every component-wise prefix of a path when opening a file. To speed this up, this library uses a cache of path prefixes that we know do *not* correspond to regular files in A. (This caching is enabled by default.) The problem can then arise if A changes after the cache already says that a path doesn't correspond to any regular files. So, if all three of the above conditions apply to you, you have two choices:

First, you can use merged_fs in conjunction with another library, such as github.com/fsnotify/fsnotify to determine if the contents of FS A have changed. If A has changed, then simply call merged.UsePathCaching(true) to clear the cache while leaving caching enabled.

Alternatively, call merged.UsePathCaching(false) to disable path caching entirely, ensuring correctness but potentially costing performance.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL