filesystem_apfs

package module
v0.0.0-...-466711e Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 22, 2026 License: BSD-3-Clause Imports: 26 Imported by: 0

README

go-filesystems/apfs

APFS

Go Reference CI

Pure-Go reader/writer for real APFS — the on-disk format Apple's apfs.kext mounts. Containers written by this package are byte-mountable through apfs.kext (hdiutil attachfsck_apfs -n clean → mount_apfs); encrypted containers written by FormatContainerEncrypted reach parity with diskutil apfs encryptVolume.

The cross-compatibility test matrix and the per-cell rationale live in COMPAT.md. The current matrix has every cell PASS except a handful of cells blocked on Apple-private code paths (snapshot creation gated on a private entitlement, fsck-clean-stage-5 on encrypted containers which fsck_apfs cannot structurally verify even on Apple's own output — see F-2).

Container API (OpenContainer* / FormatContainer*)

Entry point Purpose
OpenContainer(path) Read-only open; resolves NX SB → container OMAP.
OpenContainerRW(path) Same plus mutating APIs (commit, write, create, …).
OpenContainerFromBackend(r) Open from any containerReader; if r also satisfies containerWriter, write APIs are enabled.
OpenContainerAuto(path) GPT-aware open: detects EFI PART magic at LBA 1 and offsets into the Apple_APFS partition. Falls through to OpenContainer for raw images.
OpenContainerRWAuto(path) Read-write GPT-aware open.
FormatContainer(path, sizeBytes, label) Write a fresh, kext-mountable APFS container.
FormatContainerEncrypted(path, sizeBytes, label, passphrase) Write a FileVault-style software-encrypted APFS container (raw image).
FormatContainerEncryptedGPT(path, totalSize, label, passphrase) Same wrapped in a GPT with the Apple_APFS partition GUID.
FormatAppleDmg(path, sizeBytes, cfg) Write an Apple-compatible DMG (no UDIF wrapper).

Container carries Volumes(), OpenVolume(index), OpenSnapshot(snap), AddVolume(label), Commit(), Close(), and SetVerifyHashes(on).

Volume exposes the read paths (Name, ListInodes, ListSnapshots, LookupSnapshotByName, ListXAttrs, ReadXAttrStream, XAttrStreamReaderAt, ListSiblings, LookupInodeRecord, LookupInodeRawValue, FindInode, ReadFile, FileReaderAt, ReadFileTransparent) and the mutating paths (WriteFile, WriteFileInPlace, OverwriteFile, TruncateFile, CreateFile, CreateDirectory, CreateSymlink, CreateHardlink, CreateSparseFile, CreateFifo, CreateSocket, CreateBlockDevice, CreateCharDevice, DeleteFile, DeleteDirectory, Rename, SetXAttr, SetXAttrStream, CreateSnapshot, SetSuppressSnapshotGuard).

filesystem.Filesystem entry points

These wrap a *Container + first *Volume in a path-based driver satisfying pkg/go-filesystems/interface.Filesystem. They are what pkg/go-diskimages/diskimage and other callers consume.

Entry point Purpose
Open(imagePath, partIndex) Open a real APFS container. On macOS, falls back to hdiutil attach if the path isn't a parseable container.
OpenWithKeys(imagePath, partIndex, keys…) Same, trying each key as a FileVault passphrase before falling back.
Format(path, sizeBytes, cfg) Create a new real APFS container. cfg.Encryption = &FDEConfig{Passphrase: …} produces a FileVault-encrypted container.
OpenFDE(imagePath, passphrase, partIndex) Open a FileVault-encrypted real APFS container directly.
OpenFromBlockDevice(dev, partIndex) Open a BlockRW backend (already decrypted, e.g. behind QCOW2).

The implementation type (driver) is unexported; callers get filesystem.Filesystem. The compile-time assertion var _ filesystem.Filesystem = (*driver)(nil) in driver.go guards against drift.

Real APFS — supported features

Read paths
  • NX superblock decode (block size, fs_oid array, container OMAP oid, nx_keylocker, nx_flags).

  • Object map B-tree lookup (single-level and multi-level descend along the matching key path).

  • Per-volume APSB decode (volume name, root tree oid, volume OMAP).

  • Full FS-tree traversal at any B-tree height, with hashed-internal- node support (sealed volumes — values larger than 8 bytes are accepted; only the leading uint64 child OID is read for descent).

  • FS-tree leaf decoding for J_INODE_VAL, J_DIR_REC, J_FILE_EXTENT, J_XATTR, J_SIBLING_LINK, J_SNAP_META, J_SNAP_NAME, J_DSTREAM_ID.

  • File reading across multiple contiguous extents (extents sorted by logical offset; sparse holes zero-filled; trailing zero region honoured).

  • FindInode(oid)O(log n + k) lookup returning a fully populated Inode (Name + dataExtents). Implemented via two seekAndIterate passes (one over the inode's own records, one over the parent's drec range).

  • LookupInodeRecord(oid)O(log n) lookup returning just the J_INODE_VAL (Mode, Size, IsDir, ParentID).

  • seekAndIterate(target, visit) — B-tree forward iterator: binary- search descent positions the cursor at the first key ≥ target, then the visit callback walks every subsequent record in ascending order with early termination via (stop bool, err error).

  • ListSnapshots() enumerates every J_SNAP_META record in the volume's snapshot metadata tree.

  • LookupSnapshotByName(name) — fast path O(log n) binary search on J_SNAP_NAME records, then a second O(log n) seek on J_SNAP_META. Falls back to a linear scan when an image has J_SNAP_META records without matching J_SNAP_NAME side records.

  • OpenSnapshot(snap) returns a read-only *Volume exposing the volume as it was at snap.XID. Internally every virtual-oid resolution through the volume OMAP is clamped to the snapshot's XID.

  • ListXAttrs(inode) returns every embedded xattr; stream xattrs surface their stream id and size. ReadXAttrStream(xattr) fetches the payload of stream xattrs (concatenates J_FILE_EXTENT records keyed by xattr_obj_id).

  • ListSiblings(inode) returns every hard-link record (alternate parent + name) for the inode.

  • Optional hash verification for sealed volumes via Container.SetVerifyHashes(true). When enabled, every B-tree descent through a hashed internal node validates the child block's SHA-256 against the 32-byte digest stored after the child OID in btn_index_node_val. Disabled by default for performance.

  • Streaming reads via FileReaderAt(inode) and XAttrStreamReaderAt(xattr): both return an io.ReaderAt over the decoded bytes without buffering the whole payload. Bounded by the inode size / xattr stream size (reads past EOF return io.EOF); sparse holes return zeros without consuming I/O. Use these instead of ReadFile / ReadXAttrStream for large files (boot images, archives, kernel binaries) where the all-at-once allocation would waste memory.

  • Transparent file decompression via ReadFileTransparent(inode) covers the full decmpfs matrix:

    • Type 1 — uncompressed inline
    • Type 3 — zlib inline (with 0xFF raw passthrough)
    • Type 4 — zlib resource fork (HFS+ rsrc header + chunked block table)
    • Type 5 — raw resource fork (chunked, verbatim)
    • Type 7 — LZVN inline (raw payload wrapped in synthetic bvxn)
    • Type 8 — LZVN resource fork (offset-table layout)
    • Type 11 — LZFSE inline (block stream)
    • Type 12 — LZFSE resource fork (offset-table layout)

    Resource-fork variants automatically fetch the file's com.apple.ResourceFork xattr (embedded or stream). LZVN/LZFSE decoding is delegated to pkg/go-compressions/lzfse.

Write paths
  • FormatContainer(path, sizeBytes, label) writes a kext-mountable unencrypted APFS container (cell N-2 in COMPAT.md): full Apple-shape NX SB / spaceman / OMAP / APSB / FS-tree pre-population, including the four make_cat_root records the kext requires. Verified end-to-end via hdiutil attach + fsck_apfs -n + mount_apfs + read+write round-trip in TestCompatNative_KextMountsOurFormat.

  • FormatContainerEncrypted(path, sizeBytes, label, passphrase) and FormatContainerEncryptedGPT(path, totalSize, label, passphrase) write FileVault-style software-encrypted containers (cell F-2 in COMPAT.md). Output is structurally byte-identical to what diskutil apfs encryptVolume emits:

    • container + volume keybags encrypted at rest with AES-XTS-128 keyed on containerUUID || containerUUID (resp. volumeUUID || volumeUUID), 512-byte XTS sectors, tweak = paddr × 8 + sector_index;
    • ASN.1 DER VEKBLOB / KEKBLOB with HMAC-SHA256 keyed by SHA-256 of \x01\x16\x20\x17\x15\x05 || salt, computed over the [3] inner-keyblob envelope;
    • PBKDF2-SHA256 (100,000 iterations) protecting the KEK, AES-KW (RFC 3394) wrapping the VEK with the KEK;
    • five-ephemeral checkpoint (SPACEMAN, REAPER, SFQ_IP, SFQ_MAIN, and the optional INTEGRITY_META) matching what Apple writes;
    • APSB with APFS_FS_ONEKEY set, APFS_FS_UNENCRYPTED cleared, and APFS_INCOMPAT_ENC_ROLLED set;
    • GPT-wrapped variant emits an Apple_APFS partition entry (7C3457EF-…) so apfs.kext binds the synthesised container's physical store correctly.

    The recipe was reverse-engineered byte-by-byte against two independently-encrypted Apple reference DMGs across eight rounds of fsck_apfs / byte-diff bisection. fsck_apfs stops at the same stage and with the same status code (result=92 pl=5:1 pl=9:1 fp=30 fl=10) for both Apple's reference and our output — fsck reads the encrypted keybag's RAW bytes without decrypting and validates them as plaintext, which always fails for any encrypted APFS container, including Apple's own. Parity locked in by TestCompatFDE_FormatContainerEncrypted_FsckParityWithApple.

    apfsfde.Open(path, passphrase) round-trips the keybag chain and recovers the VEK end-to-end through the public API (TestFormatContainerEncrypted_ApfsfdeOpenRoundtrip).

  • WriteFileInPlace(inode, data) overwrites a file's already- allocated extents in place. No metadata cascade, no allocator: the file's extents must be contiguous from logical offset 0 and len(data) must fit within them; the inode's declared size is not updated.

  • WriteFile(inode, data) is the metadata-aware variant: in-place overwrite plus a patch of the inode's J_DSTREAM.size inside its FS-tree leaf, so subsequent reads see len(data) as the file's logical size.

  • OverwriteFile(oid, newData) is the size-changing variant. Three branches: (1) newData fits in the file's total existing capacity → in-place overwrite across the existing extents in logical order + size patch; (2) newData exceeds capacity → fill the existing extents head-to-tail then allocate one fresh contiguous extent at logical offset = old_total_capacity for the rest, insert J_FILE_EXTENT, update extent-ref tree, mark blocks allocated, bump apfs_fs_alloc_count, patch J_DSTREAM.{size,alloced_size, total_bytes_written}; (3) newData < current size → size patch only (use TruncateFile afterwards to free the trailing blocks). Multi-extent files are supported on both grow and in-place paths.

  • TruncateFile(oid, newSize) resizes the file. When newSize ≥ current size: only the inode's J_DSTREAM.size is patched (the file becomes sparse past the existing extents). When newSize < current size: extents that fall entirely past newSize are freed (chunk bitmap, ci_free_count, sm_free_count, extent-ref tree, and apfs_fs_alloc_count all updated); when newSize lands inside an extent, that extent is shrunk to its smallest block-aligned size that still holds newSize and only the trailing blocks within it are freed. POSIX-tolerant: alloced_size ≥ size invariant preserved when newSize is mid-block.

  • CreateFile(parentOID, name, data) allocates a fresh inode oid (from apsb.apfs_next_obj_id), allocates blocks for the payload, writes the file content, and inserts the four records J_INODE_VAL, J_FILE_EXTENT, J_DIR_REC, J_DSTREAM_ID, with multi-leaf FS-tree splits when the root would overflow.

  • CreateDirectory, CreateSymlink, CreateHardlink, CreateSparseFile, CreateFifo, CreateSocket, CreateBlockDevice, CreateCharDevice — full POSIX special-file set with the right inode mode + content (symlink target as inline data; device files with rdev).

  • DeleteFile(parentOID, name), DeleteDirectory(parentOID, name) — POSIX-style delete: drop records, free extents through markBlocksFreed, refresh parent nchildren, decrement APSB counters. Hardlinked files (nlink > 1) take a separate path: only the named alias's drec + matching J_SIBLING_LINK + J_SIBLING_MAP records are removed and the inode's nlink is decremented in place; the inode, its extents, xattrs and extent-ref records stay alive because the other names still reference them.

  • Rename(oldParentOID, oldName, newParentOID, newName) — drop old drec, insert new drec preserving file_id + drec val (incl. optional sibling_id xfield), patch inode parent_id, refresh both parents' nchildren. If the destination already exists AND refers to a regular file with nlink == 1, that file is deleted first (records dropped, extents freed, APSB counters updated) so the rename can complete — matching POSIX rename(2) semantics for the regular-file → regular-file case. Overwriting a directory or a hardlinked target is rejected.

  • SetXAttr(oid, name, payload) — embedded xattr (XATTR_DATA_EMBEDDED) for short payloads; SetXAttrStream(oid, name, payload) — stream xattr (separate dstream) for large payloads.

  • CreateSnapshot(name) — pick the container's current xid, CoW the live APSB to a fresh paddr (with o_oid = paddr, retyped to PHYSICAL), insert J_SNAP_META + J_SNAP_NAME records, materialise the OMAP snapshot tree (subtype APFS_OBJECT_TYPE_OMAP_SNAPSHOT), bump apfs_num_snapshots.

  • Container.AddVolume(label) extends a freshly-formatted single-volume container with additional volumes (up to Apple's max of 100). Each new volume gets 6 fresh metadata blocks past the format-time metadata (APSB + volume OMAP + leaf + FS-tree + snap- meta + extent-ref).

  • Container.Commit() promotes in-memory mutations to a fresh on-disk checkpoint at xid=N+1; refreshes APSB counters from a fresh FS-tree scan.

  • Concurrent stress testing: two heavy-load tests exercise the thread-safety contract under sustained pressure:

    • TestConcurrentStress_MixedOps — 16 creator + 16 reader + 4 mutator (create+grow+truncate+rename+delete) goroutines, ~1000 operations end-to-end. Validates leaf-rewrites don't crash concurrent readers and the rename-overwrite cross-call path holds the lock correctly through Rename → deleteFileLocked.
    • TestConcurrentStress_ReaderHeavy — 32 readers + 2 writers against a 100-file volume, ~16k+ parallel reads. Verifies the RWMutex actually lets readers run in parallel rather than serialising on the write lock.

    Both pass clean under go test -race.

  • Thread-safe Container / Volume API: every public method on Container and Volume is wrapped with sync.RWMutex (Container.mu). Mutating ops (CreateFile, Commit, Rename, DeleteFile, OverwriteFile, WriteFile, CreateSnapshot, …) take a write lock; read ops (ListInodes, FindInode, ReadFile, ListSnapshots, …) take a read lock so many readers run in parallel. Cross-method calls that previously chained two public methods (Rename → DeleteFile, WriteFile → WriteFileInPlace, LookupSnapshotByName → ListSnapshots) now go through unexported *Locked helpers to avoid recursive re-locking. The streaming readers (FileReaderAt, XAttrStreamReaderAt) take a snapshot of the inode's extent list under the lock at construction time; subsequent ReadAt calls do NOT re-lock — concurrent mutation after construction may serve stale (but valid) bytes. Verified by TestConcurrent_CreateAndRead (4 writer + 4 reader goroutines, 100 files end-to-end) and TestConcurrent_RenameAndDelete, both clean under go test -race.

  • TruncateFile shrink on multi-level FS-trees: the shrink path used to reject any volume whose FS-tree had been promoted to level≥1 (~30+ files). It now dispatches per dropped/shrunk J_FILE_EXTENT key through descendToLeafForKey + removeKeyFromLeaf / modifyLeafAtPaddrAndInsert, then refreshes the root index. Verified by TestTruncateFile_MultiLevelTree (50 files force level-1 FS-tree, target file shrunk to 100 bytes, unrelated files unaffected).

  • Commit ring-buffer wrap: Container.Commit() now wraps the checkpoint descriptor + data ring buffers when the next checkpoint wouldn't fit linearly past xp_desc_next / xp_data_next. Apple's apfs.kext / fsck pick the latest checkpoint by xid rather than position, so wrapping is transparent: the oldest checkpoint's slots are silently overwritten. Previously the writer errored out with "descriptor area exhausted" after ~3 commits; an arbitrary number of commits is now supported. Verified by TestCommit_RingBufferWrap (20 round-trip commits, last file readable after re-open).

  • Volume OMAP, snap-meta and extent-ref multi-level: each of the three PHYSICAL trees (apsb.apfs_omap_oid, apsb.apfs_snap_meta_tree_oid, apsb.apfs_extentref_tree_oid) starts as a single leaf, promotes to level-1 (split into two non-root leaves under an internal root) when the leaf would overflow, and promotes to level-2 in place at the APSB-pointed root paddr when the level-1 index would overflow. Promotion splits the level-1 children into two halves written as level-1 non-root internals at fresh paddrs, then rewrites the original root as a level-2 internal with two children. Subsequent inserts use a recursive descent (level-2 → level-1 → level-0) with leaf-split → L1-internal-split → L2-root index-add propagation. Reads route through traverseBTreeWithOmap which detects btreeFlagPhysical and descends child paddrs directly at any level. The extent-ref modify path also collapses an empty leaf back out of the index (free + drop) when its level-1 parent still has another sibling. Tree-wide totals on the root trailer (bt_key_count / bt_node_count) are recomputed on every rewrite by scanning the live leaves, so fsck-strict cross-checks stay clean. Level-3 is the next unimplemented jump (capacity at level-2: ≈ 122² × 108 ≈ 1.6M entries — far past typical disk-image scale). Verified by:

    • TestRootPromotion_FilesLevel2 (FS-tree + volume OMAP, 1500 single-extent files).
    • TestSnapMetaMultiLevel_PromotesAtThreshold (level-1, 200 records).
    • TestExtentRefMultiLevel_PromotesAtThreshold + TestExtentRefMultiLevel_DeleteAfterPromote (level-1, 130 files).
    • TestOMAP_PromotesToLevel2 (3000 files force OMAP level-2 via omapInternalRootCap=4).
    • TestSnapMeta_PromotesToLevel2 (800 J_SNAP_META records via snapMetaInternalCapEntries=4).
    • TestExtentRef_PromotesToLevel2 (700 files via extentRefInternalCapEntries=4). Test-only cap vars (omapInternalRootCap, snapMetaInternalCapEntries, extentRefInternalCapEntries) lower the natural per-block byte cap so the level-2 path fires under workloads tolerable in CI.

Limitations

  • Mount-backed Open mode is only used on macOS (proxies to hdiutil attach) and only when the file isn't a parseable real APFS container — i.e. for Apple-produced DMGs that the pure-Go reader can't (yet) consume directly.
  • LookupSnapshotByName falls back to a linear scan when an image carries J_SNAP_META records without matching J_SNAP_NAME side records (Apple's tmutil snapshot always emits both, so the fast path covers the common case).
  • T2 / Secure Enclave mediated keys are not supported (hardware access required).
  • fsck_apfs cannot structurally verify any encrypted APFS keybag — by design (fsck reads the encrypted bytes without decrypting and validates them as plaintext). Our FormatContainerEncrypted output is at parity with Apple's reference DMG under fsck. See F-2 in COMPAT.md.

Testing

extra_coverage_test.go carries smoke tests for the public entry points that aren't exercised through the larger end-to-end suites (OpenWithKeys unencrypted-hit and bogus-input miss). decmpfsDecodeRsrcChunk is covered across every branch (raw type with truncation, zlib empty / 0xFF passthrough / decode error / unsupported codec). OpenWithKeys is exercised on its per-key fall-through loop.

The mountpoint-dispatch branch at the top of every path-based driver method (the d.mountpoint != "" check that routes to the darwin hdiutil-attached mountpoint) is covered by constructing a driver{mountpoint: tempdir} synthetically.

Container/volume open entry points are exercised on their early-error branches: OpenContainer / OpenContainerRW on missing file + garbage content, OpenVolume with out-of-range index, OpenSnapshot with zero APSBOID and nonexistent (xid, oid). ReadFileTransparent is covered on a directory and on a plain file (no decmpfs xattr). Rename and DeleteDirectory carry tests for the apfsRootDirParent (parent_id=1) → apfsRootDirInoNum rebind branch.

Multi-level FS-tree paths are covered by bulk-creating ~150 files (no cap-injection var exists for the FS-tree the way it does for extent-ref / snap-meta) so subsequent writers descend through the non-leaf code path. Variants drive: every writer on the root dir, every writer on a non-root parent (refreshNonRootParentNchildren isRootDir=false branch), and DeleteFile on a hardlink alias under a multi-level tree (deleteHardlinkAlias multi-level descend).

Each public Volume writer (CreateFile / CreateDirectory / CreateSymlink / CreateSparseFile / SetXAttr / SetXAttrStream / TruncateFile / Create{Fifo,Socket,BlockDevice,CharDevice}) is also covered against its shared early-error preconditions: read-only container, snapshot-view (xidLimit != ∞), snapshot-guard not suppressed, empty name / empty target / empty payload. DeleteFile, DeleteDirectory, and Rename carry additional error-path tests (missing source, wrong type, non-empty directory, identical src+dst, multi-link source, directory destination, overwrite-regular-file success).

driver_filesystem_test.go exercises the path-based filesystem.Filesystem driver: Format (plain + encrypted + default-label + preexisting file), Open (success + non-APFS), the full MkDir / WriteFile / ReadFile / Stat / ListDir / Rename / DeleteFile / DeleteDir / ReadLink lifecycle, the read-only-fallback in openContainerAsFilesystem, plus targeted unit tests for drecTypeToDT, mountModeDeleteDir wipe-root, decmpfs{Zlib,LZFSE,LZVN}Inline edge cases, bytesReaderAt.ReadAt, OpenFromBlockDevice success, the fdeContainerBackend WriteAt/Close passthrough, partial-extent shrink via TruncateFile, snapshot delete that rewinds om_most_recent_snap, and the multi-level B-tree manipulation paths (snapMetaRemoveOneRecordMultiLevel, extentRefModifyLeafLevel2, rewriteExtentRefRootAtLevel, rewriteSnapMetaRootAtLevel) driven via cap-injected fixtures.

Documentation

Index

Constants

This section is empty.

Variables

View Source
var ErrHasSnapshot = errors.New("apfs: refusing to mutate a volume with active snapshots (would corrupt the frozen view)")

ErrHasSnapshot is returned by every writer-side entry point on a volume whose APSB reports `apfs_num_snapshots > 0`. The frozen snapshot view shares physical blocks (FS-tree root, extent-ref tree, etc.) with the live volume, so an in-place mutation would corrupt the snapshot. Until copy-on-write is implemented for every mutating path, callers must remove the snapshot first OR explicitly suppress the guard via `Volume.SetSuppressSnapshotGuard(true)`.

View Source
var ErrNoHeader = errors.New("apfs: no header")

ErrNoHeader is returned by Open when the file is neither a real APFS container nor (on darwin) a hdiutil-mountable image.

View Source
var ErrReadOnly = errors.New("apfs: container is read-only")

ErrReadOnly is returned by write paths when the container was opened without write capability (e.g. via OpenContainer which opens the file O_RDONLY, or via OpenContainerFromBackend with a read-only backend).

View Source
var ErrResizeUnsupported = errors.New("apfs: resize crosses chunk boundary (not implemented)")

ErrResizeUnsupported is returned when a Grow/Shrink would require allocating a fresh chunk_info_block (i.e. cross a 128 MiB chunk boundary). The single-chunk regime covers every test container in this package; a future iteration will lift the restriction.

View Source
var ErrShrinkUnsupported = errors.New("apfs: shrink would lose allocated extents (relocation not implemented)")

ErrShrinkUnsupported is returned by Shrink when at least one block at or above the requested new boundary is still marked allocated in the spaceman bitmap. Relocating those extents downward would require a pure-Go defragmenter, which is out of scope here.

View Source
var ErrUnsupported = errors.New("apfs: feature not implemented in this iteration")

ErrUnsupported is returned for code paths the parser knows exist on disk but does not yet implement (compressed extents, hashed FS-tree, etc.).

Functions

func Format

func Format(path string, sizeBytes int64, cfg FormatConfig) (filesystem.Filesystem, error)

Format creates a fresh APFS container at `path` and returns a filesystem.Filesystem wrapping its first volume. The file is auto-created if missing. For encrypted containers, populate cfg.Encryption with the passphrase.

func FormatAppleDmg

func FormatAppleDmg(path string, sizeBytes int64, cfg FormatConfig) error

FormatAppleDmg creates a real Apple DMG formatted as APFS using `hdiutil`. This is used by diskimage on macOS to produce images mountable by native tools.

func FormatContainer

func FormatContainer(path string, sizeBytes int64, volumeLabel string) error

FormatContainer writes a fresh, empty APFS container to the file at path. The file must already exist and be at least formatMetadataBlocks * 4 KiB in size; FormatContainer writes the metadata blocks at offsets 0 through (formatMetadataBlocks-1) * 4096 and leaves the remainder zeroed.

The returned container is the layout described in the package-level comment of format.go. Open the file with OpenContainer to verify it is a valid APFS volume; ListInodes will return an empty slice and ListSnapshots a nil slice.

func FormatContainerEncrypted

func FormatContainerEncrypted(path string, sizeBytes int64, volumeLabel string, passphrase []byte) error

FormatContainerEncrypted writes a fresh APFS container at path with FileVault-style software encryption enabled, protected by passphrase. The returned container has nx_flags |= NX_CRYPTO_SW set and a container + volume keybag pair the kext-style unlock walk (passphrase → PBKDF2-derived key → KEK → VEK) can recover.

The container is byte-compatible with apfsfde.Open at the keybag-chain level — see TestFormatContainerEncrypted_Roundtrips. It is NOT yet expected to mount via `hdiutil attach -stdinpass` because the volume metadata blocks (APSB, OMAPs, FS-tree root, …) are still plaintext; adding the AES-XTS-VEK layer on top is the next iteration.

func FormatContainerEncryptedGPT

func FormatContainerEncryptedGPT(path string, totalSize int64, volumeLabel string, passphrase []byte) error

FormatContainerEncryptedGPT writes an Apple_APFS-GPT-wrapped FileVault-style encrypted container to path. The output is a single file totalSize bytes large with a protective MBR + primary GPT at the head, the APFS container starting at byte 1 MiB (LBA 2048), and a GPT backup at the tail. apfs.kext recognises the Apple_APFS (7C3457EF-…) partition GUID in the GPT and binds the synthesised container's physical store correctly — without it the kext attaches the raw image but the container scheme device shows `+0 B` capacity and no inner volumes.

The APFS container itself is exactly what FormatContainerEncrypted produces. totalSize must accommodate the GPT overhead (~1 MiB at the head + ~16 KiB at the tail) plus the formatMetadataBlocks-class minimum APFS container size.

func Open

func Open(imagePath string, partIndex int) (filesystem.Filesystem, error)

Open opens an existing APFS container at `path`. Resolution order:

  1. Try real APFS (pure Go, all platforms).
  2. Try FileVault-encrypted real APFS (requires a passphrase, see OpenWithKeys for that variant).
  3. On darwin: hdiutil-mount the image and proxy operations to the mountpoint.

Returns ErrNoHeader when nothing matches.

The partIndex parameter is accepted for API compatibility but currently ignored (the pure-Go reader uses the container's first volume; GPT-partitioned images are handled transparently by OpenContainerAuto in fs.go).

func OpenFDE

func OpenFDE(imagePath string, passphrase []byte, partIndex int) (filesystem.Filesystem, error)

OpenFDE opens a FileVault 2-encrypted APFS container at imagePath, unlocking it with passphrase, and returns a Filesystem backed by the decrypted container.

If imagePath does not look like a FileVault container, an error is returned; the caller should fall back to Open.

func OpenFromBlockDevice

func OpenFromBlockDevice(dev BlockRW, partIndex int) (filesystem.Filesystem, error)

OpenFromBlockDevice opens an APFS container from any read-write block device satisfying BlockRW. Useful for QCOW2 or memory- backed backends. For FileVault-encrypted devices, use OpenFDE instead.

func OpenWithKeys

func OpenWithKeys(imagePath string, partIndex int, keys ...string) (filesystem.Filesystem, error)

OpenWithKeys opens an APFS container, trying the supplied keys against the FileVault keybag (if encrypted). For unencrypted images the keys are ignored. Falls back to hdiutil-mount on darwin.

Types

type BlockRW

type BlockRW interface {
	ReadAt(p []byte, off int64) (int, error)
	WriteAt(p []byte, off int64) (int, error)
	Close() error
}

BlockRW is the minimal interface for an arbitrary read-write block device accepted by OpenFromBlockDevice. Kept for source-level compatibility with callers that still pass custom backends.

type Container

type Container struct {
	// contains filtered or unexported fields
}

Container is an opened APFS container. It does not hold any keys — callers must unlock the underlying device with go-fde/apfs (or supply a non-encrypted reader) before passing it here.

func OpenContainer

func OpenContainer(path string) (*Container, error)

OpenContainer opens a real APFS container at path read-only.

func OpenContainerAuto

func OpenContainerAuto(path string) (*Container, error)

OpenContainerAuto opens an APFS container at path read-only, auto- detecting whether the file is naked APFS (NX SB at offset 0) or GPT-wrapped (NX SB inside the Apple_APFS partition). Use this when you don't know up-front whether you're looking at the output of our `FormatContainer` (naked) or Apple's `hdiutil create -fs APFS` (GPT).

func OpenContainerFromBackend

func OpenContainerFromBackend(r containerReader) (*Container, error)

OpenContainerFromBackend opens an APFS container from any ReadAt-capable backend. If the backend additionally satisfies containerWriter (WriteAt), write APIs are enabled. The caller retains ownership of the backend (Close on the returned container will not close it).

func OpenContainerRW

func OpenContainerRW(path string) (*Container, error)

OpenContainerRW opens an APFS container at path read-write so callers can invoke the mutating APIs (WriteFileInPlace, ...). Read paths behave identically to OpenContainer.

func OpenContainerRWAuto

func OpenContainerRWAuto(path string) (*Container, error)

OpenContainerRWAuto is OpenContainerAuto with read+write capability. When the image is GPT-wrapped, both reads and writes are offset into the Apple_APFS partition; the GPT header and protective MBR are untouched (writes outside the APFS partition would corrupt them).

func (*Container) AddVolume

func (c *Container) AddVolume(label string) (int, error)

AddVolume adds a new volume to an open APFS container. Returns the new volume's index in the container's fs_oid array (0-based; the existing first volume is at index 0). Caller does NOT need to invoke Commit afterward — the changes are persisted in place at block 0 + the current desc-area NX SB copy + the container OMAP leaf.

Limit: the container starts with one volume from FormatContainer and can grow to a total of 100. Each additional volume needs 6 fresh metadata blocks; AddVolume returns an error when the chunk bitmap can't supply 6 contiguous free blocks past the format-time metadata.

func (*Container) Close

func (c *Container) Close() error

Close releases the underlying file descriptor when one was opened by OpenContainer; OpenContainerFromBackend is a no-op.

func (*Container) Commit

func (c *Container) Commit() error

Commit promotes the in-memory state to a new on-disk checkpoint. Callers that have run CreateFile / WriteFile / etc. must Commit before macOS will mount the result; without a Commit the mutations live only at the FS-tree level and the (older) checkpoint that fsck uses to validate the container does not see them.

The Commit cascade:

  1. Compute the next checkpoint's xid (= current xid + 1).
  2. Compute the next slots in the desc + data ring buffers.
  3. Write a fresh SPACEMAN, REAPER, FQ_IP and FQ_MAIN at the new data slots, all carrying the new xid.
  4. Write a CheckpointMap at the next desc slot mapping each ephemeral OID to its new paddr.
  5. Write a new block-0 NX SB pointing at the new checkpoint, then replicate it at the desc slot AFTER the CheckpointMap.

All writes are sealed (Fletcher64) before WriteAt; the underlying backend's WriteAt is called sequentially in cascade order so a crash before block 0 is updated leaves the previous checkpoint intact.

func (*Container) Grow

func (c *Container) Grow(newSizeBytes int64) error

Grow extends the container to at least newSizeBytes. The growth is rejected when newSizeBytes is not strictly larger than the current container, when it would require a new spaceman chunk, or when the container is read-only. On success the NX superblock, spaceman, and chunk_info_block are all updated and the backing storage is extended where the backend supports Truncate.

func (*Container) OpenSnapshot

func (c *Container) OpenSnapshot(snap Snapshot) (*Volume, error)

OpenSnapshot returns a read-only Volume that exposes the volume as it was at the snapshot's transaction id. The frozen APSB is resolved through the container OMAP with xid = snap.XID, and every subsequent virtual-oid resolution inside that volume is similarly clamped via Volume.xidLimit so the snapshot's FS-tree, OMAP and snap_meta tree all read their frozen state.

func (*Container) OpenVolume

func (c *Container) OpenVolume(index int) (*Volume, error)

OpenVolume materialises the volume at the given index of Volumes(). It resolves the APSB through the container omap, then loads the volume's own omap and FS-tree root.

func (*Container) Resize

func (c *Container) Resize(newSizeBytes int64) error

Resize is a convenience dispatcher: it computes the direction from the current container size and forwards to Grow or Shrink. A no-op (newSizeBytes equal to the current size) returns nil.

func (*Container) SetVerifyHashes

func (c *Container) SetVerifyHashes(on bool)

SetVerifyHashes toggles SHA-256 verification of hashed B-tree children. When enabled, every traversal that descends a hashed internal node validates the child block's hash against the 32-byte digest stored after the child OID in the parent's value. Mismatches surface as errors from FindInode, LookupInodeRecord, ListInodes, ListSnapshots, etc.

Apple uses hashed B-trees for sealed (signed) volumes such as the macOS system volume; non-hashed trees are silently exempt.

func (*Container) Shrink

func (c *Container) Shrink(newSizeBytes int64) error

Shrink reduces the container to exactly newSizeBytes. The operation is rejected when newSizeBytes is not strictly smaller than the current container, when any block ≥ newBlocks is allocated, when it would require a new spaceman chunk (i.e. shrink to less than formatMetadataBlocks * blockSize), or when the container is read-only. On success the spaceman, chunk_info_block, and NX superblock all advertise the smaller geometry and (for a Truncate-capable backend) the underlying file is trimmed.

func (*Container) Volumes

func (c *Container) Volumes() []VolumeInfo

Volumes lists the volumes declared in the NX superblock fs_oid array. Names are NOT decoded here (that requires opening the volume).

type FDEConfig

type FDEConfig struct {
	Passphrase string
}

FDEConfig holds the passphrase for FileVault-encrypted format.

type FormatConfig

type FormatConfig struct {
	Label      string
	Encryption *FDEConfig
}

FormatConfig configures Format. The Encryption field accepts an `*FDEConfig` to produce a FileVault-encrypted APFS container.

type Inode

type Inode struct {
	ID       uint64 // file system object identifier
	ParentID uint64
	Name     string // populated when discovered through a parent's directory record
	Mode     uint16 // file mode (POSIX bits)
	Size     uint64 // logical file size (J_DSTREAM.size)
	IsDir    bool
	// contains filtered or unexported fields
}

Inode is the minimal projection of a J_INODE_VAL record exposed by this iteration of the parser.

type Sibling

type Sibling struct {
	OwnerID   uint64 // the inode this sibling refers to
	SiblingID uint64
	ParentID  uint64
	Name      string
}

Sibling is one J_SIBLING_LINK record: an alternate (parent, name) path for the inode it belongs to (i.e., a hard link).

type Snapshot

type Snapshot struct {
	XID        uint64
	APSBOID    uint64 // sblock_oid: the frozen APSB to open for read access
	Name       string
	CreateTime uint64
	ChangeTime uint64
	Inum       uint64
	Flags      uint32
}

Snapshot is one entry from the volume's snapshot metadata tree. It corresponds to a J_SNAP_META record (apfs_snap_meta_val): the frozen transaction id (XID), human-readable name, and the OID of the volume superblock captured by the snapshot.

type Volume

type Volume struct {
	// contains filtered or unexported fields
}

Volume is an opened volume inside a container.

func (*Volume) CreateBlockDevice

func (v *Volume) CreateBlockDevice(parentOID uint64, name string, perm uint16, rdev uint32) (uint64, error)

CreateBlockDevice creates a block-device node under (parentOID, name). `rdev` is the encoded device number (major / minor pair packed via the platform's `mkdev` macro — the kernel decodes it into st_rdev on stat(2)). The inode carries an INO_EXT_TYPE_RDEV xfield.

func (*Volume) CreateCharDevice

func (v *Volume) CreateCharDevice(parentOID uint64, name string, perm uint16, rdev uint32) (uint64, error)

CreateCharDevice creates a character-device node — same as CreateBlockDevice but with `mode = S_IFCHR` and drec type DT_CHR.

func (*Volume) CreateDirectory

func (v *Volume) CreateDirectory(parentOID uint64, name string, perm uint16) (uint64, error)

CreateDirectory creates a new directory inode under parentOID with the given name and POSIX permission bits. Returns the new directory's inode oid. The directory starts empty (nchildren = 0); its parent's nchildren is incremented to reflect the new dentry.

parentOID may be APFS_ROOT_DIR_PARENT (1) or APFS_ROOT_DIR_INO_NUM (2) to bind the dentry under the canonical root directory; both rebind to oid 2.

func (*Volume) CreateFifo

func (v *Volume) CreateFifo(parentOID uint64, name string, perm uint16) (uint64, error)

CreateFifo creates a named pipe under (parentOID, name) with the given permission bits. Returns the new FIFO's inode oid. FIFOs have no content blocks; the inode + drec are sufficient.

func (*Volume) CreateFile

func (v *Volume) CreateFile(parentOID uint64, name string, data []byte) (uint64, error)

CreateFile inserts a regular file under parentOID with the given name and content. Returns the new inode's object id on success.

Preconditions: the FS-tree root must currently be a single leaf node (true immediately after FormatContainer and for small populated volumes), and the parent oid must reference an existing directory (or be 1, the synthetic root). The container must be opened with write capability (OpenContainerRW).

func (v *Volume) CreateHardlink(targetOID, newParentOID uint64, newName string) error

CreateHardlink adds a second name (alias) for an existing file at targetOID under newParentOID. After the call the file has nlink=2: it is reachable both through its original drec (the one CreateFile installed) and through the freshly added drec named newName under newParentOID. Both names show the same inode number to the kernel (`stat` returns the same st_ino), and removing either link decrements nlink — the inode persists until nlink reaches 0.

Limits in this iteration:

  • the target's nlink must be exactly 1 (single-name file). The 1→2 transition retroactively creates J_SIBLING_LINK records for both the existing primary drec and the new alias.
  • both the original drec and the new alias must live in the same leaf so the in-place upsert path stays simple. This is the case when the existing drec's parent is the root dir AND newParentOID is also the root dir, which is the common test workload.

func (*Volume) CreateSnapshot

func (v *Volume) CreateSnapshot(name string) (uint64, error)

CreateSnapshot adds a snapshot named `name` to the current volume. Returns the snapshot's xid (which Apple uses both as the on-disk identifier and as the OMAP key for resolving the frozen APSB later via OpenSnapshot).

The snapshot's xid is taken from the container's nextXID counter, matching Apple's convention of stamping a snapshot with the xid the container will hand out at the next Commit. The Commit cascade then promotes this xid into the live state.

func (*Volume) CreateSocket

func (v *Volume) CreateSocket(parentOID uint64, name string, perm uint16) (uint64, error)

CreateSocket creates a UNIX-domain socket node. Same shape as a FIFO but with `mode = S_IFSOCK` and drec type DT_SOCK.

func (*Volume) CreateSparseFile

func (v *Volume) CreateSparseFile(parentOID uint64, name string, size uint64) (uint64, error)

CreateSparseFile creates an empty (all-zero) regular file under (parentOID, name) with the given declared logical size. The file reads as N bytes of zeros without consuming any physical blocks. Returns the new inode's oid.

Use case: pre-allocating a file's logical size without paying for the storage, e.g. for a sparse VM disk image. Subsequent `OverwriteFile` calls would replace the sparse hole with real data (the existing OverwriteFile path doesn't yet handle hole-to-real transitions; that would need additional extent-replacement logic).

func (v *Volume) CreateSymlink(parentOID uint64, name, target string) (uint64, error)

CreateSymlink creates a symbolic-link inode under parentOID with the given name. The target path is stored as the embedded payload of a `com.apple.fs.symlink` xattr — Apple's documented convention for APFS symlinks. Returns the new symlink's inode oid.

Mode is fixed to S_IFLNK | 0o777 (the canonical UNIX symlink mode); timestamps come from `time.Now()` and owner/group from `os.Geteuid()/Getegid()` to match what `apfs.kext` writes when the host user creates a symlink through the mounted volume.

func (*Volume) DebugWalkInodes

func (v *Volume) DebugWalkInodes(visit func(oid uint64, val []byte)) error

DebugWalkInodes calls visit(oid, rawJInodeVal) for every J_INODE record in the FS-tree, walking multi-level trees via traverseFSTree.

func (*Volume) DeleteDirectory

func (v *Volume) DeleteDirectory(parentOID uint64, name string) error

DeleteDirectory removes an empty directory at (parentOID, name). Like POSIX `rmdir(2)`, refuses non-empty directories — counts every J_DIR_REC under the target oid first and errors if the count is non-zero. Refuses to remove the canonical root or private-dir oids.

On success: drops the directory's J_INODE (+ any J_XATTR records it owned), drops the J_DIR_REC under (parentOID, name), refreshes the parent's nchildren, and decrements `apfs_num_directories` (APSB +0xC0).

func (*Volume) DeleteFile

func (v *Volume) DeleteFile(parentOID uint64, name string) error

DeleteFile removes the file at (parentOID, name) from the volume. For nlink==1 files: all four (inode, drec, file_extent, dstream_id) records are removed; the file's extent blocks are freed; the parent's nchildren is decremented; APSB counters (apfs_num_files, apfs_fs_alloc_count) are updated. For nlink>1 files: only this name's drec + its matching J_SIBLING_LINK + J_SIBLING_MAP records are dropped, and the inode's nlink is decremented in place. The inode, its extents, xattrs and dstream_id stay because the other names still reference them.

func (*Volume) DeleteSnapshot

func (v *Volume) DeleteSnapshot(name string) error

DeleteSnapshot removes the snapshot named `name` from the volume. Returns os.ErrNotExist when no snapshot of that name exists.

The snapshot's frozen APSB block is freed, the J_SNAP_NAME + J_SNAP_META records are dropped from the snap-meta tree, and `apsb.apfs_num_snapshots` is decremented. If `name` was the most-recent snapshot, the volume OMAP's `om_most_recent_snap` is rolled back to the new maximum xid (or 0 when no snapshots remain).

func (*Volume) FileReaderAt

func (v *Volume) FileReaderAt(inode Inode) (io.ReaderAt, error)

FileReaderAt returns an io.ReaderAt that streams the bytes of the regular file at `inode` on demand without buffering the whole payload. Bounded to `inode.Size`: reads past the end return `io.EOF`. Sparse holes return zeros without consuming I/O.

The returned reader holds a snapshot of the inode's extent list at the time of the call; subsequent writes to the file will not be visible.

Returns an error if `inode` is a directory.

func (*Volume) FindInode

func (v *Volume) FindInode(oid uint64) (Inode, error)

FindInode locates an inode by object id and returns a fully populated Inode (Mode, Size, IsDir, ParentID, Name and dataExtents).

Implementation: two B-tree seeks via seekAndIterate, both O(log n + k) where k is the number of records visited.

  • Phase 1 seeks (oid, type=0) and walks forward while j_key.oid == oid, gathering J_INODE_VAL, J_FILE_EXTENT (and could trivially gather xattrs / sibling links — those expose dedicated APIs already).
  • Phase 2 seeks (parent_id, jTypeDirRec) and walks forward while the j_key prefix stays at that (parent_id, type), looking for the drec whose value's file_id field matches our oid; that drec carries the directory entry name.

Requires the FS-tree leaves to be sorted in canonical APFS order (by compareFSKey) — synthetic test images built with this package's helpers honour that automatically.

func (*Volume) ListInodes

func (v *Volume) ListInodes() ([]Inode, error)

ListInodes walks the entire FS-tree and returns every J_INODE_VAL projected through Inode. Names and data extents discovered in the same traversal are folded into the matching inode. This is now a full traversal — every leaf contributes, regardless of B-tree height.

func (*Volume) ListSiblings

func (v *Volume) ListSiblings(owner Inode) ([]Sibling, error)

ListSiblings walks the FS-tree and returns every J_SIBLING_LINK record that names inode owner.ID. Each sibling is a hard-link path (parent + name) pointing at owner.

func (*Volume) ListSnapshots

func (v *Volume) ListSnapshots() ([]Snapshot, error)

ListSnapshots opens the volume's snapshot metadata tree and returns every J_SNAP_META record it contains. Returns an empty slice when the volume has no snapshots (apfs_snap_meta_tree_oid = 0).

func (*Volume) ListXAttrs

func (v *Volume) ListXAttrs(owner Inode) ([]XAttr, error)

ListXAttrs walks the FS-tree and returns every J_XATTR record attached to inode owner.ID. Stream xattrs are reported with empty EmbeddedValue and non-zero StreamID; fetch their payload via `ReadXAttrStream` or `XAttrStreamReaderAt`.

func (*Volume) LookupInodeRawValue

func (v *Volume) LookupInodeRawValue(oid uint64) ([]byte, error)

LookupInodeRawValue returns the raw J_INODE_VAL bytes for the inode with the given oid. Used by debug helpers; production code should prefer LookupInodeRecord / FindInode.

func (*Volume) LookupInodeRecord

func (v *Volume) LookupInodeRecord(oid uint64) (Inode, error)

LookupInodeRecord locates the J_INODE_VAL for the given oid using B-tree binary search through the FS-tree (O(log n) reads instead of the linear scan performed by FindInode). The returned Inode has Mode, Size, IsDir and ParentID populated; Name and dataExtents are NOT populated (those records live under different keys in the tree). Use FindInode when full inode information is required.

func (*Volume) LookupSnapshotByName

func (v *Volume) LookupSnapshotByName(name string) (Snapshot, error)

LookupSnapshotByName resolves a snapshot by its human-readable name.

Fast path (Apple-spec-compliant images): the snapshot metadata tree is expected to carry a J_SNAP_NAME record alongside every J_SNAP_META. J_SNAP_NAME records sort alphabetically by name within their (oid=0, type=jTypeSnapName) range, so a single seekAndIterate finds the entry in O(log n) and yields the matching XID; a second seekAndIterate then resolves the J_SNAP_META at that XID to populate the full Snapshot.

Fallback path (synthetic test images that only carry J_SNAP_META): if the fast path returns no match, ListSnapshots is scanned linearly. This keeps the helper compatible with images built incrementally without the J_SNAP_NAME side records.

Returns os.ErrNotExist when neither path turns up a match.

func (*Volume) Name

func (v *Volume) Name() string

Name returns the volume name (apfs_volname_t, NUL-trimmed UTF-8).

func (*Volume) OverwriteFile

func (v *Volume) OverwriteFile(oid uint64, newData []byte) error

OverwriteFile replaces the entire content of the file at `oid` with `newData`. The file's logical size becomes `len(newData)`. The file must be a regular file with at least one existing extent.

Allocation policy:

  • newData fits in the existing extents' total capacity: payload is written across them in logical-offset order, the partial tail of the boundary extent is zeroed, and the inode size is updated. No new extents are allocated, no extents are freed.
  • newData EXCEEDS existing capacity: the head fills the existing extents, then a single fresh contiguous extent is allocated at logical offset = old total capacity for the tail. The new J_FILE_EXTENT is inserted, chunk bitmap + ci_free_count + sm_free_count + extent-ref tree + apfs_fs_alloc_count are all updated, and the inode's J_DSTREAM `alloced_size` is bumped.
  • newData is smaller than the existing logical size: the inode's size is reduced; trailing extent blocks stay allocated. Use `TruncateFile(oid, len(newData))` afterwards if you also want to free the no-longer-used blocks.

Multi-extent files are supported on both the in-place and the grow paths.

func (*Volume) ReadFile

func (v *Volume) ReadFile(inode Inode) ([]byte, error)

ReadFile reads the contents of a regular file by concatenating every J_FILE_EXTENT for the inode in logical-offset order. Sparse holes (gaps between extents) are zero-filled. The trailing zero region implied by inode.Size > sum(extent.length) is also zero-filled.

func (*Volume) ReadFileTransparent

func (v *Volume) ReadFileTransparent(inode Inode) ([]byte, error)

ReadFileTransparent reads a regular file and decompresses it on the fly when the file carries a com.apple.decmpfs xattr (transparent file compression). For uncompressed files it falls back to ReadFile.

Supported decmpfs types:

inline  — 1 (uncompressed), 3 (zlib), 7 (LZVN), 11 (LZFSE)
rsrc-fork — 4 (zlib), 5 (raw), 8 (LZVN), 12 (LZFSE)

Resource-fork variants automatically fetch the file's com.apple.ResourceFork xattr (embedded or stream).

func (*Volume) ReadXAttrStream

func (v *Volume) ReadXAttrStream(x XAttr) ([]byte, error)

ReadXAttrStream returns the payload of an extended attribute stored as a stream (xattrFlagDataStream). For embedded xattrs it returns the payload already in x.EmbeddedValue. It collects every J_FILE_EXTENT keyed by the stream's xattr_obj_id, sorts them by logical offset, and concatenates (zero-filling sparse holes, trimming to x.StreamSize).

func (*Volume) Rename

func (v *Volume) Rename(oldParentOID uint64, oldName string, newParentOID uint64, newName string) error

Rename moves the entry at (oldParentOID, oldName) to (newParentOID, newName). The two (parent, name) pairs must differ (we reject the no-op case). Both parents may be either APFS_ROOT_DIR_PARENT (1) or APFS_ROOT_DIR_INO_NUM (2); they're rebound to oid 2 either way.

Overwrite semantics: if `(newParentOID, newName)` already exists AND points at a regular file with nlink==1, that file is deleted (records dropped, extents freed, APSB counters updated) before the rename proceeds — matching POSIX `rename(2)` for the regular-file case. Overwriting a directory or a multi-link inode is rejected.

Limit: single-link source inodes only (nlink == 1). Multi-link rename requires updating the corresponding J_SIBLING_LINK record's stored (parent_id, name) and is left as follow-up work.

func (*Volume) SetSuppressSnapshotGuard

func (v *Volume) SetSuppressSnapshotGuard(on bool)

SetSuppressSnapshotGuard toggles the snapshot-write guard on this volume handle. The default (false) is the safe choice: writers return ErrHasSnapshot when num_snapshots > 0. Setting to true tells the package "I know what I'm doing — proceed with in-place writes even if it corrupts the snapshot". Used by test fixtures that need post-snapshot writes for byte-diff diagnostics.

func (*Volume) SetXAttr

func (v *Volume) SetXAttr(oid uint64, name string, payload []byte) error

SetXAttr sets (or replaces) an embedded extended attribute on the inode at oid. Payload sizes up to a few hundred bytes are typical for xattrs like `com.apple.FinderInfo` (32 bytes), `com.apple.metadata:*` (a few hundred bytes), and `com.apple.quarantine` (variable). For payloads that don't fit in a single FS-tree leaf alongside the rest of the inode's records, callers should fall back to a stream xattr; that path isn't exposed by this writer yet.

Replace semantics: if a J_XATTR record with the same (oid, name) already exists, its value is overwritten. Reads via `ListXAttrs` after a Commit see the new payload.

func (*Volume) SetXAttrStream

func (v *Volume) SetXAttrStream(targetOID uint64, name string, payload []byte) error

SetXAttrStream sets (or replaces) a stream-mode extended attribute on the inode at `targetOID`. Use this for large xattr payloads (Time Machine's `com.apple.metadata:_kTimeMachineSnapshotMetadata`, quarantine attributes for big files, etc.) that don't fit inline in a single FS-tree leaf alongside the rest of the inode's records. The payload is written to a fresh extent at the volume's next free block; the extent + a J_DSTREAM_ID + a J_XATTR record (with the stream flag) are inserted into the FS-tree under a fresh `xattr_obj_id`.

If a stream xattr with the same name already exists, its old payload extent is freed, its extent-ref record is removed, and its J_FILE_EXTENT + J_DSTREAM_ID records (under the previous xattr_obj_id) are dropped before the new ones are inserted. The J_XATTR record itself is replaced via upsert semantics. An existing embedded xattr of the same name is rejected — call SetXAttr (or delete first) for that case.

func (*Volume) TruncateFile

func (v *Volume) TruncateFile(oid uint64, newSize uint64) error

TruncateFile sets the file at `oid` to exactly `newSize` bytes.

Semantics:

  • newSize ≥ existing logical size: only the inode's size field is bumped. The file becomes sparse past the existing extents — reads past them return zero. No new extents are allocated.
  • newSize < existing logical size: the inode's size is reduced AND extents that fall entirely past `newSize` are freed (chunk bitmap, ci_free_count, sm_free_count, extent-ref tree, and apfs_fs_alloc_count are all updated). When `newSize` lands in the middle of an extent, that extent is shrunk to the smallest block-aligned size that still contains `newSize`; the tail blocks of that extent are freed individually.

fsck tolerates the case where the surviving extent's last block contains bytes past `newSize`: the documented invariant is `alloced_size ≥ size`, not `alloced_size == size`.

func (*Volume) WriteFile

func (v *Volume) WriteFile(inode Inode, data []byte) error

WriteFile is iteration B of the read/write roadmap: it performs the in-place data overwrite of WriteFileInPlace AND patches the inode's J_DSTREAM.size field on disk so subsequent reads see len(data) as the file's logical size. The FS-tree leaf carrying the J_INODE_VAL is re-emitted to the same physical block (the inode value's length is unchanged), so this call does not trigger a checkpoint cascade.

Returns ErrReadOnly when the container has no write capability; returns the same capacity / sparsity errors as WriteFileInPlace; and returns an error when the on-disk inode value has no J_DSTREAM xfield to update (most regular files do).

func (*Volume) WriteFileInPlace

func (v *Volume) WriteFileInPlace(inode Inode, data []byte) error

WriteFileInPlace overwrites the contents of inode with data, writing directly into the physical extents already allocated to the file. Returns ErrReadOnly when the container has no write capability; returns a descriptive error when the file's extent layout cannot accommodate the requested write.

The caller is expected to read inode via FindInode (which populates dataExtents); a stale Inode whose extents no longer match the on-disk layout will silently corrupt unrelated blocks. Read first, write second, in the same session, with no intervening mutation.

func (*Volume) XAttrStreamReaderAt

func (v *Volume) XAttrStreamReaderAt(x XAttr) (io.ReaderAt, error)

XAttrStreamReaderAt returns an io.ReaderAt that streams the bytes of the stream-extent xattr `x` on demand. Bounded to `x.StreamSize`. For embedded xattrs (no stream id) the returned reader wraps `x.EmbeddedValue` via `bytes.Reader`-like semantics: a single in-memory copy with the same ReaderAt interface.

type VolumeInfo

type VolumeInfo struct {
	Index uint32
	OID   uint64 // virtual oid of the APSB
	Name  string // populated lazily by OpenVolume
}

VolumeInfo describes a volume found inside a container.

type XAttr

type XAttr struct {
	OwnerID       uint64
	Name          string
	Flags         uint16
	EmbeddedValue []byte
	StreamID      uint64 // valid when Flags & xattrFlagDataStream != 0
	StreamSize    uint64
}

XAttr is one extended-attribute record decoded from the FS-tree. EmbeddedValue is non-nil when the attribute payload is stored inline in the J_XATTR_VAL record (xattrFlagDataEmbedded). Stream xattrs (whose data lives in a separate J_DSTREAM_ID chain) leave EmbeddedValue nil and expose StreamID + StreamSize so the caller can fetch them later.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL