Documentation
¶
Overview ¶
Package blob provides a file archive format optimized for random access via HTTP range requests against OCI registries.
Archives consist of two OCI blobs:
- Index blob: FlatBuffers-encoded file metadata enabling O(log n) lookups
- Data blob: Concatenated file contents, sorted by path for efficient directory fetches
The package implements fs.FS and related interfaces for stdlib compatibility.
Index ¶
- Constants
- Variables
- func Create(ctx context.Context, dir string, indexW, dataW io.Writer, opts ...CreateOption) error
- type Blob
- func (b *Blob) CopyDir(destDir, prefix string, opts ...CopyOption) error
- func (b *Blob) CopyTo(destDir string, paths ...string) error
- func (b *Blob) CopyToWithOptions(destDir string, paths []string, opts ...CopyOption) error
- func (b *Blob) Entries() iter.Seq[EntryView]
- func (b *Blob) EntriesWithPrefix(prefix string) iter.Seq[EntryView]
- func (b *Blob) Entry(path string) (EntryView, bool)
- func (b *Blob) IndexData() []byte
- func (b *Blob) Len() int
- func (b *Blob) Open(name string) (fs.File, error)
- func (b *Blob) ReadDir(name string) ([]fs.DirEntry, error)
- func (b *Blob) ReadFile(name string) ([]byte, error)
- func (b *Blob) Reader() *file.Reader
- func (b *Blob) Save(indexPath, dataPath string) error
- func (b *Blob) Stat(name string) (fs.FileInfo, error)
- func (b *Blob) Stream() io.Reader
- type BlobFile
- type ByteSource
- type ChangeDetection
- type Compression
- type CopyOption
- func CopyWithCleanDest(enabled bool) CopyOption
- func CopyWithOverwrite(overwrite bool) CopyOption
- func CopyWithPreserveMode(preserve bool) CopyOption
- func CopyWithPreserveTimes(preserve bool) CopyOption
- func CopyWithReadAheadBytes(limit uint64) CopyOption
- func CopyWithReadConcurrency(n int) CopyOption
- func CopyWithWorkers(n int) CopyOption
- type CreateBlobOption
- func CreateBlobWithChangeDetection(cd ChangeDetection) CreateBlobOption
- func CreateBlobWithCompression(compression Compression) CreateBlobOption
- func CreateBlobWithDataName(name string) CreateBlobOption
- func CreateBlobWithIndexName(name string) CreateBlobOption
- func CreateBlobWithMaxFiles(n int) CreateBlobOption
- func CreateBlobWithSkipCompression(fns ...SkipCompressionFunc) CreateBlobOption
- type CreateOption
- type Entry
- type EntryView
- type Option
- type SkipCompressionFunc
Constants ¶
const ( CompressionNone = blobtype.CompressionNone CompressionZstd = blobtype.CompressionZstd )
Re-export compression constants.
const ( DefaultIndexName = "index.blob" DefaultDataName = "data.blob" )
Default file names for blob archives.
const DefaultMaxFiles = 200_000
DefaultMaxFiles is the default limit used when no MaxFiles option is set.
Variables ¶
var ( // ErrHashMismatch is returned when file content does not match its hash. ErrHashMismatch = blobtype.ErrHashMismatch // ErrDecompression is returned when decompression fails. ErrDecompression = blobtype.ErrDecompression // ErrSizeOverflow is returned when byte counts exceed supported limits. ErrSizeOverflow = blobtype.ErrSizeOverflow )
Sentinel errors re-exported from internal/blobtype.
var ( // ErrSymlink is returned when a symlink is encountered where not allowed. ErrSymlink = errors.New("blob: symlink") // ErrTooManyFiles is returned when the file count exceeds the configured limit. ErrTooManyFiles = errors.New("blob: too many files") )
Sentinel errors specific to the blob package.
var DefaultSkipCompression = write.DefaultSkipCompression
DefaultSkipCompression returns a SkipCompressionFunc that skips small files and known already-compressed extensions.
var EntryFromViewWithPath = blobtype.EntryFromViewWithPath
EntryFromViewWithPath creates an Entry from an EntryView with the given path.
Functions ¶
func Create ¶
Create builds an archive from the contents of dir.
Files are written to the data writer in path-sorted order, enabling efficient directory fetches via single range requests. The index is written as a FlatBuffers-encoded blob to the index writer.
Create builds the entire index in memory; memory use scales with entry count and path length. Rough guide: ~30-50MB for 100k files with ~60B average paths (entries plus FlatBuffers buffer).
Create walks dir recursively, including all regular files. Empty directories are not preserved. Symbolic links are not followed.
The context can be used for cancellation of long-running archive creation.
Types ¶
type Blob ¶
type Blob struct {
// contains filtered or unexported fields
}
Blob provides random access to archive files.
Blob implements fs.FS, fs.StatFS, fs.ReadFileFS, and fs.ReadDirFS for compatibility with the standard library.
func New ¶
func New(indexData []byte, source ByteSource, opts ...Option) (*Blob, error)
New creates a Blob for accessing files in the archive.
The indexData is the FlatBuffers-encoded index blob and source provides access to file content. Options can be used to configure size and decoder limits.
func (*Blob) CopyDir ¶
func (b *Blob) CopyDir(destDir, prefix string, opts ...CopyOption) error
CopyDir extracts all files under a directory prefix to a destination.
If prefix is "" or ".", all files in the archive are extracted.
Files are written atomically using temp files and renames by default. CopyWithCleanDest clears the destination prefix and writes directly to the final path. This is more performant but less safe.
Parent directories are created as needed.
By default:
- Existing files are skipped (use CopyWithOverwrite to overwrite)
- File modes and times are not preserved (use CopyWithPreserveMode/Times)
- Range reads are pipelined (when beneficial) with concurrency 4 (use CopyWithReadConcurrency to change)
func (*Blob) CopyTo ¶
CopyTo extracts specific files to a destination directory.
Parent directories are created as needed.
By default:
- Existing files are skipped (use CopyWithOverwrite to overwrite)
- File modes and times are not preserved (use CopyWithPreserveMode/Times)
- Range reads are pipelined (when beneficial) with concurrency 4 (use CopyWithReadConcurrency to change)
func (*Blob) CopyToWithOptions ¶
func (b *Blob) CopyToWithOptions(destDir string, paths []string, opts ...CopyOption) error
CopyToWithOptions extracts specific files with options.
func (*Blob) Entries ¶
Entries returns an iterator over all entries as read-only views.
The returned views are only valid while the Blob remains alive.
func (*Blob) EntriesWithPrefix ¶
EntriesWithPrefix returns an iterator over entries with the given prefix as read-only views.
The returned views are only valid while the Blob remains alive.
func (*Blob) Entry ¶
Entry returns a read-only view of the entry for the given path.
The returned view is only valid while the Blob remains alive.
func (*Blob) IndexData ¶
IndexData returns the raw FlatBuffers-encoded index data. This is useful for creating new Blobs with different data sources.
func (*Blob) Open ¶
Open implements fs.FS.
Open returns an fs.File for reading the named file. The returned file verifies the content hash on Close (unless disabled by WithVerifyOnClose) and returns ErrHashMismatch if verification fails. Callers must read to EOF or Close to ensure integrity; partial reads may return unverified data.
func (*Blob) ReadDir ¶
ReadDir implements fs.ReadDirFS.
ReadDir returns directory entries for the named directory, sorted by name. Directory entries are synthesized from file paths—the archive does not store directories explicitly.
func (*Blob) ReadFile ¶
ReadFile implements fs.ReadFileFS.
ReadFile reads and returns the entire contents of the named file. The content is decompressed if necessary and verified against its hash.
func (*Blob) Reader ¶
Reader returns the underlying file reader. This is useful for cached readers that need to share the decompression pool.
func (*Blob) Save ¶
Save writes the blob archive to the specified paths.
Uses atomic writes (temp file + rename) to prevent partial writes on failure. Parent directories are created as needed.
type BlobFile ¶
type BlobFile struct {
*Blob
// contains filtered or unexported fields
}
BlobFile wraps a Blob with its underlying data file handle. Close must be called to release file resources.
func CreateBlob ¶
func CreateBlob(ctx context.Context, srcDir, destDir string, opts ...CreateBlobOption) (*BlobFile, error)
CreateBlob creates a blob archive from srcDir and writes it to destDir.
By default, files are named "index.blob" and "data.blob". Use CreateBlobWithIndexName and CreateBlobWithDataName to override.
Returns a BlobFile that must be closed to release file handles.
type ByteSource ¶
ByteSource provides random access to the data blob.
Implementations exist for local files (*os.File) and HTTP range requests.
type ChangeDetection ¶
type ChangeDetection uint8
ChangeDetection controls how strictly file changes are detected during creation.
const ( ChangeDetectionNone ChangeDetection = iota ChangeDetectionStrict )
type Compression ¶
type Compression = blobtype.Compression
Compression identifies the compression algorithm used for a file.
type CopyOption ¶
type CopyOption func(*copyConfig)
CopyOption configures CopyTo and CopyDir operations.
func CopyWithCleanDest ¶
func CopyWithCleanDest(enabled bool) CopyOption
CopyWithCleanDest clears the destination prefix before copying and writes directly to the final path (no temp files). This is only supported by CopyDir.
func CopyWithOverwrite ¶
func CopyWithOverwrite(overwrite bool) CopyOption
CopyWithOverwrite allows overwriting existing files. By default, existing files are skipped.
func CopyWithPreserveMode ¶
func CopyWithPreserveMode(preserve bool) CopyOption
CopyWithPreserveMode preserves file permission modes from the archive. By default, modes are not preserved (files use umask defaults).
func CopyWithPreserveTimes ¶
func CopyWithPreserveTimes(preserve bool) CopyOption
CopyWithPreserveTimes preserves file modification times from the archive. By default, times are not preserved (files use current time).
func CopyWithReadAheadBytes ¶
func CopyWithReadAheadBytes(limit uint64) CopyOption
CopyWithReadAheadBytes caps the total size of buffered group data. A value of 0 disables the byte budget.
func CopyWithReadConcurrency ¶
func CopyWithReadConcurrency(n int) CopyOption
CopyWithReadConcurrency sets the number of concurrent range reads. Use 1 to force serial reads. Zero uses the default concurrency (4).
func CopyWithWorkers ¶
func CopyWithWorkers(n int) CopyOption
CopyWithWorkers sets the number of workers for parallel processing. Values < 0 force serial processing. Zero uses automatic heuristics. Values > 0 force a specific worker count.
type CreateBlobOption ¶
type CreateBlobOption func(*createBlobConfig)
CreateBlobOption configures CreateBlob.
func CreateBlobWithChangeDetection ¶
func CreateBlobWithChangeDetection(cd ChangeDetection) CreateBlobOption
CreateBlobWithChangeDetection sets the change detection mode.
func CreateBlobWithCompression ¶
func CreateBlobWithCompression(compression Compression) CreateBlobOption
CreateBlobWithCompression sets the compression algorithm.
func CreateBlobWithDataName ¶
func CreateBlobWithDataName(name string) CreateBlobOption
CreateBlobWithDataName sets the data file name (default: "data.blob").
func CreateBlobWithIndexName ¶
func CreateBlobWithIndexName(name string) CreateBlobOption
CreateBlobWithIndexName sets the index file name (default: "index.blob").
func CreateBlobWithMaxFiles ¶
func CreateBlobWithMaxFiles(n int) CreateBlobOption
CreateBlobWithMaxFiles limits the number of files in the archive.
func CreateBlobWithSkipCompression ¶
func CreateBlobWithSkipCompression(fns ...SkipCompressionFunc) CreateBlobOption
CreateBlobWithSkipCompression adds skip compression predicates.
type CreateOption ¶
type CreateOption func(*createConfig)
CreateOption configures archive creation.
func CreateWithChangeDetection ¶
func CreateWithChangeDetection(cd ChangeDetection) CreateOption
CreateWithChangeDetection controls whether the writer verifies files did not change during archive creation. The zero value disables change detection to reduce syscalls; enable ChangeDetectionStrict for stronger guarantees.
func CreateWithCompression ¶
func CreateWithCompression(c Compression) CreateOption
CreateWithCompression sets the compression algorithm to use. Use CompressionNone to store files uncompressed, CompressionZstd for zstd.
func CreateWithMaxFiles ¶
func CreateWithMaxFiles(n int) CreateOption
CreateWithMaxFiles limits the number of files included in the archive. Zero uses DefaultMaxFiles. Negative means no limit.
func CreateWithSkipCompression ¶
func CreateWithSkipCompression(fns ...SkipCompressionFunc) CreateOption
CreateWithSkipCompression adds predicates that decide to store a file uncompressed. If any predicate returns true, compression is skipped for that file. These checks are on the hot path, so keep them cheap.
type Option ¶
type Option func(*Blob)
Option configures a Blob.
func WithDecoderConcurrency ¶
WithDecoderConcurrency sets the zstd decoder concurrency (default: 1). Values < 0 are treated as 0 (use GOMAXPROCS).
func WithDecoderLowmem ¶
WithDecoderLowmem sets whether the zstd decoder should use low-memory mode (default: false).
func WithMaxDecoderMemory ¶
WithMaxDecoderMemory limits the maximum memory used by the zstd decoder. Set limit to 0 to disable the limit.
func WithMaxFileSize ¶
WithMaxFileSize limits the maximum per-file size (compressed and uncompressed). Set limit to 0 to disable the limit.
func WithVerifyOnClose ¶
WithVerifyOnClose controls whether Close drains the file to verify the hash.
When false, Close returns without reading the remaining data. Integrity is only guaranteed when callers read to EOF.
type SkipCompressionFunc ¶
type SkipCompressionFunc = write.SkipCompressionFunc
SkipCompressionFunc returns true when a file should be stored uncompressed. It is called once per file and should be inexpensive.
Source Files
¶
Directories
¶
| Path | Synopsis |
|---|---|
|
Package cache provides content-addressed caching for blob archives.
|
Package cache provides content-addressed caching for blob archives. |
|
disk
Package disk provides a disk-backed cache implementation.
|
Package disk provides a disk-backed cache implementation. |
|
cmd
|
|
|
profiler
command
|
|
|
Package http provides a ByteSource backed by HTTP range requests.
|
Package http provides a ByteSource backed by HTTP range requests. |
|
internal
|
|
|
batch
Package batch provides batch processing for reading multiple entries from a blob archive.
|
Package batch provides batch processing for reading multiple entries from a blob archive. |
|
blobtype
Package blobtype defines shared types used across the blob package and its internal packages.
|
Package blobtype defines shared types used across the blob package and its internal packages. |
|
file
Package file provides internal file reading operations for the blob package.
|
Package file provides internal file reading operations for the blob package. |
|
sizing
Package sizing provides safe size arithmetic and conversions to prevent overflow.
|
Package sizing provides safe size arithmetic and conversions to prevent overflow. |
|
write
Package write provides internal file writing operations for the blob package.
|
Package write provides internal file writing operations for the blob package. |