c4

package module
v0.0.0-...-7ecc481 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 17, 2025 License: MIT Imports: 9 Imported by: 0

README

C4 ID - Universally Unique and Consistent Identification

Go Report Card CI codecov GoDoc MIT License Release

import "github.com/bgyss/c4"

This is a Go package that implements the C4 ID system SMPTE standard ST 2114:2017. C4 IDs are universally unique and consistent identifiers that standardize the derivation and formatting of data identification so that all users independently agree on the identification of any block or set of blocks of data.

C4 IDs are 90 character long strings suitable for use in filenames, URLs, database fields, or anywhere else that a string identifier might normally be used. In ram C4 IDs are represented in a 64 byte "digest" format.

Features
  • A single C4 id can represent multiple files.
  • C4 ids are unique, random, and unforgeable.
  • C4 ids are identical for the same file in different locations or points in time.
  • A network connection is not required to generate C4 ids.
  • A C4 id can be used in filenames, URLs, json and xml.
  • C4 ids can be selected easily with double click (a problem for many unique identifiers).
  • Easily discover C4 ids in arbitrary text with a simple regex c4[1-9A-HJ-NP-Za-km-z]{88}
  • Naming files by their C4 id automatically deduplicates them.
Comparison of Encodings

C4 is the shortest self identifying SHA-512 encoding and is the only standardized encoding. To illustrate, the following is the SHA-512 of "foo" in hex, base64 and c4 encodings:

# encoding     length   id
  hex          135:     sha512-f7fbba6e0636f890e56fbbf3283e524c6fa3204ae298382d624741d0dc6638326e282c41be5e4254d8820772c5518a2c5a8c0c7f7eda19594a7eb539453e1ed7
  base64        95:     sha512-9/u6bgY2+JDlb7vzKD5STG+jIErimDgtYkdB0NxmODJuKCxBvl5CVNiCB3LFUYosWowMf37aGVlKfrU5RT4e1w==
  c4            90:     c43inc2qGhSWQUMRvDMW6GAjJnRFY5sxq399wcUcWLTuPai84A2QWTfYu1gAW8f5FmZFGeYpLsSPyrSUh9Ao3J68Cc
Example Usage
package main

import (
  "fmt"
  "strings"

  "github.com/bgyss/c4"
)

func main() {

  // Generate a C4 ID for any contiguous block of data...
  id := c4.Identify(strings.NewReader("alfa"))
  fmt.Println(id)
  // output: c43zYcLni5LF9rR4Lg4B8h3Jp8SBwjcnyyeh4bc6gTPHndKuKdjUWx1kJPYhZxYt3zV6tQXpDs2shPsPYjgG81wZM1

  // Generate a C4 ID for any number of non-contiguous blocks...
  var ids c4.IDs
  var inputs = []string{"alfa", "bravo", "charlie", "delta", "echo", "foxtrot", "golf", "hotel", "india"}
  for _, input := range inputs {
    ids = append(ids, c4.Identify(strings.NewReader(input)))
  }
  fmt.Println(ids.ID())
  // output: c435RzTWWsjWD1Fi7dxS3idJ7vFgPVR96oE95RfDDT5ue7hRSPENePDjPDJdnV46g7emDzWK8LzJUjGESMG5qzuXqq
}

Installation & Building

This project includes a Nix flake for reproducible builds and development environments across multiple platforms.

Quick Start
# Build the CLI tool
nix build

# Run directly without installing
echo "Hello World" | nix run

# Enter development environment
nix develop
Supported Platforms

The flake supports building for all major platforms:

  • aarch64-darwin (Apple Silicon macOS)
  • aarch64-linux (ARM64 Linux)
  • i686-linux (32-bit x86 Linux)
  • x86_64-darwin (Intel macOS)
  • x86_64-linux (64-bit x86 Linux)
Platform-Specific Builds
# Build for specific platform
nix build .#packages.x86_64-linux.c4
nix build .#packages.aarch64-linux.c4

# View all available platforms
nix flake show --all-systems
Development Environment

The development shell includes all necessary tools:

# Enter the development environment
nix develop

# Available tools in the shell:
go build ./cmd/c4          # Build the CLI tool
go test ./...              # Run all tests
go test -cover ./...       # Run tests with coverage
golangci-lint run          # Run linter
CI/CD Integration
# Run all checks (build, test, lint)
nix flake check

# Run checks for all platforms
nix flake check --all-systems
direnv Integration (Optional)

For automatic environment loading when entering the directory:

# Allow direnv (if you have direnv installed)
direnv allow

# The development environment will now load automatically
Traditional Go Build
# Build manually with Go
go build -o c4 ./cmd/c4

# Run tests
go test ./...

Testing & Quality

Running Tests
# Run all tests
go test ./...

# Run tests with coverage
go test -cover ./...

# Run tests with verbose output
go test -v ./...

# Run benchmarks
go test -bench=. -run=^$ ./...
Coverage Reports

The project maintains high test coverage across all packages:

  • Overall Coverage: ~78%
  • Core Package: 92.4%
  • Store Package: 80.9%
  • Manifest Package: 72.9%
  • Util Package: 100%
Performance Benchmarks

Performance benchmarks are run continuously to track regression:

# Run core performance benchmarks
go test -bench=BenchmarkIdentify -benchmem ./...

# Run memory allocation benchmarks
go test -bench=BenchmarkMemoryAllocation -benchmem ./...

# Run platform-specific benchmarks
go test -bench=. -benchmem ./...
Code Quality

The project uses several tools to maintain code quality:

  • golangci-lint: Comprehensive linting with 20+ enabled linters
  • gosec: Security vulnerability scanning
  • govulncheck: Dependency vulnerability checking
  • gofmt: Code formatting consistency
  • go vet: Static analysis for potential issues
Continuous Integration

All code is validated through GitHub Actions CI/CD:

  • ✅ Multi-platform testing (Linux, macOS, Windows)
  • ✅ Multi-version Go support (1.20, 1.21)
  • ✅ Automated security scanning
  • ✅ Performance regression testing
  • ✅ Coverage reporting
  • ✅ Dependency vulnerability checks
  • ✅ Nix build validation

Releases

Current release: v0.8.1

Videos:

C4 ID Whitepaper

Contributing

Contributions are welcome. The following are some general guidelines for project organization. If you have questions please open an issue.

The master branch holds the current release, and older releases can be found by their version number. The dev branch represents the development branch from which bug and feature branches should be taken. Pull requests that are accepted will be merged against the dev branch and then pushed to versioned releases as appropriate.

Feature and bug branches should follow the github integrated naming convention. Features should be given the new tag, and bugs the bug tag. Here is an example of checking out a feature branch:

> git checkout dev
Switched to branch 'dev'
Your branch is up-to-date with 'origin/dev'.
> git checkout -b new/#99_some_github_issue
...

If a branch for an issue is already listed in this repository, then check it out and work from it.

License

This software is released under the MIT license. See LICENSE for more information.

Documentation

Overview

This package implements the C4 ID system **SMPTE standard ST 2114:2017**. C4 IDs are a universally unique and consistent identifiers standardize the derivation and formatting of data identification so that all users independently agree on the identification of any given block or set of blocks of data.

C4 IDs are 90 character long strings suitable for use in filenames, URLs, database fields, or anywhere else that a string identifier might normally be used.

In ram C4 IDs are represented in a 64 byte "digest" format.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type ID

type ID [64]byte

// Encoder generates an ID for a contiguous bock of data.

type Encoder struct {
	err error
	h   hash.Hash
}

// NewIDEncoder makes a new Encoder.

func NewEncoder() *Encoder {
	return &Encoder{
		h: sha512.New(),
	}
}

// Write writes bytes to the hash that makes up the ID.

func (e *Encoder) Write(b []byte) (int, error) {
	return e.h.Write(b)
}

// ID returns the ID for the bytes written so far.

func (e *Encoder) ID() (id ID) {
	copy(id[:], e.h.Sum(nil))
	return id
}

// Reset the encoder so it can identify a new block of data.

func (e *Encoder) Reset() {
	e.h.Reset()
}

ID represents a C4 ID.

func Identify

func Identify(src io.Reader) (id ID)

Generate an id from an io.Reader

func Parse

func Parse(source string) (ID, error)

Parse parses a C4 ID string into an ID.

func (ID) Cmp

func (l ID) Cmp(r ID) int

Cmp compares two IDs. There are 3 possible return values.

-1 : Argument id is less than calling id.

0 : Argument id and calling id are identical.

+1 : Argument id is greater than calling id.

Comparison is done on the actual numerical value of the ids. Not the string representation.

func (ID) Digest

func (id ID) Digest() []byte

Digest returns the C4 Digest of the ID.

func (ID) IsNil

func (id ID) IsNil() bool

func (ID) Less

func (id ID) Less(idArg ID) bool

Returns true if B less than A in: A.Less(B)

func (ID) MarshalJSON

func (id ID) MarshalJSON() ([]byte, error)

func (ID) String

func (id ID) String() string

String returns the standard string representation of a C4 id.

func (ID) Sum

func (l ID) Sum(r ID) ID

func (*ID) UnmarshalJSON

func (id *ID) UnmarshalJSON(data []byte) error

type IDs

type IDs []ID

func (IDs) ID

func (d IDs) ID() ID

func (IDs) Len

func (d IDs) Len() int

func (IDs) Less

func (d IDs) Less(i, j int) bool

func (IDs) Swap

func (d IDs) Swap(i, j int)

func (IDs) Tree

func (d IDs) Tree() Tree

Provides a computed C4 Tree for the slice of digests

type Identifiable

type Identifiable interface {
	ID() ID
}

Identifiable is an interface that requires an ID() method that returns the c4 ID of the of the object.

type Tree

type Tree []byte

`Tree` implements an ID tree as used for calculating IDs of non-contiguous sets of data. A C4 ID Tree is a type of merkle tree except that the list of IDs is sorted. According to the standard this is done to insure that two identical lists of IDs always resolve to the same ID.

func NewTree

func NewTree(s []ID) Tree

NewTree creates a new Tree from a DigestSlice, and copies the digests into the tree. However, it does not compute the tree.

func ReadTree

func ReadTree(r io.Reader) (Tree, error)

func (Tree) Bytes

func (t Tree) Bytes() []byte

Bytes returns the tree as a slice of bytes.

func (Tree) ID

func (t Tree) ID() (id ID)

The ID of the list (i.e. the level 0 of the tree).

func (Tree) Len

func (t Tree) Len() int

Number of IDs in the list (i.e. the length of the bottom row of the tree).

func (Tree) String

func (t Tree) String() string

Directories

Path Synopsis
cmd
c4 command
## C4 ID
## C4 ID
github.com/bgyss/c4/store is a package for representing generic C4 storage.
github.com/bgyss/c4/store is a package for representing generic C4 storage.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL