Method · Folder Merkle
Folder anchoring by Merkle root.
The single-file receipt is the elementary instrument the office issues. The present document describes the protocol's extension to working sets — directories of files anchored as a single object, with the structural option to later disclose any one file and prove its membership without revealing the rest. The page is published as defensive prior art and is itself anchored on issuance.
1 · The single-file case and its limit
An Orphograph receipt for a single file proves that the exact byte sequence whose SHA-256 fingerprint matches the recorded digest existed by the time of the recorded Bitcoin block. That proposition is sound, and for many uses it is sufficient. It is not, however, sufficient for the common working-set case, in which a customer holds a directory — a job's photographs, a manuscript's chapters, a matter's exhibits, a contractor's daily progress images — and wishes for a single attestation to cover the entire set. The customer further wishes to retain the option of later disclosing only one item from the set, with mathematical proof that the disclosed item belonged to the anchored set and no other inference about the remaining contents.
Issuing one receipt per file does not yield this guarantee. A pile of receipts does not, in itself, bind the files to one another; the customer cannot show that the disclosed item was part of the same submission as the others without revealing the others. The protocol therefore requires a structure that commits to the set as a whole and yet supports per-element disclosure. The Merkle tree, as standardized in RFC 6962, is that structure.
2 · The Merkle tree as the natural extension
A binary hash tree is the construction in which each leaf is the hash of a record, each internal node is the hash of the concatenation of its two children, and the root is a single fixed-size value whose definition transitively binds every leaf. The construction is the structure underlying Git's commit identifiers, the Certificate Transparency append-only log defined in RFC 6962 and updated in RFC 9162, the per-block transaction commitment in the Bitcoin block header itself, the content-addressed object graph of IPFS, and the transparency log behind Sigstore. The construction is neither novel nor proprietary; it is the standard primitive for committing to a set with the option of per-element proof.
The OpenTimestamps protocol already operates a Merkle tree internally. The calendars aggregate the digests submitted to them during an aggregation interval, construct a tree over those submissions, and commit only the root to the Bitcoin chain in a single transaction. Each submitter then receives a per-digest proof path. The folder-anchoring extension applies the same primitive one level higher: the customer builds a tree over the files in a folder, submits only the resulting root to the calendars, and receives an OpenTimestamps receipt that anchors that root to the chain. Diagrammatically: leaf hashes → folder root → OpenTimestamps aggregation → Bitcoin block.
3 · Domain separation
Leaves and internal nodes are not hashed under the same input shape. A leaf is prefixed with a single byte of value 0x00 before being passed to SHA-256; an internal node is prefixed with a single byte of value 0x01. The convention is the one adopted by RFC 6962 and is termed domain separation.
The argument for domain separation is one of second-preimage resistance. Without it, the byte sequence that constitutes an internal-node value would be indistinguishable, in its hash input, from the byte sequence of a file whose contents happened to match. An adversary could in principle construct a file whose bytes are the concatenation of two child hashes and present that file as a leaf whose hash collides with a known internal node, thereby corrupting an inclusion proof. The single-byte prefix removes the ambiguity at negligible cost.
4 · Path binding
The leaf hash binds the file's relative path together with the file's content fingerprint. The construction is:
leaf = SHA-256( 0x00 || utf8(rel_path) || 0x00 || sha256(file_bytes) )
The interior 0x00 separator between the path and the file digest is a second domain marker and removes ambiguity between paths of differing length. The consequence is deliberate: two receipts produced for the same byte sequence under different filenames yield different leaves and therefore different roots. The protocol treats the relative path as evidentiary metadata, not as an arbitrary label. A photograph filed as 2026-05-20/IMG_0042.cr3 and the same bytes filed as misc.cr3 are, under the folder receipt, distinguishable artifacts.
Customers who consider the path itself to be sensitive may anchor under a hash-only labeling scheme, in which the recorded path is the file's own digest. The protocol does not require any particular naming policy; it records, faithfully, the policy the customer chose.
5 · Canonical ordering
Two parties producing trees over the same set of files must produce the same root, or the receipt is not reproducible by an independent verifier. The protocol fixes ordering as the lexicographic UTF-8 byte order of the POSIX relative path. The path is normalized to forward slashes; case is preserved; Unicode is left unnormalized beyond the requirement that the bytes used in the leaf hash are the bytes recorded in the manifest.
The default exclude list is recorded in the manifest so a verifier reproduces the exact set the customer anchored. The default excludes are: .DS_Store, Thumbs.db, desktop.ini, .git/*, node_modules/*, __pycache__/*, *.tmp, *.swp, *.swo, and ~$*. A customer who anchors a folder under a non-default exclude list records that list in the manifest; the verifier reproduces ordering and exclusion from the manifest alone.
6 · Odd-level promotion
At each level of the tree, pairs of adjacent nodes are combined to produce the next level. When the level contains an odd number of nodes, the final node has no sibling. RFC 6962 specifies that the lone last node is promoted to the next level unchanged; it is not paired with itself, and it is not duplicated.
The argument for promotion over duplication is, again, second-preimage resistance. Under a duplicate-the-last-node rule, an adversary can construct two distinct sets of leaves that yield the same root, by appending a particular crafted leaf to one set and exploiting the duplication. The Bitcoin protocol's own historical Merkle rule used duplication and inherited the corresponding ambiguity. RFC 6962 documents the lesson, and the folder-anchoring extension adopts the RFC 6962 rule. The customer's manifest records the rule it used; the verifier applies the same rule.
7 · What the customer submits, what crosses the wire
The browser, or the local command-line tool, computes every leaf hash and every interior node hash on the customer's machine. The data that crosses the wire to the office consists of the manifest — the ordered list of relative paths, per-file SHA-256 fingerprints, per-file leaf hashes, and per-file byte sizes — together with the resulting root. The file content itself is never transmitted. The office never holds, copies, or observes the bytes of any file in the anchored set.
A trade-off attends this design: the path names and the file sizes are part of the receipt that the office persists, and are therefore visible to the office in the way ordinary receipt metadata is visible. Customers for whom path names are themselves sensitive may anchor under hash-labeled paths, as noted above, in which case the manifest carries no human-readable filename information at all.
8 · Inclusion proofs and selective disclosure
The structure's central evidentiary property is this. For any file in the anchored set, the customer can produce an inclusion proof consisting of approximately log₂ N sibling hashes — one per level of the tree — that, when combined with the file's own leaf hash in the documented order, reconstructs the anchored root. A third party in possession of the original file, the inclusion proof, and the OpenTimestamps receipt for the root can verify that this specific file belonged to the anchored set without learning anything about the other files in the set.
The proof's size scales logarithmically with the set's size. For a folder of ten thousand files, the proof is approximately fourteen sibling hashes — well under a single kilobyte, small enough to attach to a single email or dispute filing.
9 · Tamper detection
A single-bit modification to any file in the set produces a different file digest, which produces a different leaf hash, which propagates upward and produces a different root. A renaming of any file — even one preserving the bytes — produces a different leaf, because the path is bound into the leaf. An insertion or deletion of a file shifts the leaves of the tree and likewise alters the root. The structure is intentionally rigid: any modification of the anchored set produces a mismatch that any independent verifier observes upon recomputation.
The rigidity is the point. A folder receipt is useful precisely because it cannot be partially renounced; the customer cannot, after the fact, claim that one file was always different. The set is what was anchored, and the set is what the verifier reconstructs.
10 · Verification independence
The standalone verifier extends naturally to the folder case. The customer, or a third party reviewing the receipt, holds the original file, the inclusion proof, the manifest, and the OpenTimestamps proof for the root. The verifier recomputes the file's leaf hash from its bytes and its recorded path, walks the inclusion proof to reconstruct the root, and checks that root against the OpenTimestamps Bitcoin commitment. No call to the office is required for verification at any point. The receipt is the instrument; the chain is the trust anchor.
The independent verifier at /verify-js.html handles the single-file case in browser cryptography today; the folder extension preserves the same property.
11 · Trade-offs and the v1 boundary
The first published version of the folder-anchoring extension makes the following choices explicit. The tree is a single-pass RFC 6962 binary hash tree over the canonically ordered set of leaves; no incremental-append log is provided in v1. Empty folders are not supported as leaves; a folder must contain at least one file that survives the exclude list. Symbolic links are not followed; a symlink encountered during enumeration is recorded in the manifest as a symlink with its target string but is not dereferenced. Browser-side cryptography reads each file into memory once during hashing; a folder containing files larger than browser memory permits is handled by the desktop command-line tool, which streams each file in megabyte chunks and never exceeds a fixed working-set size.
The boundaries are stated plainly because the boundary is part of the instrument. A receipt that overreached the v1 protocol would be a worse receipt, not a better one.
12 · Publication date and prior-art status
This document was first published on 2026-05-20. It is anchored to the Bitcoin chain on publication using the protocol it describes; the receipt identifier for this revision is recorded in the footer below once issuance completes. Subsequent revisions are anchored separately; their receipts are appended to the same footer record. The combination of detailed disclosure, MIT licensing of the verifier and protocol code, and a Bitcoin-anchored publication date establishes this page as prior art for the methods described. Any later filing — by any party — that claims novelty on the construction described here is rebuttable by reference to this publication.
Citations and verification
Publication receipt for this revision: pending Bitcoin commitment (the receipt id is recorded once issuance completes).