Method · Filenames

Why filenames are not stored.

The protocol distinguishes, in a manner deliberate and structural, between the content of a file and the labels by which a file is referred to in conversation. The content is what the receipt attests. The labels are not received, and are therefore not stored, and could not be disclosed even were disclosure compelled. The following describes the reasoning, the consequences, and the alternatives considered and rejected.

1 · A filename is a label, not content

A file's content is its bytes — the sequence of octets the operating system delivers when the file is opened for reading. The fingerprint computed by the protocol is a function of those bytes; no other property of the file influences it. Two files whose bytes are identical produce identical fingerprints; two files whose bytes differ by a single byte produce, with overwhelming probability, distinct fingerprints. The fingerprint is content-defined.

A filename, by contrast, is a label imposed by a filesystem, by a sender, by an editor's "Save As" dialogue, or by any one of a long sequence of intermediate processes. A filename may be altered without changing the bytes. The bytes may be moved to a new filename without alteration. The filename "manuscript-final.docx" and the filename "draft-3.docx" can refer to the same bytes; they can also refer to wholly different bytes. The label and the content stand in no necessary relation to each other.

An attestation about the bytes is therefore, by its nature, mute on the question of the label. The receipt records what was received; what was received was the fingerprint of the bytes; the label was not part of that submission and not part of that record.

2 · The structural privacy contract

The privacy posture of the office is intended to be structural rather than promissory. A promissory privacy contract is one in which an operator pledges not to disclose information already held; the strength of the pledge depends on the operator's continued willingness and ability to honour it, on the absence of compulsion, and on the operator's continued solvency. A structural privacy contract is one in which the information is not held; the strength of the contract depends only on the architecture, and the architecture cannot be coerced into surrendering what it does not contain.

Filenames are excluded from the receipt on the structural model. The browser performs the hashing locally and submits only the resulting digest. The filename is never transmitted; the office does not receive it; the office cannot record it; the office cannot return it under subpoena, cannot lose it in a breach, cannot inadvertently include it in a log, and cannot accidentally render it on a public page. The contract is enforced by absence rather than by undertaking.

This is a stronger guarantee than the one customarily offered. A guarantee not to disclose stored information is contingent on the operator. A guarantee that the information was never received is independent of the operator. The first relies on the operator's good faith and continued existence; the second relies only on the architecture, which is published and which the customer is invited to inspect.

3 · The practical consequence: identical fingerprints, distinct names

A direct consequence of the structural exclusion is that two receipts for the same byte sequence under different filenames produce the same fingerprint and, accordingly, the same proof of existence. The protocol cannot distinguish between them, and the office holds that this is the correct behaviour rather than an inadequacy. The fingerprint is the answer to the question the protocol is willing to answer; if the bytes are the same, the answer is the same.

Where the customer requires distinction — for example, where the same image is anchored under two business identifiers — the customer may attach a client_label, a short human-readable string of the customer's choosing, which is recorded on the receipt and presented in the vault listing. The client label is a deliberate, optional, customer-controlled annotation; it is not the filename, and the customer chooses whether and how to populate it. The label permits per-customer disambiguation without imposing any disclosure of the underlying file's identity beyond what the customer has chosen to record.

A receipt is therefore self-describing on its own terms: the fingerprint defines the bytes; the client label, if present, records what the customer chose to call this attestation. The label is not, and is not represented to be, the file's filename. It is whatever short string the customer found useful at the moment of anchoring.

4 · The cost of not storing filenames

The structural exclusion has a cost, and the office is unwilling to leave it understated. A customer who anchors many files cannot, from the office's records alone, recover the human-readable name by which any given receipt corresponds to any given file. The receipt records a fingerprint, not a name; to make the receipt useful in conversation — "the receipt for the third draft of chapter seven" — the customer must maintain their own correspondence between receipt identifiers and the files those identifiers describe.

The /account vault is provided as a convenience for this purpose. Receipts issued to an authenticated account are stored under the account's HMAC-keyed identifier; the vault lists those receipts, their created_at timestamps, their fingerprints, and whatever client_label the customer chose to attach. The customer may, at their option, populate the client label with as much or as little human-readable detail as they wish; the office's holdings remain the same in either case.

For customers whose practice involves many files, a sensible discipline is to keep a private ledger — a text file, a spreadsheet, or a directory listing — that maps the customer's chosen filenames to the corresponding receipt identifiers. The ledger lives on the customer's own machine, under the customer's own control, and exits the office's threat model entirely.

5 · The alternative architecture, considered and rejected

The alternative — to accept and persist filenames alongside receipts — was considered at length and rejected. The enumeration below records the failure modes that would attend that architecture, and explains why each is judged to outweigh the convenience the alternative would confer.

Disclosure surface. Every datum the office holds is a datum the office may be compelled, by lawful process or by adversarial breach, to surrender. Filenames frequently contain matter the customer would consider sensitive: project names, client names, internal codenames, dated draft markers, jurisdictional indicators, intimate annotations of the customer's working habits. The architecture that holds them is the architecture that may be made to disclose them. The architecture that does not hold them is the architecture that cannot.

Search-warrant exposure. A subpoena directed at the office asks for what the office holds. If the office holds filenames, the office must produce them; if the office does not, the office cannot. The distinction matters in the design of the instrument. A receipt that travels with a filename attached travels with a beacon for forensic correlation; a receipt that does not, does not.

Accidental publication. Every operational error — a misconfigured log, a debug page left visible, a mis-applied access control, a vendor's index of public objects — risks exposing what the office holds. Holdings that include filenames are holdings that may be inadvertently published in human-readable form. The architecture that excludes them excludes the failure mode entirely.

Breach radius. The blast radius of a hypothetical full-database compromise is determined by what the database contains. A database of fingerprints, opaque to human reading without correlation against an external file held by the customer, conveys far less to an attacker than a database of fingerprints accompanied by the names by which their referents are commonly known. The architecture chooses the smaller blast radius.

Convenience misalignment. The convenience the alternative would confer — that the customer not maintain their own mapping — is real but small; the discipline of a local ledger is modest, and customers who do not require human-readable disambiguation incur no cost at all. The privacy guarantee, by contrast, is a feature the customer cannot replicate at home. The architecture allocates each property to the party best positioned to provide it.

6 · A customer-facing summary

The customer who requires human-readable file identification keeps that mapping on their own machine. The office issues an attestation about the bytes; the customer maintains, in private, the correspondence between the bytes and whatever name those bytes go by in their working practice. The receipt is the instrument; the local ledger is the index to the instrument. Both are necessary; both serve their respective parties; neither is held by the party that does not need it.

The receipt remains complete as an instrument under this division. The fingerprint is sufficient to identify the bytes; the chain commitment is sufficient to establish the time; the customer's local ledger is sufficient to record the name. The three together compose a record that is verifiable without the office, private without reliance on the office's good faith, and durable without dependence on the office's continued existence.

7 · Publication date and prior-art status

The page is first published on 2026-05-19. It is anchored to the Bitcoin chain on publication using the protocol it describes; the receipt identifier for this revision is recorded in the footer below. The combination of detailed disclosure, MIT licensing of the verifier, and a chain-anchored publication date establishes this page as prior art for the structural-privacy framing and the rejection of the filename-storage alternative.

Citations and verification

Publication receipt for this revision: pending Bitcoin commitment (the receipt id is recorded once issuance completes).