Computer Storage
Perhaps the fundamental aspect of a computing paradigm, especially for it's social and political implications, is how to store, identify, organize, and control access to the data we are processing.
There's a war out there, old friend, a world war. And it's not about who's got the most bullets. It's about who controls the information, what we see and hear, how we work, what we think. It's all about the information. -- Sneakers (1992) @ 01:54
The State of Storage
Block Storage
Hard disk drives, solid-state drives, floppy disks, and optical disks all present a block-storage interface. Each block holds a fixed number of bits, each drive holds a fixed number of blocks, and a block is identified by a natural number. Access control is usually purely physical, and while some drives integrate encryption that encryption is of dubious quality and usually provides access to all blocks or nothing.
Files
The current dominant paradigm for data storage is files, named variable-length chunks of data stored along with a variety of metadata, most universally the latest modification time. Files (and hierarchical file systems) originated on mainframe computers in the 1960s, as a way to delineate what users could access and modify what data. Because the primary means of access control is physical, and the system enforces this secondary means, the administrators of a system can access any file stored on it.
Files may be identified by their host and location on that host, but different hosts encode the location differently. Copying a file to another location on the same host or to another host results in a new file. This makes referencing the data in one file from another file ad-hoc and error-prone.
Mutual Storage Paradigms
For a storage paradigm suitable to be for mutual computing:
- Copying objects from one computer to another must not change the global identifier for that object.
- Multiple operators/computers must be able to edit an object without needing to check with each other. If edits conflict, the operator(s) must be able to reconcile those conflicts.
- Physical possession of an object should not grant the ability to read it (it should be encrypted), and must not grant the ability to edit it (it must be cryptographically authenticated).
- Edits to objects should be able to propagate between computers on any medium (networks, flash drives, exports to files that can be sent with popular messaging applications).
Versioned Nodes
Versioned Nodes is a storage paradigm that bundles data into nodes that also contain a list of links to other nodes. There are two node types, blobs and versions. Blobs can only be linked to immutably (they are typically identified by a hash), while versions may be linked to either immutably (by a signature) or mutably (by a public key). Versions also record their parent versions.
Access control is capability-based. If you open a node with just a read-capability, you get a read-capability to all the nodes it links to. If also use a write-capability, you may additionally get some write capabilities for those linked nodes, if they are provided. Despite being immutable, blobs can also have write-capabilities, which only reveal the write-capabilities of nodes they reference.
Versioned nodes themselves do not provide a nice interface for the operator, as they do not have human-readable names. However, they are amenable to a variety of naming and search systems that are challenging for file systems - in particular because references to a node always remain valid.