mutual computing project

storage
Login

storage

Computer Storage

Perhaps the fundamental aspect of a computing paradigm, especially for it's social and political implications, is how to store, identify, organize, and control access to the data we are processing.

There's a war out there, old friend, a world war. And it's not about who's got the most bullets. It's about who controls the information, what we see and hear, how we work, what we think. It's all about the information. -- Sneakers (1992) @ 01:54

The State of Storage

Block Storage

Hard disk drives, solid-state drives, floppy disks, and optical disks all present a block-storage interface. Each block holds a fixed number of bits, each drive holds a fixed number of blocks, and a block is identified by a natural number. Access control is usually purely physical, and while some drives integrate encryption that encryption is of dubious quality and usually provides access to all blocks or nothing.

Files

The current dominant paradigm for data storage is files, named variable-length chunks of data stored along with a variety of metadata, most universally the latest modification time. Files (and hierarchical file systems) originated on mainframe computers in the 1960s, as a way to delineate what users could access and modify what data. Because the primary means of access control is physical, and the system enforces this secondary means, the administrators of a system can access any file stored on it.

Files may be identified by their host and location on that host, but different hosts encode the location differently. Copying a file to another location on the same host or to another host results in a new file. This makes referencing the data in one file from another file ad-hoc and error-prone.

Mutual Storage Paradigms

For a storage paradigm suitable to be for mutual computing:

Versioned Nodes

Versioned Nodes is a storage paradigm that bundles data into nodes that also contain a list of links to other nodes. There are two node types, blobs and versions. Blobs can only be linked to immutably (they are typically identified by a hash), while versions may be linked to either immutably (by a signature) or mutably (by a public key). Versions also record their parent versions.

Access control is capability-based. If you open a node with just a read-capability, you get a read-capability to all the nodes it links to. If also use a write-capability, you may additionally get some write capabilities for those linked nodes, if they are provided. Despite being immutable, blobs can also have write-capabilities, which only reveal the write-capabilities of nodes they reference.

Versioned nodes themselves do not provide a nice interface for the operator, as they do not have human-readable names. However, they are amenable to a variety of naming and search systems that are challenging for file systems - in particular because references to a node always remain valid.