The Internet protocol suite is wonderful, but it was designed before the advent of modern cryptography and without the benefit of hindsight. On the modern Internet, cryptography is typically squeezed into a single, incredibly complex layer, Transport Layer Security (TLS; formerly known as Secure Sockets Layer, or SSL). Over the last few months, 3 entirely unrelated (but equally catastrophic) bugs have been uncovered in 3 independent TLS implementations (Apple SSL/TLS, GnuTLS, and most recently OpenSSL, which powers most “secure” servers on the Internet), making the TLS system difficult to trust in practice.
What if cryptographic functions were spread out into more layers? Would the stack of layers become too tall, inefficient, and hard to debug, making the problem worse instead of better? On the contrary, I propose that appropriate cryptographic protocols could replace most existing layers, improving security as well as other functions generally not thought of as cryptographic, such as concurrency control of complex data structures, lookup or discovery of services and data, and decentralized passwordless login. Perhaps most importantly, the new architecture would enable individuals to internetwork as peers rather than as tenants of the telecommunications oligopoly, putting net neutrality directly in the hands of citizens and potentially enabling a drastically more competitive bandwidth market.
|Current OSI model||In practice||Proposed update|
|2||Data Link||e-UTRA (LTE), 802.11 (WiFi), 802.3 (Ethernet), etc.||Data Link|
Of course, the layers I propose will doubtless introduce new problems of their own, but I’d like to start this conversation with some concrete ideas, even if I don’t have a final answer. (Please feel free to email me your comments or tweet @davidad.)
Descriptions follow for each of the five new layers I suggest, four of which are named after common information security requirements, and one of which (Transactions) is borrowed from database requirements (and also vaguely suggestive of cryptocurrency).
General disclaimer for InfoSec articles: Reading this article does not qualify you to design secure systems. Writing this article does not qualify me to design secure systems. In fact, nobody is qualified to design secure systems. A system should not be considered secure unless it has been reviewed by multiple security experts and resisted multiple serious attempts to violate its security claims in practice. The information contained in this article is offered “as is” and without warranties of any kind (express, implied, and statutory), all of which the author expressly disclaims to the fullest extent permitted by law.
Data Link and Physical layers
For our purposes today, the Data Link and Physical layers are a black box (perhaps literally), to which we have an interface (the “network interface”) which looks like a transmit queue and a receive queue. These queues can store “payloads” of anywhere from 1 to 12801 octets (bytes). The next layer in the stack can push a payload onto the Data Link transmit queue (and possibly get an error if it’s full) and can pop a payload from the Data Link receive queue (and possibly get an error if it’s empty). The Data Link layer is responsible for (eventually) flushing the transmit queue, and any payload which leaves the transmit queue must appear on the receive queues of all other devices connected to the same channel (a technical term, which may refer to a radio channel in the case of cellular devices, or simply to a particular length of cable in a point-to-point wired connection).
We would like a received payload to self-evidently be the same payload which was sent. Although the Data Link layer is supposed to provide such an assurance, various kinds of attacks on the system might invalidate this assumption. Integrity protocols mitigate these attacks:
|Paranoia Level||Attacks||Mitigation||Common Implementation||My Preferred Implementation|
|1||Thermal noise, cosmic rays||checksum hash||TCP Checksum||CRC-32C|
|2||Deliberate corruption||cryptographic hash||SHA-1||BLAKE2b|
|3||Spoofing of trusted contacts||keyed hash||HMAC-SHA1||SipHash|
|4||Spoofing of strangers||public-key signature of cryptographic hash||SHA-1 + RSA||BLAKE2b + Ed25519|
Integrity protocols are fairly simple: the appropriate verification material is placed at the beginning of every Data Link payload. The Integrity layer exposes the same kind of “transmit queue and receive queue” interface as the Data Link layer, but the payload which can be passed to the Integrity layer must be somewhat smaller, so that there is room for the verification material and the Integrity payload together to fit into 1280 octets. Overhead ranges from 4 octets for a CRC-32C checksum to 96 octets for an Ed25519 signature.
In the keyed hash case, some state is necessary at the Integrity protocol level: each API customer must be able to add “trusted contacts” to its “address book” by specifying a symmetric key corresponding to a given endpoint name (which may have been negotiated at a higher protocol level, or simply out-of-band entirely). Since some advanced higher-level protocols may define symmetric authentication keys that are only good for a single use (e.g. Axolotl ratcheting after the handshake phase), “address book entries” should be single-use by default, with renewal explicitly required after each payload received from a given contact.
We would like networked endpoints to be available to receive packets from other endpoints in a way that is robust to unannounced changes in network topology. This layer conceptually takes the place of the Network layer in the original model, as it will be responsible for routing packets. Significantly, in this proposal, there are no “hosts” or “ports”: only “endpoints”, identified by public keys. This is simply taking the end-to-end principle one step further, by considering the “host” merely part of the network infrastructure which makes applications available.
A fully implemented Availability layer should provide unicast (deliver to a unique endpoint authenticated by a given public key, wherever it may be), anycast (deliver to nearest endpoint authenticated by a given public key), and multicast (a.k.a. pub/sub: route to all endpoints who have asked to subscribe to a given ID, and provide a subscription method).
|Routing Semantics||Current Reliability||New Implemenation|
|Overlay on existing Internet||Native Mesh|
|Multicast||awful||S/Kademlia message broker||Straightforward extension of unicast|
|Anycast||decent||No advantage over load balancers||Possible extension of unicast|
|Unicast||excellent||Special case of multicast||Electric Routing|
I believe the Electric Routing algorithm2 is up to the challenge of replacing unicast3, and that it could be extended to provide multicast and even anycast, but other algorithms could be developed at this protocol layer as well. The first real-world implementation of the system I’m describing will very likely be developed as an overlay network on top of IP, in which case multicast can be implemented simply atop S/Kademlia, with unicast as a special case, and anycast can be emulated with standard load-balancing techniques.
The tradeoff here is that routers have a lot more work to do, since there are no “addresses” corresponding directly to geographic location. But, it means that every node on the network can participate as a router, so there is a lot more capacity to do that work. In addition, the endpoints-only scheme has many potentially desirable properties with respect to features like pseudonymity, NAT transparency, redundancy, and decentralization of the telecommunications market (especially in densely settled areas).
Ideally, we would like to not transmit any information to anything other than the destination endpoint(s). This ideal is not in general achievable on a public network, but some types of mitigation are possible:
|Paranoia Level||Attacks||Mitigation||Common Implementation||My Preferred Implemenation|
|1||Sniffing payloads to trusted contacts||symmetric encryption||AES||ChaCha|
|2||Sniffing payloads to strangers||public-key encryption||RSA||RSA|
|3||Chosen plaintext attacks||key agreement + symmetric encryption||ECDH + AES||Curve25519 + ChaCha|
|4||Key compromise||ephemeral key agreement + symmetric encryption||ECDHE + AES||Axolotl ratchet with Curve25519, SipHash, PBKDF2, ChaCha|
In cases 3 and 4, this layer has to maintain some state, holding session keys or message keys, and the Axolotl ratchet is a little complicated; but this layer does not have to worry about the verification of identity (which will be provided on a higher layer, by services such as keybase.io or using pronounceable hash fingerprints) or integrity (which will be provided by a lower layer).
Non-Repudiation and/or Repudiation layer
We would like for a receiver to be sure that a message they receive was sent by a given sender, and we would like for a sender to be sure that a given message was successfully received. Sometimes, we would also like for a receiver to be unaware of the location a message was sent from. The result is three related but orthogonal protocol types, which may be nested:
|Non-Repudiation of Sending||Recipient knows immediate sender||Sender includes a hash of their public key in the message. To understand why this is necessary given the Integrity layer, read this excellent article|
|Non-Repudiation of Receipt||Sender knows message was received||Recipient must send a signed acknowledgement for every message. This also implements “reliable delivery”|
|Repudiation of Origin||Message is difficult to trace||Onion Routing|
We would like for sets of nodes which wish to maintain common mutable state variables to be able to do so, even in the presence of various types of adversaries. This is a common abstraction for the requirements of
git, cryptocurrencies, and distributed databases (i.e. ACID MVCC). I propose that (borrowing most directly from
git, but also from Clojure’s concurrent data structures) changes in large or complex mutable states be represented as changes to the root of a Merkle tree, thus reducing the state subject to transactional semantics to single-packet size4.
To make it obvious what I’m intending to refer to, the owner of a particular “domain name” or a particular “coin” (or, generally, any cryptographically controlled resource) is an example of a mutable state. But so is, for instance, the contents of any social media profile, email inbox, hypertext page, or source code repository. These things could all be managed without reference to central authorities or single points of failure.
|1||Asynchrony; node failure/disconnection||D1HT tracker|
|2||Sybil attacks; eclipse attacks; churn attacks||S/Kademlia tracker|
|3||Malicious trackers||Leaderless Byzantine Paxos or Byzantine gossip|
|4||Any attack that Bitcoin can survive||Block-chain protocol|
Many (including myself) have claimed that the core contribution of Bitcoin, the block-chain protocol, is a novel solution to the Byzantine Generals Problem, but it turns out this is somewhat misleading. Although the block-chain protocol is Byzantine-fault-tolerant in a novel way, there has been plenty of research on Byzantine protocols over the years, and it seems probably unnecessary to constantly “mine,” i.e. solve cryptopuzzles, to achieve Byzantine fault tolerance. The main reason to introduce cryptopuzzles is to reduce the efficacy of Sybil attacks, in which one malicious actor fabricates arbitrarily many identities in order to exceed the Byzantine fault tolerance threshold and control the system. However, these attacks can also be mitigated by requiring crypto-puzzles only for joining the network (as in S/Kademlia), and by blacklisting nodes which behave suspiciously (the latter being how most attacks on Bitcoin are stopped in practice).
In such an environment, applications (or application components!) are essentially just maps from one mutable state to another, in functional reactive programming style. In the same way that you might encode packet filters into a kernel’s TCP/IP stack today, you might encode entire applications into a kernel’s “mesh” stack in the future. Various search functions, including full-text search, could be provided using the OneSwarm approach or potentially by distributed Bloom filters implemented atop this platform (an idea due to Andrée Monette). Resource control and access control can be provided by means of cryptographic capabilities.
But, in general, this layer is completely open for all sorts of applications. Essentially, any end-user service that runs on a network (and what doesn’t, these days?) would fit here.
I’ve outlined some radical ideas for how to re-build the Internet protocol stack in a way that is ultimately more coherent with Internet cultural values (freedom of expression, pseudonymity, reduced potential for abuses of power). This outline still needs quite a bit of work and thought before being turned into implementations, but I feel like I’ve reached a turning point in making my ideas about next-generation architectures concrete, and at a timely moment with respect to conversations about TLS and net neutrality. If you would like to see these concepts made into working code, please reach out and let me know.
This is similar in principle to the trick used by most practical public-key cryptosystems, which use the actual public-key algorithm only to encrypt a key from some symmetric cryptosystem, and then encrypt arbitrarily large content using a stream cipher. The common principle is that you can do the hard security algorithm on a small piece of data, and use easier security algorithms to apply those hard security properties to large chunks of data.↩