Technical Journal

Stuff I hack

An OSI layer model for the 21st century

The Internet protocol suite is wonderful, but it was designed before the advent of modern cryptography and without the benefit of hindsight. On the modern Internet, cryptography is typically squeezed into a single, incredibly complex layer, Transport Layer Security (TLS; formerly known as Secure Sockets Layer, or SSL). Over the last few months, 3 entirely unrelated (but equally catastrophic) bugs have been uncovered in 3 independent TLS implementations (Apple SSL/TLS, GnuTLS, and most recently OpenSSL, which powers most “secure” servers on the Internet), making the TLS system difficult to trust in practice.

What if cryptographic functions were spread out into more layers? Would the stack of layers become too tall, inefficient, and hard to debug, making the problem worse instead of better? On the contrary, I propose that appropriate cryptographic protocols could replace most existing layers, improving security as well as other functions generally not thought of as cryptographic, such as concurrency control of complex data structures, lookup or discovery of services and data, and decentralized passwordless login. Perhaps most importantly, the new architecture would enable individuals to internetwork as peers rather than as tenants of the telecommunications oligopoly, putting net neutrality directly in the hands of citizens and potentially enabling a drastically more competitive bandwidth market.

Current OSI model In practice Proposed update
8 (none) Application Application
7 Application HTTP Transactions
6 Presentation SSL/TLS (Non-)Repudiation
5 Session TCP Confidentiality
4 Transport Availability
3 Network IP Integrity
2 Data Link e-UTRA (LTE), 802.11 (WiFi), 802.3 (Ethernet), etc. Data Link
1 Physical Physical

Of course, the layers I propose will doubtless introduce new problems of their own, but I’d like to start this conversation with some concrete ideas, even if I don’t have a final answer. (Please feel free to email me your comments or tweet @davidad.)

Descriptions follow for each of the five new layers I suggest, four of which are named after common information security requirements, and one of which (Transactions) is borrowed from database requirements (and also vaguely suggestive of cryptocurrency).

All Boolean functions are polynomials

…in the integers mod 2 (a.k.a. the finite field of order 2). Multiplication mod 2 is AND:

0 0 0 0
0 1 0 0
1 0 0 0
1 1 1 1

Adding one mod 2 is NOT:

A (A+1) A NOT
0 1 1
1 0 0

So, multiplication plus one is NAND:

0 0 1 1
0 1 1 1
1 0 1 1
1 1 0 0

Since NAND is universal, and any finite composition of polynomials is a polynomial, any finite boolean circuit is a polynomial. Here’s all 16 two-input functions:

Getting started with nginx configuration

Thanks to fellow Hacker Schooler Leah Steinberg for inspiring this post!

Having intermittently struggled with apache2 configuration files for the majority of my adult life, I find nginx an absolute joy to set up. I’m completely sincere about that. But, for those who are just getting into Web development, nginx is just about as much of a struggle as Apache used to be—in fact, probably more so, because there’s less abundant learning material out there on the Internet.

So, here’s an attempt to make that situation just the slightest bit better.

If you don’t already have nginx installed, I encourage you to follow these directions for building OpenResty, an enhanced version of nginx that enables building entire Web apps within the nginx process using the beautiful programming language Lua.

But, from here on, I’m going to assume that you already have a stock version of nginx installed. Verify that if you run

$ nginx -v

you get some kind of reasonable response, like

nginx version: nginx/1.2.3


Now, make a file called hi.conf:

error_log stderr;
http {
access_log off;
server {
listen 4945;
location / {
return 200;
events {}

VNC as a graphical interface medium

The Virtual Network Computing (VNC) system for accessing the GUI environments of remote computers uses a protocol called Remote Frame Buffer (RFB) to exchange data about graphics output as well as keyboard and mouse input. RFB turns out to be a very sane protocol (specification PDF here) compared with X11, and infinitely more sane than Cocoa (which requires the ObjC runtime) or Win32 (no explanation needed). So, I thought, why not just expose a program’s graphical interface as a VNC server? Then we can let a VNC client deal with the vagaries of the host windowing environment, and we only need to speak a well-specified protocol on a socket.

So far, this is what I have to show (code on github):

Concurrency Primitives in Intel 64 Assembly

Now that nearly every computer has some form of multi-processing (that is, multiple CPUs sharing a single address space), some high-level languages are starting to get attention for their concurrency features. Many languages refer to such features as “concurrency primitives.” But since these are high-level languages, we know that these “primitives” must ultimately be implemented with hardware operations. Older high-level languages, like C, don’t have baked-in support for such operations – not because such languages are lower-level, but simply because the operations in question weren’t a thing when C was invented. Assembly language, being up to date with the latest CPU capabilities by definition1, should provide the best window into the true nature of today’s concurrency operations.

In this post I’m going to walk you through a (relatively) simple concurrent assembly program which runs on OSX or Linux. Here’s the demo (github):

bash-3.2$ time ./concurrency-noprint-x1 foo    # single-worker version

real  0m1.458s
user  0m1.445s
sys   0m0.010s
bash-3.2$ # now run two at once
bash-3.2$ time ./concurrency-noprint-x1 foo-2 & ./concurrency-noprint-x1 foo-2
[1] 71366

real  0m0.785s
user  0m0.780s
sys   0m0.001s
[1]+  Done                    time ./concurrency-noprint-x1 foo-2
bash-3.2$ time ./concurrency-noprint-x4 foo-3  # four-worker version

real  0m0.417s
user  0m0.413s
sys   0m0.003s
bash-3.2$ time ./concurrency-noprint-x7 foo-4  # seven-worker version

real  0m0.295s
user  0m0.283s
sys   0m0.001s
bash-3.2$ diff -s --from-file=foo foo-*
Files foo and foo-2 are identical
Files foo and foo-3 are identical
Files foo and foo-4 are identical

The Security/Product Design Correspondence

General disclaimer for InfoSec articles: Reading this article does not qualify you to design secure systems. Writing this article does not qualify me to design secure systems. In fact, nobody is qualified to design secure systems. A system should not be considered secure unless it has been reviewed by multiple security experts and resisted multiple serious attempts to violate its security claims in practice. The information contained in this article is offered “as is” and without warranties of any kind (express, implied, and statutory), all of which the author expressly disclaims to the fullest extent permitted by law.

If programming is the art of adding functionality to computers, security is the art of removing it.

This maxim is a bit unfair to deep and wonderful world of information security (InfoSec), but it has a point. A lot of essential concepts in InfoSec have natural opposites in software product design.

Let’s start at the top. Every professional software project begins with specifications. In product design, the specifications are called use cases: stories about an external agent who wants to perform some function, and how they would go about performing the function using your software. In InfoSec, the specifications are called threats. These are also stories about an external agent who wants to perform some function, and how would go about performing the function using your software. The difference is, in product design, you want to make the agent’s job as easy as possible, while in InfoSec, you want to make it as hard as possible. We also have these related correspondences:

Systems Past: the only 8 software innovations we actually use

Note: This is a position piece, not a technical article. Hat tip to Jake Skelcy for requesting such a piece.

Computers didn’t always have operating systems. The earliest machines, like the Harvard Mark I and the EDVAC, performed one “computation” at a time. Whenever a computation finished, with its output printed by a teletypewriter or recorded on a magnetic tape, the machine would shut down. A person would then have to notice the machine stopped, unload the output, set up a new computation by manually loading the input and program instructions, and finally, press the start button to get the machine cranking again. On the Harvard Mark I, for instance, restarting would involve separately turning on multiple electric motors and then pressing a button marked MAIN SEQUENCE.

The control panel of the Harvard Mark I.

This is the context in which the programming language (PL) and the operating system (OS) were invented. The year was 1955. Almost everything since then has been window dressing (so to speak). In this essay, I’m going to tell you my perspective on the PL and the OS, and the six other things since then which I consider significant improvements, which have made it into software practice, and which are neither algorithms nor data structures (but rather system concepts). Despite those and other incremental changes, to this day1, we work exclusively2 within software environments which can definitely be considered programming languages and operating systems, in exactly the same sense as those phrases were used almost 60 years ago. My position is:

  • Frankly, this is backward, and we ought to admit it.
  • Most of this stuff was invented by people who had a lot less knowledge and experience with computing than we have accumulated today. All of it was invented by people: mortal, fallible humans like you and me who were just trying to make something work. With a solid historical perspective we can dare to do better.

How I Think About Math,
Lecture 1: Relations

See the slides (PDF). (You may want to use your PDF viewer’s presentation mode; there are a lot of pseudo-animations that could get annoying to scroll through.)

Update: Today, I drew up the field axioms in this notation. I’m almost to the point where I can define linearity!

Last week at Hacker School, I floated the idea of giving a presentation about linear algebra. Over a decade after taking it in college, I finally feel like I understand linear algebra well enough to express clearly, to an audience of programmers, most of the concepts from linear algebra that they might find useful.

I figured the very first thing to present would be the concept of linearity itself. After all, a linear operator is just any operator that commutes with addition and scalar multiplication. But wait– what is “commuting”?

Python to Scheme to Assembly,
Part 1: Recursion and Named Let

In 2001, my favorite programming language was Python. In 2008, my favorite programming language was Scheme. In 2014, my favorite programming language is x64 assembly. For some reason, that progression tends to surprise people. Come on a journey with me.


In this article, we’re going to consider a very simple toy problem: recursively summing up a list of numbers1.

def sum_list(list):
if len(list) == 0:
return 0
return list[0]+sum_list(list[1:])
 >>> sum_list(range(101))

Young Carl Gauss would be proud.

 >>> sum_list(range(1001))
 RuntimeError: maximum recursion depth exceeded


Young programmers often learn from this type of experience that recursion sucks. (Or, as a modern young programmer might say, it doesn’t scale.) If they Google around a bit, they might find the following “solution”:

Overkilling the 8-queens problem

Last night, a fellow Hacker Schooler challenged me to a running-time contest on the classic eight queens puzzle. Naturally, I pulled up my trusty Intel® 64 manual and got to work. It turned out to be even faster than I expected, churning out pretty-printed output in 15ms, which is totally dominated by the time it takes the terminal to display it (it takes only 2ms if redirected to a file).

Update: Very slightly more scientific testing, spurred by curious Hacker News commenters, indicates that, without pretty-printing and other overhead, the solving time is actually closer to 11.2µs – about a factor of 7 speedup over commenter bluecalm’s C implementation.

pretty-printed output
pretty-printed output
a davidad production