Universality and APIs
Imagine, for a moment, that Caesar had decided to make an honest woman of Cleopatra and married her. And, because of his deep affection, Caesar adopted Egyptian hieroglyphics and made it the new written language across the Roman empire.
Now fast-forward a thousand years and think about what that would mean for the printing-press. Hieroglyphics are one-to-one representations of words or concepts. Hand-writing them is fairly efficient. Creating a hieroglyphic-printing-press, on the other hand, is much harder than a Latin-based one. The Egyptians had roughly 900 hieroglyphics in regular use, so you'd need at least that many keys. It would also stunt development of the language - if anyone wanted to coin a new term they'd first have to run a marketing campaign to teach everyone what the new symbol meant, and then convince the printers that it was worth their investment to create yet another hieroglyphic-key.
Interestingly enough the first moveable-type printing did use a form of hieroglyphics - Mandarin - and was invented in AD 1040 in China. But the technology wasn’t automated until 400 years later when Gutenberg introduced the printing-press in Europe. The small number of alphabetic characters needed for European languages was an important factor in its success.
Latin based languages are extremely versatile. With 26 letters and a few rules I can write a poem to my lover, a recipe for chicken soup, or a scientific paper that revolutionizes physics. Every idea that has ever been pondered can be serialized into English. This is because Latin-based alphabets are capable of representing every word in the language - they're universal systems. Hieroglyphics aren't capable of representing all states so they aren't universal. In other words, the move to Latin-based languages was a jump into universality.
This is not the first time humanity has made this jump. We've stumbled into it a few times, most notably from the Roman numeral system to the Indian numeral system. In Roman numerals, the highest number is one thousand (represented by the symbol ↀ). Calculating anything above that meant appending ↀ's to each other, and then you're back to tallying, an unscalable system for anything beyond basic arithmetic. (This clearly didn't bother the Romans who had little need for large numbers.)
The Indian number system we use today has ten symbols, the digits 0 to 9, and its universality is due to a rule that the value of a digit depends on its position in the number. For instance, the digit 2 means two when written by itself, but means two-hundred in the numeral 204. Such ‘positional’ systems require ‘placeholders’, such as the digit 0 in 204, whose only function is to place the 2 into the position where it means two hundred.
Mother nature came upon universality long before we re-discovered it. Using the four base-pairs A C G and T, DNA can encode the instructions to make a T-Rex or a chicken. It can even contain the source-code for the most complex creation in the known universe: the human brain.
Can we get simpler than our Indian 0 to 9 digit system? Yes - we actually only need two 'bits' of data to represent numbers: 0 or 1. This is called base-2 binary and yes, it's also universal, which means it can represent any number.
Whenever we have a system that can represent numbers, it can also represent letters in the alphabet. All we need is a lookup table that has numbers on one side, and letters on the other. ASCII is one such system - for example in ASCII the number 65 represents the letter 'A'.
So now we can take ideas (like the concept of a chicken), serialize them into letters by describing them in English, and then serialize them into numbers (using ASCII), and then serialize those numbers into binary 0s and 1s. In other words, any idea can be represented by a series of 0s and 1s. Quite extraordinary.
We can even go a step further and remove English from the equation by passing each word through a clever algorithm called Word2vec. This will map words into space so they're represented by vectors of anywhere from 100 to 300 dimensions. What is fascinating about this is that we now have a collection of numbers that refers to an abstract idea (rather than a collection of numbers that refers to an English sentence, which then refers to an abstract idea). This allows us to do all sorts of fancy things, like translating the idea into French, or guessing the next word in a sequence (the core mechanic behind GPT3).
Encoding letters as numbers is very convenient for storage (just persist a 0 or a 1 in a memory cell) and also for processing (number transformations using math!). A computer program is just a series of simple instructions (like 'add these two numbers together' or 'remember this number') that a CPU runs. And remember, because a binary system is universal it can store the computer program too!
So to recap:
- Any idea (or knowledge) can be explained in the English language
- English can be translated into numbers
- Those numbers can then be stored, processed, and manipulated by computer programs.
When you take all these things together we come to the conclusion: all knowledge can be computationally processed. Computers, given the right program, can perform any transformation of knowledge whatsoever, including (albeit theoretically at this point) knowledge creation
The power in universal systems comes from their reach. They can do a lot more than the use-case they were originally designed to do.
Take your washing-machine for instance. The computer it uses is Turing complete which means it can run any program. It’s universal. It can run a washing program, or it can run a program that performs astro-physics calculations. Sure, the latter program might take a while to process and run into memory limitations, but it would still run.
(As an aside, it's fascinating that the program for an artificial-general-intelligence, once written, could plausibly run on your washing-machine.)
Computation is only limited by two things: memory and time. All computer advances are simply improving one of those two things. For example a GPU improves the ‘time’ part of the equation by giving the computer access to specific hardware for making fast floating-point calculations. Those calculations were possible before GPU’s existed but now they’re much faster.
Reach doesn’t give us progress, but it is a catalyst for it. The fact that anyone can write any program which can run on any computer has catapulted humanity’s progress forward this century.
How do computers talk to each other? They use a format called an API. Except this isn't really a standardized format because every time a new API is created a software engineer has to get involved, and they build something custom.
If two computers use the same API to talk to each other (as they do on the web) then things, in general, go quite smoothly. Both computers are following the same rules, they're speaking the same language.
But often-times we want two different APIs to talk to each other. For example, we want our billing system to talk to our marketing system. In these cases we need to pay a software engineer to:
- Go and read each respective API docs
- Then create a custom translation layer between these two APIs so they can understand each other
Not only is this expensive and time-consuming, these API-layers are one-off and brittle. They are not universal. In other words, they're a lot like hieroglyphics.
So this, to me, is the crazy part. We have universal writing systems. We have universal numbers. We have universal computers that are Turing complete and can run any program. Yet our communication layer between computers is not universal. It's no better than hieroglyphics.
This problem, albeit maybe not framed in terms of universality before, is not a new one. Many people have banged their head up against it and have considered solutions.
A common first thought people have is "why can't everyone use a universal standard API?" So they introduce a new standard, proudly proclaiming 'one standard to rule-them-all!'. Then they realize all they have done is create one-more competing standard and quickly try to change the subject. It's like asking the entire world to speak the same language. Who knows - maybe one day that’ll happen. But let's not plan on it.
More entrepreneurial-inclined engineers might think "Aha, this is a great business idea! I'll create a company that bridges all the APIs, solve the problem, and become rich along the way!." They are both right and wrong. Yes, they are probably right about getting rich (see Zapier). But no, it won't solve the problem. It's like trying to make a thousand Rosetta stones. Each integration they build out is custom, brittle, and un-universal.
So we need a simple universal system for computer communication, but one that doesn't involve rewriting existing APIs, and also doesn't involve everyone having to use the same API standard. Fortunately for us, such a system already exists – it's called OpenAPI. Unfortunately for us, it's not very popular.
OpenAPI (previously Swagger) is a standard that describes an API such that a computer can understand it. It wraps APIs in a similar fashion to the way ASCII wraps letters. Although, rather than using a lookup table, it describes a structured list of all the API endpoints, the parameters these endpoints take, and so forth.
This has a number of benefits. Once you’ve created an OpenAPI file, you can then generate bindings to the API from any programming language. And generate the documentation too. And it works for any existing API - no changes required.
But the biggest benefit is one we haven’t scratched the surface of yet: computers dynamically talking to each other.
Taking action in the real-world typically involves interfacing with a company. Almost every company has an API (albeit some are quite unstructured like your local-pizza-joint’s 90s HTML website). Currently humans need to get involved whenever APIs talk to each other. But soon, very soon, AI assistants will be doing all this interfacing on our behalf. We need to create the language for them to communicate in.
One alternative would be to teach AIs to browse the internet, do research, submit forms, etc. And maybe that is what will end up happening. But surely machines will have an easier job interfacing with APIs designed specifically for computer consumption?
Why is OpenAPI not more popular then? I’m not sure, but if I was to take a stab at a guess I’d say it was a combination of poor tooling, bad documentation, and a cottage industry of enterprise-first companies who care little about individual developers.
For OpenAPI to gain wide-spread adoption it will need:
- Clear, beautiful, documentation and tooling to help generate OpenAPI files
- Best of class language-binding generation
- A package manager for finding and submitting OpenAPI files
- API frameworks that run on platforms like Next.js to generate OpenAPI files for free
This is a lot to build, but it is very doable. We don’t know what will be possible if we unlock another universal system, but we do know that it’ll catalyze progress.
If you've enjoyed the ideas above, you'll probably love reading The Beginning of Infinity by David Deutsch where many of these ideas are from. I've written a brief synopsis here.