How is knowledge created? This seems like a pretty important question, perhaps even the most important, given how much we owe to knowledge.
Our technology, clothes that keep us warm, farming techniques, medicines, and even our morality are all types of knowledge.
We should always strive for more knowledge because it will improve our quality of life and give humanity the best chance of surviving. In fact, all of our present and future problems can be solved with knowledge, just as all of our past problems have been.
If we understand how knowledge creation works, then we can automate the process using computers, greatly increasing the speed at which we can generate new knowledge.
So what does our current understanding look like?
The best attempt I've seen at documenting how human knowledge is created is from David Deutsch in his book The Beginning of Infinity. He describes how, during the Enlightenment, we built a culture of conjecture and criticism. These two elements play off against each other, honing new knowledge.
Conjecture entails using creativity to solve a problem. So how does creativity work? At this point, the trail runs cold. No one knows for sure. If we did, we'd know how to build an artificial general intelligence. Perhaps though, we're getting close.
Can we bottle creativity into a machine?
Yes, it’s entirely feasible. The proof is in our existence, and the fact we’re creative.
Our brains are made of atoms in a particular arrangement abiding by the laws of physics. All of this can be modeled by a computer, which are universal simulators.
Of course in practice, a full brain-emulation would require a lot more knowledge of how the brain works than we currently have, plus it would likely require a vast amount of processing power.
So hopefully we can build artificial creativity without modeling an entire brain, otherwise it may be some time before we have the knowledge and computation required.
Perhaps creativity is a type of information extrapolation?
This would explain why there are so many instances of inventors running down to the patent office to file the same patent on the same day. Like some kind of hive mind, they've come across the same information and made the same extrapolations.
It's kind of like progressing through the different Ages in Age of Empires. There's no skipping Ages because they build on each other. Knowledge begets knowledge.
Other times it’s the case of a lone individual having a stroke of inspiration. If you take the case of Einstein, he took what he knew about discrepancies in physics and then applied logical thought experiments to come up with General Relativity.
But even then, there is no genius in isolation. Einstein wasn’t operating on any secret information, only combining knowledge from different fields extrapolating on what he already knew.
Another way of looking at this is would a brain in a vat, devoid of sensory input, come up with new explanatory knowledge, or is a constant stream of incoming knowledge necessary to be creative?
Are large language models creative?
If you play with them, LLMs can seem creative. But is this just an illusion? After all, they're just regurgitating from a static dataset; how can they come up with new knowledge that isn't in their dataset?
Can't the same be said of us, though? Whenever we come up with a new idea, aren't we also operating from a static dataset, that is, whatever is in our memory at a given moment?
If you subscribe to the idea that creativity is a form of information extrapolation, then I suspect you're in the "LLMs are already creative" camp. It doesn't matter that the dataset is static. You can still combine previously uncombined areas of the dataset to generate new knowledge.
What about hallucinations though? Sometimes LLMs can seem pretty dumb, making mistakes humans wouldn’t.
I think sometimes though, we gloss over our internal error correction. When someone comes up with a smart idea and shares it, that spark of creativity has often undergone a phase of internal self-reflection and reasoning prior to sharing.
Pure creativity includes fallacies and hallucinations. However we seldom consider it in this light because when ideas are shared they’re already refined through internal criticism and reasoning.
Or perhaps creativity is an algorithm we haven’t found yet, and we’re barking up the wrong tree with language models. We shall see.
Can large language models reason?
As with creativity, if you play with LLMs, they seem to have a degree of reasoning. But is this just an illusion? After all, they're just predicting the next token. How is that reasoning?
What exactly is reasoning? When a teacher asks a pupil to show their reasoning, what they're asking for are the steps taken to get to an answer. This is meant to demonstrate understanding of the source material versus just a lucky guess at the answer.
Curiously when LLMs first made the scene people noticed that responses got much better if you instructed the model to "show your working". In fact, one of the best prompts turned out to be "Take a deep breath and work on this problem step-by-step".
More curious still is that chain-of-thought reasoning seemed to emerge after the models were trained on source code.
But LLMs still make silly mistakes and hallucinate. This is especially prevalent when you ask them to solve long math problems. And as you know from math class, one mistake in your equations throws the rest off.
What today's LLMs are missing is a dedicated self-reflection and reasoning stage. We've already demonstrated that the LLMs generate much more accurate results when asked to show their working step by step. If we can somehow allow the LLMs even more time to think and consider each step, perhaps we can improve the accuracy of their output?
Can we improve the model's reasoning abilities?
Let's say I asked you to solve a math problem, say 18 x 22. Most likely you'd break that problem down into steps, writing down each step, verifying as you go. It might take you some time, but the answer you'd end up with would be much more accurate than if you'd just guessed at it.
In cognitive physiology, this is commonly referred to as Type 2 thinking, a slower, more deliberate mode of thinking that takes some effort. This is in contrast to Type 1 thinking, which is more instinctual and much faster.
Different activities require different modes of thinking. You engage Type 1 when you're crossing the road, and Type 2 when you're solving the crossword. LLMs currently only do Type 1, the quick instinctual guessing type. They return tokens at a consistent rate, and they don't go back to retrace their steps.
So here's the question: can we get language models into a Type 2 mode, giving them more time to ponder, reflect, and think through their answers? In other words, can we convert time into accuracy?
What if we simply copy the way humans do it? Like a mathematician with a blackboard, we:
- Ask the LLM to output its answer into numbered steps
- Then verify each step along the way
In May, OpenAI published a paper doing precisely this titled "Let's verify step by step", showing a big gain in accuracy using this technique.
First they fine-tuned a verifier LLM (which they call a PRM) using human-verified training data. Next, they instructed a language model to "think step by step", outputting a series of steps showing its working. Then they scored each step using the verifier LLM, labeling each step as positive (correct), negative (incorrect), or neutral (ambiguous).
Using this technique they reported a significant improvement in accuracy. Which makes logical sense - if you verify each step is accurate, then the ultimate answer is more likely to be correct.
But the part that I found particularly fascinating was that they detected generalization outside of pure mathematics. By fine-tuning the verifier LLM on math, it was able to improve its reasoning across chemistry, physics, and other STEM tests.
Can we give the model more time to think?
So perhaps we can emulate the 'verification' part of Type 2 thinking, but what about the pondering part. How can we give the model more time to think through answers?
One approach is simply to ask the model the same question many times, scoring each answer, then only surfacing the best one. This lets us spend an arbitrary amount of time answering questions, trading time for quality.
The convenient thing with computers is that whenever there's a time tradeoff, we can just throw more processing power at the problem. So rather than wait years to complete the sentence "the next president will be ...", we can just throw a million dollars at it in server compute. The technical term for this set of problems is test-time compute.
So now we have both the pondering and verification part of Type 2 thinking (at least for a narrow domain such as mathematics). Let's try and model another characteristic of thinking: tree of thought.
Consider how you think through the next move in chess. You first consider the legal moves available to you. Then, for each move, you imagine your opponent's retaliation, your counter-move, and so on. You might see a quick way to win the game, or perhaps just a way to strengthen your position. If you noted down each of these possibilities, it would look like a tree, branching whenever there was a possible move.
So now, let's apply that back to our artificial Type 2 thinking. Rather than considering each step by step linearly, we can explore a tree of thought, evaluating branches in turn until we come up with a good answer. Whenever we branch, we can generate a lot of options, scoring each one in turn.
Can AI solve novel problems?
Using ChatGPT to devise creative marketing campaigns and humorous poems is one thing, but coming up with a new scientific discovery is another thing entirely. If we had an example of that, it would be hard to argue they aren't creative.
The closest thing we have to this is a paper from Google's DeepMind, which uses a large language model to solve a previously unsolved math problem. 'FunSearch' pairs a code-writing LLM with an evaluator algorithm to, in this case, solve the cap-set problem. The LLM spat out millions of variations of the python program, then the evaluator runs the program, analyzes the output, saving the best ones. At the end of a few days FunSearch actually managed to produce a correct and previously unknown solution to the cap set problem.
Is this the first time an AI solved a math problem using its own creativity? Perhaps so. Critics, though, were quick to point out the narrow scope of the problem, and also that this approach wouldn't necessarily generalize past math problems.
As an aside, the most fascinating aspect for me was the uncanny parallels between Deutsch's conjecture/criticism theory of knowledge and the dynamic between the LLM and the evaluator. The LLM provides the conjecture, while the evaluator offers the criticism.
Setting aside LLMs, are there any examples of AIs being creative?
Actually, yes, on the 10th of March 2016 an AI displayed such a leap of creativity that the people of South Korea had a mini national crisis.
The DeepMind Challenge was a five-game Go match between the world's best human player, Lee Sedol, and AlphaGo, an artificial intelligence. In game 2 of 4, AlphaGo's 37th move floored spectators. At first, the move seemed unusual, perhaps a software malfunction. Slowly it dawned on spectators that this was a new way of playing Go, an entirely alien way, and that this move was going to be studied for years to come.
AlphaGo was initially trained on past human games, but it was honed by playing millions of games against itself, developing new strategies previously unused by humans. This kind of self-play is called reinforcement learning. The neural network proposes a move, an evaluator algorithm scores it, and the system learns. If you repeat this process millions of times, you get a superhuman Go player.
A subsequent version of AlphaGo, dubbed AlphaGo Zero, dropped the human data and trained itself entirely with self-play. After 40 days of training it beat all its predecessors to become the world's best Go player.
So perhaps reinforcement learning is the secret to creativity? Essentially using self-play to iteratively improve, treading a path no human has trod before.
But will this generalize to broad domains outside of games and math?
Ever since AlphaGo, this is the question researchers around the world have been asking. Do these techniques expand outside closed systems like games, or narrow domains like math?
The issue here is verification. Within a game, or a math equation, verification is pretty straightforward. You have a fit function which rewards winning the game, or a math equation you can evaluate. But what if you're looking for a good explanation to a problem — how would you evaluate something nebulous like that?
The solution so far has been to use humans for the evaluation. This is called RLHF, or reinforcement learning from human feedback, a technique that uses humans to train a reward model. You take the output of an LLM, then you get a human to vote on whether it's useful or garbage, and then fine-tune the model with the result. Essentially Hot or Not for language models.
So far RLHF has been successfully used to reduce hallucinations and coax language models into responding with something useful and, as the Let's Verify Step-by-step paper showed, it can also improve reasoning.
However RLHF has an obvious limitation: humans. It's impossible to do the kind of self-play iteration that AlphaGo did with a meat-suit in the loop. Can we automate reinforcement learning in a loop, AlphaGo style?
A paper titled STaR: Self-Taught Reasoner purports to do exactly that, demonstrating a model improving itself by learning from its own generated reasoning. Its performance is impressive, comparable to a fine-tuned model that is 30× larger. Hopefully StaR is just the start of breakthroughs in automated reinforcement learning.
Update: and two days into 2024 and we have another promising paper: Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models.
Artificial General Intelligence
So we have some very promising areas of active research:
- With step-by-step verification we can improve accuracy
- Using test-time compute, we can trade time for quality, or spend more money to get better responses
- Using techniques like STaR, we could potentially remove the human-feedback part of reinforcement learning, entirely automating it.
- And there's some evidence to suggest reasoning is generalizing.
Combine all that along with a rumored OpenAI breakthrough called Q*, and it looks like we're going to have a very interesting year.
I will leave you with this thought.
The search for knowledge is the search for good explanations. What constitutes a good explanation? Unlike with an equation, you can have two explanations for something that are true, just that one explanation is better than the other.
Take the explanation for gravity, for example. We have Newton's Laws of Universal Gravitation, and then we have Einstein's General Relativity, superseding the former. Newton wasn't wrong, it's just that Einstein's explanation is better.
What makes Einstein's explanation better? A better explanation offers a more accurate and precise description of phenomena. While Newton's laws predict the motion of planets accurately in most cases, they do not explain certain observations, like the precession of Mercury's orbit. Einstein's explanation accounts for these discrepancies and has broader reach.
If we can get a machine to both conjecture new explanations, and to understand the difference between a good and bad explanation, then we can automate the search for knowledge.
- Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
- Scaling Scaling Laws with Board Games
- Tree of Thoughts: Deliberate Problem Solving with Large Language Models
- Let’s Verify Step by Step
- Training Verifiers to Solve Math Word Problems
- STaR: Self-Taught Reasoner Bootstrapping Reasoning With Reasoning
- Can LLMs Critique and Iterate on Their Own Outputs?