To Forge Deeper Thought
Jul 2024 - Alex Alejandre

Every now and then, we hear of some miracle material or process which will revolutionize this or that, perhaps solve cancer or replace concrete. Excited, thirsting to drink from the firehose, r/tech has our back (“and drinking largely sobers us again.") We never hear of them again, amnesia warding off disappointment. But why does this happen?

In AI research, papers optimize but one aspect while every other part of their methodology lags a few years. Just using last year’s cutting edge approaches everywhere should blow everyone out of the water. It appears DeepSeek is actually synthesizing.

DeepSeek’s approach is currently open source state of the art, table stakes for new research. (That other researchers don’t pluck these low hanging fruit’s a travesty.) Since they focus on bootstrapping (at low cost) ever better models, their papers illustrate a good learning path too! Just this, they surveyed the field, found the Pareto frontier to frontier capabilities, began to blaze a trail down it and shared the hollowed path! After a few months they hit and published their next milestone, which just drove victory in a math olympiad.

That’s fine and good, but how can we apply this methodology elsewhere? The replication crisis infects psychology, medicine, behavioral economics and of course computer science. How did DeepSeek’s researchers cut through the cruft?

Git gud.

They listened to their coach and built strong fundamentals. Where “shut up and calculate” has rendered much recent physics research useless, just being aware of underlying principles puts you above the rest. When everyone misses the forest for their own special tree, forgetting to account for recent developments, simply curating and collating what works yields novel theory, to guide innovation. While 95% isn’t that good, combining 95% in finance, math and geology unlocks many doors. Why?

Underlying every field, reign the same very rules, the very grammar and architecture of complexity. It’s no wonder Christopher Alexander’s “design patterns” so inspired software developers, jealously crafting ever arcaner tools to navigate and mine ontological veins.

DeepSeek employs jacks of all trades, knowledgeable in history, languages, chemistry beside the expected mathematicians and computer scientists.Attacking the problem from bisociated angles, they see different branches. While Western academicians and corporate researchers pursue the stale meta, restricted by what their colleagues and superiors, yea investors and grant boards will understand and accept, by what they can quickly publish lest they perish, with courage and curiosity, DeepSeek delves ever deeper.

Commenting on an interview (trans.):

Innovating on model structure means there’s no path to follow, requiring many failures and enormous time and economic costs.

As the tortoise beat the hare, duly building out a framework breaks ground faster. Testing each part, you can determine what research (other papers) is chaff, focusing your model-theory.

Amidst a clamor that LLM technology will inevitably converge and that following is a smarter shortcut, DeepSeek values the accumulation in “detours” and believes that Chinese LLM entrepreneurs can join the global technological innovation torrent in addition to application innovation.

If we can form a complete industry upstream and downstream, we don’t need to do applications ourselves.

Where in China common wisdom commandeth man make products sitting on the West’s giant shoulders, in the US profit commandeth man guard but jealously his knowledge and wallow away from network effects, outside opinions etc From Carpenter’s They Became What they Beheld:

Specialists don’t welcome discovery, only new proofs of what they know. All specialists understand that discoveries are fatal … Discovery makes the field of the specialist and the expert obsolete.

True passion and curiosity cut straight through this wilderness of mirrors, instead of resting on your laurels and retarding advancement, actually thriving is reward in itself. Actual ownership of your ideas and research direction leads you from zero to one.

Even if OpenAI is closed source, it can’t prevent being overtaken by others. So we deposit value in our team, our colleagues grow in this process, accumulate a lot of know-how, form an organization and culture that can innovate, that’s our moat. Open source, publishing papers, actually doesn’t lose anything. For technical people, being followed is a very fulfilling thing. In fact, open source is more like a cultural behavior than a commercial behavior. Giving is actually an extra honor. A company doing this will also have cultural attractiveness.


In my own work, I blend a fruit smoothie of diachronic philology, scholastic philosophy, value investing, biomechanical economics and GOFAI. Spending obsessed years with each of these lenses in turn, they rhyme like the music of the spheres and, beholding the new hotness, help cut through the chaff.