How to achieve feats fast

Scaling in the right order.

First, figure out what to scale: people who are fast in executing
understand that proof of concept doesn't need to take that long. A
lot of POCs in many directions in a short amount of time allows them
to see the future in a predictable roadmap. This is why OpenAI is
making significant progress very fast: they tested on directions to
scale—model size, training compute, and recently, inference compute.
These are the scaling laws. Once knowing the scaling law, one knows
where to scale effectively. Each scale brings a multiplicative effect
on the overarching goal.

I think doing things slowly is often caused by doing one thing in 10
different ways, i.e., scaling redundancy. To succeed, there is often
no more than 2 things to focus on to scale at any given time. A useful
mindset is to think about "how to achieve 80% of the effect with 10%
of the effort"; this helps bring focus down to what really matters.

Second, it's worth thinking about in which order the scale should
happen. Scaling in the right order means that doing things in a
specific order can make doing the following thing more effective.
Think about why language was the initial entry point of AGI: it's
the lowest cost abstract storage of information about the world,
and we have plenty of it already, and can create more easily.
Whereas images and videos are less supervised, and they are more
costly for representing information. Text has the beautiful property
of filtering out what doesn't matter and only focusing on what
matters. So making language models really well pushes forward other
directions. It's like a brick that is interlinked, and pushing one
direction moves the others as well.