• Turing Post
  • Posts
  • 01.AI: How Kai-Fu Lee Shapes the AI Narrative in Both the U.S. and China

01.AI: How Kai-Fu Lee Shapes the AI Narrative in Both the U.S. and China

From open-source models to practical applications, discover how 01.AI is bridging AI strengths between the U.S. and China

Introduction

Chinese models often evoke both interest and suspicion when they launch and succeed. But one company stands out, getting attention from the highest caliber of Western media: 01.AI and its founder, Kai-Fu Lee. 01.AI quickly achieved unicorn status in the GenAI space, almost immediately after its launch. The company doesn’t just focus on models; practical products are what Lee is pushing forward. This aligns with his belief that while the U.S. will lead when breakthroughs are needed, China, with its tenacious work ethic and vast market opportunities, can catch up when it comes to executing known technologies.

Kai-Fu Lee is a unique figure. Despite working for China, he is an AI expert whom Bloomberg, The Financial Times, Time, and other Western media regularly turn to for comments, interviews, and stories. Can Kai-Fu Lee manage to influence both China and the US? And where is he driving 01.AI? Let’s explore together.

In today’s episode:

  • Zero One everything – how it all started

  • The master behind the idea – Kai-Fu Lee, the legend and the creator of the narrative of the future

  • Difference between US and China

  • First models (What does 01.AI specialize at)

  • TechSpec of Yi Model series and Yi Coder

  • How does 01.AI make money? Any problems with that?

  • Suggesting new approach to Product Market FIT/ or What is TC-PMF?

  • Conclusion

  • Bonus: Resources

Zero One Everything – how it all started

In March 2023, the legendary Kai-Fu Lee, noticing that China lagged behind in the generative AI race, shook up his network and started to assemble a team. In three months, this effort became 01.AI (known as Zero One Everything in China). It quickly gained momentum, securing $200 million in funding from investors like Alibaba in November and reaching AI unicorn status with a $1 billion valuation.

If you remember from our coverage of Zhipu and Moonshot, China’s AI landscape was originally dominated by the “AI tigers,” which included companies like Zhipu, Moonshot, Baichuan Intelligence, and MiniMax. This group has now evolved into the “AI dragons,” with two AI unicorns added: 01.AI and StepFun.

In June, five of these companies announced a strategic partnership with Alibaba to help the tech giant develop advanced generative AI features for DingTalk, Alibaba’s enterprise communication platform. From the very beginning, this has been Kai-Fu Lee’s approach: focusing on practical applications of AI as the only way for China to catch up with the US in the AI race.

The master behind the idea – Kai-Fu Lee

I first learned who Kai-Fu Lee was in 2018, devouring his just-published book

AI Superpowers” in a couple of days. In this book, he argues that while Silicon Valley’s entrepreneurs are motivated by a combination of intellectual curiosity and financial success, China’s tech entrepreneurs are driven by a sheer survival instinct, knowing they must adapt and innovate or be left behind in the dust. He wrote: “The West may have sparked the fire of deep learning, but China will be the biggest beneficiary of the heat the AI fire is generating. That global shift is the product of two transitions: from the age of discovery to the age of implementation, and from the age of expertise to the age of data. Much of the difficult but abstract work of AI research has been done, and it’s now time for entrepreneurs to roll up their sleeves and get down to the dirty work of turning algorithms into sustainable businesses.”

Though generally correct, he didn’t anticipate generative AI as it happened (that we could tell by reading his sci-fi book “AI 2041: Ten Visions for Our Future,” which already seems outdated), but this dawn of Generative AI finally proved his thesis: it is the Age of Implementation now. There is no actual reason to create more foundation models, but to find a killer app for them is.

Kai-Fu Lee is a genuine AI expert. He started with neural networks during his time as a Ph.D. student at Carnegie Mellon University (CMU) in the 1980s, right when the "connectionist winter" came to an end as researchers such as Hopfield, Rumelhart, Williams, Hinton, and others demonstrated the effectiveness of backpropagation in neural networks and their ability to represent complex distributions. Lee’s dissertation focused on developing a pioneering speech recognition system called Sphinx, which utilized neural network algorithms. Later, he held executive roles at Apple, Microsoft, and Google, where he led Google China. Then he started Beijing-based Sinovation Ventures, a venture capital firm investing in AI and technology startups. Though originally from Taiwan, Kai-Fu Lee was also cheering for China and its developments, encouraging Chinese students to come back to China to work in tech, presenting it to the outside world as bridging US-China tech relations.

And the Western media listens to him. He is featured in Bloomberg, Financial Times, The New York Times, etc. He is also on the TIME 100 Most Influential People in AI 2023 (but didn’t make it to 2024).

Difference between US and China

China's progress in the generative AI field has encountered some challenges, particularly due to its continued use of U.S. technology. Although Beijing aims to lead in AI by 2030, Chinese companies have found it difficult to match the pace of U.S. innovation. The story of 01.AI, which built its system using Meta’s LLaMA model, reflects this dynamic. While regulatory and economic factors have impacted growth, Chinese firms are leveraging U.S. open-source AI software to bridge the gap. Some experts note that China still trails behind the U.S. in certain areas, with broader geopolitical and security concerns adding complexity.

Kai-Fu Lee underscores the distinct strengths of both nations in the AI race. The U.S. is often celebrated for its pioneering innovation, particularly in foundational technologies like the transformer model used in GPT. Lee acknowledges that many future AI breakthroughs may continue to emerge from the U.S. due to this innovative drive. On the other hand, he highlights China’s expertise in execution – efficiently collecting and cleaning data and optimizing infrastructure. Lee emphasizes China’s ability to develop and scale successful applications, pointing to companies like TikTok, WeChat, Shein, and Temu as examples. As the focus shifts towards AI applications, he suggests China may have a key advantage.

But there is always this special “China” moment, meaning an overbearing control by the Chinese government. According to the Financial Times: “The Cyberspace Administration of China (CAC), a powerful internet overseer, has forced large tech companies and AI start-ups including ByteDance, Alibaba, Moonshot, and 01.AI to take part in a mandatory government review of their AI models, according to multiple people involved in the process.”

While the digital world may seem borderless, national policies and regulations heavily influence AI development and global perceptions.

Why open-source

In a sense, it’s a way to 'pay back,' since a lot of Chinese companies benefit from open-source themselves and would not be able to catch up without it. But it’s also a highly strategic move. 01.AI’s decision to release open-source AI models helps them build a loyal developer base, setting them apart from companies like OpenAI and Google, which keep their technology tightly controlled. Founder Kai-Fu Lee believes this openness fosters innovation and contributes to a more balanced AI ecosystem.

From a PR perspective, open-sourcing is also a smart strategy for generating word-of-mouth. By offering accessible models, 01.AI encourages early adoption and developer engagement, which can amplify the brand's presence in the AI community. This creates a buzz that positions the company as an innovator →

"I think that on a pure technology basis, yes, the U.S. is ahead. China’s caught up very quickly. There are two very good open-source models from China. One is YiLarge, which is the model from Kai-Fu Lee‘s company, 01.AI. And then the other one is Qwen 2, which is out of Alibaba, and these are two of the best open-source models in the world, and they’re actually pretty good."

Alexander Wang, Scale.AI

However, open-sourcing may only be an entry point – building initial momentum through community engagement could lead to eventual commercial success, where proprietary control becomes key to activating the business flywheel.

What 01.AI offers / Timeline

Model Releases and Key Events:

→ November 6, 2023:

  • Yi Model Series: 01.AI open-sourced three model variants: Yi-6B, Yi-9B, and Yi-34B. These models marked the beginning of their large-scale LLM offerings, designed for text generation, understanding, and complex reasoning.

→ Early 2024:

  • Yi-Large – 01.AI’s first closed-source model designed for high-performance reasoning and content creation. It competes with models like GPT-4, offering a more precise and targeted approach for various AI tasks.

→ May 7, 2024:

  • Wanzhi, an AI-powered productivity platform that provides features such as meeting minutes, weekly reports, and document interpretation. The platform is available in both Chinese and English, and supports collaboration through mobile and web apps, including on platforms like WeChat.

→ July 29, 2024

  • YI-Vision model released, 6B and 34B available.

→ September 5, 2024:

  • Yi-Coder, an open-source code generation model in two configurations: Yi-Coder-1.5B and Yi-Coder-9B. These models focus on efficient coding tasks such as code generation, editing, and long-context comprehension. Despite its smaller size, Yi-Coder performs competitively against larger models on key benchmarks like HumanEval and LiveCodeBench.

Key Model APIs

01.AI offers six performance-oriented APIs optimized for different tasks:

  • Yi-Large API:
    Designed for intricate reasoning, text generation, and deep content creation.

  • Yi-Large-Turbo API:
    A faster, more cost-effective version of Yi-Large, optimized for complex inference tasks.

  • Yi-Medium API:
    Built for instruction-following tasks like chat, translation, and general text processing.

  • Yi-Medium-200K API:
    Specializes in handling 200,000 words of text, ideal for long-form content processing.

  • Yi-Vision API:
    Focuses on image-based tasks like analysis, recognition, and multimodal interactions.

  • Yi-Spark API:
    A lightweight model optimized for quick responses in text and math-based tasks like code generation and simple mathematical analysis.

There is an important notice about their API though: “We regret to inform you that our API services will be temporarily unavailable starting from August 25th due to business adjustments. To continue using our services, we invite you to register on our official Chinese website at https://platform.lingyiwanwu.com/.”

Another “China” moment.

TechSpec of open-source Yi model series

Yi-6B, Yi-9B, and Yi-34B

Shortly after the release of the Yi model series (November 2023), developers spotted that 01.AI's code initially referenced Meta’s model, though these mentions were later removed. Richard Lin, head of open source at 01.AI, later confirmed that the company would restore the references. 01.AI has since acknowledged Llama 2 as part of the architectural foundation for Yi-34B, with key enhancements to improve efficiency and performance.

The series includes three open-source variants: Yi-6B, Yi-9B, and Yi-34B, which focus on large-scale language and multimodal tasks like text generation, complex reasoning, and multimodal understanding. The models incorporate Grouped-Query Attention (GQA), which reduces computational overhead for both training and inference, a feature typically reserved for LLaMA’s largest models but integrated into all Yi variants.

Other architectural improvements include using SwiGLU activation functions to enhance computational efficiency and Rotary Position Embedding (RoPE) for managing long-context windows up to 200K tokens. These modifications significantly boost the Yi series’ versatility, enabling it to excel in both short- and long-context tasks across different domains.

Pretraining plays a crucial role in Yi’s performance, leveraging 3.1 trillion tokens in a dual-language corpus (English and Chinese) and employing rigorous data-cleaning techniques. This high-quality dataset, along with advanced data deduplication methods, positions the Yi models competitively on standard benchmarks such as MMLU and Chatbot Arena.

Yi-34B, the largest of the series, approaches GPT-3.5-level performance while maintaining a focus on cost efficiency, particularly through 4- and 8-bit quantization, making it deployable on consumer-grade GPUs. This combination of advanced architecture, data quality, and model optimization makes the Yi series a strong choice for research and commercial applications​.

Yi-Coder

Recent addition (September 2024) to open-source Yi model series is Yi-Coder. It excels in various coding tasks such as code completion, editing, and long-context modeling, handling up to a 128K token window. Its long-context capabilities were tested in the "Needle in the Code" benchmark, where it flawlessly identified key functions within large codebases. Additionally, Yi-Coder demonstrates strong mathematical reasoning, achieving a 70.3% accuracy rate on program-aided language modeling (PAL) mathematical benchmarks, outperforming larger models like DeepSeek-Coder-33B.

Available in two sizes – 1.5B and 9B parameters – Yi-Coder builds upon the core Yi-9B architecture and is pretrained on 2.4 trillion tokens across 52 major programming languages. Yi-Coder produced a lot of excitement among the coding community since it’s such a small model that can be easily run on one’s laptop. But opinions about its quality vary →

Yi Large

How does 01.AI make money? Any problems with that?

01.AI’s revenue model blends both open- and closed-source models, targeting domestic and international B2B and B2C markets. According to Lee, 01.AI is nearing profitability. With a projected revenue of RMB 100 million (USD 13.8 million) this year, the company focuses on monetizing its AI products like the Yi-Large API, priced at RMB 20 (USD 2.7) per million tokens – less than a third of GPT-4 Turbo’s price. 01.AI also plans to adopt a cloud-based approach, further commercializing its industry solutions and APIs.

In August 2024, 01.AI reported that the Wanzhi productivity app reached 10 million users and generated over 100 million yuan (around $13.8 million) in revenue.

But there is also an information that 01.AI has been struggling to secure bids in the Chinese government procurement market in 2024. This lack of success, highlighted in the ChinAI Newsletter by Jeffrey Ding, suggests challenges in adapting to the complex and diverse needs of the Chinese enterprise market. This insight comes from Gui Xingren, using data from Zhiliaobiaoxun, which tracks bids in China's public procurement market. The report suggests that startups like 01.AI may struggle due to fragmented market demands and a preference for established companies.

Suggesting new approach to Product Market FIT/ or What is TC-PMF?

Since Kai-Fu Lee always insists on the importance of immediate AI implementation, he suggested a new approach – Technical Cost x Product-Market Fit (TC-PMF). It addresses the unique challenges AI companies face, where traditional Product-Market Fit isn't enough.

Unlike mobile internet's low-cost scaling, AI development incurs significant computational costs in both training and inference. TC-PMF highlights the need to align technology capabilities with cost efficiency to ensure scalability. Lee stresses that balancing technological advancements with sustainable cost practices is key, urging AI companies to avoid unsustainable methods, like traffic buying, and focus on rational growth strategies.

Conclusion

Yet, even for someone as influential as Kai-Fu Lee, establishing a global AI enterprise from China is challenging. While 01.AI is well-covered by Western media and well-known in the developer community, it cannot escape the realities of operating within China's regulated tech environment. Whether it’s navigating government reviews, securing international bids, or maintaining API services, 01.AI exemplifies the delicate balance between global ambition and local control. Lee’s vision of fostering cross-border collaboration and advancing the practical application of AI has positioned 01.AI as a significant player, but it also highlights the tightrope companies must walk when balancing innovation with governmental oversight.

Ultimately, 01.AI’s journey highlights the broader tension in the global AI race: balancing breakthrough innovation with practical execution. This mirrors the U.S.-China dynamic, where the U.S. excels in pioneering advancements, and China demonstrates strength in scaling and application. 01.AI is Kai-Fu Lee’s attempt to merge these two strengths, crafting a hybrid approach that leverages the best of both worlds. With this approach, Kai-Fu Lee is entering into a long-distance race—first to catch up, and then to surge ahead.

How did you like it?

Login or Subscribe to participate in polls.

Bonus: Recources

pip install llama-index-llms-yi
from llama_index.llms.yi import Yi

Not ready to upgrade? Check out the sponsor’s offering. It’s an interesting use case for LLMs and might be useful for you →

Streamline your PR without sacrificing quality.

With Pressmaster.ai, you can schedule, create and distribute your content to top magazines, social media and email lists with just a few clicks.

Content, that instantly builds trust, delivers value and is fun to read.

PR doesn't have to be expensive or complicated. Not anymore.

Skip the hassle and get your own PR ninja at your fingertips:

Reply

or to participate.