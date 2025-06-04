1× 0:00 -1:03:57

Welcome to the Sinica Podcast, a weekly discussion of current affairs in China.

I'm Kaiser Kuo

Those of you who follow the show know that Sinica has been partnering quite closely with the team at Trivium, and today we get to showcase one of their standout pieces of recent work from one of their standout analysts. In January of this year, an AI startup called DeepSeek, of course, stunned the world by dropping a highly capable large language model that rivaled top-tier Western offerings. That debut raised a lot of eyebrows, and not a few questions about where China’s generative AI ecosystem really stands. Enter Kendra’s report, Seeking the Next DeepSeek. It digs into a remarkable data set maintained by China’s cyberspace regulator, the CAC, which requires registration of all public-facing generative algorithms. She told me about the database and this work she was doing on it over coffee last month when I was visiting Pittsburgh.

That was great. And a couple of weeks ago, she put it out. The report and the database offer a rare window into what’s actually being built in China, built and deployed, not just by the giants like Tencent, Alibaba, ByteDance and Baidu, but also by hundreds of startups, state labs, and even government agencies. As of April 2025, this list included 3,739 registered generative algorithmic tools, or GATs, that’s an acronym you’re going to be hearing a bit, ranging from foundational LLMs like DeepSeek to B2C voice assistants, image generators, video generators. Kendra’s analysis gives us the most comprehensive look to date at where generative AI innovation is happening in China — who’s doing it, which sectors are really heating up, where foundational labs are proliferating, and what role the state is playing in shaping this landscape?

It also offers a window into China’s evolving innovation strategy, where it’s succeeding, where it’s duplicating efforts, and where the next breakthroughs might come from. And the report illustrates, really, the value of mining China’s regulatory disclosures when you know where to look. There’s actually a surprising amount of actionable and public info on AI development in China. So, let’s get into it with Kendra Schaefer. Kendra, welcome back to Sinica.

Kendra Schaefer: Thank you. So good to be here as always.

Kaiser: Well, before we get into the data set itself, there’s the very reason it exists in the first place. And that is, of course, that it was mandatory to register these generative AI algorithms, as I mentioned just now. Folks interested in the AI regs and on their genesis can listen to the episode that you did on Sinica with Jeremy Daum, which was great. But for today, Kendra, can you give us a quick overview on where else in the world this type of regulation currently exists? Do we have something like this in the U.S.? Does the EU have something like this?

Kendra: No. And that’s why this data set is so interesting. And it’s also one of the reasons this data set is relatively unknown. I mean, where else in the world can you get a complete list of all of the different generative AI tools operating within the borders of a country? Now, there are some caveats about what’s actually on the list and what’s actually not on the list. But this is incredibly unique. It’s not just unique for the era. It’s unique globally. The U.S. does not require generative AI tools to register with the state, obviously. And Europe, even though they do regulate algorithms quite heavily, they haven’t, certainly not this heavily, and they haven’t taken this tactic, where they require companies to kind of file with a regulator for any tool that is interacting with the general public.