港股研究社
2025.07.15 01:51

Bilibili promotes the revival of 'video-based' podcasts, another 'supply-side reform' in the content industry.

portai
I'm LongbridgeAI, I can summarize articles.

When podcasts are no longer just private sounds in headphones, they start to invade the screens of billions of users with images. The global podcast market is also heading towards a market size of $30.72 billion by 2026 with a compound annual growth rate of 27%.

Because of this trend, China's largest youth culture community, Bilibili, has launched a major initiative by releasing a support policy for video podcasts called the 'Video Podcast Outbreak Plan,' aimed at helping audio and text creators transition to video creation and assisting account growth.

Image source from pixabay library

On the other side of the ocean, YouTube announced as early as February this year that the monthly active users of podcast content on its platform have exceeded 1 billion, which not only far surpasses the market share of audio giant Spotify but also forces Spotify to launch a video revenue-sharing plan to retain creators.

As deep content consumption becomes a spiritual necessity for users to escape fragmentation, video podcasts are returning to the center of the content industry.

Video podcasts show breakout value, Bilibili welcomes a moment of breakthrough

For a long time, the growth of the gaming business has been sluggish, and advertising revenue has hit a ceiling. Bilibili has already stood at the crossroads of commercialization transformation.

Despite having a large Gen Z user base, the platform has always struggled to balance the community atmosphere of 'powered by love' with commercial efficiency.

When short video platforms are rapidly capturing user time, Bilibili urgently needs to find a middle path that can maintain the tone of deep content while opening up monetization channels.

The quiet rise of video podcasts has shown the platform a glimmer of hope for a breakthrough. According to 'Cooperation Planning' data, this quarter, Bilibili's video podcast audience exceeded 40 million, and user watch time increased from 6.9 billion minutes to 25.9 billion minutes, a growth of over 270%.

These data were born in a state of natural growth where 'operations and products did not intervene,' also revealing the strong demand for deep content from users.

What excites Bilibili even more is the monetization ability shown by top creators. Legal field UP hosts 'Wang Yikuai' (Mita Technology COO Wang Yiwei) and 'Zhong Er Da Xuan Ge' achieved considerable annual income through user charging (content payment) and knowledge courses, far exceeding the revenue of equivalent content on audio podcast platforms.

This monetization model based on deep trust relationships is the best commercial expression of Bilibili's community genes.

It is reported that Bilibili's video podcast support policy mainly includes three parts: cold start support for traffic, free recording venues in major first-tier cities, and AI creation tools exclusive to video podcasts.

Among them, the launch of the 'Code H' AI tool directly addresses creation pain points. It is mainly used to help podcast creators save time on video material search and editing work. Creators input content, and the tool can automatically generate images. Moreover, it supports two formats of input, text and audio, and can compress the video production time of a thousand-word content to within 6 minutes, with the potential to shorten it to 3 minutes in the future.

It is evident that technological empowerment is dismantling the barriers to professional video production, allowing knowledge elites and audio creators to go into battle lightly.

In the past, Bilibili was long trapped in the contradiction between the community gene of 'powered by love' and commercial efficiency. Although the gaming business briefly leveraged profitability with blockbuster hits, over-reliance on single products exposed the fragility of the profit structure.

In this context, the wild growth of video podcasts has become a key variable. To seize this track, Bilibili has launched a combination of punches and used AI to completely dismantle the technical barriers to professional video production, allowing knowledge elites to go into battle lightly. This gamble is not only Bilibili's last stand to break the commercialization curse but also the opening act for deep content to return to the stage.

From 'niche private land' to 'mass content infrastructure,' the video podcast battle begins

When Bilibili heavily bets on video podcasts with '1 billion traffic + AI tools,' the global battlefield is already filled with smoke.

YouTube, with 1 billion monthly active podcast users, has topped the world's largest podcast platform, forcing audio giant Spotify to urgently follow suit; while the domestic market is still in the 'underwater game' stage, Douyin, Xiaohongshu are secretly testing, and Himalaya, WeChat Video Channel are quietly laying out.

In comparison, there are significant differences between the Chinese and American markets. Currently, the US market has formed a mature ecosystem of 'top podcasts = video podcasts,' whether it's Lex Fridman interviewing Bezos or Joe Rogan talking to Musk, YouTube is the core dissemination platform for these contents.

The success of 'The Joe Rogan Experience' proves that video elements can elevate podcasts to immersive thought fields—audiences can not only hear the conversation but also see the micro-expressions and body language of Rogan and Musk during their exchange, making abstract ideas more concrete and dramatic.

In contrast, most of China's top podcasts still remain in pure audio form. The dual obstacles behind this difference are that audio podcasts themselves are difficult to commercialize, and video production requires higher cost input. When audio monetization is difficult, creators naturally lack the motivation to upgrade to video.

Therefore, video blogs, with the visual shaping of personal IP and the diversification of advertising forms, are expected to break the dilemma of 'popular but not profitable' faced by traditional audio podcasts.

In this context, Bilibili becomes the first high-profile platform to participate, which is not accidental. Its unique genes constitute a differentiated advantage.

First is the inclusiveness of medium and long content. Currently, Bilibili is the only video platform verified to allow users to consume diversified medium and long content, and its unique 'black listening' culture (referring to users listening to sound without watching images) allows users to freely choose viewing and consumption forms.

Second is the barrage culture and secondary creation ecology. The construction of Bilibili's barrage allows real-time thoughts to resonate, and users spontaneously edit and recreate podcasts, forming a 'metabolic system' that precipitates core ideas from long content and then feeds back long content consumption through fragmented dissemination.

This ecology is a competitive advantage that platforms like Douyin and Xiaohongshu find difficult to replicate.

It is worth noting that Bilibili's 'video podcast plan' is not an isolated action but deeply aligns with the overall evolution trend of Chinese podcasts. According to the 'CPA Podcast Marketing White Paper 2025,' the number of Chinese podcast listeners will exceed 150 million by 2025, with China ranking first globally with a growth rate of 43.6% in 2024.

JustPod data shows that 74% of users are willing to pay for podcasts, and 71.6% of users have made consumption behavior due to podcasts. These data indicate that podcasts are transitioning from 'niche private land' to 'mass content infrastructure,' and Bilibili's decision to go all-in on video podcasts at this time is both a response to market trends and an intention to seize the dividends of industry upgrades.

Revival of deep content, 'podcast videoization' completes self-transcendence

In the content ecosystem dominated by algorithm recommendations and 3-second highlights, the essence of video podcasts is still 'slow media.' It does not chase instant gratification but requires viewers to invest continuous time; it does not provide information fast food but cooks a feast of thoughts.

This forms a stark contrast with short video ads that pursue instant conversion and also brings unique commercial value evaluation dimensions. Therefore, the rise of video podcasts is essentially a rediscovery of the value of deep content.

Looking deeper, the Elaboration Likelihood Model (ELM) reveals the dual persuasion paths of video podcasts: when users think deeply, they rationally accept content through the 'central route'; when attention is scattered, they rely on the host's persona and other 'peripheral routes' to build trust. This 'dual-track overlay' trust-building ability is precisely the ecological niche advantage shared by Bilibili and podcasts.

Moreover, the 'slow philosophy' against fragmentation has also become its core competitiveness. Just Pod research shows that 91.2% of Chinese podcast users have a bachelor's degree or above, and 73.4% are users from first-tier and new first-tier cities, forming a precise traffic pool of high-net-worth individuals.

For example, after a skincare brand placed customized content in 'Business Is Like This,' the search volume of its Tmall flagship store surged by 180%, confirming the unique business logic of podcasts 'content planting - mind occupation - long-term conversion.' This 'slow conversion' model may have more brand-effectiveness advantages than the 'instant viewing and buying' of short videos in the era of consumption upgrade.

Furthermore, behind the trust economy, Chinese podcasts are experiencing a value leap from 'traffic business' to 'trust economy.' Surveys show that users have a high tolerance for podcast ads, with only 0.6% of listeners exiting due to ads. This means that the core of podcast commercialization is not hard selling but emotional connection based on trust.

And the addition of video elements makes this trust-building more three-dimensional. When hosts move from behind the voice to in front of the camera, listeners' perception of their personality traits becomes more complete, creating a new type of social interaction where 'physical absence, emotional presence,' which traditional media cannot reach.

Finally, the visual expression of the thought market. Video podcasts are not simply moving the recording process to the screen but constructing immersive thought fields around personal IP. The success of 'The Joe Rogan Experience' on YouTube shows that the host's charisma, camera performance, and deep conversation ability are the core assets.

At this time, Bilibili's 1 billion traffic ignites not only the enthusiasm of creators but also the counterattack horn of deep content against the fragmented era.

Now, globally, the boundaries between audio, video, and live streaming are blurring, and their ultimate form may be neither pure audio nor traditional video. Bilibili's bet on video podcasts at this time is both a forward-looking judgment on content consumption trends and a strategic attempt to break through the commercialization dilemma.

Author: Sang Yu

Source: Hong Kong Stock Research Society

The copyright of this article belongs to the original author/organization.

The views expressed herein are solely those of the author and do not reflect the stance of the platform. The content is intended for investment reference purposes only and shall not be considered as investment advice. Please contact us if you have any questions or suggestions regarding the content services provided by the platform.