Why the Creator Economy Needs More Data Science-Hero.jpg

Why the Creator Economy Needs More Data Science

The term “creator economy” refers to the now-massive industry comprised of individuals who use their talent, expertise, and unique personalities to create content, build audiences online, and grow their own social media-based businesses. As screentime skyrockets, so does the creator economy.

At the moment, the creator economy is worth about $250 billion (roughly a third of the size of the U.S. commercial banking market) and is expected to almost double in size to $480 billion by 2027. According to Influencer Marketing Hub, more than 200 million people around the world consider themselves “creators,” and almost 30% of young Americans plan to join this new economy as social media influencers.

However, despite its size, consistent income remains elusive for many in the creator economy. In fact, 97.5% of YouTubers have yet to make enough income from their creations to reach the U.S. poverty line ($14,580 for individuals). Nearly 60% of beginner creators make less than $100 in a year, and more than a third make $1000 or less. Only a scant 4% of creators make more than $100,000 a year.

In this article, we’ll explore some of the ways data science and digital tools can empower creators to overcome challenges in content strategy and monetization.

Data Science to the Rescue

In the current digital landscape, while creators can initially capture attention, sustaining it and creating long-term income remains challenging. To date, more than two thirds of creators work in the field on a part-time basis only, and almost 60% hold full-time jobs. Many creators take a casual approach to their work, relying solely on trial and error for feedback to guide their content planning and research. Furthermore, trying to build their businesses “on the side” strains creators’ ability to keep up with key industry trends and content, especially when content editing and production take up majority of their “free” time.

Recognizing these challenges, I saw an opportunity for data science to play a key role for content creators. Data science can offer actionable insights and help creators to streamline their content strategies and pave the way to monetization.  

My participation in the #BCGHacks AI Hackathon helped me to transform these initial ideas into a Python-based tool that leverages machine learning (specifically, Natural Language Processing) to help content creators optimize their content. This two-day online event served as the perfect motivation for me to transform my ideas into a product (that, incidentally, won the hackathon’s Best Prototype Award under the Social Impact Category).

Leveraging Python to Optimize Content Creation  

To describe the implementation we created, we will use YouTube as an example.

We used YouTube’s API to aggregate contextual data along with engagement metrics from one content creator’s catalog. We implemented this in Python, using the following steps:

  1. Extracting and summarizing existing content. First we extracted data from a number of the creator’s YouTube videos, capturing both text-based content like video captions, tags, and labels, and engagement metrics such as likes, views, and comments. Then we summarized the statistical metrics, using weighting and normalization to minimize any biases based on timeframes and external factors.
  2. Analyzing content and engagement correlations using Natural Language Processing. Next, we used standard Natural Language Processing techniques to analyze the textual context of the data we had just extracted. Using Stemming and Lemmatization through a WordNet Model, we then performed text cleaning to help us bring each word in the text caption to its root form. For example, we converted “car,” “cars,” “car’s,” and “cars” to “car.” Simplifying these variations helped us understand the key meaning or topic hidden behind each of these grammatical nuances.

    To help us understand the key topics or subject behind each video, we used a KeyBERT Machine Learning Model to perform keyword extraction, which, in turn, enabled us to summarize the text caption for each video. As a result, we could now represent each video (which averaged 20 minutes in run time and 3,000 words of text) using just 7-10 keywords or key topics.

  3. Identifying and visualizing improvement opportunities. By combining the summarized textual context and the normalized statistical metrics, we could assess correlations between content topics and viewer engagement. This step enabled us to identify those key topics that earned the most viewer engagement—information that would enable the creator to refine content to match the interests of the target audience.

To make sure that the key topics we identified using a single creator’s content reflected interests were shared across that industry, we then extracted the same contextual and engagement data from content created by the industry’s top 10 creators. Identifying this content helped identify gaps in the unique creator’s content—key topics that were garnering the most industry engagement, but that were not among the creator’s strongest topics. Now cognizant of these gaps, the creator could plan new content accordingly.

The following graphic demonstrates how the use of data science can improve creators’ ability to optimize content. The chart, which aggregates contextual data and engagement metrics from a Tech Reviews Channel, reveals that the topic “iPhone” returns the greatest number of likes and comments:

A Promising Path to Monetization  

Clearly, the creator economy has tremendous promise, but few creators have been able to turn that potential into reliable income. As our initial test using YouTube data has shown, a data-centric, machine learning-based approach can help creators identify industry trends that will improve their ability to create sought-after content, increase engagement among their existing audience members, and grow their own slice of the burgeoning creator economy.

Tech + Us: Monthly insights for harnessing the full potential of AI and tech.