As user-generated content increases and the amount of data grows, trust and safety on digital platforms is becoming increasingly critical. Content that goes unmoderated can not only directly hurt brand reputation, but it can directly impact a businesses bottom line through lost users, advertisers, and revenue. Regulators worldwide are also implementing stricter rules around content moderation, online safety, misinformation, and disinformation.
To address these growing risks, more businesses are looking to AI and machine learning as part of robust trust and safety strategies. State-of-the-art AI solutions enable unprecedented scale, nuance, consistency, and efficiency in identifying and taking action on high-risk user content.
However, businesses can face multiple challenges when implementing AI for trust and safety. This includes:
Labelbox is a data-centric AI platform that enables businesses to build state-of-the-art AI solutions for enhanced controls, transparency, efficiency in content moderation, and greater brand safety. Rather than spending valuable time building an in-house solution or relying on disparate systems, businesses can explore data, use foundation models for assisted-enrichment, and evaluate models to quickly build more accurate AI systems for analyzing user behavior, detecting disinformation, and enhancing ad-targeting.
In this guide, we’ll walk through how your team can leverage Labelbox’s platform to build a model for content moderation, such as detecting and classifying disinformation, allowing you to elevate brand trust and improve the trust and safety of your applications.
Part 1: Explore and enhance your data: Google Colab Notebook
Part 2: Create a model run, fine-tune an LLM, and evaluate model performance: Google Colab Notebook
Follow along with the tutorial and walkthrough in this Colab Notebook. If you are following along, please make a copy of the notebook.
With the growing amount of user-generated content, businesses want to ensure that there is no inappropriate content or disinformation happening on their platform. To implement content moderation at scale, teams can leverage AI to analyze and detect harmful content and classify disinformation from existing data stored in a cloud bucket or a local folder.
To upload a sample of your content to Labelbox for labeling, you have a few options:
Upload a dataset through the SDK
Using the Google Colab notebook, upload the sample dataset into Labelbox or use it to import data from various sources like Bigquery, Databricks, or Snowflake.
In this notebook, we’re going to bring in two libraries of interest:
You’ll need your Labelbox API key to initiate the Labelbox Client and create a dataset. For this guide, we’ll be using a dataset stored in a Google Cloud bucket as a CSV and we can use Labelpandas to bring this data in.
The provided sample dataset includes:
Upload a dataset through the UI
If you have a dataset from your local file, you can upload it through the Labelbox UI by clicking "new dataset" in Catalog.
Once you’ve successfully uploaded your text, you can browse the dataset in Catalog — along its metadata. You can visualize your data in a no-code interface to quickly pinpoint and curate data for model training.
Leverage custom and out-of-the-box smart filters and embeddings to quickly explore product listings, surface similar data, and optimize data curation for ML. You can:
Create and save data slices
If you have a search query that you’re interested in saving or reusing in the future, you can save it as a slice. You can construct a slice by using one or more filters to curate a collection of data rows. Users often combine filters to surface high-impact data and then save the results as a slice.
In this example, we are interested in saving the surfaced data rows as “Climate Articles” so that this filtered dataset can easily be surfaced later on for annotation or data discovery purposes.
1) Create a text project in Annotate
2) Sample and send your uploaded dataset as a batch to your newly created project. In this case we can send the two dataset slices that we created: “Climate related articles” and “Non-climate related articles”
3) Create an ontology to determine how to structure your data. If you have a previous ontology you’d like to use, you can do so. If not, you’ll need to create a new ontology. For this use case, our ontology consists of two classifications:
4) If you’re relying on an external team of labelers or want to provide your internal labeling team with more instructions, you can upload instructions as a PDF for your labelers during the ontology creation process.
Now that we have a project with our data set up in Annotate, we’ll need to label this training data.
Since this project is a classification use case, we can also leverage bulk classification to speed up our labeling process and maximize labeling efficiency. Teams who have used bulk classification in Labelbox have seen labeling time decrease from a full quarter to a few days. Since we’ve leveraged filters in Catalog to identify “Climate related articles,” we can send these articles to our newly created labeling project with pre-labels.
To bulk classify and pre-label data rows, you can:
1) Highlight any data rows of interest, in our use case these would be data rows in the slice "Climate related articles", and select "Manage selection" > "Add classifications"
2) Select the labeling project that you made in the previous step and determine a step of the project’s review workflow that you would like to send the classifications to. In the above demo, we are sending these to the "Initial labeling task" because we want to have a labeler verify that these are indeed all climate related articles
3) Select the desired classification — in this case it would be "Climate related"
4) You can determine the batch’s data row priority (from 1-5) and submit the bulk classification job
Rather than labeling from scratch, a team of labelers can now simply verify or correct the pre-labels used during this bulk classification step.
With Labelbox, you can label your data in the following ways:
1) Internal team of labelers: your team can start labeling directly in the Labelbox editor, utilizing automation tools and maintaining quality with custom workflows to maintain human-in-the-loop review.
2) External team of expert labelers with Labelbox Boost: leverage our global network of specialized labelers for a variety of tasks.
Workforce Boost provides a collaborative platform for labeling services in a self-serve manner — this is great for teams that don’t have the technical expertise to build a machine learning system yet are looking for an easy-to-use technology to get a quick turnaround on quality training data. You can learn more about our Boost offerings here.
3) Create pre-labels with foundation models
In addition to creating pre-labels for classification projects, you have the ability to send model predictions as pre-labels to your labeling project. This can be done in one of two ways:
Content moderation relies heavily on training the model on accurate and verified data. To ensure that you’re producing the most reliable and high-quality training datasets, you can customize your labeling review workflow.
You can create customizable, multi-step review and rework pipelines to drive efficiency and automation for your review tasks. Set a review task based on specific parameters that are unique to your labeling team or desired outcome.
Follow along with the tutorial and walkthrough in this Colab Notebook. If you are following along, please make a copy of the notebook.
In this part of the tutorial, we’ll be taking the ground truth labels created in Part 1 to fine-tune a large language model (LLM). From there, we’ll evaluate model performance in Labelbox Model to diagnose model strengths and weaknesses and look to continuously boost and improve model performance.
Once you have your labeled data in your project in Annotate, you’re ready to move on to creating a model run in Labelbox Model.
To create a new model, you’ll need to:
1) Navigate to the "Experiments" tab in Model. The "Experiments" tab will be where you can find all model experiments across iterations.
2) Create a new model by selecting the "New model" button.
Create a model run
Once you’ve created a new model, we will need to create a new model run.
A model run is a model training experiment — each model run provides a versioned data snapshot of the data rows, annotations, and data splits for that model run. You can upload predictions to the model run and compare its performance against other model runs in a model directory.
The model run we create will be the initial model run for our LLM fine-tuning experiment. To add a new model run:
1) Select "New model run"
2) Give the model run a name (e.g “model run #1”)
3) Set data splits for the model run (for train, validate, and test)
4) Create the model run
After creating a model run, you’ll be able to see the corresponding data rows with ground truth populated into the appropriate train, validate, and test splits. This model run will be the gateway for us to export ground truth data to fine-tune a large language model.
Model training occurs outside of Labelbox. Labelbox Model works with any model training and inference framework, major cloud providers (AWS, Azure, GCS), and any data lake (Databricks, Snowflake).
We’ll be using this Colab notebook to fine-tune a model and bring back inferences from the fine-tuned model for evaluation and diagnosis.
For this step, you will need:
Export ground truth from the model run experiment
Labelbox will return the ground truth export in a JSON format. With the provided Colab notebook, we can visualize the exported JSON into a DataFrame format for us to view corresponding ground truth for each article.
Given that we want to fine-tune a Google Vertex model with this data, we’ll need to convert the ground truth export to a GCP vertex tuning format (JSONL):
# build LLM prompt and convert to GCP vertex tuning format (jsonl) prompt = 'Given the following headline and content, determine whether the article is related to climate change or similar topics. Also determine whether the article contains inaccurate or disinformation. Answer in the following format with Yes/No Answers: [climate related? / disinformation?]' df['input_text'] = prompt + df['content'] df['output_text'] = 'climate related: ' + df['climate_related'] + ' disinformation: ' + df['disinformation_flag'] with open('modelPrompt_GCP.jsonl', 'w') as file: for _, row in df[['input_text', 'output_text']].iterrows(): json_line = row.to_json() file.write(json_line + '\n')
Fine-tuning is a technique whereby we take an off-the-shelf open-source or proprietary model and retrain it on a variety of concrete examples, and save the updated weights as a new model checkpoint. You can learn more about other techniques to leverage LLMs in this guide.
For this use case, we’ll be using Google Vertex AI to fine-tune an LLM with the ground truth from Part 1 of this tutorial. Once in the Vertex AI console, we’ll want to create a tuned model:
Now, we can start the model tuning process. Once the model fine-tuning job has been completed, we can head over to the Google Vertex sandbox and give the newly tuned model a prompt.
For example, we can ask if the article is climate related and if it contains disinformation and it will provide a response based on the training dataset we provided.
Now that we’ve fine-tuned a model, we can use it to make predictions on the initial dataset and compare it with our ground truth data to assess the fine-tuned model’s performance.
Create inferences with the tuned model
We’ll need to:
We can then start creating model inferences and predictions from the tuned model on our news articles. Use Pandas to clean up the responses, to remove corresponding prompts, and save them as a DataFrame — this will return the model’s initial headline and the client’s response if the data row is climate related or contains disinformation.
Once we have model inferences, we can send the inferences back to a model run in Labelbox for further evaluation and analysis.
Evaluate and diagnose model effectiveness
To evaluate the effectiveness of the fine-tuned model in Labelbox, we’ll need to:
Once that’s complete, you can hop back to the original Labelbox model run and view the corresponding ground truth data and model inferences on each data row. You can visually compare the effectiveness of the fine-tuned model predictions (in red) with ground truth (in green).
For this use case, our goal is to minimize the spread of disinformation, so we can take a look at the metric that shows corresponding articles that are considered "disinformation" by labelers, but where the model incorrectly predicted articles as "not disinformation".
After running error analysis, you can make more informed decisions on how to iterate and improve your model’s performance with corrective action or targeted data selection.
Once you’ve identified an example of a corner-case where the model might be struggling, you can easily leverage Catalog to surface similar unlabeled examples to improve model performance.
With new high-impact data labeled, you can retrain the model and can track model improvements across various runs for comparison and how this has affected model performance.
Unmoderated content poses mounting risks to businesses with the risk of spreading misinformation, disinformation, and an unsafe online environment. With responsible implementation, businesses can leverage AI for trust and safety to efficiently and consistently identify high-risk content at scale. This not only helps create an online environment that is safe for users, but also helps protect brand reputation.
Labelbox is a data-centric AI platform that empowers teams to iteratively build powerful product recommendation engines to fuel lasting customer relationships. To get started, sign up for a free Labelbox account or request a demo.