Threads Ads Brand Safety Infrastructure

Implemented brand safety controls at Meta that enabled enterprise advertiser adoption for Threads using ML-driven content classification and safety profiles.

KotlinAndroidAds SDKMetaThreadsMonetizationML ClassificationSystem Design

Threads Ads Brand Safety

Company: Meta
Surface: Threads
Role: Android Software Engineer
Impact Area: Advertiser Trust, Platform Monetization Readiness


The Problem

Launching ads on a new social platform is not just a technical problem. It is a trust problem. Every major advertiser has a version of the same question before they commit budget to a new platform: will my brand appear next to content I would not want to be associated with?

On established platforms, brand safety infrastructure is a given. Advertisers expect it. On Threads, a platform still scaling its ads product, this infrastructure needed to be built from the ground up before top-tier advertiser adoption could happen. Without it, campaigns from brand-sensitive advertisers simply could not run, because there was no contractual or technical mechanism to guarantee content adjacency controls.

I led the implementation of the brand safety system on Threads that solved this problem and enabled the platform to pursue advertiser relationships that require enterprise-grade content controls.


Why Threads Is Different

Brand safety on a text-based social surface like Threads is meaningfully different from brand safety on video platforms like Reels or YouTube.

On a video platform, content classification can operate at the media level: analyze the video, assign a safety score, cache it. The classification is relatively stable because the video itself does not change.

On Threads, content is conversational, text-based, and highly dynamic. A single post may appear harmless, but the reply thread it anchors can shift the context dramatically. Hashtags can associate a post with a controversy that the post itself does not mention. A post that is safe at the moment of classification may become unsafe hours later when a news event recontextualizes it.

The brand safety system needed to handle all of this at scale, with low latency, while staying within the ads delivery SLA.


System Architecture

The system is organized into two layers: a pre-computation classification layer and a real-time filtering layer at ad serve time.

Content Classification Pipeline. Threads posts are processed asynchronously through a text-based content safety model. The model assigns each post a continuous safety score between 0 and 1 and a set of categorical flags across six dimensions: violent content, hate speech, adult content, political sensitivity, controversy potential, and potential misinformation. These classification records are stored per post with a short TTL, ensuring freshness while keeping the critical path at ad serve time free of inline ML inference.

Ads Delivery Integration. When an ad request is processed, an adjacency context builder fetches the content safety records for the posts surrounding the candidate ad slot. A brand safety filter then evaluates those records against the requesting advertiser's brand safety profile. Only ads whose adjacency context satisfies the profile rules pass through to delivery.


Advertiser Brand Safety Profile

One of the most important design decisions was making the brand safety controls advertiser-configurable rather than platform-level-only. Different advertisers have legitimately different needs.

The brand safety profile includes three levels of control:

Blocked content categories. Advertisers select which of the six safety categories they want to block adjacency for. A children's brand might block all six. A news organization might only block hate speech and violent content.

Custom keyword blocklist. Advertisers can provide a list of specific keywords or phrases that trigger a block regardless of the ML classification score. This handles edge cases that the ML model may not capture, such as brand-specific sensitivities or current events not yet represented in the training data.

Sensitivity level. A three-tier setting (Strict, Standard, Expanded) that controls how conservatively the filter interprets the adjacency rules. Strict applies blocklist rules to a wider window of adjacent posts. Expanded tolerates more borderline adjacency for advertisers who prioritize reach over maximum safety conservatism.

These profile settings are configured once at the campaign level and enforced automatically across every ad serve for that campaign, with no per-impression advertiser action required.


The Critical Design Decision: Safety-First Default

The most important engineering decision in the entire system was what happens when classification data is unavailable.

There are several scenarios where a content safety record may be missing at ad serve time: a cache miss, a classifier processing delay for a very new post, or a classification service timeout. In each of these cases, the system has two choices: serve the ad anyway, or withhold it.

I designed the system to always withhold the ad when safety data is uncertain or missing.

This is a conservative stance that sacrifices a small fraction of impression volume in edge cases. But the alternative is to occasionally serve an ad in an unsafe context, which is a qualitatively different kind of failure. An impression lost to a safety withold is invisible to the advertiser. An ad that serves next to genuinely unsafe content is a brand safety incident that can damage the advertiser relationship and the platform's reputation with advertisers broadly.

At the platform level, advertiser trust is more valuable than the marginal impression volume recovered by serving in ambiguous cases.


Instrumentation

I instrumented the full filtering pipeline with metrics covering classification coverage rate (what percentage of adjacent posts have a valid, fresh classification record), filter trigger rate by category (which safety categories are most frequently causing blocks), impression withold rate by advertiser sensitivity level, and false positive rate measured against a human-labeled evaluation set used for model quality tracking.

These metrics drove two improvement loops: continuous retraining of the content classifier using confirmed false positive and false negative examples, and recalibration of the sensitivity level thresholds as the training data matured.


Outcome

The Threads Ads Brand Safety system enabled the platform to offer contractually backed brand safety guarantees to top-tier advertisers, a prerequisite for enterprise campaign adoption that did not exist before this project. It established the classification and enforcement infrastructure that now supports the full Threads ads product at scale, and it set the standard for how brand safety is handled across future Meta surface launches.