top of page

When AI Companies Police Themselves

  • ShiftQuality Contributor
  • Feb 3
  • 7 min read

There's a theory that AI companies, given the right incentives and enough public pressure, will regulate themselves effectively. They'll create safety teams. They'll establish ethics boards. They'll make voluntary commitments. They'll do the right thing because it's good business, because their employees demand it, or because they genuinely believe in responsible development.

This theory has been tested. The results are in.

The Voluntary Commitment Era

In July 2023, the White House secured voluntary commitments from major AI companies. The commitments covered safety testing, information sharing, watermarking AI-generated content, prioritizing research on societal risks, and developing AI to address major challenges.

These commitments were announced with ceremony. They were covered as meaningful progress. And they were voluntary, which means unenforceable. No penalties for non-compliance. No independent verification that commitments were being met. No mechanism for accountability when they weren't.

Some of those commitments have been honored in some form. Others have quietly faded. The watermarking commitment is a useful example: despite being one of the most concrete pledges, implementation across the industry remains inconsistent and easily circumvented. Companies watermark some outputs in some contexts, but there's no standard, no verification, and no consequence for gaps.

Voluntary commitments work when the cost of compliance is low and the reputational benefit is high. They fail when compliance requires sacrificing competitive advantage. In AI, the most important safety decisions are exactly the ones where compliance is most costly.

Safety Teams That Get Overruled

Most major AI companies have established internal safety teams. These teams do real work. They conduct red-teaming exercises, evaluate model capabilities, identify potential harms, and recommend mitigations. The problem isn't that these teams don't exist or that they lack talented people. The problem is their organizational authority.

A safety team that can delay a release is fundamentally different from a safety team that can recommend delaying a release. The distinction matters because when safety concerns conflict with business objectives, the question of who makes the final decision determines the outcome.

The pattern we've seen repeatedly: a safety team raises concerns, the concerns are acknowledged, and then a business decision is made to proceed anyway, sometimes with partial mitigations, sometimes with a promise to address the issue post-launch, sometimes with a reframing of the concern as less serious than the safety team believed.

This isn't a failure of individual people. It's a structural problem. When the safety function reports to the same leadership that's evaluated on product velocity and market share, safety will lose the close calls. Not every time. But often enough that the pattern is unmistakable.

Several high-profile departures from AI companies' safety teams have confirmed this dynamic publicly. Researchers who left have described environments where safety work was valued in principle but deprioritized in practice, where raising persistent concerns was career-limiting, and where the pressure to ship outweighed the pressure to ship safely.

Ethics Boards: A Case Study in Symbolic Governance

AI ethics boards have followed a pattern that's worth examining in detail because it illustrates the fundamental problem with self-regulation.

A company announces an ethics board with external members. The announcement generates positive coverage. The board is presented as evidence that the company takes ethical considerations seriously. Then one of several things happens.

The board issues recommendations that conflict with business plans, and the board is restructured or disbanded. Or the board's scope is narrowly defined so that it can only advise on questions the company is already comfortable addressing. Or the board operates without real authority, its recommendations are non-binding, and the company selectively implements the ones that don't cost anything.

Google's dissolution of its AI ethics team leadership, the disbanding of various advisory boards across the industry, and the pattern of ethics researchers being pushed out when their findings were inconvenient are not isolated incidents. They're the predictable outcome of asking a profit-driven organization to create an internal body whose purpose is to sometimes say "no" to profitable activities.

This isn't a commentary on the character of the people running these companies. It's a commentary on incentive structures. Effective oversight requires independence from the entity being overseen. Internal ethics boards are, by design, not independent.

The Fox and the Henhouse

The self-regulation argument implicitly asks us to believe that AI companies will prioritize public safety over shareholder value when the two conflict. This belief has essentially no support from the history of any industry, anywhere, ever.

Consider the precedents.

Tobacco. The tobacco industry funded research, created advisory panels, and made voluntary commitments for decades. Internal documents later revealed that these efforts were designed to create doubt and delay regulation, not to protect public health.

Financial services. Before 2008, the financial industry argued that self-regulation and market discipline were sufficient to prevent systemic risk. Internal risk management teams existed at every major bank. They were overruled by trading desks that were generating enormous profits.

Social media. Tech companies spent years arguing that self-regulation was sufficient to address content moderation, misinformation, and user safety concerns. Internal research documenting harms was sometimes suppressed or downplayed.

Pharmaceuticals. Before the modern regulatory framework, pharmaceutical companies marketed products with limited safety testing. The thalidomide disaster demonstrated the cost of insufficient oversight and led to the regulations we have today.

The pattern is consistent across industries and decades. When self-regulation is the only check on behavior, and when compliance has real costs, self-regulation fails. Not because business leaders are uniquely bad people, but because organizations optimize for their stated objectives, and the stated objective of a corporation is not public safety.

The "We'll Slow Down" Promise

A particular version of self-regulation in AI is the promise to slow down or pause development if safety concerns arise. Some companies have made explicit statements about their willingness to halt progress for safety.

Evaluate this claim against the competitive landscape. The major AI labs are in an intense capability race. Billions of dollars of investment are predicated on continued progress. Market position depends on being at or near the frontier. Employee compensation, through equity, is tied to company valuation, which is tied to perceived capability leadership.

In this environment, what would it actually take for a company to voluntarily slow down? The safety concern would need to be severe enough to justify the competitive cost. The evidence would need to be unambiguous. And the company would need to believe that competitors would also slow down, otherwise they're unilaterally surrendering market position.

These conditions are almost never simultaneously met. The result is that "we'll slow down if needed" becomes "we'll slow down if the danger is existential and undeniable and everyone else also slows down," which in practice means "we won't slow down."

Some companies have delayed specific releases for safety reasons, and that's worth acknowledging. But there's a difference between delaying a release by a few weeks to fix a specific issue and fundamentally slowing the pace of capability development in response to broader safety concerns. The former happens. The latter, so far, hasn't.

What Effective Oversight Actually Requires

If self-regulation doesn't work, what does? The answer isn't complicated, but it is difficult to implement.

Independence

The entity doing the oversight cannot be funded by, employed by, or dismissible by the entity being overseen. This is the most basic requirement and the one that internal safety teams and ethics boards structurally cannot meet.

Independent oversight means external auditors with legal authority to access systems, data, and personnel. It means regulators with technical capacity to evaluate claims. It means funding structures that don't create dependence on the regulated industry.

Authority

Oversight without authority is theater. The oversight body needs the ability to require changes, delay releases, impose penalties, and if necessary, prohibit specific activities. Advisory recommendations that a company can ignore are not oversight.

This is where regulation becomes necessary. Only government authority (or government-delegated authority) carries the enforcement power needed to make oversight meaningful.

Technical Capacity

AI systems are technically complex, and effective oversight requires the ability to understand that complexity. This is a genuine challenge. Regulatory bodies need to attract and retain people who understand how these systems work, and they're competing for talent with the companies they're regulating.

Some approaches to this challenge include rotating technical experts between industry and regulatory roles (with appropriate cooling-off periods), funding academic research that supports regulatory capacity, and creating specialized technical bodies like the AI Safety Institute.

Transparency Requirements

Oversight works best when it operates on good information. Companies should be required to disclose training data documentation, safety testing results, known failure modes, significant incidents, and the governance processes behind release decisions.

This disclosure should be structured and standardized so that it can be meaningfully compared across companies, not free-form narratives that each company crafts to present itself in the best light.

The Industry's Counter-Arguments

The AI industry makes several arguments against external regulation in favor of self-regulation. These arguments deserve honest engagement.

"Regulation will slow innovation." This is true. Regulation imposes costs and constraints that slow some activities. The question is whether the activities being slowed include harmful ones, and whether the benefit of preventing harm outweighs the cost of slower development. For high-risk applications, the answer is clearly yes. For low-risk applications, proportional regulation can minimize the burden.

"Regulators don't understand the technology." This is partially true and partially a self-serving argument. Regulatory technical capacity is a real challenge, but it's a solvable one, and the alternative, trusting the industry to regulate itself, has failed. Imperfect external regulation is better than absent regulation for systems that affect millions of people.

"Regulation will push development overseas." This argument has been made by every industry facing regulation in the history of regulation. Sometimes it's true at the margins. But the largest AI companies are deeply embedded in the economies and talent markets of the US and EU. The idea that Meta or Google will move their AI development to an unregulated jurisdiction to avoid compliance costs is not realistic.

"We need flexibility to address rapidly changing technology." This is the strongest argument. Technology-specific regulation risks becoming obsolete quickly. The solution is principles-based regulation that sets outcomes and requirements rather than prescribing specific technical implementations, combined with delegated authority to technical bodies that can update specific requirements as technology evolves.

Where We Are Now

The current state of AI governance is a mixture of emerging regulation and continued self-regulation. The EU AI Act is the most significant external oversight framework. US regulation remains fragmented. International coordination is embryonic.

Within this landscape, self-regulation continues to play a role, but its role should be understood clearly. Company safety teams, responsible AI practices, and voluntary commitments are useful supplements to regulation. They are not substitutes for it.

The AI companies that are genuinely committed to responsible development should welcome effective external oversight, because it levels the playing field. When every company is held to the same standard, no company loses competitive advantage by doing the right thing. Self-regulation, by contrast, punishes the most responsible actors and rewards those willing to cut corners.

The question isn't whether the AI industry should be regulated. The historical record across every comparable industry answers that question definitively. The question is how to design regulation that is effective, technically informed, proportional, and adaptable. That's a hard problem. But it's a much better problem to be solving than the one we get if we continue to let the industry police itself.

Comments


bottom of page