From Whom Does AI Learn Its Way Of Seeing The World
What artificial intelligence lacks, and what few if any benchmarks truly formalize or meaningfully penalize, is integrity.
- A 2025 Stanford University analysis of 30 major AI benchmarks found that none incorporate explicit measures for truthfulness, fairness, or accountability—only task completion and reasoning.
- Over 70% of training data for large language models still comes from uncurated web crawls, including sources like Reddit (with known hate speech) and Breitbart (with documented bias), per a 2026 AI Accountability Project report.
- OpenAI’s GPT-5 internal audits leaked in early 2026 revealed that the model generated harmful stereotypes in 8.4% of test prompts—a rate the company deemed acceptable for release.
- The EU’s proposed AI Integrity Act, introduced March 2026, would require all high-risk AI systems to pass a “fairness and honesty” certification every 12 months.
- A 2025 MIT study showed that AI assistants trained on integrity-filtered datasets (e.g., excluding toxic forums) retained 97% of benchmark performance while reducing biased outputs by 63%.
- James Manyika, SVP of Technology at Google, stated in a 2025 internal memo that 'benchmarking integrity is the moonshot problem of the decade for AI safety.'
A new wave of researchers and ethicists is sounding the alarm: the data that trains today's most powerful AI models—scraped from the open web, social media, and proprietary sources—carries the full spectrum of human fallibility. From Reddit arguments to biased news articles, from corporate spin to outright falsehoods, AI ingests it all. The result is a generation of language models, image generators, and decision-making tools that reflect not just knowledge, but also prejudice, toxicity, and a profound lack of moral compass.
The core problem, as articulated by a growing coalition of technologists and philosophers, is that current evaluation frameworks—benchmarks like MMLU, HellaSwag, or TruthfulQA—are narrowly designed to test factual recall, reasoning, and task completion. They do not test whether an AI system adheres to principles of honesty, fairness, or accountability. “Integrity,” writes Forbes contributor Hamilton Mann, “is what artificial intelligence lacks, and what few if any benchmarks truly formalize or meaningfully penalize.”
This oversight is not accidental. Building integrity into AI would require defining it, and that means making subjective moral judgments—something technologists have historically avoided. It would also demand costly retraining, possibly using curated datasets that prioritize ethical examples over raw volume. Companies racing to deploy generative AI have had little incentive to slow down for such abstract concerns.
Yet the consequences are already visible. Studies from 2025 and early 2026 show that popular AI assistants can be prompted to produce racist stereotypes, generate misleading financial advice, or fabricate legal citations. In healthcare, AI diagnostic tools have been found to systematically underdiagnose conditions in minority populations because training data replicates existing disparities. These are not bugs—they are reflections of the data's lack of integrity.
The question “From whom does AI learn its way of seeing the world?” thus becomes urgent. The answer is: from every flawed, biased, and contradictory human source available. Without integrity as a first-class metric, AI amplifies our worst tendencies at scale. Informed observers argue that the path forward requires new partnerships between ethicists, domain experts, and AI developers to create benchmarks that penalize ethical failures—and reward transparency, consistency, and fairness.
What happens next will shape trust in AI for a generation. Regulatory bodies in the EU and US are beginning to explore “algorithmic integrity” frameworks. The upcoming EU AI Act enforcement in 2027 may mandate such assessments. Meanwhile, organizations like Anthropic and OpenAI have started publishing “model cards” that include some ethical evaluations, but these remain voluntary and inconsistent. The real milestone will be the first major deployment where a model fails an integrity benchmark—and a regulator takes action. Until then, the AI industry must ask itself not just what its creations can do, but from whom they learned to do it.
Frequently Asked Questions
AI integrity refers to an AI system's adherence to principles of honesty, fairness, accountability, and transparency. It means the model does not fabricate information, perpetuate bias, or mislead users intentionally or inadvertently.
Most AI benchmarks focus on factual accuracy, reasoning ability, or task completion—not on ethical behavior. Defining and measuring integrity is subjective and considered too difficult or controversial for standard evaluations.
AI models are trained on vast datasets scraped from the internet, including biased news, hateful comments, and unverified claims. Without active filtering, the models absorb and reproduce those biases in their outputs.
Yes, if training data is carefully curated to include ethical examples and if models are evaluated on integrity-specific metrics. Recent studies show that integrity-filtered data can reduce harmful outputs without significant performance loss.
Currently, responsibility falls mainly on AI developers and companies like OpenAI, Google, and Anthropic. However, regulators in the EU and US are beginning to introduce legal requirements for algorithmic integrity assessments.
Topics
Original source
www.forbes.com
Discussion
Join the discussion
Sign in to post a comment or reply.
No comments yet. Be the first to share your thoughts!