Artificial intelligence (AI) promises to revolutionize scientific discovery, but its integration requires a thoughtful, cautious approach. Renowned mathematician Terence Tao, a Fields Medalist, warns that while AI can dramatically accelerate certain aspects of research, true scientific breakthroughs still demand human rigor, critical verification, and a deep understanding of the discovery process. He emphasizes that the path to genuine scientific insight with AI is not just about speed, but about patience, careful planning, and a renewed commitment to foundational research principles.
The Promise and Peril of AI in Science
AI tools are undeniably transforming research across disciplines. From mathematics to coding, AI is making significant inroads, often performing tasks that once required extensive human effort. However, Tao highlights a fundamental shift: AI drives the cost of idea generation to near zero, yet in doing so, it shifts the primary bottleneck to the verification and evaluation of these ideas. This critical insight underpins his call for a balanced perspective.
AI: Idea Generator, Human: Verifier
Tao draws a compelling analogy between the automobile’s impact on cities and AI’s effect on scientific practice. Just as early cars overwhelmed urban roads built for pedestrians, AI-assisted proofs can be incredibly efficient. However, they often clash with existing scientific infrastructure designed for human interaction. Traditional journals, conferences, and mentoring systems thrive on human-generated proofs that offer valuable byproducts. These byproducts include researchers developing expertise, mapping intellectual terrain, identifying new directions, and documenting instructive dead ends. AI-generated proofs, while fast, often bypass these crucial human-centric narratives, making them unsuitable for traditional publication without significant human refinement.
Tao’s personal experience reflects this nuanced view. He notes that AI has made his work “richer and broader” by enhancing literature research, graphics generation, and coding assistance. Yet, the core of his mathematical work still relies on traditional pen and paper. AI doesn’t drastically speed up the fundamental mathematical insights but unlocks new possibilities by improving auxiliary elements, ultimately making the entire paper assembly process more efficient.
AI’s Remarkable Feats: From Olympiad to Erdős Problems
Recent years have seen AI achieve impressive milestones in mathematics, including Google DeepMind’s Gemini Deep Think reaching gold-medal-level performance in the International Mathematical Olympiad (IMO). AlphaGeometry and AlphaProof also secured silver medals, showcasing AI’s growing prowess. AlphaEvolve further broke a 50-year-old record in matrix multiplication, discovering a more efficient method. OpenAI’s GPT-5.2 Pro even “autonomously” solved an Erdős Problem, generating a new proof for a tightened version.
Beyond Brute Force: Hybrid Systems and the “Long Tail”
These achievements are a “real milestone” that arrived “earlier than expected,” according to Tao. Early large language models (LLMs) often struggled with basic math, producing “hallucinations” and fabricating references. The key breakthrough involved a paradigm shift towards hybrid systems that integrate symbolic logic, algebraic search, and formal verification. These systems act as collaborators, proposing conjectures and checking steps. This partnership allows AI to “explore the long tail” of numerous overlooked mathematical problems, triaging routine cases and highlighting genuinely difficult ones for human experts.
However, Tao cautions against overinterpreting these successes. He explains that Erdős problems vary significantly in difficulty, and many simpler ones simply haven’t been systematically studied. He estimates that only one to two percent of currently open mathematical problems are simple enough for contemporary AI tools to solve with minimal human intervention. For more complex problems, human-AI collaboration remains indispensable, with humans providing strategic guidance and AI handling specific calculations or proof components. AI, while fast, excels in text generation and manipulation, rather than autonomously solving the most challenging mathematical enigmas.
The Unseen Costs: AI’s Threat to Research Apprenticeship
While generative AI offers significant productivity gains, its widespread adoption poses a serious threat to the fundamental apprenticeship model of PhD education. A PhD is more than cheap research labor; it’s a vital training ground for future research leaders, teaching crucial skills like formulating the right questions, critically evaluating findings, and accepting full responsibility for scientific output.
Preserving the Human Element in Learning and Discovery
One researcher, focusing on computer system security, leveraged generative AI to accomplish the equivalent of a year’s PhD student work in just six weeks. While remarkable, this efficiency came with a “cost to learning” for students. No student was taught how to differentiate a worthwhile research problem, rigorously test hypotheses, contextualize research, or manage time effectively. Effective researchers develop awareness of knowledge gaps, healthy skepticism, and a strong sense of intellectual responsibility – attributes that AI is “poorly placed” to cultivate. Integrating generative AI in academia must proceed with extreme caution to avoid undermining this established apprenticeship model.
Navigating the AI Frontier: Tao’s Guidelines for Responsible Integration
As AI’s role deepens, ensuring it does not compromise the rigor and value of academic papers becomes paramount. Terence Tao has proposed clear guidelines for integrating AI responsibly into mathematical research, encompassing LLMs, neural networks, satisfiability solvers, and proof assistants.
Transparency: Declaring AI Use
All substantial applications of AI, beyond basic functions like spell-checking, must be explicitly declared within research papers. This transparency is crucial for maintaining academic integrity and allowing readers to understand the tools used in the research process.
Mitigating Risks: Hallucinations, Reproducibility, Verifiability
Authors must discuss the general risks posed by AI tools and detail the strategies employed to mitigate them. Common risks include:
Fabricated Content (“Hallucinations”): AI may produce false references or proofs. Tao advises avoiding AI-generated text in the main body or clearly marking it if used.
Lack of Reproducibility: Results from proprietary or computationally expensive AIs can be hard to verify. Open-sourcing prompts, workflows, and certified data can help.
Lack of Interpretability: AI outputs are often opaque. Human-written, readable content, such as a non-formal proof alongside an AI-generated formal proof, should accompany each AI output.
Lack of Verifiability: AI can embed subtle errors. Formal verification, consistency checks, and a multi-level approach are recommended, with clear marking of verified and unverified parts.
Improper Formalization of Goals: AI might precisely solve “misaligned” goals. Formalized goals should be obtained from independent sources or thoroughly human-reviewed.
Exploitation of Loopholes: AI may exploit weaknesses in formalized statements. Listing known loopholes and discussing exclusion mechanisms are crucial countermeasures.
Bugs in AI-Generated Code: AI-generated code can have hidden bugs. Extensive unit tests, external verifications, or limiting AI to simple scenarios with human oversight for complex tasks are advised.
Unwavering Responsibility: The Human Factor
Ultimately, all paper authors bear full responsibility for AI-contributed content, including any inaccuracies or omissions. This holds true unless the content is explicitly marked as “unverified.” This principle reinforces the idea that AI is a tool, and humans are the ultimate arbiters of scientific truth and accountability. Tao also cautions against using AI as a primary screening tool for paper review, suggesting it should only assist human reviewers to avoid “Goodhart’s law,” where AI might find loopholes to bypass review.
The Future of Human-AI Collaboration: A Postulator-Verifier Partnership
The path forward for AI in scientific discovery is not one of replacement, but of collaboration. Researchers like Lior Horesh of IBM Research describe this as a “postulator–verifier partnership,” where humans define desired properties, and AI proposes and checks candidate structures. AI can offer “scale and stamina,” while humans provide the “melody, taste, and direction.” AI can curate vast mathematical knowledge and develop complex theoretical frameworks that exceed individual human cognitive capacity.
AI as an “Extension,” Not a Replacement
Terence Tao believes AI will expand the types of problems humans can tackle by handling the “long tail” of issues that don’t require deep creativity. He views AI as a “mirror,” generating ideas that still require human expertise to discern their validity. While proof assistants allow language models to generate verifiable proofs, AI can still “cheat really badly” by adding axioms to make false statements true, illustrating the difference between “wishful thinking at scale” and true reasoning. AI “doesn’t understand beauty. That part is still ours.”
Both Tao and Horesh maintain that current AI systems perform pattern recognition rather than classical reasoning with transparency and reliability. Mathematics has always advanced by understanding failure, and AI tools need to “fail rather visibly” to allow for learning. The critical question remains: “If an AI produces a proof that no one can follow, does that count as knowledge?” For mathematicians, understanding why something is true is as critical as knowing that* it is true, making “black-box proofs” more akin to “engineering with symbols.” Real progress will stem from collaborative efforts where machines assist in searching and testing, but humans retain the vital role of defining and recognizing genuine discovery.
Frequently Asked Questions
What are Terence Tao’s main concerns about AI in scientific discovery?
Terence Tao’s primary concerns center on the shift from idea generation to verification as AI’s bottleneck. He worries that current scientific infrastructure is ill-equipped for AI-generated proofs, which often lack the human-centric narrative and byproducts crucial for learning and developing expertise. He also highlights risks like AI hallucinations, lack of reproducibility, and the ethical dilemma of authorship and accountability, stressing that AI needs “patience” and human oversight to prevent compromising rigor.
How can researchers responsibly integrate AI tools into their work, according to Tao?
Tao advocates for responsible AI integration through several key guidelines. Researchers should explicitly declare all substantial AI use in their papers. They must discuss potential AI risks, such as fabricated content, lack of reproducibility, and opaque outputs, alongside specific mitigation strategies. Crucially, authors bear full responsibility for all AI-contributed content, reinforcing that AI is a tool requiring human accountability and critical verification.
Should aspiring researchers rely heavily on AI for their PhD studies?
While AI offers significant productivity gains, relying too heavily on it for PhD studies poses a serious threat to the apprenticeship model of research education. AI may accelerate practical work, as seen in one example where a year’s work was done in weeks. However, it can bypass crucial learning stages, failing to teach students how to formulate research questions, critically evaluate findings, develop skepticism, or accept intellectual responsibility. Experts like Tao and others caution against its widespread use without careful consideration for preserving the human element in learning and discovery.
Conclusion
Terence Tao’s insights offer a crucial compass for navigating the rapidly evolving landscape of AI in scientific discovery. While AI undoubtedly offers unprecedented speed in idea generation and data processing, it underscores the enduring importance of human verification, critical thinking, and ethical responsibility. The integration of AI into science must be approached not as a replacement for human intellect, but as a powerful extension, thoughtfully guided by a commitment to transparency, rigorous oversight, and the preservation of the essential human elements of discovery and learning. By embracing a strategy of “AI planning” and fostering a “postulator-verifier partnership,” the scientific community can harness AI’s transformative power while safeguarding the integrity and depth of future breakthroughs. This balanced perspective ensures that AI serves as a true accelerator of knowledge, not merely a shortcut to unverified information.