Artificial intelligence is increasingly making strides in mathematical research, demonstrating abilities that both impress and challenge human experts. While AI has succeeded in solving complex problems, leading mathematicians emphasize that human judgment, intuition, and the ability to frame meaningful questions remain essential aspects of the field.

In May, OpenAI announced that its AI model had disproved a longstanding conjecture posed by Paul Erdos in 1946 concerning the number of point pairs in a plane separated by the same distance. Erdos, renowned for posing hundreds of challenging problems, offered financial rewards for proofs or disproofs of the problem, which remained unsolved for over 80 years. Princeton mathematician Noga Alon described the AI’s breakthrough as a “spectacular solution,” underscoring the growing impact of AI in mathematics research. Thomas Bloom, a research mathematician at the University of Manchester, likened AI’s role to helping “more fully explore the cathedral of mathematics.”

Sébastien Bubeck, a mathematician and researcher at OpenAI, noted that the solution emerged from a general AI model that was not specifically fine-tuned for mathematical problem-solving and required no human input. While he acknowledged AI’s advantage in managing the mental complexity of proofs, Bubeck also highlighted a key limitation: the lack of understanding about the broader context and motivation behind the problems being addressed. “I’m trying to not only solve this problem … but it’s part of a broader program. And these models don’t have broader agendas,” he said.

Rodrigo Ochigame, an anthropologist and historian of computing at Leiden University, suggested that the role of mathematicians will evolve rather than disappear. He anticipated a shift toward setting research directions, developing new techniques, cultivating insight, and linking specific problems to larger conceptual frameworks in mathematics and beyond.

Despite these advances, many in the mathematical community remain cautious. They note that AI solutions often require careful verification and that the technology can sometimes produce incorrect or incomplete results. Martin Hairer, a Fields Medal-winning mathematician affiliated with EPFL and Imperial College London, pointed out the difficulty in trusting AI-generated proofs, describing their outputs as lacking the clarity and honesty characteristic of human-written proofs.

In response, the First Proof project was established to provide a rigorous benchmark for AI’s performance on mathematical problems. By presenting a set of previously solved but unpublished problems, this initiative aims to gauge AI’s problem-solving capabilities more transparently. In their most recent assessment, four AI systems attempted 10 problems, with at least one AI producing a correct solution for seven of them. Results varied from flawless proofs to minor corrections and outright failures, with some AI-generated strategies impressing expert referees by differing from human approaches.

Terry Tao, a Fields Medalist at the University of California, Los Angeles, contributed to efforts that combine AI models with software tools to aid reasoning. Tao described the distinction between human mathematicians and AI as one between “mountain climbers” and “jumpers.” According to Tao, experts carefully map out intermediate steps and support each other, while AI tries to leap to answers quickly but “do not ‘fail gracefully,’” making it harder to build on partial successes.

As AI continues to develop, experts agree that it represents a new tool rather than a replacement for human mathematicians. The challenge lies in harnessing its capabilities while preserving the essential human elements of insight and strategic vision within the discipline.