📊 Full opportunity report: Engineering Is Automated. Research Is the Residual. on ThorstenMeyerAI.com — validation score, market gap, and execution plan.
TL;DR
AI systems are now capable of automating most engineering tasks involved in AI development, with research remaining the primary challenge. This shift could reshape the AI R&D landscape within the next 32 months.
Recent performance data from multiple AI benchmarks confirm that AI systems are now capable of automating the majority of core engineering tasks involved in AI development, while research tasks remain largely human-driven.
Six key benchmarks measuring AI capabilities in research reproduction, Kaggle competitions, and kernel design show rapid progress, with some reaching saturation levels within 16 to 21 months. For example, the CORE-Bench, which tests research reproduction, has improved from 21.5% to 95.5%, with the latter being described as ‘solved’ by its author. Similarly, the MLE-Bench, assessing performance on Kaggle competitions, has advanced from 16.9% to 64.4%, approaching mid-tier human performance.
Experts interpret these results as evidence that AI can now handle large portions of engineering work involved in AI research—such as reproducing experiments, optimizing models, and designing hardware kernels—at a reliability that approaches human competence. However, the same benchmarks suggest that the creative and exploratory aspects of research, which may involve novel hypothesis generation and strategic insight, are not yet fully automatable.
Engineering is automated.
Research is the residual.
Six skill benchmarks. Edison’s framing. The question Clark leaves open is whether research is just engineering at scale.
Jack Clark’s Import AI #455 catalogs six benchmarks measuring AI capability on AI R&D tasks and concludes “AI can today automate vast swatches, perhaps the entirety, of AI engineering.” The residual question is research. The structural read on the residual: it may not be a permanent moat.
Six skills. One trajectory.
Clark catalogs six benchmarks measuring AI capability on AI R&D-relevant tasks. Each individual benchmark could be noise. Six benchmarks moving together is a curve. The pattern is the cascade observed across the broader Clark series — visible here in the specific R&D-skill domain.

1000 AI Tools Directory 2026: The Ultimate Guide to AI Tools for Business, Productivity, Content Creation, Marketing, Coding, Design, Research and Automation
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Three data points. Mixed signal.
Clark provides three data points on the creative-spark question. Yes-evidence: Erdős-1051, centaur math discovery, sporadic Move-37-style moments. No-evidence: low yield, framing dependence, absence of acceleration. The mixed signal is the honest read.
The data supports two readings. Pessimistic: rare moments suggest creative insight is qualitatively distinct from engineering work. Optimistic: rare moments are an artifact of low-volume exploration; more shots on goal yields more discoveries. Both readings are consistent with Clark’s “vast swatches, perhaps the entirety” claim. They differ on the residual.

Embedded AI Infrastructure Design: Efficient Model Optimization Strategies for Resource-Constrained Computing Environments (Complete Programming, … Development for Beginners and Developers)
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Five dimensions Clark gestures at but leaves underdeveloped.
Clark’s section is rigorous on the empirical evidence. Five strategic dimensions matter for the institutional response that the Clark series synthesis argues is structurally inadequate.

AI Engineering: Building Applications with Foundation Models
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Two readings. Different equilibria.
The structural question Clark leaves open: is research a permanent moat that bounds automated AI R&D, or is it engineering at scale that dissolves with more shots on goal? Both readings are consistent with the current data. They differ by orders of magnitude in consequences.
Productivity multiplier years
Recursive loop operational

Innovation in Music: Current Research Perspectives (Perspectives on Music Production)
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Five audiences. Asymmetric cost of being wrong.
The institutional response should not bet on inspiration being a permanent moat. If the distinction holds, capacity built is still useful. If it closes, capacity is necessary. Asymmetric cost-of-being-wrong points toward building now.
IN INDUSTRY
IN ACADEMIA
POLICYMAKERS
INVESTORS
EVERYONE ELSE
Engineering is automated. The residual is the question. The institutional response should not bet on inspiration being a permanent moat.
Implications for AI Development and Research Roles
This rapid progress in engineering capabilities implies that AI systems could soon take over much of the routine and technical work traditionally performed by human researchers. As the bottleneck shifts from engineering to the more creative and strategic aspects of research, organizations may need to reconsider how they allocate human talent and resources. The potential for near-complete automation of engineering tasks also raises questions about the future of AI innovation, intellectual property, and the role of human intuition in scientific discovery.
Recent Benchmark Progress and AI Capability Trends
The development of AI capabilities over the past 18 months has shown a pattern of rapid improvement across multiple domains relevant to AI R&D. The CORE-Bench, assessing research reproduction, reached near-complete performance in December 2025. The Kaggle-based MLE-Bench, measuring practical model performance, followed a similar trajectory, with the leaderboard paused in April 2026 to develop more robust evaluation methods. These benchmarks, along with advances in kernel design and hardware optimization, indicate a broad trend: AI is approaching or surpassing human-level performance in engineering tasks essential to AI research and development.
This pattern suggests that the ‘perspiration’—the engineering work—may soon be fully automated, leaving the ‘inspiration’—the research and innovation— as the remaining frontier. The structural question Clark leaves open is whether research itself is becoming a form of large-scale engineering, which could accelerate the residual challenge to near-zero.
“The pattern across these benchmarks indicates that AI is approaching or has reached saturation in core engineering skills, potentially automating the bulk of AI development work.”
— Thorsten Meyer
Unresolved Questions About Research Automation
It is still unclear how much of the broader research process—such as hypothesis generation, strategic planning, and novel scientific insight—can be automated. While engineering tasks are nearing full automation, the creative aspects of research may remain human-driven for the foreseeable future. Additionally, the pace at which research itself might become a form of large-scale engineering remains an open question, as does the potential for AI to fully replace human researchers in the long term.
Next Steps for Monitoring AI R&D Automation
In the coming months, researchers and organizations will likely focus on refining evaluation benchmarks, exploring the limits of AI in creative research tasks, and assessing the impact of engineering automation on innovation cycles. Continued benchmarking and real-world deployment will clarify whether AI can fully handle research at scale, or if human insight will remain essential for breakthrough discoveries. Policy discussions around AI’s role in scientific research are also expected to intensify as capabilities advance.
Key Questions
What are the main engineering tasks AI can now automate?
AI can automate tasks such as reproducing research experiments, optimizing hardware kernels, designing neural network architectures, and conducting model training and evaluation at near-human reliability.
Does this mean human researchers are no longer needed?
Not entirely. While engineering tasks are increasingly automated, the creative, strategic, and hypothesis-driven aspects of research remain largely human-driven, at least for now.
How soon could AI fully automate research processes?
It is uncertain. Benchmarks suggest engineering automation is imminent or achieved, but automating the entire research cycle, including innovation and discovery, may take longer and depends on future breakthroughs.
What are the risks of fully automating AI research?
Potential risks include reduced human oversight, loss of scientific diversity, and ethical concerns about AI-driven discovery without human judgment. These issues are under active discussion among policymakers and researchers.
Source: ThorstenMeyerAI.com