In warning to Devs Devin AI gets very close to being an autonomous end to end software engineer

0 12

By Matthew Griffin Robo Revolution 21st March 2024

WHY THIS MATTERS IN BRIEF

In time we’ll see a fully autonomous AI software developer, at first it will be okay, then it will be great.

Love the Exponential Future? Join our XPotential Community, future proof yourself with courses from XPotential University, read about exponential tech and trends, connect, watch a keynote, or browse my blog.

A new Artificial Intelligence (AI) model has triggered unease among developers because of its astounding ability to write complex code, then scan any errors that may arise in compilation and automatically correct them – just like a human programmer would. The model, dubbed ‘Devin’ is developed by AI startup Cognition.

Maryland wants to be the first US state to switch to a four day working week

Backed by bigwigs like Peter Thiel’s Founders Fund, former Twitter exec Elad Gil, and Doordash co-founder Tony Xu, Cognition has secured $21 million in funding. And while AI coding assistants like those on Stackoverflow and Github have been around for a while, including OpenAI’s celebrated Copilot, Devin purportedly raises the bar by taking on end-to-end development responsibilities.

If Cognition’s claims hold water, Devin could mark a shift in the world of AI-assisted coding. Rather than playing second fiddle to human developers, this AI seems primed to operate as a self-sufficient, dare we say ‘edging on autonomous,’ software engineer in its own right. According to the startup’s founder and CEO Scott Wu, Devin operates within a secure sandbox, planning and executing complex engineering tasks through common dev tools like code editors and web browsers.

Bloomberg used 1.3 Million GPU hours and 600 Billion documents to train BloombergGPT

All a human needs to do is feed Devin instructions via a chat interface. From there, the AI dynamically maps out a solution, gets its hands dirty writing the actual code, fixes bugs along the way, tests its work, and keeps the user updated in real-time. If the programmer spots any issues, they can simply message Devin to course-correct.

See DEVIN in action

Wu demonstrated Devin’s impressive range in a blog post, from deploying web apps and websites to fine-tuning large language models using GitHub repos.

Perhaps its biggest feat, however, is its performance on the SWE-bench test which evaluates AI’s ability to resolve real open-source software issues from GitHub. Devin could solve 13.86% of these cases entirely independently compared to figures like 4.8% for Claude, 3.97% for a different AI called SWE-Llama, and 1.74% for GPT-4.

AI helps emergency dispatchers diagnose heart attacks by listening in on phone calls

While Devin remains under wraps for now, Cognition hopes to make it available to select customers soon. The company seems to view coding as just the start too, suggesting it could leverage its core “long-term reasoning and planning” advances to create AI workers for other domains.

Matthew Griffin / About Author

Matthew Griffin, multi-award winning Futurist and named Futurist of the Year 2024, has been described as a "Walking encyclopaedia of the future" by NASA and a futurist polymath. One of the world's most renowned futurists and strategic foresight experts Matthew is the 15 times author of the blockbuster "Codex of the Future" series, and is the Founder and Futurist in Chief of the 311 Institute, a global Futures and Deep Futures advisory firm working across the next 50 years, XPotential University, the world's first free futures and foresight university, and the World Futures Forum which works with the United Nations to solve the worlds greatest challenges. Matthew is an in demand international keynote, acclaimed university lecturer and mentor, and host of the hit Fanatical Futurist podcast.

A rare talent in his past Matthew helped build and run several multi-billion dollar business units for Atos, Dell-EMC, and IBM, and his ability to identify, track, and explain the impacts of hundreds of emerging technologies and trends on global business, culture, and society has earned him a powerful reputation and a roster of clients that include royal households, world leaders, G7, G20, and G77+ governments, and many of the world's most respected brands including ABB, Accenture, Adidas, AON, ARM, BCG, Centrica, Citi Group, Coca Cola, Dentons, Deloitte, Disney, Dow, EY, KPMG, Lego, Legal & General, LinkedIn, Microsoft, PepsiCo, Qualcomm, RWE, Samsung, T-Mobile, UBS, VISA, and many others. He was also the only futurist invited to talk at the UN COP28 held in Dubai alongside world leaders.

Regularly featured in the global media including the AP, BBC, Bloomberg, CNBC, Discovery, Forbes, Khaleej Times, Telegraph, TIME, ViacomCBS, WIRED, and the WSJ, Matthews mission is to help organisations create a fair and sustainable future whose benefits are shared by everyone irrespective of their ability, background, or circumstances.