|
Message-Id: <2C67A739-491E-4672-94F1-5C78DBC55C97@dwheeler.com> Date: Mon, 19 Aug 2024 17:02:29 -0400 From: "David A. Wheeler" <dwheeler@...eeler.com> To: oss-security@...ts.openwall.com Subject: Re: AI Cyber Challenge (AIxCC) semi-final results from DEF CON 32 (2024) > On Aug 17, 2024, at 4:32 PM, Alfredo Ortega <ortegaalfredo@...il.com> wrote: > > I found a real bug (OpenBSD IPv6 Multicast Forwarding Cache sysctl > kernel heap overflow) using Mistral-Medium almost 6 months ago: > https://github.com/ortegaalfredo/vulns-ai/blob/main/openbsd_mfc6_sysctl_overflow.txt > > The simple tool that did it is also released as open-source here: > > https://github.com/ortegaalfredo/autokaker > > About to release the second version, and a vscode plugin, next week. That's even more evidence that LLMs can find at least some vulnerabilities. Also - here's a visualization that tries to show how AIxCC competitors did against the challenge problems: https://dashboard.aicyberchallenge.com/collectivesolvehealth You can see that the tools found & fixed many of the seeded vulnerabilities in nginx, a few in all but one of the others, and they struggled with the Linux kernel. The Linux kernel is *huge* compared to most projects, so that isn't too surprising. The final competition is in about a year, so there's hope that the tools will make improvements in that time as part of the challenge. To be honest, even finding and fixing *some* problems automatically is a big win, especially if false reports are rare. Still, the better the tools are at finding and fixing vulnerabilities, the better off we are. --- David A. Wheeler
Powered by blists - more mailing lists
Please check out the Open Source Software Security Wiki, which is counterpart to this mailing list.
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.