Augment Code · @augmentcode

𝐆𝐏𝐓-𝟒.𝟏 𝐚𝐥𝐦𝐨𝐬𝐭 𝐭𝐨𝐩𝐬 𝐂𝐥𝐚𝐮𝐝𝐞 𝟑.𝟕 𝐨𝐧 𝐜𝐨𝐝𝐢𝐧𝐠?! New eval dropping using our #1 SWE-...

View this X/Twitter post from @augmentcode published on 15 เมษายน 2568 เวลา 00:02. This post contains 1 images.

โพสต์ใหม่View creator profile

Published

15 เมษายน 2568 เวลา 00:02

Thread Items

Media Items

Augment Code

@augmentcode

15 เมษายน 2568 เวลา 00:02

Tweet Overview

View this X/Twitter post from @augmentcode published on 15 เมษายน 2568 เวลา 00:02. This post contains 1 images.

𝐆𝐏𝐓-𝟒.𝟏 𝐚𝐥𝐦𝐨𝐬𝐭 𝐭𝐨𝐩𝐬 𝐂𝐥𝐚𝐮𝐝𝐞 𝟑.𝟕 𝐨𝐧 𝐜𝐨𝐝𝐢𝐧𝐠?!

New eval dropping using our #1 SWE-bench coding agent!

- GPT-4.1 beats Gemini 2.5 Pro and almost tops Claude 
   3.7 Sonnet!
- Even GPT-4.1 mini matches Claude 3.5 Sonnet V2 
   performance. It was the top model just 2mo ago!

The evaluation is done through our proprietary codebase understanding benchmark AugmentQA. You can learn more at: https://www.augmentcode.com/blog/you-make-your-evals-then-your-evals-make-you-introducing-augmentqa

Try our agent yourself at: http://www.augmentcode.com.

Related Creators

𝐆𝐏𝐓-𝟒.𝟏 𝐚𝐥𝐦𝐨𝐬𝐭 𝐭𝐨𝐩𝐬 𝐂𝐥𝐚𝐮𝐝𝐞 𝟑.𝟕 𝐨𝐧 𝐜𝐨𝐝𝐢𝐧𝐠?! New eval dropping using our #1 SWE-...

Tweet Overview

Related Creators

Free Twitter video downloader. Top Twitter trends and hashtags list, Monitor, track hottest trending topics, hashtags.

ลิงค์อื่น ๆ

ดาวน์โหลด

สินค้าที่เกี่ยวข้อง

© 2024 TwitFast สงวนลิขสิทธิ์.