Google’s latest model, Gemini 3 Pro, is now available for all Warp users. Give it a try, and let us know what you think.
Gemini 3 Pro impressed the Warp team right out of the gate. It performed well on benchmarks and real tasks— more on both below.
Over 15% improvement on Terminal-Bench 2.0
Warp’s last score on Terminal-Bench was 50.1%, which is #2 on the board. Switching from our previous default model to using Gemini 3 Pro raised our score to 59.1%.

Standout performance in tests
The Warp team has been using Gemini 3 Pro internally for around a week. Sharing some comments that have organically come through our Slack channel below.
I had Gemini implement this change for me. The thing I was most impressed by was the change that Gemini made to our UI framework itself, adding an API that I as a developer would have added — it felt like Gemini wrote really high quality code in addition to creating something that worked.
Zach Lloyd
Just had Gemini 3 take a stack of 6 prs, start on the first one, run presubmit, fix everything, amend, go to next one, and do the same for all 6, then push all the PRs back up. All the fixes (albeit kinda trivial) were correct on the first try. Kinda mundane, but it felt magical to not have to do that grunt work.
Zach Bai
Okay, Gemini 3 cooked. This new model is a game-changer. It is consistently nailing multi-step fixes.
Suraj Gupta
This is the first model that I don't even need to wait on our Rust compiler to run. It works every time. I've gotten 3 features done today and the code quality is very high.
Ben Holmes




