Older 2.5 models showed significantly less improvement, which Google attributes to weaker reasoning abilities. However, a Vercel study suggests that direct instructions via AGENTS.md could be even more effective. Google is therefore also exploring alternative approaches, including MCP services.

Success rate of Gemini models with and without Agent Skill across 117 coding tasks: Newer models in the 3.x series benefited more from the skill due to stronger reasoning capabilities compared to older models with weaker reasoning. | Image: Google
Success rate of Gemini models with and without Agent Skill across 117 coding tasks: Newer models in the 3.x series benefited more from the skill due to stronger reasoning capabilities compared to older models with weaker reasoning. | Image: Google

Conclusion:

Google’s new Agent Skill for the Gemini API highlights a growing shift toward dynamic AI systems that can access real-time knowledge instead of relying solely on static training data.