<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>devtake.dev — #tool-calling</title><description>Articles tagged tool-calling on devtake.dev.</description><link>https://devtake.dev/</link><language>en-us</language><item><title>Cactus Compute distilled Gemini into a 26M tool-calling model. The trick: no feed-forward layers.</title><link>https://devtake.dev/article/needle-cactus-compute-tool-calling/</link><guid isPermaLink="true">https://devtake.dev/article/needle-cactus-compute-tool-calling/</guid><description>Needle is a 26M-parameter function caller distilled from Gemini 3.1 Flash-Lite. The Simple Attention Network drops MLPs and runs at 6,000 tok/s prefill on edge silicon.</description><pubDate>Wed, 13 May 2026 10:00:00 GMT</pubDate><category>ai</category><category>ai-models</category><category>gemini</category><category>open-weights</category><category>function-calling</category><category>tool-calling</category><category>edge-ai</category><category>on-device-ai</category><category>small-models</category><author>dieter-morelli</author></item></channel></rss>