Curious if anyone has seen differences in how models handle conflicting tool descriptions — e.g., two tools with overlapping capabilities where the boundary isn't clear. In my experience that's where most bad tool calls come from, not from missing descriptions but from ambiguous overlap between tools.
We've been exposing tools via MCP and the biggest lesson so far: the tool description is basically a meta tag. It's the only thing the model reads before deciding whether to call your tool.
Two things that surprised us: (1) being explicit about what the tool doesn't do matters as much as what it does - vague descriptions get hallucinated calls constantly, and (2) inline examples in the description beat external documentation every time. The agent won't browse to your docs page.
The schema side matters too - clean parameter names, sensible defaults, clear required vs optional. It's basically UX design for machines rather than humans. Different models do have different calling patterns (Claude is more conservative, will ask before guessing; others just fire and hope) so your descriptions need to work for both styles.
> inline examples in the description beat external documentation every time. The agent won't browse to your docs page.
That seems... surprising, and if necessary something that could easily be corrected on the harness side.
> The schema side matters too - clean parameter names, sensible defaults, clear required vs optional. It's basically UX design for machines rather than humans.
I don't follow. Wouldn't you do all those things to design for humans anyway?
CRIPIX seems to be a new and unusual concept. I came across it recently and noticed it’s available on Amazon. The description mentions something called the Information Sovereign Anomaly and frames the work more like a technological and cognitive investigation than a traditional book. What caught my attention is that it appears to question current AI and computational assumptions rather than promote them. Has anyone here heard about it or looked into it ?
tool description wording does matter, at least in my testing. models seem to use the description to reason about whether a tool "should" apply, not just whether it can. two things that helped: (1) explicit input format with an example, (2) a one-sentence note about what the tool does NOT handle. the negative case helps models avoid calling it on edge cases and then failing, which trains them (in context) to prefer it when it's actually the right fit.
Tool description quality matters way more than people expect.
In my experience with MCP servers, the biggest win is specificity about when not to use the tool. Agents pick confidently when there's a clear boundary, not a vague capability statement.
Curious if anyone has seen differences in how models handle conflicting tool descriptions — e.g., two tools with overlapping capabilities where the boundary isn't clear. In my experience that's where most bad tool calls come from, not from missing descriptions but from ambiguous overlap between tools.
We've been exposing tools via MCP and the biggest lesson so far: the tool description is basically a meta tag. It's the only thing the model reads before deciding whether to call your tool.
Two things that surprised us: (1) being explicit about what the tool doesn't do matters as much as what it does - vague descriptions get hallucinated calls constantly, and (2) inline examples in the description beat external documentation every time. The agent won't browse to your docs page.
The schema side matters too - clean parameter names, sensible defaults, clear required vs optional. It's basically UX design for machines rather than humans. Different models do have different calling patterns (Claude is more conservative, will ask before guessing; others just fire and hope) so your descriptions need to work for both styles.
> inline examples in the description beat external documentation every time. The agent won't browse to your docs page.
That seems... surprising, and if necessary something that could easily be corrected on the harness side.
> The schema side matters too - clean parameter names, sensible defaults, clear required vs optional. It's basically UX design for machines rather than humans.
I don't follow. Wouldn't you do all those things to design for humans anyway?
CRIPIX seems to be a new and unusual concept. I came across it recently and noticed it’s available on Amazon. The description mentions something called the Information Sovereign Anomaly and frames the work more like a technological and cognitive investigation than a traditional book. What caught my attention is that it appears to question current AI and computational assumptions rather than promote them. Has anyone here heard about it or looked into it ?
tool description wording does matter, at least in my testing. models seem to use the description to reason about whether a tool "should" apply, not just whether it can. two things that helped: (1) explicit input format with an example, (2) a one-sentence note about what the tool does NOT handle. the negative case helps models avoid calling it on edge cases and then failing, which trains them (in context) to prefer it when it's actually the right fit.
Tool description quality matters way more than people expect. In my experience with MCP servers, the biggest win is specificity about when not to use the tool. Agents pick confidently when there's a clear boundary, not a vague capability statement.