2 comments

  • tonetegeatinst 10 hours ago ago

    Are you able to share what models or fine tuning you did for the agents?

    I'm currently studying security in college, and most of my time is spent working on a good system card and premade prompts for certain situations like using nmap or burpsuite.

    • gauravbsinghal 10 hours ago ago

      Great question. We use frontier models (Claude, Gemini class) without fine-tuning. The insight that changed everything for us: prompt engineering alone hits a ceiling fast for offensive security.

      What matters more than the model:

      1. Architecture over prompts. Cipher isn't one agent with a great prompt — it's multiple agents with distinct roles (recon, attack, verification) that coordinate. The "judge" agent that tries to disprove findings is more important than the attacker agent. 2. Tool use over reasoning. The model doesn't "know" how to pentest — it reasons about what tool to use next based on what it's learned so far. We give it real tools (not simulated ones) and let it chain them. 3. Invariant-based testing over checklist-based. Instead of "try SQLi on every input," Cipher defines security properties ("User A can't access User B's data") and tries to violate them. This catches logic bugs that no scanner finds.

      Since you're studying security — the best thing you can do is get really good at manual pentesting first. Understanding why an attack chain works is what lets you build agents that reason about it. The prompts matter less than the mental model you encode into the system's architecture.

      Happy to chat more — feel free to DM or join our Discord.