Apps tagged with 'ai-evaluation'

All apps in Apps tagged with 'ai-evaluation'category. Use the filters below to narrow down your search. 
Copy a direct link to this comment to your clipboard
  1. Spec27 icon
     1 like

    Automates AI agent validation using specification-driven, black-box testing for robustness, regressions, and security, without requiring code-level or SDK access.

    Cost / License

    • Free
    • Proprietary

    Platforms

    • Online
    • Software as a Service (SaaS)
    Robustness Evaluations
    Robustness Specifications
    1 alternatives
  2. Agentuity icon
     Like

    The full-stack cloud platform for AI agents. Build with intelligent routing, persistent state, and seamless handoffs. Deploy with built-in APIs, React frontends, databases, sandboxes, and monitoring — on our cloud, your VPC, or on-prem.

    Cost / License

    Platforms

    • Online
    • Self-Hosted
    • Docker
    • Windows
    • Linux
    • Mac
    Observability
    Evals
    Agent Workbench
    +2
    Sandboxes
    6 alternatives
  3. Prompt evaluation platform that scores prompts 0-100 across 4 dimensions (clarity, specificity, structure, goal alignment) and suggests improvements.

    Cost / License

    • Freemium
    • Proprietary

    Platforms

    • Online
    • Software as a Service (SaaS)
    Evaluator macro score
    Recommendations and advanced technical analysis
    Playground A/B results
    +5
    Playground A/B results
  4. Netra icon
     Like

    Netra is the reliability platform for AI agents to observe, evaluate, simulate, and continuously improve every decision your agents make, so you can ship with confidence and catch regressions before your users do.

    Cost / License

    • Freemium
    • Proprietary

    Platforms

    • Online
    • Software as a Service (SaaS)
    Netra screenshot 1
    Netra screenshot 1
    Netra screenshot 2
    +2
    Netra screenshot 3
  5. Vivgrid icon
     Like

    Vivgrid is a platform for building production-ready AI agents. Vivgrid streamlines prompt editing, testing, and evaluation to lower development and coordination overhead.

    Cost / License

    • Freemium
    • Proprietary

    Platforms

    • Online
    Vivgrid screenshot 1
    2 alternatives