Apps tagged with 'llm-evaluation'

All apps in Apps tagged with 'llm-evaluation'category. Use the filters below to narrow down your search.

Arena AI
15 likes
Public platform for evaluating large language models using anonymous pairwise comparisons, crowd-sourced voting, real-time result updates, and global performance tracking.
Cost / License
Free
Proprietary (Apache-2.0)
Application types
Large Language Model (LLM) Tool
AI Chatbot
Origin
United States
Platforms
Online
Best alternatives are ChatGPTandDeepSeek
49 alternatives
Langfuse
1 like
Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more.
Cost / License
Freemium
Proprietary
Application type
Large Language Model (LLM) Tool
Origin
Germany
EU
Platforms
Online
Software as a Service (SaaS)
Self-Hosted
Docker
+3
Best alternatives are RapidClawandOpik
4 alternatives
Helicone
1 like
Helicone is the open-source LLM observability platform for developers to monitor, debug, and improve production-ready applications.
Cost / License
Free Personal
Open Source (Apache-2.0)
Application type
Large Language Model (LLM) Tool
Origin
United States
Platforms
Online
Self-Hosted
Docker
Best alternatives are RapidClawandLangfuse
6 alternatives
AIQ-X
Like
Test AI models yourself, privately, with a standardized benchmark, and get both technical scores AND practical recommendations.
Cost / License
Free
Proprietary
Origin
United States
Platforms
Online
LightEval
Like
LightEval is a lightweight LLM evaluation suite that Hugging Face has been using internally with the recently released LLM data processing library datatrove and LLM training library nanotron.
Cost / License
Free
Open Source (MIT)
Origin
United States
Platforms
Self-Hosted
Python
Best alternatives are Opikandxseek
3 alternatives
Lisapet.ai
Like
Lisapet.ai is the next-level AI product development platform that empowers teams to prototype, test, and ship robust AI features 10x faster.
Cost / License
Paid
Proprietary
Platforms
Online
Best alternatives are Vellum
1 alternatives
Netra
Like
Netra is the reliability platform for AI agents to observe, evaluate, simulate, and continuously improve every decision your agents make, so you can ship with confidence and catch regressions before your users do.
Cost / License
Freemium
Proprietary
Origin
United States
Platforms
Online
Software as a Service (SaaS)
+2
Best alternatives are LangfuseandHelicone

Arena AI

Cost / License

Application types

Origin

Platforms

Langfuse

Cost / License

Application type

Origin

Platforms

Helicone

Cost / License

Application type

Origin

Platforms

AIQ-X

Cost / License

Origin

Platforms

LightEval

Cost / License

Origin

Platforms

Lisapet.ai

Cost / License

Platforms

Netra

Cost / License

Origin

Platforms