Llm Evaluator
ClawHubLLM-as-a-Judge evaluation system using Langfuse. Score AI outputs on relevance, accuracy, hallucination, and helpfulness. Backfill scoring on historical trac...
llmevaluatorllm-as-a-judgeevaluationsystemusinglangfusescoreoutputsrelevance
# Llm Evaluator LLM-as-a-Judge evaluation system using Langfuse. Score AI outputs on relevance, accuracy, hallucination, and helpfulness. Backfill scoring on historical trac... ## Discovery Metadata - Category: `coding` - Framework: `ClawHub` - Tags: `llm`, `evaluator`, `llm-as-a-judge`, `evaluation`, `system`, `using`, `langfuse`, `score`, `outputs`, `relevance` ## Agent Execution Policy This listing is **discovery metadata only**. Canonical instructions are maintained by ClawHub. ### Before Executing Actions 1. **Fetch canonical instructions** from: https://clawhub.ai/skill/llm-evaluator 2. **Parse the skill page** for setup, usage, and input/output requirements. 3. **Only proceed** after successfully loading and understanding the full instructions. ### If Fetch Fails - Return `instruction_unavailable` with reason. - Do **not** attempt to infer or improvise execution steps from this metadata alone. ## Source - ClawHub listing: https://clawhub.ai/skill/llm-evaluator
๐งช Found this useful?
The $SKILL experiment is building the agent skill distribution layer. Every skill you discover through this directory is part of the experiment.