LLM Evaluations

LLM Evaluations is a observability capability available through Arize Phoenix on Aweb. Automated evaluation of LLM outputs. Access it through a single unified API with automatic failover and intelligent routing.

Try LLM Evaluations API docs

Best for

Highest quality

Arize Phoenix

Premium tier

Most affordable

Arize Phoenix

Economy tier

Contract

Max Latency30000ms

Providers (1)

ProviderScoreQualityPricing

Arize PhoenixDEFAULT

80premiumeconomy

Quick start

Call LLM Evaluations through Alfred — automatic provider selection, failover, and load balancing included.

cURL

curl -X POST https://api.alfred-ai.app/v1/execute \
  -H "Authorization: Bearer $ALFRED_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "capability": "observability.evals",
    "input": { "prompt": "Hello world" }
  }'

TypeScript

import { Alfred } from '@alfred/core';

const alfred = new Alfred({ apiKey: process.env.ALFRED_API_KEY });

// Alfred automatically selects the best provider
const result = await alfred.execute({
  capability: 'observability.evals',
  input: { prompt: 'Hello world' },
});

console.log(result.output);

Orchestration pipeline

import { Alfred } from '@alfred/core';

const alfred = new Alfred({ apiKey: process.env.ALFRED_API_KEY });

// Multi-step pipeline with automatic failover
const result = await alfred.orchestrate({
  steps: [
    { id: 'step1', capability: 'observability.evals', input: { prompt: 'Hello world' } },
    { id: 'step2', capability: 'llm.chat', dependsOn: ['step1'],
      input: { prompt: 'Summarize: $step1.output' } },
  ],
});

Related Observability capabilities

observability

observability

observability

observability

observability

Getting started →API reference →All providers →All capabilities →