llm
Large Language Models
Fast LLM Inference
Fast LLM Inference is a large language models capability available through Groq on Aweb. Ultra-low latency LLM inference optimized for speed via dedicated hardware. Access it through a single unified API with automatic failover and intelligent routing.
Best for
Highest quality
Groq
Premium tier
Most affordable
Groq
Economy tier
Contract
Providers (1)
Quick start
Call Fast LLM Inference through Alfred — automatic provider selection, failover, and load balancing included.
cURL
curl -X POST https://api.alfred-ai.app/v1/execute \
-H "Authorization: Bearer $ALFRED_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"capability": "llm.fast-inference",
"input": { "prompt": "Hello world" }
}'TypeScript
import { Alfred } from '@alfred/core';
const alfred = new Alfred({ apiKey: process.env.ALFRED_API_KEY });
// Alfred automatically selects the best provider
const result = await alfred.execute({
capability: 'llm.fast-inference',
input: { prompt: 'Hello world' },
});
console.log(result.output);Orchestration pipeline
import { Alfred } from '@alfred/core';
const alfred = new Alfred({ apiKey: process.env.ALFRED_API_KEY });
// Multi-step pipeline with automatic failover
const result = await alfred.orchestrate({
steps: [
{ id: 'step1', capability: 'llm.fast-inference', input: { prompt: 'Hello world' } },
{ id: 'step2', capability: 'llm.chat', dependsOn: ['step1'],
input: { prompt: 'Summarize: $step1.output' } },
],
});