Serverless GPU

Serverless GPU is a serverless compute capability available through Replicate, RunPod, Apify and 1 more on Aweb. On-demand GPU compute for ML inference. Access it through a single unified API with automatic failover and intelligent routing.

Try Serverless GPU API docs

Best for

Highest quality

Apify, Cloudflare Workers

Premium tier

Most affordable

RunPod

Economy tier

Contract

Max Latency60000ms

Providers (4)

ProviderScoreQualityPricing

Quick start

Call Serverless GPU through Alfred — automatic provider selection, failover, and load balancing included.

cURL

curl -X POST https://api.alfred-ai.app/v1/execute \
  -H "Authorization: Bearer $ALFRED_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "capability": "compute.serverless",
    "input": { "prompt": "Hello world" }
  }'

TypeScript

import { Alfred } from '@alfred/core';

const alfred = new Alfred({ apiKey: process.env.ALFRED_API_KEY });

// Alfred automatically selects the best provider
const result = await alfred.execute({
  capability: 'compute.serverless',
  input: { prompt: 'Hello world' },
});

console.log(result.output);

Orchestration pipeline

import { Alfred } from '@alfred/core';

const alfred = new Alfred({ apiKey: process.env.ALFRED_API_KEY });

// Multi-step pipeline with automatic failover
const result = await alfred.orchestrate({
  steps: [
    { id: 'step1', capability: 'compute.serverless', input: { prompt: 'Hello world' } },
    { id: 'step2', capability: 'llm.chat', dependsOn: ['step1'],
      input: { prompt: 'Summarize: $step1.output' } },
  ],
});

Related Serverless Compute capabilities

Code Sandbox

compute

Code Execution

compute

Synthetic Data Generation

compute

Data Anonymization

compute

Cloud Browser

compute

Getting started →API reference →All providers →All capabilities →