Avatars
Create reusable AI presenters (portrait + cloned voice) and generate lipsync talking videos. Avatar creation costs $1.00. Video generation billed at $0.05/s (avatar) or $0.08/s (avatar_pro).
Overview
The Avatars API lets you create a persistent AI presenter from either a text description or a talking-head video clip. Each avatar bundles a generated portrait with a cloned TTS voice — both reusable across unlimited lipsync video generations.
Feature flag: The API is gated by AVATARS_API_ENABLED=true. All endpoints return 503 FEATURE_DISABLED when the flag is absent or falsy.
System avatars: A set of pre-seeded public avatars (2 English, 4 Vietnamese) are always included in listing and fetch responses alongside your own avatars.
| Action | Cost | Notes |
|---|---|---|
| Create avatar | $1.00 | Charged immediately; refunded on failure |
| Create pose | $0.013 | Image generation (i2i) rate |
Video — avatar tier | $0.05 / s | Fast; good for production |
Video — avatar_pro tier | $0.08 / s | Premium quality; detailed expressions |
avatar tier
Powered by pruna-ai/p-video/avatar. Fast turnaround, suitable for production workflows. Billed at $0.05 per second of actual audio.
avatar_pro tier
Powered by skywork-ai/skyreels-v3-standard/single-avatar. Slower generation with richer facial expressions and lip motion. Billed at $0.08 per second of actual audio.
Quick Start
# Create avatar from text description (generated)
curl -X POST https://getvrex.com/api/v1/avatars \
-H "Authorization: Bearer sk-your-api-key" \
-H "Content-Type: application/json" \
-d '{
"name": "Aria",
"description": "Professional female presenter, confident and warm expression, business attire",
"language": "en",
"nationality": "American",
"gender": "female",
"accent": "Neutral"
}'
# Poll for completion
curl https://getvrex.com/api/v1/avatars/av_abc123def456 \
-H "Authorization: Bearer sk-your-api-key"Avatar creation involves portrait generation, voice harvesting, and TTS voice cloning — expect 60–300 seconds. Poll every 5–10 seconds or use webhooks for zero-latency notification.
API Reference — Avatars
/avatarsSubmit an async avatar creation job. Two modes: 'generated' (AI portrait from description) or 'video' (face + voice extracted from a talking-head clip).
Request Body
| Name | Type | Required | Default | Description |
|---|---|---|---|---|
| name | string | Required | — | Avatar display name (1–60 chars). |
| description | string | Optional | — | Portrait/persona description (1–2000 chars). Required for source="generated"; optional for source="video" (auto-defaulted). |
| language | string | Required | — | Language code: en, vi, zh, ja, ko, es, fr, de, pt, ar. |
| nationality | string | Required | — | Nationality label (1–60 chars), e.g. "American", "Vietnamese". |
| gender | string | Optional | — | Gender hint: male or female (guides voice selection). |
| accent | string | Optional | — | Optional accent label (max 40 chars), e.g. "Northern", "Southern". |
| source | string | Optional | generated | Creation mode: "generated" (text description → AI portrait) or "video" (talking-head clip → face frame + real-voice clone). |
| ref_image_data_url | string | Optional | — | Seed image for i2i portrait. Data URL (data:image/...;base64,...) or HTTPS URL (≤4 MB). Ignored when source="video". |
| source_video_url | string | Optional | — | Required when source="video". Data URL (data:video/*;base64,...) or public HTTPS URL. Max 80 MB. MP4, MOV, WebM, M4V. The clip becomes the avatar's demo video. |
Response (202 Accepted)
{
"id": "av_abc123def456",
"status": "pending",
"estimated_cost_usd": 1.00,
"poll_url": "/api/v1/avatars/av_abc123def456"
}Errors
| Code | Status | Description |
|---|---|---|
| INVALID_JSON | 400 | Request body is not valid JSON |
| VALIDATION_ERROR | 400 | Invalid request (name/description missing, language unsupported, etc.) |
| INSUFFICIENT_BALANCE | 402 | Insufficient balance. Please top up your account. |
| QUEUE_FULL | 429 | Avatar creation queue at capacity (sends Retry-After: 60) |
| RATE_LIMITED | 429 | Rate limit exceeded |
| FEATURE_DISABLED | 503 | Avatar creation temporarily disabled (AVATARS_API_ENABLED not set) |
/avatarsList all avatars accessible to the caller: owned avatars + all public system avatars. Supports pagination.
Query Parameters
| Name | Type | Required | Default | Description |
|---|---|---|---|---|
| page | number | Optional | 1 | Page number for pagination. |
| limit | number | Optional | 20 | Items per page (max 100). |
Response (200 OK)
{
"data": [
{
"id": "av_abc123def456",
"name": "Aria",
"description": "Professional female presenter...",
"language": "en",
"nationality": "American",
"gender": "female",
"accent": "Neutral",
"image_url": "https://r2.getvrex.com/avatars/...",
"status": "ready",
"stage": null,
"is_system": false,
"is_public": false,
"cost_usd": 1.00,
"error": null,
"created_at": "2025-01-01T00:00:00.000Z",
"updated_at": "2025-01-01T00:05:00.000Z"
}
],
"page": 1,
"limit": 20
}status: pending, processing, ready, failed
stage: Creation progress — one of portrait, demo, voice, poses
image_url: Presigned portrait URL (1-hour expiry) when status="ready", else null
is_system: true for pre-seeded system avatars (not deletable via API)
/avatars/:idFetch a single avatar by ID. Must be owned by the caller or a public system avatar.
Response (200 OK)
{
"id": "av_abc123def456",
"name": "Aria",
"description": "Professional female presenter...",
"language": "en",
"nationality": "American",
"gender": "female",
"accent": "Neutral",
"image_url": "https://r2.getvrex.com/avatars/...",
"status": "ready",
"stage": null,
"is_system": false,
"is_public": false,
"tts_voice_id": "voice-abc123xyz",
"cost_usd": 1.00,
"error": null,
"created_at": "2025-01-01T00:00:00.000Z",
"updated_at": "2025-01-01T00:05:00.000Z"
}tts_voice_id: Cloned TTS voice ID populated once status="ready". Use it directly with the /tts endpoint to synthesise speech in the avatar's voice without generating a full video.
Errors
| Code | Status | Description |
|---|---|---|
| NOT_FOUND | 404 | Avatar not found or not accessible |
| STORAGE_ERROR | 502 | Portrait key exists but R2 presigning failed |
| FEATURE_DISABLED | 503 | Avatar API disabled |
/avatars/:idDelete an owned avatar and all associated assets (portrait, videos, poses, TTS voice). System avatars cannot be deleted via API.
Response
Errors
| Code | Status | Description |
|---|---|---|
| NOT_FOUND | 404 | Avatar not found |
| FORBIDDEN | 403 | Avatar not owned by caller; system avatars are not deletable |
| FEATURE_DISABLED | 503 | Avatar API disabled |
API Reference — Avatar Poses
Poses are i2i variations of an avatar's base portrait, usable as the starting frame for video generation. Each pose costs $0.013 (the standard image generation rate). The base portrait is always available as a free pose.
/avatars/:id/posesSubmit a pose creation job (i2i variation of the avatar's portrait). Avatar must be in 'ready' status.
Request Body
| Name | Type | Required | Default | Description |
|---|---|---|---|---|
| prompt | string | Required | — | Pose description for i2i variation (1–600 chars). |
| name | string | Optional | — | Display label (max 60 chars). Defaults to "Pose" if omitted. |
Response (201 Created)
{
"id": "ap_xyz789abc123",
"avatar_id": "av_abc123def456",
"prompt": "side profile, arms crossed, confident posture, business casual",
"name": "Side Profile",
"image_key": "avatars/user-id/pose-id.jpg",
"image_url": "https://r2.getvrex.com/avatars/...",
"status": "ready",
"is_base": false,
"cost_usd": 0.013
}Errors
| Code | Status | Description |
|---|---|---|
| INVALID_JSON | 400 | Request body is not valid JSON |
| VALIDATION_ERROR | 400 | Prompt missing or too long (max 600 chars) |
| NOT_FOUND | 404 | Avatar not found or not accessible |
| WALLET_NOT_FOUND | 404 | No billing wallet exists yet (top up to initialise) |
| AVATAR_NOT_READY | 409 | Avatar not in ready status (still processing) |
| INSUFFICIENT_BALANCE_FOR_POSE | 402 | Insufficient balance to create a pose |
| DAILY_CAP_REACHED | 429 | Per-tier daily cap reached (resets midnight UTC) |
| RATE_LIMITED | 429 | Too many requests — see Retry-After header |
| FEATURE_DISABLED | 503 | Avatar API disabled |
/avatars/:id/posesList all poses for an avatar: the base portrait plus any custom poses created by the caller.
Response (200 OK)
{
"data": [
{
"id": "ap_base-portrait-id",
"avatar_id": "av_abc123def456",
"name": "Base Portrait",
"prompt": null,
"image_key": null,
"image_url": "https://r2.getvrex.com/avatars/...",
"status": "ready",
"is_base": true,
"cost_usd": 0.0,
"created_at": "2025-01-01T00:00:00.000Z"
},
{
"id": "ap_xyz789abc123",
"avatar_id": "av_abc123def456",
"name": "Side Profile",
"prompt": "side profile, arms crossed...",
"image_key": "avatars/user-id/pose-id.jpg",
"image_url": "https://r2.getvrex.com/avatars/...",
"status": "ready",
"is_base": false,
"cost_usd": 0.013,
"created_at": "2025-01-01T00:10:00.000Z"
}
]
}is_base: true for the avatar's default portrait (free; always present)
prompt / image_key: null for base portrait, populated for custom poses
Errors
| Code | Status | Description |
|---|---|---|
| NOT_FOUND | 404 | Avatar not found or not accessible |
| FEATURE_DISABLED | 503 | Avatar API disabled |
API Reference — Avatar Videos
Generate a lipsync talking video from any ready avatar and a text script. Cost is estimated at request time from text length, then reconciled to actual audio duration after completion — you are only charged for what was produced.
Text limit: 600 characters (~30 seconds of audio).
# Generate an avatar lipsync video
curl -X POST https://getvrex.com/api/v1/avatar-videos \
-H "Authorization: Bearer sk-your-api-key" \
-H "Content-Type: application/json" \
-d '{
"avatar_id": "av_abc123def456",
"text": "Welcome to our product demo. Let me show you what we can do.",
"tier": "avatar",
"video_prompt": "professional confidence, relaxed posture",
"webhook_url": "https://your-server.com/webhook"
}'
# Poll for completion
curl https://getvrex.com/api/v1/avatar-videos/vg_abc123def456 \
-H "Authorization: Bearer sk-your-api-key"/avatar-videosSubmit an async lipsync video generation job.
Request Body
| Name | Type | Required | Default | Description |
|---|---|---|---|---|
| avatar_id | string | Required | — | Avatar ID to use (av_...). |
| text | string | Required | — | Speech text (1–600 chars, ~30 s max). |
| tier | string | Optional | avatar | Quality tier: "avatar" ($0.05/s, fast) or "avatar_pro" ($0.08/s, premium). |
| pose_id | string | Optional | — | Custom pose ID (ap_...); uses base portrait if omitted. |
| video_prompt | string | Optional | — | Behavior/movement hint for the lipsync model (max 300 chars). |
| webhook_url | string | Optional | — | HTTPS URL to receive video.completed or video.failed events. |
| output_format | string | Optional | url | "url" (presigned, 1-hour expiry) or "raw_key" (R2 object key). Affects the video_url field in GET /avatar-videos/:id — no effect on the 202 create response. |
Response (202 Accepted)
{
"id": "vg_abc123def456",
"status": "queued",
"tier": "avatar",
"estimated_cost_usd": 0.15,
"poll_url": "/api/v1/avatar-videos/vg_abc123def456"
}Errors
| Code | Status | Description |
|---|---|---|
| INVALID_JSON | 400 | Request body is not valid JSON |
| VALIDATION_ERROR | 400 | Invalid request (text missing, too long, tier invalid, etc.) |
| POLICY_VIOLATION | 400 | Text fails safety filter |
| INSUFFICIENT_BALANCE | 402 | Insufficient balance |
| NOT_FOUND | 404 | Avatar or pose not found |
| AVATAR_NOT_READY | 409 | Avatar not in ready status, or avatar lacks portrait/voice |
| QUEUE_FULL | 429 | Avatar video queue at capacity (sends Retry-After: 60) |
| RATE_LIMITED | 429 | Rate limit exceeded |
| FEATURE_DISABLED | 503 | Avatar API disabled |
/avatar-videos/:idPoll the status and download link of an avatar video generation job.
Query Parameters
| Name | Type | Required | Default | Description |
|---|---|---|---|---|
| output_format | string | Optional | url | "url" (presigned, 1-hour expiry) or "raw_key" (R2 object key). |
Response (200 OK)
{
"id": "vg_abc123def456",
"status": "completed",
"tier": "avatar",
"seconds": 3,
"actual_seconds": 3,
"avatar_id": "av_abc123def456",
"pose_id": "ap_xyz789abc123",
"video_url": "https://r2.getvrex.com/avatar-videos/...",
"cost_usd": 0.15,
"actual_cost_usd": 0.15,
"error": null,
"created_at": "2025-01-01T12:00:00.000Z",
"completed_at": "2025-01-01T12:00:30.000Z"
}status: queued, processing, completed, failed
seconds: Estimated duration (from text length at request time)
actual_seconds: Actual measured audio duration (populated only after completion)
cost_usd: Estimated charge held at submission
actual_cost_usd: Final reconciled cost; reduced if actual duration < estimate; null on failure (refunded in full)
video_url: Presigned URL (1-hour expiry) or R2 key per output_format; null until completed
Errors
| Code | Status | Description |
|---|---|---|
| UNAUTHORIZED | 401 | Missing or invalid API key |
| VALIDATION_ERROR | 400 | output_format is not url or raw_key |
| NOT_FOUND | 404 | Video not found or does not belong to caller |
| STORAGE_ERROR | 502 | Failed to generate presigned URL |
| FEATURE_DISABLED | 503 | Avatar API disabled |
Webhooks
Set webhook_url in the POST /avatar-videos request to receive events when the job reaches a terminal state. Your server must respond with 2xx within 10 seconds.
{
"event": "video.completed",
"video_gen_id": "vg_abc123def456",
"video_url": "https://r2.getvrex.com/avatar-videos/...",
"duration_sec": 3,
"tier": "avatar",
"actual_cost_usd": 0.15,
"timestamp": "2025-01-01T12:00:30.000Z"
}tier: Lipsync tier used — "avatar" or "avatar_pro"
actual_cost_usd: Present once cost reconciliation completes; may be omitted if reconciliation has not yet finished at webhook dispatch time
{
"event": "video.failed",
"video_gen_id": "vg_abc123def456",
"error": "wavespeed API returned 502: service unavailable",
"timestamp": "2025-01-01T12:00:30.000Z"
}SDK
The official JavaScript SDK exposes all avatar operations under client.avatar.*:
create · list · get · delete · createPose · listPoses · generateVideo · videoStatus · waitForVideo
import { VrexClient } from "@vrex/sdk";
const client = new VrexClient({ apiKey: "sk-your-api-key" });
// Create an avatar (async — polls automatically)
const avatar = await client.avatar.create({
name: "Aria",
description: "Professional female presenter, confident and warm",
language: "en",
nationality: "American",
gender: "female",
});
// List avatars
const { data } = await client.avatar.list({ page: 1, limit: 20 });
// Create a custom pose
const pose = await client.avatar.createPose(avatar.id, {
prompt: "side profile, arms crossed, confident posture",
name: "Side Profile",
});
// Generate a lipsync video (polls until complete)
const video = await client.avatar.waitForVideo({
avatar_id: avatar.id,
pose_id: pose.id,
text: "Hello! Welcome to the demo.",
tier: "avatar_pro",
});
console.log("Video URL:", video.video_url);