SkyReels V4
SkyReels V4 Video Generation
- Two model tiers: Fast (speed-optimized) and Std (quality-optimized)
- Three modes auto-routed by request fields: Text-to-Video (T2V), Image-to-Video (I2V), Multimodal Reference (Omni)
- 480p / 720p / 1080p resolution, 3 ~ 15 seconds duration
- Advanced features: first/end/key frame, reference images, reference videos, grid collage, video extension, audio sync
- Async processing mode, returns a task ID for later query
POST
Documentation Index
Fetch the complete documentation index at: https://gccai.heqingsong.uk/llms.txt
Use this file to discover all available pages before exploring further.
Authorization
All API endpoints require Bearer Token authenticationGet your API Key:Visit the API Key Management Page to get your API KeyAdd to the request header:
Generation Modes
SkyReels V4 auto-routes to the correct mode based on request fields — nomode field needed:
| Mode | Trigger | Capability |
|---|---|---|
| T2V (Text-to-Video) | Only prompt + general fields | Pure text-driven generation |
| I2V (Image-to-Video) | Any of first_frame_image / end_frame_image / mid_frame_images | First/end/key frame control |
| Omni (Multimodal Reference) | Any of ref_images / ref_videos | Subject reference, grid collage, motion reference, video extension, audio sync |
@tag mechanism: When using mid_frame_images / ref_images / ref_videos, each element must declare a tag starting with @ (e.g., @image1, @Actor-1, @video1), and the tag must appear in the prompt.Think of prompt as the “script” and tag as a “character pointer” to specific assets (images / videos). For example, a prompt like "@Actor-1 walks into the scene of @video1" instructs the system to inject the reference image subject tied to @Actor-1 and the motion reference tied to @video1 into the generation process.Request Parameters
General Fields
Two model tiers are available:
| Model | Positioning | Use Cases |
|---|---|---|
skyreels-v4-fast | Speed-first | Quick previews, batch generation, daily content |
skyreels-v4-std | Quality-first (25~30% higher price than Fast) | Key shots, high-detail requirements, formal delivery |
Text prompt, max 1280 tokensDescribe scenes, subjects, actions, styles in detail for better generation results.When using
ref_images / ref_videos / mid_frame_images, the prompt must contain the corresponding @tag (e.g., @Actor-1, @video1, @image1).Example: "@Actor-1 walks through a neon-lit street at night."Output video duration (seconds)
- Range:
[3, 15] - Default:
5
Video resolutionOptions:
480p720p1080p(default)
Aspect ratioOptions:
16:9(default)4:31:19:163:4
Whether to auto-optimize the promptWhen enabled, the system automatically optimizes your prompt for better generation results.
I2V-Specific Fields
First frame image URL (jpg / jpeg / png / gif / bmp)When provided, this image is used as the starting frame of the video.
End frame image URL (jpg / jpeg / png / gif / bmp)When provided, this image is used as the ending frame of the video. Can be combined with
first_frame_image for first-and-last-frame control.Mid keyframe list, up to 6. Each element has the following structure:
Omni-Specific Fields
Reference image list (all elements must share the same
type). Each element has the following structure:Reference video list, up to 1. Each element has the following structure:
Supported Scenarios
The following scenarios are supported by bothskyreels-v4-fast and skyreels-v4-std:
| Scenario | Mode | Required Fields | Typical Use Case |
|---|---|---|---|
| Text-to-Video | T2V | prompt | Pure text-driven, rapid concept shots |
| Image-to-Video - First Frame | I2V | first_frame_image | Still-to-video with a specified starting frame |
| Image-to-Video - End Frame | I2V | end_frame_image | Specifies the closing frame |
| Image-to-Video - Keyframes | I2V | mid_frame_images (1 ~ 6) | First + end + mid keyframes for precise pacing |
| Omni Single/Multi-Subject | Omni | ref_images (type=image) | Character consistency, multi-subject framing |
| Omni Grid Collage | Omni | ref_images (type=grid, 1 image) | Step-by-step process videos (tutorials, recipes, demos) |
| Omni Motion Reference | Omni | ref_videos (type=reference) | Replicate the motion, subject, or style of a reference video |
| Omni Video Extension | Omni | ref_videos (type=extend) | Continue an existing video with new content |
| Omni Audio Sync | Omni | ref_images (type=image) + audio_url | Digital human narration, audio-driven lip-sync |
Parameter Constraints
Violating any of the following will cause the request to be rejected with a 422 response, no billing occurs:| Parameter | Constraint |
|---|---|
prompt | Max 1280 tokens |
duration | [3, 15] seconds; overridden by reference video length (max 10s) when ref_videos.type=reference |
resolution | Only 480p / 720p / 1080p |
aspect_ratio | 16:9 / 4:3 / 1:1 / 9:16 / 3:4; ignored in I2V; ignored when Omni carries ref_videos |
mid_frame_images | Up to 6; time_stamp must be -1 or within (0, duration) |
ref_images overall | All elements must share the same type; cannot coexist with I2V fields |
ref_images.type=grid | List length must = 1; image_urls must be 1 image |
ref_images.type=image | List length 1 ~ 3; each image_urls length 1 ~ 5 |
ref_images.audio_url | Only supported when type=image, audio ≤ 15 seconds |
ref_videos | Up to 1; video_url MP4 / MOV, ≤ 15 seconds |
ref_videos.type=reference | Overrides requested duration (max 10s), can combine with ref_images.type=image, carries input video audio by default |
ref_videos.type=extend | Billed by requested duration; cannot combine with ref_images |
tag field | Must start with @ and appear in the prompt |
| I2V / Omni exclusion | I2V fields and Omni fields cannot be used together |
Response
Response status code, 200 on success
Response data array
Request Examples
Case 1: Text-to-Video (Minimal)
Case 2: Text-to-Video (Full Parameters)
Case 3: Image-to-Video - First Frame
Case 4: Image-to-Video - First/End Frame + Mid Keyframes
Case 5: Omni - Single Subject Reference
Case 6: Omni - Multi-Subject + Video Motion Reference
Case 7: Omni - Grid Collage
Case 8: Omni - Video Extension (extend)
Case 9: Omni - Audio Sync (Voice-Driven)
Query Task ResultsVideo generation is an async task that returns a
task_id upon submission. Use the Get Task Status endpoint to query generation progress and results.