Vidu API Guide: MoE Model & Q-Series Access

Access Vidu MoE, Q3 Pro, and Q3 Turbo through one APIMart key. Compare the models, pricing from $0.048/sec, and the async API flow for text and image-to-video.

Tutorial

If I had to sum it up in one line: use Vidu MoE for harder prompt logic, use Q3 Pro for final output, and use Q3 Turbo for lower-cost testing through one APIMart setup.

Here’s the short version you can act on right away:

I can access Vidu MoE, Vidu Q3 Pro, and Vidu Q3 Turbo through APIMart with one API key and one main request flow.
The core endpoint is POST https://api.apimart.ai/v1/videos/generations.
Video jobs are async, so I get a task_id first, then I poll GET /v1/tasks/{task_id} or use callback_url.
Vidu supports:
- text-to-video
- image-to-video
- reference-based video
- first-last frame transitions
Q3 models add built-in audio like dialogue, sound effects, and music.
Clips can run up to 16 seconds, with 540p, 720p, or 1080p output.
APIMart pricing in the article lists:
- Q3 Pro: about $0.12/sec at 720p
- Q3 Turbo: about $0.048/sec at 720p
Output links expire after 24 hours, so I should download files soon after success.

Vidu API Models Compared: MoE vs Q3 Pro vs Q3 Turbo

Quick comparison

Model	Best use	Main upside	Main tradeoff	Price in article
Vidu MoE	Harder multi-scene prompts	Better prompt control and scene logic	Slower and higher cost	Premium
Vidu Q3 Pro	Final videos	Higher-quality output, 1080p, audio-video sync	Costs more than Turbo	$0.12/sec
Vidu Q3 Turbo	Tests, drafts, batch work	Lower cost and lower wait time	Less visual detail than Pro	$0.048/sec

What stands out to me is how simple the switch is: in most cases, I just change the model field and keep the rest of the setup the same. That makes the article less about setup work and more about picking the right model for cost, wait time, and output quality.

Vidu Models Explained: MoE vs. Q-Series

Vidu

Vidu's MoE model: what it is and when to use it

The MoE (Mixture of Experts) model sends different parts of a generation task to specialized experts for motion, scene consistency, and prompt control. It makes the most sense for multi-scene or longer prompts where consistency matters more than raw speed.

There’s a catch, though. MoE takes more compute and has slower turnaround than the Q-Series ^[7]. For simple prompts, it’s often more than you need.

Vidu Q-Series and Vidu Q3 Pro: performance for production use

If MoE is the specialist, Q-Series is the option built for production work. Vidu Q3 Pro is designed for polished cinematic output and storyboard-driven videos ^[7]. It supports 1080p video, clips up to 16 seconds, and audio-video generation with synchronized dialogue and sound effects ^[1]^[2]^[4]. On APIMart, Q3 Pro starts at $0.12 per second ^[2]^[3].

Vidu Q3 Turbo leans more toward speed and lower cost, with faster scene switching ^[6]^[7]. On APIMart, Q3 Turbo starts at $0.048 per second ^[3].

How to choose between MoE and Q-Series for your workflow

This choice mostly comes down to prompt complexity, turnaround time, and budget. If your workflow depends on strict instruction-following and multi-scene logic, go with MoE. If you need polished output with audio-visual sync, Q3 Pro is the better fit. Alternatively, Kling V3 provides another high-fidelity option for cinematic AI video. If your main goal is fast iteration or lower cost per clip, Q3 Turbo is the practical pick.

The table below maps each model to the kind of work it handles best. For those comparing high-end options, Sora 2 offers similar cinematic capabilities with synchronized audio.

Model	Best For	Strengths	Tradeoffs	Latency	Pricing (USD/sec)
Vidu MoE	Complex multi-scene narratives	Instruction-following, scene logic, consistency	Higher compute cost, slower turnaround	High	Premium
Vidu Q3 Pro	Cinematic production	Visual quality, audio-visual sync, storyboard generation	Higher cost than Turbo	Medium	$0.12 ^[2]
Vidu Q3 Turbo	Rapid iteration & batching	Generation speed, cost-efficiency, faster scene switching	Slightly lower visual detail	Low	$0.048 ^[3]

Next, see how to select a model, authenticate, and send the request through APIMart.

How to Access Vidu Through APIMart

GccAi

Account setup, authentication, and API key handling

After you pick a model, you can send jobs through APIMart with one API key. First, create an APIMart account and generate your key from the API key management page in the dashboard ^[2]^[3].

Send each request with a Bearer token in the Authorization header:

Authorization: Bearer YOUR_API_KEY

For storage, keep keys in environment variables or a secret manager like AWS Secrets Manager or GCP Secret Manager. It also helps to use separate keys for development, staging, and production. If a key gets exposed, rotate it right away. Do the same on a set schedule. And when you log requests, save only the task_id - never the token itself ^[5].

Finding Vidu models, pricing, and input schema in APIMart

Once you're signed in, check the catalog before you send anything. That's where you can confirm model names, supported inputs, and current pricing. In APIMart's catalog, Vidu models are listed under Video Generation. You can also find other high-performance models like MiniMax-Hailuo-02 in the same category. Use that page to compare input schema, resolution, and per-second cost across MoE, Q3 Pro, and Q3 Turbo ^[2]^[3].

The main fields to watch are:

model
prompt
duration
resolution
aspect_ratio

For text-to-video jobs, use aspect_ratio. For image-based jobs, the system uses the source image's ratio instead ^[2]. Text prompts are limited to 2,000 characters ^[2]^[3].

Endpoints, request structure, and async job handling

After you choose the model, submit the generation request and track the async job with the returned task_id. Send a POST request to https://api.apimart.ai/v1/videos/generations, then poll job status with GET https://api.apimart.ai/v1/tasks/{task_id} ^[2]^[5].

Jobs move through these states:

submitted
queueing
processing
success or failed

If you want APIMart to notify your app when the job is done, add callback_url and receive the result by webhook ^[5]. Once the job reaches success, download the file right away. From there, you can map the request fields to either a text-to-video flow or a reference-based flow.

Step-by-Step Integration for Text-to-Video and Reference-Based Video

Basic text-to-video flow with model selection

After you pick a model from the catalog, the text-to-video flow is pretty simple. Send your API key from the server side in the Authorization header as Bearer {your_api_key}.

Here’s a minimal payload for a text-to-video job with viduq3-pro:

{
  "model": "viduq3-pro",
  "prompt": "A red fox running through a snowy forest at dusk, cinematic slow motion",
  "duration": 8,
  "resolution": "720p",
  "aspect_ratio": "16:9",
  "audio": true
}

The response includes a task_id and a status like submitted, queueing, or processing. After that, you can either poll GET /v1/tasks/{task_id} with the returned task_id, or pass a callback_url in the request so the platform can notify your app when the job reaches success or failed ^[1]^[7]^[10]. If you want to switch to viduq3-turbo, you mostly just change the model field.

The async pattern stays the same across modes. What changes are the input fields.

Adding image or reference inputs and advanced controls

For image-to-video, pass one image URL in the image_urls array. Use 0 images for text-to-video, 1 for image-to-video, and 2 for first-last-frame mode ^[2]. In image-based modes, the output aspect ratio comes from the source image, so you can leave out aspect_ratio ^[2]. If you upload files directly instead of using URLs, keep each image in PNG, JPEG, or WebP format, under 50 MB, and keep the total HTTP body under 20 MB ^[9]^[8].

For reference-based generation, use the /reference2video endpoint with a subjects array. Define each subject with a name and its images, then call it in the prompt with @subjectname. Q3 models allow up to 7 reference images or text descriptions in the subjects feature ^[6]. If you’re using first-last frame mode, keep both images close in aspect ratio, ideally within a 0.8 to 1.25 ratio, to reduce failures ^[8]. When faces or hands are involved, keep motion prompts subtle to cut down on distortion artifacts ^[5].

The table below shows the main parameters across both flows:

Parameter	Type	Valid Range / Options	Applies To
`model`	String	`viduq3-pro`, `viduq3-turbo`	All
`prompt`	String	Max 2,000 characters	All (required for text-to-video; optional for image-to-video)
`duration`	Integer	1–16s	All
`resolution`	String	`540p`, `720p`, `1080p`	All
`aspect_ratio`	String	`16:9`, `9:16`, `4:3`, `3:4`, `1:1`	Text-to-video only
`audio`	Boolean	`true`, `false`	Default `true` for Q3
`seed`	Integer	`-1` to `4,294,967,295`	All
`off_peak`	Boolean	`true`, `false`	All
`callback_url`	String	Optional webhook URL for status updates	All

Set a fixed seed while testing if you want the same visual result across runs ^[2]^[9]. For batch jobs that aren’t urgent, set off_peak to true. Those tasks are usually completed within 48 hours and use fewer credits ^[1]^[6].

Tracking usage, cost, and production reliability

Once your request is working, the next job is keeping cost and reliability under control in production.

Log the task_id and timestamp for every request. That gives you a safe way to debug without storing sensitive credentials ^[5]. It also helps to track queue time and generation time separately, so you can tell the difference between platform delay and model latency.

For cost estimation, Vidu Q3 Pro at 720p costs about $0.12 per second on APIMart, and Q3 Turbo costs about $0.048 per second ^[3]. Set automated alerts at 50%, 80%, and 100% of your monthly budget cap so spending doesn’t get away from you ^[5].

Retries matter too. On 5xx errors, use exponential backoff: retry at 2 seconds, then 5 seconds, then 15 seconds before showing an error to the user ^[5]. Vidu Q3 series models come with a 99.9% SLA for production workloads ^[3], but short-lived failures still happen, so retries should be part of any shipping build.

Model Selection Checklist and Key Takeaways

Use-case checklist for developers, creators, and product teams

Pick based on three things: prompt complexity, speed, and output quality. The table below turns the model comparison into a practical shipping choice.

Scenario	Best Model	Why
Multi-scene ads, storyboards, complex prompts	Vidu MoE (`viduq3-mix`)	Best for instruction-heavy prompts and smart scene transitions
Final brand promos, polished product visuals	Vidu Q3 Pro (`viduq3-pro`)	High-fidelity, cinematic 1080p output; ~$0.12/sec at 720p ^[3]
Rapid prototyping, drafts, and short-form clips	Vidu Q3 Turbo (`viduq3-turbo`)	Best for fast, high-volume iteration; ~$0.048/sec at 720p ^[3]
Character consistency across references	Vidu Q3 Pro (`viduq3-pro`)	Supports up to 7 reference images and requires image input ^[6]^[8]

Once you’ve picked a row, keep the same request schema from the integration section. In plain English: start ideas in Q3 Turbo, then move the final 1080p render to Q3 Pro. It’s a simple workflow, and it helps you move fast without spending more than you need to.

For clips where motion fidelity matters most, aim for 5–10 seconds instead of stretching to the 16-second maximum. Shorter clips often give you tighter motion and fewer headaches.

Key points to remember before shipping

MoE is the pick for complex, multi-scene logic. Q3 Pro gives you high-fidelity, cinematic 1080p output ^[3]. Q3 Turbo is the lower-cost option at $0.048/sec at 720p ^[3].

On APIMart, switching between these models is just a single model parameter change. Everything else in the request stays the same ^[3]. That means you can test one model, swap to another, and keep your integration work steady.

Use the same async flow each time:

Submit the request
Capture task_id
Poll for status or use callback_url

Also, download generated videos soon after they’re ready. Output links expire after 24 hours ^[3]^[11].

FAQs

Which Vidu model should I start with?

Start with the model that fits your needs for speed, audio, and visual control.

viduq3-pro: best for audio-visual sync and shot segmentation
viduq3-turbo: faster generation than the pro version
viduq1 or viduq2: solid picks for stable video production and reliable camera movement

How do I track a video job after submitting it?

You can track your video generation task in two ways.

For production use, the best option is to include a callback_url in your initial request. When you do that, the Vidu API sends task updates and result metadata straight to your URL automatically. That means you don't need to keep checking the task status yourself.

The other option is to poll the status query API with the task_id you get after submission. Once the task state changes to success, the response will include the video download URL and other related metadata.

What inputs and limits should I know before integrating?

Before you integrate the Vidu API, make sure your inputs stay within these limits:

Images: PNG, JPEG, JPG, or WebP only; each file must be under 50 MB and at least 128×128 pixels
Total HTTP request body: 20 MB max
Text prompts: up to 5,000 characters
Payload passthrough data: up to 1,048,576 characters

Duration limits depend on the model you use. Q3 supports 1–16 seconds, Q2 supports 1–10 seconds, and Q1 supports 5 seconds.

Also, keep your API keys secure. Don’t expose them in client-side code. Send requests through a server-side intermediary instead.