Model Gallery Documentation Pricing Enterprise Research Grants

Generative
media platform
for developers.

The world's best generative image, video, and audio models, all in one place. Develop and fine-tune models with serverless GPUs and on-demand clusters.

Get started Contact Sales

Trusted by over 1,500,000 developers and leading companies.

Enterprise Scale

The world's largest generative media model gallery

Choose from 1,000+ production ready image, video, audio and 3D models. Build products using fal model apis. Scale custom AI models with fal serverless Access 1000s of H100, H200 and B200 VMs with fal compute.

Explore all models

Nano Banana 2

Kling O3 Image to Video [Pro]

Kling Video v3 Image to Video [Pro]

Nano Banana Pro

Build,
deploy,
train.

1000+ generative media models. Ready for production.

Explore a rich library of models for image, video, voice, and code generation. All accessible with a simple API. No fine-tuning or setup needed — just call and go.

Use it for:

Building with state-of-the-art open models
Personalize models for your own brand or persona
Exclusive early access to new models

Explore models

On-demand, serverless GPUs.

Run inference at lightning speed with fal's globally distributed serverless engine. No GPUs to configure, no cold starts, no autoscaler setup.

Use it for:

Access to fal Inference Engine to accelerate your workloads
Scale from zero to thousands of GPUs instantly
All-in-one framework: run, deploy, productionize
Monitor everything with best in class observability toolchain

Learn more

Dedicated clusters for frontier research labs.

Spin up dedicated compute to fine-tune, train, or run custom models with guaranteed performance. Choose from the latest NVIDIA hardware across global regions.

Use it for:

1000s of Blackwell™ NVDIA chips
Run large scale training workloads
Proprietary distributed data-feeding engine
Enterprise-grade reliability and scale

Learn more

Why
choose
fal?

Fastest inference engine for diffusion models

fal Inference Engine™ is up to 10x faster. Scale from prototype to 100M+ daily inference calls — with 99.99% uptime and zero headaches.

fal0.0s

alternative 1 0.0s

alternative 20.0s

flux[dev] inference speed

On-demand GPUs, serverless deployments

Deploy private or fine-tuned models with one click — or bring your own weights. Customize endpoints securely with enterprise-ready infra.

Private deployments

Bring your own model

Built for developers

Use our unified API and SDKs to call hundreds of open models or your own LoRAs in minutes. No MLOps, no setup — just plug in and generate.

Documentation

H100

H200

A100

A6000

B200

H100s, H200s, B200s for as low as $1.2

Pay only for what you use. Choose per-output pricing for Serverless, or hourly GPU pricing with Compute. Scale without lock-in or hidden fees.

See pricing

Built for enterprise scale

fal powers AI features in some of the world's most demanding environments — from public companies to hypergrowth startups.

SOC 2Single Sign-OnPrivate endpointsUsage analytics24/7 priority support

Learn more

fal is SOC 2 compliant and ready for enterprise procurement processes.

SOC 2 & enterprise compliance

Scale on-demand or with guaranteed capacity.

Usage-based or reserved pricing

Collaborate with our Applied Machine Learning Engineers for customized solutions.

Forward Deployed Generative Media Experts

Deploy and serve your own models securely.

Private model endpoints

“fal's platform has been instrumental in accelerating our AI innovation journey. We love the flexibility of the platform and the extensive model offering.”

Morgan Gautier

Head of Generative AI Experiences at Canva

“I believe the future of image and video search is AI-driven. fal is our trusted infrastructure partner as we scale Perplexity's generative media efforts.”

Aravind Srinivas

CEO of Perplexity

Built with fal,
loved by all.

“Working with fal has completely transformed our text-to-speech infrastructure. Our customers love the near-instant voice responses, we can scale globally, and the fine-tuning speed is unmatched. We're excited to expand our partnership and push the boundaries of generative voice AI.”

Mahmoud Felfel

CEO of PlayAI

“fal currently powers 40% of Poe's official image and video generation bots. The fal team is one of the fastest-moving organizations we work with and consistently goes the extra mile to optimize inference and ensure great user experience. We are excited to work together to scale both of our platforms as the incredibly rapid progress in AI continues and we make it accessible to the world.”

Adam D'Angelo

CEO of Quora

“fal's platform has been instrumental in accelerating our AI innovation journey. We love the flexibility of the platform and the extensive model offering.”

Morgan Gautier

Head of Generative AI Experiences at Canva

“I believe the future of image and video search is AI-driven. fal is our trusted infrastructure partner as we scale Perplexity's generative media efforts.”

Aravind Srinivas

CEO of Perplexity

“Working with fal has completely transformed our text-to-speech infrastructure. Our customers love the near-instant voice responses, we can scale globally, and the fine-tuning speed is unmatched. We're excited to expand our partnership and push the boundaries of generative voice AI.”

Mahmoud Felfel

CEO of PlayAI

“fal currently powers 40% of Poe's official image and video generation bots. The fal team is one of the fastest-moving organizations we work with and consistently goes the extra mile to optimize inference and ensure great user experience. We are excited to work together to scale both of our platforms as the incredibly rapid progress in AI continues and we make it accessible to the world.”

Adam D'Angelo

CEO of Quora