fal is a serverless Python runtime that lets you scale your code in the cloud with no infra management.
With fal, you can:
- Serve ML models such as Stable Diffusion and Llama on fast GPUs like Nvidia A100 and T4s with minimal cold start times.
- Automatically scale up to hundreds of GPUs and scale down back to 0 GPUs when idle.
- Pay by the second only when your code is running.
Let's discover fal in less than 2 minutes.
Get started by installing
$ pip install fal
Login using the
$ fal auth login
You can start using fal on any Python project by just importing
fal and wrapping existing functions with the
@fal.function decorator. All serverless functions behave like regular functions with the caveat that they run on fal's cloud runtime.
import fal import math @fal.function() def calculate_sqrt(n: int) -> int: return math.sqrt(n) root = calculate_sqrt(9) print("Square root of 9 is: ", root)
Run your script with
This should print out the square root of 9, which is (spoiler alert) 3. Congratulations! You have successfully ran a function on
fal is currently in beta, and we are letting people in slowly to ensure good performance. Once you run the login command, you will get an error that you should reach out to email@example.com. Shoot us an email with how you are planning to use fal, and we will make sure to get you access asap.