fal is a serverless Python runtime that lets you scale your code in the cloud with no infra management.

With fal, you can:

  • Serve ML models such as Stable Diffusion and Llama on fast GPUs like Nvidia A100 and T4s with minimal cold start times.
  • Automatically scale up to hundreds of GPUs and scale down back to 0 GPUs when idle.
  • Pay by the second only when your code is running.

Let's discover fal in less than 2 minutes.

Install fal

Get started by installing fal:

$ pip install fal

Login using the auth command:

$ fal auth login

Create your first serverless fal function

You can start using fal on any Python project by just importing fal and wrapping existing functions with the @fal.function decorator. All serverless functions behave like regular functions with the caveat that they run on fal's cloud runtime.
import fal
import math
def calculate_sqrt(n: int) -> int:
    return math.sqrt(n)
root = calculate_sqrt(9)
print("Square root of 9 is: ", root)

Run your script

Run your script with python:


This should print out the square root of 9, which is (spoiler alert) 3. Congratulations! You have successfully ran a function on fal!

Ready for more?

Start reading more about fal serverless functions or check out our examples.

2023 © Features and Labels Inc.