Documentation
Private Serverless Models on GPUs
Production Ready Applications

Production Ready Applications

A function represents the simplest unit of code that can be deployed to the fal runtime. It's ideally suited for use when the API is straightforward enough that it doesn't require any state and can be fully defined via a single endpoint.

Applications, in contrast to functions, are more intricate and stateful interfaces. They provide developers with greater control over operations by allowing them to override setup() and teardown() methods. Moreover, developers can utilize the provide_hints() method to customize routing, thereby enabling multiple endpoints to operate under the same machine.

Here's an example of a simple application:

import fal
from pydantic import BaseModel
 
class Input(BaseModel):
    prompt: str = Field()
 
class Output(BaseModel):
    output: str = Field()
 
 
class FalMultipleEndpoints(fal.App):
    machine_type = "GPU-A100"
    requirements = ["my_requirements"]
 
    def setup(self) -> None:
        from chat_completion_model import ChatCompletionModel
        from sentiment_analysis import SentimentAnalyzer
 
        self.model = ChatCompletionModel()
        self.sentiment_analyzer = SentimentAnalyzer()
 
    @fal.endpoint("/complete-chat")
    def generate(self, input: Input) -> Output:
        return Output(output=self.model(input.prompt))
 
    @fal.endpoint("/analyze-sentiments")
    def generate(self, input: Input) -> Output:
        return Output(output=self.sentiment_analyzer(input.prompt))

In this code:

  • FalMultipleEndpoints is a class that inherits from fal.App. This structure allows the creation of a complex application with multiple endpoints, which are defined using the @fal.endpoint decorator.

  • machine_type is a class attribute that specifies the type of machine on which this application will run. Here, "GPU-A100" is specified.

  • requirements is another class attribute that lists the dependencies needed for the application to run. In this case, "my_requirements" is a placeholder for actual dependencies.

  • The setup() method is overridden to initialize the models used in the application. This method is executed once when the application is started. In this example, ChatCompletionModel and SentimentAnalyzer are imported and instantiated.

  • The @fal.endpoint decorator is used to define the routes or endpoints of the application. In this example, two endpoints are defined: "/complete-chat" and "/analyze-sentiments".

Deploying your application

Once your application is prepared for deployment, you can use the fal CLI to deploy it:

fal deploy my_app.py::FalMultipleEndpoints --app-name my_app

In this command, we instruct fal to deploy the FalMultipleEndpoints class from my_app.py as an API application. We also assign a name my_app to this application for easier identification and management.

Upon successful deployment, fal will provide a URL, for example, https://fal.run/777/my_app (opens in a new tab). This URL is the public access point to your deployed application, allowing you to interact with the API endpoints defined within your FalMultipleEndpoints class.

Testing fal deployment

During the development process, it's often necessary to perform test deployments. The fal run command is designed for this purpose:

fal run my_app.py::FalMultipleEndpoints

Executing this command initiates an HTTP server hosting your application on fal infrastructure. A URL is then provided for you to conduct your tests. This approach enables you to validate your applications on the intended hardware, ensuring optimal performance and compatibility.


2023 © Features and Labels Inc.