Deploying Agents to Google Cloud

Tools

AI Gallery

Blog

/ML Studio

Tools

AI Gallery

Blog

/ML Studio

Tools

AI Gallery

Blog

/ML Studio

Deploying Agents to Google Cloud

Jneid Jneid

@jjneid94

Published on Jan 23, 2025

Deploying Agents to Google Cloud

Jneid Jneid

@jjneid94

Published on Jan 23, 2025

This guide explains how to deploy an AutoGen agent that generates and executes code using OpenAI's GPT models to Google Cloud Run.

Agent takes in a task description, such as:

Analyze American Airlines (AAL) stock, include last 2 years use scikit learn

then, generates a code to perform this task, executes the code and analyses the results.

The full code implementation on github is at the end of this page.

Architecture Overview

Our setup consists of:

FastAPI application wrapping the AutoGen agent
Two endpoints: /process for task execution and /code/{filename} for accessing generated code
Google Cloud Run for serverless deployment

Prerequisites

Google Cloud account
gcloud CLI installed
OpenAI API key
Python dependencies: autogen, fastapi, uvicorn

Implementation Steps

1. FastAPI Application Structure

Our FastAPI app exposes two endpoints:

POST /process: Takes a task description and returns execution results
GET /code/{filename}: Retrieves generated code files

2. Google Cloud Deployment

First, create these files in your project directory:

asgi.py:

Procfile:

After creating the files, initialize your Google Cloud project:

Deploy to Cloud Run:

Set OpenAI API key as environment variable:

3. Configuring Public Access

To make your API endpoints publicly accessible:

Security Note: These settings allow public access. For production environments, implement proper authentication and authorization.

4. Testing the Deployment

Test your deployed endpoint:

Important Considerations

Cost Management: Cloud Run charges based on request count and compute time. Monitor usage to optimize costs.
Security: Implement authentication for production deployments. Current setup allows public access.
Scaling: Cloud Run automatically scales based on traffic. Set appropriate resource limits.
Environment Variables: Securely manage API keys and other sensitive data using Cloud Run's environment variables.

The full code implementation is here https://github.com/JJneid/fastapi_coder

Next steps

Make it stream processing, so that the user can see the progress of the task execution.

Happy building 🚀

if you have any questions, please do not hesitate to ask faizan|jneid@slashml.com.

If you want to chat with Any github Codebase, please visit CodesALot

If you want to chat with data and generate visualizations, please visit SirPlotsAlot

If you are self-hosting, try out our dashboard

Deploy any model In Your Private Cloud or SlashML Cloud

Deploy now!

READ OTHER POSTS