Jneid Jneid

@jjneid94

Published on Jan 8, 2025

Deploying Qwen2.5 on AWS

Jneid Jneid

@jjneid94

Published on Jan 8, 2025

Deploy Qwen 2.5 to AWS in 10 Seconds

Today I want to share a quick guide on deploying Qwen 2.5 to AWS SageMaker. While traditional deployment might take hours of configuration, I'll show you how to get it running in seconds using Magemaker.

Prerequisites

Before we start, you'll need:

  1. Python 3.11+

  2. Magemaker

  3. AWS account with SageMaker access

  4. AWS credentials configured

Setting Up AWS Credentials

First, let's set up your AWS credentials properly, using this guide.

Deployment Steps

Here's the entire process:

# Install the tool
pip install magemaker
# Start deployment

magemaker --cloud aws

When prompted:

  1. Choose "Deploy a model endpoint"

  2. Select "Deploy a Hugging Face model"

  3. Enter the model name: "Qwen/Qwen2.5-1.5B-Instruct"

  4. Select instance type: ml.m5.2xlarge

That's it. The deployment will start automatically and you'll see a progress bar showing the deployment status.

What's Happening Behind the Scenes

During the wait time, several things are happening:

  • Creating a SageMaker model with PyTorch environment

  • Setting up an endpoint configuration

  • Launching the endpoint with the specified instance type

  • Configuring the model for text generation

Verifying the Deployment

Once deployed, you can verify your endpoint with Magemaker drop-down menu or in the AWS console:

  1. Go to Amazon SageMaker

  2. Click "Endpoints" in the left sidebar

  3. You should see your Qwen endpoint listed as "InService"

Using Your Deployed Model

You can now query your model using the endpoint created. Here's a simple Python example:

curl -X POST \
  https://runtime.sagemaker.[REGION].amazonaws.com/endpoints/[ENDPOINT_NAME]/invocations \
  -H "Content-Type: application/json" \
  -H "Authorization: AWS4-HMAC-SHA256 ..." \
  -d '{"inputs": "Tell me a joke about programming"}'

Cost Considerations

Remember that SageMaker charges by the hour for running endpoints. The ml.m5.2xlarge instance costs around $0.50 per hour. Make sure to delete endpoints when not in use:

magemaker --cloud aws
# Select "Delete a model endpoint"
# Choose your endpoint from the list

Common Issues

If you encounter any issues:

  • Ensure your AWS credentials are correctly configured

  • Check that you have sufficient quota for the instance type

  • Verify your IAM user has the necessary permissions

Next Steps

Now that you have Qwen 2.5 running on AWS, you might want to:

  • Test different prompt templates

  • Monitor the endpoint's performance

  • Set up auto-scaling for production use

  • Try deploying with different instance types

Remember to watch your AWS costs and shut down endpoints when you're done testing!

Happy deploying! 🚀

Try out our dashboard

Try out our dashboard

Deploy any model In Your Private Cloud or SlashML Cloud

READ OTHER POSTS

©2024 – Made with ❤️ & ☕️ in Montreal

©2024 – Made with ❤️ & ☕️ in Montreal

©2024 – Made with ❤️ & ☕️ in Montreal