For this tutorial, we are using deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B.
This step-by-step guide covers both interactive and YAML-based deployment options.
We will be using Magemaker, a Python tool that simplifies deploying open-source AI models to cloud providers like AWS, GCP, and Azure.
Step 1: GCP Setup
1. Create a Google Cloud account if you haven't already
2. Install gcloud CLI:
3. Enable Vertex AI API in your project:
Go to Google Cloud Console
Search for "Vertex AI API"
Click "Enable"
For a more step by step config, use this.
Step 2: Authentication
Step 3: Create YAML Configuration
Create a file named `deploy-deepseek-gcp.yaml`:
Step 4: Deploy
Step 5: Verify Deployment
Go to Google Cloud Console
Navigate to Vertex AI → Model Registry
Check your endpoint status
Step 6: Test the Endpoint
Use this Python code to test your deployment:
Common Issues and Solutions
Quota Issues
If you encounter quota errors:
Go to IAM & Admin → Quotas
Search for "NVIDIA L4 GPUs"
Request quota increase
Authentication Issues
Instance Availability
Check if g2-standard-12 is available in your region
Try different regions if needed
Monitoring Your Deployment
Monitor through Google Cloud Console:
Vertex AI → Endpoints
Cloud Monitoring
Cloud Logging
Cost Management
Pricing Breakdown
g2-standard-12 with NVIDIA L4: ~$1 per hour
Additional costs:
Network egress
API calls
Storage for model artifacts
Cost Optimization Tips
1. Delete endpoints when not in use:
2. Use batch processing when possible
3. Monitor usage patterns
4. Set up billing alerts
5. Consider scheduled shutdowns for non-critical workloads
Monthly Cost Estimates
24/7 running: ~$720/month
8 hours/day: ~$240/month
4 hours/day: ~$120/month
Next Steps
Set up monitoring alerts
Configure auto-scaling if needed
Implement proper error handling
Test with different prompts
We are open-sourcong Magemaker!! Stay Tuned!!!
As Always, Happy Coding!!!
if you have any questions, please do not hesitate to ask faizan|jneid@slashml.com.
If you want to chat with Any github Codebase, please visit CodesALot
If you want to chat with data and generate visualizations, please visit SirPlotsAlot
Deploy any model In Your Private Cloud or SlashML Cloud
READ OTHER POSTS