Deploying NudgeBee AI on AWS SageMaker

Overview

This guide walks you through deploying the NudgeBee AI model on AWS SageMaker.

Prerequisites

AWS account with access to SageMaker, S3, IAM, and ECR
Trained NudgeBee model (.tar.gz)
Model uploaded to an S3 bucket
IAM role with required permissions

Step-by-Step Guide

Step 1: Upload Model to S3

Log in to AWS Console
Go to Amazon S3
Create or open a bucket
Upload nudgebee_model.tar.gz to the bucket

Step 2: Create SageMaker Model

Open Amazon SageMaker
Go to Models > Create model
Name the model (e.g., nudgebee-ai-model)
Provide ECR container image URL
Add S3 path to the model artifact
Choose IAM role with permissions
Click Create model

Step 3: Deploy Endpoint

Go to Inference > Endpoint Configurations
Create new configuration and add the model
Choose instance type (e.g., ml.m5.large)
Create endpoint and wait for it to become InService

RAG and LLM Server Configuration

RAG Server (SageMaker)

EMBEDDINGS_PROVIDER=sagemaker
EMBEDDINGS_PROVIDER_REGION=<AWS_SageMaker_Region>
EMBEDDINGS_PROVIDER_API_ENDPOINT=<SageMaker_Endpoint_URL>
EMBEDDINGS_MODEL_NAME=<Model_Name>

LLM Server (SageMaker)

LLM_PROVIDER=sagemaker
LLM_PROVIDER_API_ENDPOINT=<SageMaker_Endpoint_URL>
LLM_PROVIDER_REGION=<AWS_SageMaker_Region>

Testing

Use SageMaker Console > Endpoints > Test
Provide a JSON request and validate response

Overview​

Prerequisites​

Step-by-Step Guide​

Step 1: Upload Model to S3​

Step 2: Create SageMaker Model​

Step 3: Deploy Endpoint​

RAG and LLM Server Configuration​

RAG Server (SageMaker)​

LLM Server (SageMaker)​

Testing​

Overview

Prerequisites

Step-by-Step Guide

Step 1: Upload Model to S3

Step 2: Create SageMaker Model

Step 3: Deploy Endpoint

RAG and LLM Server Configuration

RAG Server (SageMaker)

LLM Server (SageMaker)

Testing