This document explains the streaming capabilities in the AWS Bedrock Lambda Proxy (bedrock-lambda-proxy-streaming.py
), including implementation details and deployment notes.
The Lambda proxy supports both streaming and non-streaming requests to AWS Bedrock models. The implementation provides two streaming response formats:
To use streaming, simply add "stream": true
to your request JSON:
{
"modelId": "anthropic.claude-3-sonnet-20240229-v1:0",
"stream": true,
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Your prompt here"
}
]
}
],
"anthropic_version": "bedrock-2023-05-31",
"max_tokens": 1000
}
For non-streaming requests ("stream": false
or omitted), the Lambda returns the complete model response in a single JSON object, exactly as provided by AWS Bedrock.
Depending on deployment method, streaming responses come in two formats:
When using Lambda Function URLs with streaming enabled, the response:
text/event-stream
data: {"content": "chunk text"}\n\n
data: [DONE]\n\n
This format is ideal for real-time display of responses as they’re generated.
When using API Gateway with streaming enabled, the response:
{"content": "chunk text"}
{"done": true}
This format is used because API Gateway doesn’t natively support streaming responses.
Add this permission to your Lambda’s IAM role:
{
"Effect": "Allow",
"Action": [
"bedrock:InvokeModelWithResponseStream"
],
"Resource": "*"
}
The Lambda proxy maintains compatibility with the existing usage tracking system:
reporting-setup.yml
See the client-integration.js file for examples of:
For the best streaming experience:
The Lambda proxy:
stream
parameter and deployment type:
invoke_model
invoke_model_with_response_stream
and returns SSEinvoke_model_with_response_stream
but collects chunks