Architecture Patterns
Production-ready patterns for building Python applications on Cloudflare's edge. Each pattern combines multiple Cloudflare services into a cohesive architecture.
Pattern 1: AI-Powered API
Combine Workers AI and Vectorize to build a complete RAG (Retrieval-Augmented Generation) API. Users submit queries, the system finds relevant context via semantic search, then generates intelligent responses.
from js import Response, env
import json
async def on_fetch(request):
if request.method == "POST":
data = await request.json()
# Generate embedding for the query
embedding = await env.AI.run(
"@cf/baai/bge-base-en-v1.5",
text=data["query"]
)
# Semantic search for relevant context
results = await env.VECTORIZE_INDEX.query(
embedding.data[0], topK=5
)
# Build context from search results
context = "\n".join([
r.metadata.content for r in results.matches
])
# Generate response using LLM with context
response = await env.AI.run(
"@cf/meta/llama-3-8b-instruct",
messages=[
{"role": "system", "content": f"Context: {context}"},
{"role": "user", "content": data["query"]}
]
)
return Response.json({"answer": response.response}) Services used: Workers AI (embeddings + LLM), Vectorize (vector search), Python Workers (API handler)
Pattern 2: Async Data Pipeline
Accept file uploads via an API, store them in R2, queue them for background processing with AI, and persist results in D1. The API returns immediately while processing happens asynchronously.
# API endpoint triggers processing
async def on_fetch(request):
file_url = (await request.json())["file_url"]
# Store in R2
file_data = await fetch(file_url)
key = f"uploads/{uuid4()}"
await env.BUCKET.put(key, file_data.body)
# Queue for background processing
await env.PROCESS_QUEUE.send({
"key": key,
"user_id": request.headers.get("X-User-ID")
})
return Response.json({"status": "processing"})
# Queue consumer processes files asynchronously
async def queue_handler(batch):
for msg in batch.messages:
# Get file from R2
obj = await env.BUCKET.get(msg.body["key"])
data = await obj.text()
# Process with AI (summarization)
summary = await env.AI.run(
"@cf/facebook/bart-large-cnn", text=data
)
# Store results in D1
await env.DB.prepare("""
INSERT INTO summaries (user_id, summary, created_at)
VALUES (?, ?, datetime('now'))
""").bind(msg.body["user_id"], summary.summary).run()
msg.ack() Services used: R2 (file storage), Queues (async processing), Workers AI (summarization), D1 (results database)
Pattern 3: Full-Stack Application
Replace your entire Docker Compose stack — nginx, gunicorn, Redis, Postgres, Celery — with a single Python file using FastAPI:
# main.py - Your entire "stack" in one file
from js import Response, env
from fastapi import FastAPI
import asyncio
app = FastAPI()
# Serve static assets directly
@app.get("/static/{path:path}")
async def static(path: str):
# Workers can serve from KV or R2
asset = await env.ASSETS.get(path)
return Response.new(asset.body, headers={
"Content-Type": asset.httpMetadata.contentType
})
# Your app logic
@app.post("/api/process")
async def process(data: dict):
# Cache in KV (replaces Redis)
await env.KV.put(f"cache:{data['id']}", data)
# Queue background work (replaces Celery)
await env.QUEUE.send(data)
# Store in D1 (replaces Postgres)
await env.DB.prepare(
"INSERT INTO items (data) VALUES (?)"
).bind(data).run()
# Track metrics (replaces Prometheus)
await env.ANALYTICS.writeDataPoint({
"blobs": ["api_call"],
"doubles": [1],
})
return {"status": "ok"} Services used: Workers KV (caching), Queues (background tasks), D1 (database), R2/KV (static assets), Analytics Engine (metrics)
Cost Comparison
A typical production stack on Cloudflare vs traditional cloud:
| Service | Cost |
|---|---|
| Workers (10M requests) | $5/month |
| D1 (1GB) | $5/month |
| KV (1M reads) | $0.50/month |
| R2 (100GB) | $1.50/month |
| Queues (1M messages) | $0 |
| Total | ~$12/month, no egress fees |