Architecture Patterns

Production-ready patterns for building Python applications on Cloudflare's edge. Each pattern combines multiple Cloudflare services into a cohesive architecture.

Pattern 1: AI-Powered API

Combine Workers AI and Vectorize to build a complete RAG (Retrieval-Augmented Generation) API. Users submit queries, the system finds relevant context via semantic search, then generates intelligent responses.

from js import Response, env
import json

async def on_fetch(request):
    if request.method == "POST":
        data = await request.json()

        # Generate embedding for the query
        embedding = await env.AI.run(
            "@cf/baai/bge-base-en-v1.5",
            text=data["query"]
        )

        # Semantic search for relevant context
        results = await env.VECTORIZE_INDEX.query(
            embedding.data[0], topK=5
        )

        # Build context from search results
        context = "\n".join([
            r.metadata.content for r in results.matches
        ])

        # Generate response using LLM with context
        response = await env.AI.run(
            "@cf/meta/llama-3-8b-instruct",
            messages=[
                {"role": "system", "content": f"Context: {context}"},
                {"role": "user", "content": data["query"]}
            ]
        )

        return Response.json({"answer": response.response})

Services used: Workers AI (embeddings + LLM), Vectorize (vector search), Python Workers (API handler)

Pattern 2: Async Data Pipeline

Accept file uploads via an API, store them in R2, queue them for background processing with AI, and persist results in D1. The API returns immediately while processing happens asynchronously.

# API endpoint triggers processing
async def on_fetch(request):
    file_url = (await request.json())["file_url"]

    # Store in R2
    file_data = await fetch(file_url)
    key = f"uploads/{uuid4()}"
    await env.BUCKET.put(key, file_data.body)

    # Queue for background processing
    await env.PROCESS_QUEUE.send({
        "key": key,
        "user_id": request.headers.get("X-User-ID")
    })

    return Response.json({"status": "processing"})

# Queue consumer processes files asynchronously
async def queue_handler(batch):
    for msg in batch.messages:
        # Get file from R2
        obj = await env.BUCKET.get(msg.body["key"])
        data = await obj.text()

        # Process with AI (summarization)
        summary = await env.AI.run(
            "@cf/facebook/bart-large-cnn", text=data
        )

        # Store results in D1
        await env.DB.prepare("""
            INSERT INTO summaries (user_id, summary, created_at)
            VALUES (?, ?, datetime('now'))
        """).bind(msg.body["user_id"], summary.summary).run()

        msg.ack()

Services used: R2 (file storage), Queues (async processing), Workers AI (summarization), D1 (results database)

Pattern 3: Full-Stack Application

Replace your entire Docker Compose stack — nginx, gunicorn, Redis, Postgres, Celery — with a single Python file using FastAPI:

# main.py - Your entire "stack" in one file
from js import Response, env
from fastapi import FastAPI
import asyncio

app = FastAPI()

# Serve static assets directly
@app.get("/static/{path:path}")
async def static(path: str):
    # Workers can serve from KV or R2
    asset = await env.ASSETS.get(path)
    return Response.new(asset.body, headers={
        "Content-Type": asset.httpMetadata.contentType
    })

# Your app logic
@app.post("/api/process")
async def process(data: dict):
    # Cache in KV (replaces Redis)
    await env.KV.put(f"cache:{data['id']}", data)

    # Queue background work (replaces Celery)
    await env.QUEUE.send(data)

    # Store in D1 (replaces Postgres)
    await env.DB.prepare(
        "INSERT INTO items (data) VALUES (?)"
    ).bind(data).run()

    # Track metrics (replaces Prometheus)
    await env.ANALYTICS.writeDataPoint({
        "blobs": ["api_call"],
        "doubles": [1],
    })

    return {"status": "ok"}

Services used: Workers KV (caching), Queues (background tasks), D1 (database), R2/KV (static assets), Analytics Engine (metrics)

Cost Comparison

A typical production stack on Cloudflare vs traditional cloud:

Service	Cost
Workers (10M requests)	$5/month
D1 (1GB)	$5/month
KV (1M reads)	$0.50/month
R2 (100GB)	$1.50/month
Queues (1M messages)	$0
Total	~$12/month, no egress fees