Skip to content

Module 3.8: Azure Functions & Serverless

Complexity: [MEDIUM] | Time to Complete: 2h | Prerequisites: Module 3.4 (Blob Storage), Module 3.1 (Entra ID)

After completing this module, you will be able to:

  • Deploy Azure Functions with HTTP, Timer, Blob, and Service Bus triggers using the Flex Consumption plan
  • Configure Durable Functions for stateful orchestration patterns (fan-out/fan-in, human interaction, chaining)
  • Implement Azure Functions with VNet integration, managed identity, and Key Vault secret references
  • Optimize Azure Functions performance by configuring host.json settings, concurrency limits, and scale-out rules

In 2022, an e-commerce company was processing product image uploads. Their workflow was simple: a customer uploads an image, the application resizes it into three formats (thumbnail, medium, large), and stores the results. They ran this on a pair of D4s_v5 VMs behind a load balancer, running a Node.js process that polled an upload queue every 500 milliseconds. The two VMs cost $280/month. During business hours, they processed about 200 images per hour. Between midnight and 6 AM, they processed zero. On Black Friday, they processed 15,000 images per hour and the VMs could not keep up, causing a 3-hour backlog of unprocessed images. After migrating to Azure Functions with a Blob trigger, images were processed within 2 seconds of upload, regardless of volume. The Functions scaled automatically from zero to hundreds of concurrent executions during Black Friday, and back to zero at night. Their monthly bill dropped from $280 to $23---and the image processing backlog disappeared permanently.

Azure Functions is Microsoft’s function-as-a-service (FaaS) platform. You write a small piece of code---a function---that is triggered by an event (an HTTP request, a new blob, a queue message, a timer). Azure handles everything else: provisioning infrastructure, scaling, patching, and monitoring. You pay only for the compute time your code consumes, measured in gigabyte-seconds.

In this module, you will learn the three hosting plans for Functions, how triggers and bindings eliminate boilerplate integration code, and how Durable Functions orchestrate multi-step workflows. By the end, you will build a function triggered by a blob upload that processes data and writes results to Cosmos DB using output bindings.


The hosting plan determines the scaling behavior, available resources, and pricing model for your Functions. Choosing the right plan is one of the most consequential decisions.

FeatureConsumptionFlex ConsumptionPremium (EP)Dedicated (ASP)
Scaling0 to 200 instances0 to 1000 instances1 to 100 instancesManual or autoscale
Scale to zeroYesYesOptional (min 1)No
Cold startYes (1-10 seconds)Reduced (pre-warmed)No (always warm)No (always running)
Max execution5 min (default, 10 max)30 minUnlimitedUnlimited
Memory1.5 GBUp to 4 GB3.5-14 GBPlan-dependent
VNet integrationNoYesYesYes
Cost modelPer execution + GB-sPer execution + GB-sPer instance hourPer App Service Plan
Free grant1M executions + 400K GB-s/monthSimilarNoneNone
Best forEvent-driven, sporadicEvent-driven, predictableLow latency, always readyExisting ASP, long jobs

Stop and think: If your company has a strict policy that all database traffic must route through a private VNet, but you want to avoid paying for instances when no traffic is hitting your function at night, which hosting plan is your only viable option?

flowchart TD
VNet{Does your function need VNet access?}
VNet -- YES --> Cost{Is cost a primary concern?}
VNet -- NO --> Duration{Is execution time always under 5 minutes?}
Cost -- YES --> Flex[Flex Consumption<br/>scales to zero, VNet support]
Cost -- NO --> Premium[Premium EP1<br/>no cold start, VNet, unlimited duration]
Duration -- YES --> Consumption[Consumption<br/>cheapest, simplest]
Duration -- NO --> PremiumDedicated[Premium or Dedicated]
Terminal window
# Create a Consumption plan Function App (Python requires Linux)
az functionapp create \
--resource-group myRG \
--consumption-plan-location eastus2 \
--runtime python \
--runtime-version 3.11 \
--os-type Linux \
--functions-version 4 \
--name kubedojo-func-$(openssl rand -hex 4) \
--storage-account "$STORAGE_NAME"
# Create a Premium plan Function App
az functionapp plan create \
--resource-group myRG \
--name kubedojo-premium-plan \
--location eastus2 \
--sku EP1 \
--min-instances 1 \
--max-burst 20
az functionapp create \
--resource-group myRG \
--plan kubedojo-premium-plan \
--runtime python \
--runtime-version 3.11 \
--functions-version 4 \
--name kubedojo-premium-func \
--storage-account "$STORAGE_NAME"

Cold start is the latency added when a function executes for the first time (or after an idle period). It happens because Azure needs to allocate a worker, load the runtime, and initialize your code.

flowchart LR
A["Worker Alloc<br/>(~1-2s)"] --> B["Runtime Init<br/>(~0.5-1s)"]
B --> C["Dep Load (pip)<br/>(~1-4s)"]
C --> D["Your Code Init<br/>(~0.2-1s)"]

Total: 3-8 seconds

Mitigation strategies:

  1. Keep dependencies minimal (fewer pip packages)
  2. Use Premium plan (always-warm instances)
  3. Use Flex Consumption (pre-provisioned instances)
  4. Avoid heavy initialization in function startup

War Story: A payment processing company chose the Consumption plan for their webhook handler. Most requests completed in 200ms. But once every 15-20 minutes, the function would cold start, adding 6 seconds of latency. Their payment provider interpreted these 6-second responses as timeouts and marked them as failed, triggering retry logic that created duplicate transactions. Switching to Premium plan with 1 minimum instance eliminated cold starts entirely, and the duplicate transaction problem vanished overnight.


Triggers and Bindings: The Power of Declarative Integration

Section titled “Triggers and Bindings: The Power of Declarative Integration”

Triggers and bindings are what make Azure Functions genuinely productive. A trigger is the event that causes a function to execute. A binding is a declarative connection to another Azure service that handles the boilerplate of reading from or writing to that service.

TriggerEvent SourceCommon Use Case
HTTPHTTP requestREST APIs, webhooks
TimerCron scheduleScheduled tasks, cleanup jobs
BlobNew/modified blobImage processing, file transformation
QueueQueue messageAsync task processing
Service BusSB messageEnterprise messaging
Event HubStreaming eventsIoT, telemetry processing
Cosmos DBDocument change feedReal-time data synchronization
Event GridAzure eventsResource change reactions
flowchart LR
subgraph Triggers
QueueTrig[Queue Message]
end
subgraph Input Bindings
Blob[Blob Storage]
Table[Table Storage]
end
subgraph Azure Function
Code((Your Code<br/>just the logic))
end
subgraph Output Bindings
Cosmos[Cosmos DB Document]
QueueOut[Queue Message]
Email[SendGrid Email]
end
Blob --> Code
Table --> Code
QueueTrig --> Code
Code --> Cosmos
Code --> QueueOut
Code --> Email

Without bindings: 50+ lines of SDK setup code With bindings: 0 lines of SDK code (declarative)

Pause and predict: If you use an output binding to write a document to Cosmos DB, but the Cosmos DB service experiences a brief 2-second network blip while the function runs, do you need to write custom retry logic in your Python code?

# function_app.py - Timer trigger (run every 30 minutes)
import azure.functions as func
import logging
app = func.FunctionApp()
@app.timer_trigger(
schedule="0 */30 * * * *", # Every 30 minutes
arg_name="timer",
run_on_startup=False
)
def cleanup_job(timer: func.TimerRequest) -> None:
if timer.past_due:
logging.warning("Timer is past due!")
logging.info("Running scheduled cleanup...")
# Your cleanup logic here
# HTTP trigger with Cosmos DB output binding
@app.route(route="orders", methods=["POST"], auth_level=func.AuthLevel.FUNCTION)
@app.cosmos_db_output(
arg_name="outputDocument",
database_name="OrdersDB",
container_name="orders",
connection="CosmosDBConnection"
)
def create_order(req: func.HttpRequest, outputDocument: func.Out[func.Document]) -> func.HttpResponse:
order = req.get_json()
order["id"] = str(uuid.uuid4())
order["createdAt"] = datetime.utcnow().isoformat()
# Write to Cosmos DB via output binding (no SDK code needed!)
outputDocument.set(func.Document.from_dict(order))
return func.HttpResponse(
json.dumps({"orderId": order["id"]}),
status_code=201,
mimetype="application/json"
)
# Blob trigger with Queue output binding
@app.blob_trigger(
arg_name="inputBlob",
path="uploads/{name}",
connection="StorageConnection"
)
@app.queue_output(
arg_name="outputQueue",
queue_name="processing-results",
connection="StorageConnection"
)
def process_upload(inputBlob: func.InputStream, outputQueue: func.Out[str]) -> None:
logging.info(f"Processing blob: {inputBlob.name}, Size: {inputBlob.length} bytes")
# Process the blob content
content = inputBlob.read()
result = {
"blobName": inputBlob.name,
"size": inputBlob.length,
"processedAt": datetime.utcnow().isoformat()
}
# Send result to queue via output binding
outputQueue.set(json.dumps(result))
logging.info(f"Result queued for {inputBlob.name}")
Terminal window
# Initialize a new Function project locally
func init MyFunctionProject --python
# Create a new function
cd MyFunctionProject
func new --name ProcessBlob --template "Azure Blob Storage trigger"
# Run locally
func start
# Deploy to Azure
func azure functionapp publish kubedojo-func-xxxx
# Or deploy via Azure CLI with zip deploy
cd MyFunctionProject
zip -r function.zip . -x ".venv/*"
az functionapp deployment source config-zip \
--resource-group myRG \
--name kubedojo-func-xxxx \
--src function.zip

Durable Functions: Orchestrating Complex Workflows

Section titled “Durable Functions: Orchestrating Complex Workflows”

Regular Azure Functions are stateless---each execution is independent. Durable Functions add state management, enabling you to write multi-step workflows, fan-out/fan-in patterns, and human interaction patterns.

Pattern 1: Function Chaining

flowchart LR
P1S1[Step 1: Validate] --> P1S2[Step 2: Enrich]
P1S2 --> P1S3[Step 3: Process]
P1S3 --> P1S4[Step 4: Notify]

Pattern 2: Fan-Out / Fan-In

flowchart LR
Start[Start] --> TaskA[Task A]
Start --> TaskB[Task B]
Start --> TaskC[Task C]
TaskA --> Aggregate[Aggregate]
TaskB --> Aggregate
TaskC --> Aggregate

Pattern 3: Async HTTP API (Long-Running)

sequenceDiagram
participant Client
participant API as Durable Function API
Client->>API: Start Long-Running Task
API-->>Client: Return Status URL (202 Accepted)
loop Until Complete
Client->>API: Poll Status URL
API-->>Client: Status: Running
end
Client->>API: Poll Status URL
API-->>Client: Status: Complete + Result
# Durable Functions example: Image processing pipeline
import azure.functions as func
import azure.durable_functions as df
app = func.FunctionApp()
# Orchestrator function (coordinates the workflow)
@app.orchestration_trigger(context_name="context")
def image_pipeline(context: df.DurableOrchestrationContext):
# Input: blob URL from the trigger
blob_url = context.get_input()
# Step 1: Validate the image
validation = yield context.call_activity("validate_image", blob_url)
if not validation["valid"]:
return {"status": "rejected", "reason": validation["reason"]}
# Step 2: Fan-out - create multiple sizes in parallel
parallel_tasks = [
context.call_activity("resize_image", {"url": blob_url, "size": "thumbnail"}),
context.call_activity("resize_image", {"url": blob_url, "size": "medium"}),
context.call_activity("resize_image", {"url": blob_url, "size": "large"}),
]
results = yield context.task_all(parallel_tasks)
# Step 3: Save metadata to database
metadata = yield context.call_activity("save_metadata", {
"original": blob_url,
"variants": results
})
return {"status": "completed", "metadata": metadata}
# Activity functions (do the actual work)
@app.activity_trigger(input_name="blobUrl")
def validate_image(blobUrl: str) -> dict:
# Validate image format, size, etc.
return {"valid": True, "format": "jpeg", "dimensions": "3024x4032"}
@app.activity_trigger(input_name="params")
def resize_image(params: dict) -> str:
# Resize the image (actual PIL/Pillow code would go here)
return f"resized/{params['size']}/{params['url'].split('/')[-1]}"
@app.activity_trigger(input_name="data")
def save_metadata(data: dict) -> dict:
# Save to Cosmos DB or Table Storage
return {"id": "img-12345", "saved": True}
# HTTP trigger to start the orchestration
@app.route(route="start-pipeline", methods=["POST"])
@app.durable_client_input(client_name="client")
async def start_pipeline(req: func.HttpRequest, client) -> func.HttpResponse:
blob_url = req.get_json().get("blobUrl")
instance_id = await client.start_new("image_pipeline", client_input=blob_url)
return client.create_check_status_response(req, instance_id)

Durable Functions store their state in Azure Storage (tables and queues), enabling them to run for days, weeks, or even months. An orchestration can be paused (waiting for a human approval, for example) and resumed without consuming any compute.


Terminal window
# Set an application setting (becomes an environment variable)
az functionapp config appsettings set \
--resource-group myRG \
--name kubedojo-func-xxxx \
--settings "COSMOS_DB_ENDPOINT=https://mydb.documents.azure.com:443/"
# Reference Key Vault secrets (recommended over plain text)
az functionapp config appsettings set \
--resource-group myRG \
--name kubedojo-func-xxxx \
--settings "CosmosDBConnection=@Microsoft.KeyVault(SecretUri=https://myvault.vault.azure.net/secrets/cosmos-connection/)"
# Enable managed identity for Key Vault access
az functionapp identity assign --resource-group myRG --name kubedojo-func-xxxx
Terminal window
# Function-level auth (API key in header or query string)
# Configured per-function via authLevel in the trigger decorator
# App-level auth with Entra ID (Easy Auth)
az webapp auth microsoft update \
--resource-group myRG \
--name kubedojo-func-xxxx \
--client-id "$APP_CLIENT_ID" \
--issuer "https://login.microsoftonline.com/$TENANT_ID/v2.0"

  1. Azure Functions Consumption plan has processed trillions of executions since its launch. The free grant of 1 million executions and 400,000 GB-seconds per month means that many small-to-medium applications run entirely for free. A function that executes 100,000 times per month at 128 MB memory and 200ms average duration uses only 2,560 GB-seconds---well within the free tier.

  2. Durable Functions can run for up to 7 days on the Consumption plan (the orchestrator itself; individual activity functions still have the 5-10 minute limit). On Premium and Dedicated plans, they can run indefinitely. One retail company uses a Durable Function orchestration that runs for 30 days, managing a month-long A/B test lifecycle with periodic check-ins and automatic completion.

  3. Blob triggers use a polling mechanism, not events. When you use a Blob trigger, Azure Functions scans the blob container for changes every few seconds. This means there can be a delay of up to 60 seconds between a blob being uploaded and the function executing. For real-time processing, use an Event Grid trigger (which is event-driven and near-instant) that subscribes to the blob storage account’s BlobCreated events.

  4. Azure Functions supports custom handlers, meaning you can write functions in any language that can listen on an HTTP port---Rust, Go, Ruby, or even Bash scripts. The Functions runtime sends trigger data to your custom handler via HTTP, and your handler sends back the response. This opens Azure Functions to languages that are not natively supported.


MistakeWhy It HappensHow to Fix It
Choosing Consumption plan for latency-sensitive APIsConsumption is the default and cheapestUse Premium or Flex Consumption for APIs where cold start latency is unacceptable. The extra cost of keeping 1 instance warm is usually justified.
Writing functions that take longer than 5 minutes on ConsumptionDevelopers do not check the timeout limit during developmentMove long-running work to Durable Functions (activity functions can run up to 5 min each, but the orchestration chains them). Or switch to Premium plan.
Using Blob triggers when Event Grid triggers are more appropriateBlob trigger is the first result in documentationBlob triggers poll (up to 60s delay). Event Grid triggers are event-driven (sub-second). Use Event Grid for time-sensitive processing.
Storing connection strings in Application Settings as plain textIt is the quickest way to configure bindingsUse Key Vault references: @Microsoft.KeyVault(SecretUri=...). Enable managed identity on the Function App and grant it Key Vault Secrets User role.
Not configuring retry policies for queue/event triggersThe default retry behavior is not always appropriateConfigure maxRetryCount and retryStrategy (fixed or exponential backoff) in host.json. Without retries, transient failures cause permanent message loss.
Writing large, monolithic functions instead of chaining small onesIt seems simpler to put all logic in one functionBreak complex workflows into small, focused functions. Use Durable Functions for orchestration. Small functions are easier to test, debug, and scale independently.
Not setting FUNCTIONS_WORKER_PROCESS_COUNT for CPU-intensive workThe default is 1 worker processFor Python functions (GIL limitation), increase FUNCTIONS_WORKER_PROCESS_COUNT to utilize multiple cores. Each process handles requests independently.
Ignoring the cold start impact of large dependency packagesAdding dependencies is easy; understanding their boot cost is notProfile your cold start time. Remove unused packages. Use slim base images. For Python, consider using layers or pre-compiled wheels.

1. Your team is migrating a legacy e-commerce API to Azure Functions. The API must connect to a secure on-premises inventory database via a VNet, and it consistently receives traffic 24/7 without significant idle periods. Why would you choose the Premium plan over the Consumption plan for this workload?

You would choose the Premium plan because the Consumption plan does not support VNet integration, making it impossible to connect to your secure on-premises database. Additionally, since the API receives constant traffic, the per-execution pricing of the Consumption plan would likely become more expensive than the flat, instance-based pricing of the Premium plan. The Premium plan also keeps instances pre-warmed, eliminating cold starts and ensuring the latency-sensitive e-commerce API remains responsive to customers at all times.

2. You are writing an Azure Function that processes uploaded CSV files and writes the results to both a Cosmos DB database and an Azure Service Bus queue. How would you divide the responsibilities between triggers and bindings to implement this without writing any connection management code?

You would configure a single Blob Storage trigger to initiate the function execution whenever a new CSV file is uploaded, as every function requires exactly one trigger to define when it runs. To handle the outputs, you would configure two output bindings: one for Cosmos DB and one for Service Bus. The bindings declaratively handle the connection, authentication, and data formatting for the external services. By using output bindings, your Python code simply returns or sets the data objects, and the Azure Functions runtime automatically manages the complex SDK operations to write the data to the respective destinations.

3. Your media processing application uses a Blob trigger function to resize user profile pictures. Users are complaining that it takes up to a minute for their new profile picture to appear after uploading. What is causing this delay, and how can you re-architect the trigger to solve it?

The delay is caused by the Blob trigger’s internal polling mechanism, which periodically scans the storage container for new or modified blobs. Because it operates on a polling schedule (often every 10 to 60 seconds), it cannot provide immediate execution upon upload. To solve this and achieve real-time processing, you should switch from a Blob trigger to an Event Grid trigger. By configuring the storage account to emit an event to Event Grid the moment a blob is created, the function will be triggered instantly, eliminating the polling delay and ensuring profile pictures update immediately.

4. You are tasked with building an onboarding system. When a new user signs up, the system must create an Active Directory account, assign licenses, and then wait for an HR manager to manually approve the access via an email link. This approval could take up to 5 days. Why would you choose Durable Functions over regular Azure Functions for this workflow?

Regular Azure Functions are stateless and have a maximum execution time limit (up to 10 minutes on Consumption, or 30 on Flex), meaning they cannot natively wait 5 days for a manual approval without timing out or running continuously. Durable Functions solve this by using stateful orchestration. When the workflow reaches the manual approval step, the orchestrator function goes to sleep, saving its state to Azure Storage and stopping compute consumption entirely. Once the HR manager clicks the approval link, an external event wakes the orchestrator back up, restoring its state, and allowing the workflow to continue assigning licenses.

5. Your web application experiences massive, unpredictable traffic spikes on Black Friday, jumping from zero requests to thousands of requests per second instantly. You are debating between using the Consumption plan and the Premium plan to handle this API. How would the scaling behavior differ between the two plans during this spike?

On the Consumption plan, Azure manages scaling automatically by rapidly allocating new workers from a shared pool to handle the incoming requests, allowing it to scale from zero to hundreds of instances. However, this reactive scaling introduces cold starts, meaning the first wave of users will experience significant latency while new instances boot up. On the Premium plan, you configure a minimum number of always-warm instances and a maximum burst count. When the spike hits, the pre-warmed instances handle the initial load instantly without cold starts, while Azure rapidly scales out additional instances up to your defined maximum limit, providing a much smoother and faster scaling response.

6. You have a function that processes orders and needs to: validate the order, charge the credit card, update inventory, and send a confirmation email. How would you architect this with Azure Functions?

You should use a Durable Functions orchestrator with four separate activity functions for validation, charging, inventory, and emailing. The orchestrator function calls each activity sequentially, maintaining the state of the entire workflow. If a step fails, such as the inventory update, the orchestrator can natively implement compensation logic to call a refund activity and reverse the credit card charge. By using Durable Functions instead of chaining multiple independent functions via queue messages, you gain a single, readable source of truth for the workflow and ensure that partial failures do not leave your system in an inconsistent state.


Hands-On Exercise: Blob Trigger to Process and Store in Cosmos DB

Section titled “Hands-On Exercise: Blob Trigger to Process and Store in Cosmos DB”

In this exercise, you will create an Azure Function triggered by blob uploads that processes the file metadata and stores the result in Cosmos DB via an output binding.

Prerequisites: Azure CLI, Azure Functions Core Tools (func), Python 3.11+.

Terminal window
RG="kubedojo-functions-lab"
LOCATION="eastus2"
STORAGE_NAME="kubedojofunc$(openssl rand -hex 4)"
FUNC_NAME="kubedojofunc$(openssl rand -hex 4)"
COSMOS_NAME="kubedojocosmos$(openssl rand -hex 4)"
az group create --name "$RG" --location "$LOCATION"
# Create storage account for the Function App
az storage account create \
--name "$STORAGE_NAME" \
--resource-group "$RG" \
--location "$LOCATION" \
--sku Standard_LRS
# Create a blob container for uploads
az storage container create \
--name "uploads" \
--account-name "$STORAGE_NAME"
# Create Cosmos DB account
az cosmosdb create \
--name "$COSMOS_NAME" \
--resource-group "$RG" \
--kind GlobalDocumentDB \
--default-consistency-level Session \
--locations regionName="$LOCATION" failoverPriority=0
# Create Cosmos DB database and container
az cosmosdb sql database create \
--account-name "$COSMOS_NAME" \
--resource-group "$RG" \
--name "ProcessingDB"
az cosmosdb sql container create \
--account-name "$COSMOS_NAME" \
--resource-group "$RG" \
--database-name "ProcessingDB" \
--name "results" \
--partition-key-path "/blobName"
Verify Task 1
Terminal window
az cosmosdb sql container show \
--account-name "$COSMOS_NAME" -g "$RG" \
--database-name "ProcessingDB" --name "results" \
--query '{Name:name, PartitionKey:resource.partitionKey.paths[0]}' -o table
Terminal window
# Create a Consumption plan Function App
az functionapp create \
--resource-group "$RG" \
--consumption-plan-location "$LOCATION" \
--runtime python \
--runtime-version 3.11 \
--functions-version 4 \
--name "$FUNC_NAME" \
--storage-account "$STORAGE_NAME"
# Get connection strings
STORAGE_CONN=$(az storage account show-connection-string \
-n "$STORAGE_NAME" -g "$RG" --query connectionString -o tsv)
COSMOS_CONN=$(az cosmosdb keys list \
--name "$COSMOS_NAME" -g "$RG" --type connection-strings \
--query 'connectionStrings[0].connectionString' -o tsv)
# Configure app settings
az functionapp config appsettings set \
--resource-group "$RG" \
--name "$FUNC_NAME" \
--settings \
"StorageConnection=$STORAGE_CONN" \
"CosmosDBConnection=$COSMOS_CONN"
Verify Task 2
Terminal window
az functionapp show -g "$RG" -n "$FUNC_NAME" \
--query '{Name:name, State:state, Runtime:siteConfig.linuxFxVersion}' -o table
Terminal window
# Create project directory
mkdir -p /tmp/functions-lab && cd /tmp/functions-lab
# Initialize the project
func init --python --model V2

Now create the function code:

Terminal window
cat > /tmp/functions-lab/function_app.py << 'PYEOF'
import azure.functions as func
import json
import uuid
import logging
from datetime import datetime
app = func.FunctionApp()
@app.blob_trigger(
arg_name="inputBlob",
path="uploads/{name}",
connection="StorageConnection"
)
@app.cosmos_db_output(
arg_name="outputDocument",
database_name="ProcessingDB",
container_name="results",
connection="CosmosDBConnection"
)
def process_upload(inputBlob: func.InputStream, outputDocument: func.Out[func.Document]):
"""Process uploaded blobs and store metadata in Cosmos DB."""
blob_name = inputBlob.name
blob_size = inputBlob.length
content = inputBlob.read()
logging.info(f"Processing blob: {blob_name}, Size: {blob_size} bytes")
# Determine content type based on extension
extension = blob_name.rsplit(".", 1)[-1].lower() if "." in blob_name else "unknown"
content_types = {
"json": "application/json",
"csv": "text/csv",
"txt": "text/plain",
"jpg": "image/jpeg",
"png": "image/png",
"pdf": "application/pdf"
}
# Build the metadata document
result = {
"id": str(uuid.uuid4()),
"blobName": blob_name.split("/")[-1],
"fullPath": blob_name,
"sizeBytes": blob_size,
"extension": extension,
"contentType": content_types.get(extension, "application/octet-stream"),
"processedAt": datetime.utcnow().isoformat() + "Z",
"status": "processed"
}
# If it is a JSON file, count the records
if extension == "json":
try:
data = json.loads(content)
if isinstance(data, list):
result["recordCount"] = len(data)
result["status"] = "processed_with_analysis"
except json.JSONDecodeError:
result["status"] = "invalid_json"
# Write to Cosmos DB via output binding
outputDocument.set(func.Document.from_dict(result))
logging.info(f"Stored metadata for {blob_name} in Cosmos DB with id {result['id']}")
PYEOF
# Update requirements.txt
cat > /tmp/functions-lab/requirements.txt << 'EOF'
azure-functions
azure-cosmos
EOF
Verify Task 3
Terminal window
ls -la /tmp/functions-lab/function_app.py
cat /tmp/functions-lab/requirements.txt

You should see the function_app.py file and requirements.txt.

Terminal window
cd /tmp/functions-lab
func azure functionapp publish "$FUNC_NAME"
Verify Task 4
Terminal window
az functionapp function list -g "$RG" -n "$FUNC_NAME" \
--query '[].{Name:name, Trigger:config.bindings[0].type}' -o table

You should see the process_upload function with a blobTrigger.

Task 5: Test the Function by Uploading Blobs

Section titled “Task 5: Test the Function by Uploading Blobs”
Terminal window
# Upload a JSON file
echo '[{"user": "alice", "action": "login"}, {"user": "bob", "action": "purchase"}]' > /tmp/test-data.json
az storage blob upload \
--container-name "uploads" \
--file /tmp/test-data.json \
--name "test-data.json" \
--account-name "$STORAGE_NAME" \
--connection-string "$STORAGE_CONN"
# Upload a text file
echo "Hello from KubeDojo Functions lab" > /tmp/readme.txt
az storage blob upload \
--container-name "uploads" \
--file /tmp/readme.txt \
--name "readme.txt" \
--account-name "$STORAGE_NAME" \
--connection-string "$STORAGE_CONN"
# Wait for processing (blob trigger polling delay)
echo "Waiting 60 seconds for blob trigger to process..."
sleep 60
# Check function execution logs
az functionapp log tail -g "$RG" -n "$FUNC_NAME" --timeout 10 2>/dev/null || \
echo "Check logs in the Azure portal: Function App > Functions > process_upload > Monitor"
Verify Task 5
Terminal window
# Query Cosmos DB for the processed results
az cosmosdb sql query \
--account-name "$COSMOS_NAME" \
--resource-group "$RG" \
--database-name "ProcessingDB" \
--container-name "results" \
--query-text "SELECT c.blobName, c.sizeBytes, c.extension, c.status, c.processedAt FROM c" \
-o table 2>/dev/null || \
az cosmosdb sql container show \
--account-name "$COSMOS_NAME" -g "$RG" \
--database-name "ProcessingDB" --name "results" \
--query resource.id -o tsv

You should see two documents in Cosmos DB: one for test-data.json (with recordCount=2 and status=processed_with_analysis) and one for readme.txt (with status=processed).

Terminal window
az group delete --name "$RG" --yes --no-wait
rm -rf /tmp/functions-lab /tmp/test-data.json /tmp/readme.txt
  • Storage account with uploads container created
  • Cosmos DB account with ProcessingDB database and results container created
  • Function App deployed with blob trigger and Cosmos DB output binding
  • JSON file uploaded and processed (metadata stored in Cosmos DB with record count)
  • Text file uploaded and processed (metadata stored in Cosmos DB)
  • Function execution visible in logs or monitor

Module 3.9: Azure Key Vault --- Learn how to securely manage secrets, encryption keys, and certificates with Azure Key Vault, and integrate it with your applications using Managed Identities.