Module 7.4: Ansible for Infrastructure

Цей контент ще не доступний вашою мовою.

Complexity: [COMPLEX]

Time to Complete: 90 minutes

Prerequisites

Before starting this module, you should have completed:

Module 6.1: IaC Fundamentals
Basic SSH and Linux administration
Understanding of YAML syntax

What You’ll Be Able to Do

After completing this module, you will be able to:

Configure Ansible playbooks with roles and collections for Kubernetes node provisioning
Implement Ansible’s kubernetes.core collection for declarative cluster resource management
Deploy multi-tier applications using Ansible with inventory management and vault-encrypted secrets
Integrate Ansible with Terraform for infrastructure provisioning and configuration management workflows

Why This Module Matters

The configuration management console showed 2,847 servers in “unknown” state.

When the e-commerce company’s Black Friday traffic projections came in—400% above normal—their infrastructure team realized the nightmare scenario: their servers were configured manually over three years. No one knew the exact state of any machine. Some had been patched, others hadn’t. Some ran different application versions. Some had configuration drift that would cause failures under load.

They had six weeks to achieve configuration parity across nearly 3,000 servers. Manual intervention was impossible—that’s 71 servers per day, with zero errors allowed.

Ansible became their salvation. In 23 days, they wrote playbooks that audited every server’s state, documented all drift, and enforced consistent configuration. On Black Friday, zero servers failed due to configuration issues. Revenue: $127 million in 24 hours.

This module teaches you Ansible’s agentless architecture, playbook development, inventory management, and integration with Terraform for complete infrastructure automation. You’ll learn when to use Ansible versus Terraform—and how to use them together.

War Story: The Patch That Broke Production

Characters:

Marcus: Senior SRE (8 years experience)
Team: 6 engineers managing 1,200 servers
Infrastructure: Mix of bare metal and cloud VMs

The Incident:

A critical OpenSSL vulnerability (CVE-2024-XXXX) was announced. Marcus had 72 hours to patch all 1,200 servers before the exploit went public.

Timeline:

Hour 0: CVE announced, CVSS 9.8 (Critical)
        Team: "How do we patch 1,200 servers in 72 hours?"

Hour 1: Marcus starts writing bash scripts
        Team realizes: different OS versions need different patches
        Ubuntu 20.04, 22.04, RHEL 7, 8, 9 all in production

Hour 4: Scripts getting complex
        "How do we track which servers are patched?"
        "How do we rollback if something breaks?"

Hour 6: Marcus: "We need Ansible. Now."
        Team starts learning Ansible

Hour 12: First playbook complete
         inventory file lists all 1,200 servers
         Grouped by OS version

Hour 18: Dry run on 10 servers
         Found: 3 servers had customized OpenSSL
         Would have broken in production

Hour 24: Playbook handles edge cases
         Added check mode (--check) for verification
         Added handlers for service restarts

Hour 36: Rolling deployment begins
         50 servers at a time
         Automatic rollback on failure

Hour 48: 847 servers patched, zero failures

Hour 60: All 1,200 servers patched
         12 hours ahead of deadline

Hour 72: Exploit released
         Security scan: 0 vulnerable servers

What Ansible Provided:

Idempotency: Running playbook twice = same result
Check mode: See what would change without changing
Inventory grouping: Different plays for different OS versions
Handlers: Restart services only when needed
Rolling updates: Control blast radius

Lessons Learned:

Manual operations don’t scale under pressure
Ansible’s agentless model meant instant deployment
Check mode prevented three potential outages
Inventory groups handle heterogeneous environments

Ansible vs. Terraform: Complementary Tools

Understanding the Difference

┌─────────────────────────────────────────────────────────────────┐
│                    INFRASTRUCTURE LIFECYCLE                      │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│   TERRAFORM                        ANSIBLE                       │
│   ══════════                       ═══════                       │
│                                                                  │
│   ┌──────────────┐                 ┌──────────────┐             │
│   │  Provision   │                 │  Configure   │             │
│   │              │                 │              │             │
│   │  • Create VM │───────────────▶ │  • Install   │             │
│   │  • Networks  │                 │    packages  │             │
│   │  • Storage   │                 │  • Configure │             │
│   │  • IAM       │                 │    services  │             │
│   └──────────────┘                 │  • Deploy    │             │
│                                    │    apps      │             │
│   Declarative                      └──────────────┘             │
│   State-managed                                                  │
│   API-driven                       Procedural                    │
│                                    Agentless                     │
│                                    SSH-driven                    │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

When to Use Each

Use Case	Terraform	Ansible	Both
Create cloud resources	✅	❌	-
Install packages	❌	✅	-
Configure services	❌	✅	-
Manage Kubernetes resources	✅	✅	-
Complete server provisioning	-	-	✅
Database schema migrations	❌	✅	-
Network infrastructure	✅	❌	-
Secret rotation	❌	✅	-
Application deployment	❌	✅	-

The Golden Pattern: Terraform + Ansible

┌─────────────────────────────────────────────────────────────────┐
│                     INFRASTRUCTURE PIPELINE                      │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│   TERRAFORM                    ANSIBLE                          │
│   ═════════                    ═══════                          │
│                                                                  │
│   1. terraform apply           3. ansible-playbook              │
│      │                            │                             │
│      ├── Create VPC               ├── Install packages          │
│      ├── Create subnets           ├── Configure services        │
│      ├── Create EC2 instances     ├── Deploy application        │
│      ├── Create RDS               ├── Setup monitoring          │
│      └── Output: inventory.ini    └── Run health checks         │
│                    │                                            │
│   2. Dynamic inventory ◄──────────┘                             │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Ansible Architecture

Agentless Design

┌─────────────────────────────────────────────────────────────────┐
│                      ANSIBLE ARCHITECTURE                        │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌──────────────┐                                               │
│  │   Control    │                                               │
│  │    Node      │                                               │
│  │              │                                               │
│  │  • Ansible   │                                               │
│  │  • Playbooks │                                               │
│  │  • Inventory │                                               │
│  └──────┬───────┘                                               │
│         │                                                        │
│         │ SSH / WinRM                                           │
│         │ (No agents required)                                  │
│         │                                                        │
│    ┌────┴────┬────────────┬────────────┐                        │
│    ▼         ▼            ▼            ▼                        │
│ ┌──────┐ ┌──────┐    ┌──────┐    ┌──────┐                      │
│ │ Host │ │ Host │    │ Host │    │ Host │                      │
│ │  1   │ │  2   │    │  3   │    │  N   │                      │
│ │      │ │      │    │      │    │      │                      │
│ │Python│ │Python│    │Python│    │Python│                      │
│ │only  │ │only  │    │only  │    │only  │                      │
│ └──────┘ └──────┘    └──────┘    └──────┘                      │
│                                                                  │
│  Requirements:                                                   │
│  • SSH access (or WinRM for Windows)                            │
│  • Python on managed nodes                                       │
│  • No daemon, no agent installation                             │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Key Components

# ansible.cfg - Control node configuration
[defaults]
inventory = ./inventory
remote_user = ansible
private_key_file = ~/.ssh/ansible_key
host_key_checking = False
retry_files_enabled = False
gathering = smart
fact_caching = jsonfile
fact_caching_connection = /tmp/ansible_facts

[privilege_escalation]
become = True
become_method = sudo
become_user = root
become_ask_pass = False

[ssh_connection]
pipelining = True
ssh_args = -o ControlMaster=auto -o ControlPersist=60s

Inventory Management

Static Inventory

[webservers]
web1.example.com ansible_host=10.0.1.10
web2.example.com ansible_host=10.0.1.11
web3.example.com ansible_host=10.0.1.12

[databases]
db1.example.com ansible_host=10.0.2.10
db2.example.com ansible_host=10.0.2.11

[loadbalancers]
lb1.example.com ansible_host=10.0.0.10

# Group variables
[webservers:vars]
http_port=8080
max_connections=1000

[databases:vars]
db_port=5432
max_connections=500

# Group of groups
[production:children]
webservers
databases
loadbalancers

[production:vars]
env=production
monitoring=enabled

YAML Inventory (Preferred)

all:
  children:
    webservers:
      hosts:
        web1.example.com:
          ansible_host: 10.0.1.10
          nginx_worker_processes: 4
        web2.example.com:
          ansible_host: 10.0.1.11
          nginx_worker_processes: 8
        web3.example.com:
          ansible_host: 10.0.1.12
          nginx_worker_processes: 8
      vars:
        http_port: 8080
        max_connections: 1000

    databases:
      hosts:
        db1.example.com:
          ansible_host: 10.0.2.10
          postgresql_version: "15"
          role: primary
        db2.example.com:
          ansible_host: 10.0.2.11
          postgresql_version: "15"
          role: replica
      vars:
        db_port: 5432
        backup_enabled: true

    loadbalancers:
      hosts:
        lb1.example.com:
          ansible_host: 10.0.0.10
      vars:
        haproxy_maxconn: 50000

  vars:
    ansible_user: ansible
    ansible_become: true
    env: production

Dynamic Inventory with AWS

#!/usr/bin/env python3
"""
AWS EC2 Dynamic Inventory Script
Generates inventory from EC2 instances with specific tags
"""

import boto3
import json
import argparse

def get_inventory():
    ec2 = boto3.client('ec2')

    inventory = {
        '_meta': {'hostvars': {}},
        'all': {'children': ['ungrouped']},
        'ungrouped': {'hosts': []}
    }

    # Get all running instances
    response = ec2.describe_instances(
        Filters=[
            {'Name': 'instance-state-name', 'Values': ['running']},
            {'Name': 'tag:ManagedBy', 'Values': ['ansible']}
        ]
    )

    for reservation in response['Reservations']:
        for instance in reservation['Instances']:
            instance_id = instance['InstanceId']
            private_ip = instance.get('PrivateIpAddress')

            if not private_ip:
                continue

            # Get tags
            tags = {t['Key']: t['Value'] for t in instance.get('Tags', [])}

            # Add to hostvars
            inventory['_meta']['hostvars'][instance_id] = {
                'ansible_host': private_ip,
                'instance_type': instance['InstanceType'],
                'availability_zone': instance['Placement']['AvailabilityZone'],
                **tags
            }

            # Group by Environment tag
            env = tags.get('Environment', 'ungrouped')
            if env not in inventory:
                inventory[env] = {'hosts': [], 'children': []}
                inventory['all']['children'].append(env)
            inventory[env]['hosts'].append(instance_id)

            # Group by Role tag
            role = tags.get('Role', 'ungrouped')
            if role not in inventory:
                inventory[role] = {'hosts': [], 'children': []}
                inventory['all']['children'].append(role)
            inventory[role]['hosts'].append(instance_id)

    return inventory

def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('--list', action='store_true')
    parser.add_argument('--host', type=str)
    args = parser.parse_args()

    inventory = get_inventory()

    if args.list:
        print(json.dumps(inventory, indent=2))
    elif args.host:
        hostvars = inventory['_meta']['hostvars'].get(args.host, {})
        print(json.dumps(hostvars, indent=2))

if __name__ == '__main__':
    main()

AWS EC2 Plugin (Recommended)

plugin: amazon.aws.aws_ec2

regions:
  - us-east-1
  - us-west-2

filters:
  instance-state-name: running
  "tag:ManagedBy": ansible

keyed_groups:
  # Group by environment tag
  - key: tags.Environment
    prefix: env
    separator: "_"
  # Group by role tag
  - key: tags.Role
    prefix: role
    separator: "_"
  # Group by instance type
  - key: instance_type
    prefix: type
    separator: "_"

hostnames:
  - tag:Name
  - private-ip-address

compose:
  ansible_host: private_ip_address
  ansible_user: "'ec2-user'"

Playbook Development

Basic Playbook Structure

---
- name: Configure Web Servers
  hosts: webservers
  become: true
  gather_facts: true

  vars:
    http_port: 80
    nginx_worker_processes: auto
    nginx_worker_connections: 1024

  vars_files:
    - vars/common.yml
    - "vars/{{ env }}.yml"

  pre_tasks:
    - name: Update apt cache
      ansible.builtin.apt:
        update_cache: true
        cache_valid_time: 3600
      when: ansible_os_family == "Debian"

    - name: Verify connectivity
      ansible.builtin.ping:

  roles:
    - common
    - nginx
    - monitoring

  tasks:
    - name: Ensure nginx is running
      ansible.builtin.service:
        name: nginx
        state: started
        enabled: true

    - name: Deploy application configuration
      ansible.builtin.template:
        src: app.conf.j2
        dest: /etc/nginx/conf.d/app.conf
        owner: root
        group: root
        mode: '0644'
      notify: Reload nginx

  post_tasks:
    - name: Verify web server is responding
      ansible.builtin.uri:
        url: "http://localhost:{{ http_port }}/health"
        return_content: true
      register: health_check
      failed_when: "'healthy' not in health_check.content"

  handlers:
    - name: Reload nginx
      ansible.builtin.service:
        name: nginx
        state: reloaded

Advanced Playbook Patterns

---
- name: Rolling Deployment
  hosts: webservers
  become: true
  serial: "{{ serial_count | default('25%') }}"
  max_fail_percentage: 10

  pre_tasks:
    - name: Disable in load balancer
      ansible.builtin.uri:
        url: "{{ lb_api }}/servers/{{ inventory_hostname }}/disable"
        method: POST
        headers:
          Authorization: "Bearer {{ lb_token }}"
      delegate_to: localhost

    - name: Wait for connections to drain
      ansible.builtin.pause:
        seconds: 30

  tasks:
    - name: Stop application
      ansible.builtin.systemd:
        name: myapp
        state: stopped

    - name: Deploy new version
      ansible.builtin.unarchive:
        src: "{{ artifact_url }}"
        dest: /opt/myapp
        remote_src: true
        owner: myapp
        group: myapp

    - name: Apply database migrations
      ansible.builtin.command:
        cmd: /opt/myapp/bin/migrate
      run_once: true
      delegate_to: "{{ groups['databases'][0] }}"

    - name: Start application
      ansible.builtin.systemd:
        name: myapp
        state: started

    - name: Wait for application to be ready
      ansible.builtin.uri:
        url: "http://localhost:8080/ready"
        status_code: 200
      register: result
      until: result.status == 200
      retries: 30
      delay: 5

  post_tasks:
    - name: Re-enable in load balancer
      ansible.builtin.uri:
        url: "{{ lb_api }}/servers/{{ inventory_hostname }}/enable"
        method: POST
        headers:
          Authorization: "Bearer {{ lb_token }}"
      delegate_to: localhost

    - name: Verify health
      ansible.builtin.uri:
        url: "http://{{ inventory_hostname }}/health"
      delegate_to: localhost
      register: health
      failed_when: health.status != 200

Error Handling and Recovery

---
- name: Resilient Deployment with Recovery
  hosts: webservers
  become: true

  vars:
    deployment_version: "{{ version | mandatory }}"
    rollback_version: "{{ previous_version | default('latest') }}"

  tasks:
    - name: Create deployment checkpoint
      block:
        - name: Backup current configuration
          ansible.builtin.archive:
            path:
              - /etc/myapp/
              - /opt/myapp/current
            dest: "/var/backups/myapp-{{ ansible_date_time.epoch }}.tar.gz"

        - name: Record current version
          ansible.builtin.shell: |
            cat /opt/myapp/current/VERSION
          register: current_version
          changed_when: false

        - name: Store rollback info
          ansible.builtin.set_fact:
            rollback_info:
              version: "{{ current_version.stdout }}"
              backup: "/var/backups/myapp-{{ ansible_date_time.epoch }}.tar.gz"

    - name: Deploy new version
      block:
        - name: Download artifact
          ansible.builtin.get_url:
            url: "{{ artifact_base_url }}/{{ deployment_version }}.tar.gz"
            dest: /tmp/deployment.tar.gz
            checksum: "sha256:{{ artifact_checksum }}"

        - name: Extract artifact
          ansible.builtin.unarchive:
            src: /tmp/deployment.tar.gz
            dest: /opt/myapp/releases/{{ deployment_version }}
            remote_src: true

        - name: Update symlink
          ansible.builtin.file:
            src: /opt/myapp/releases/{{ deployment_version }}
            dest: /opt/myapp/current
            state: link
            force: true

        - name: Restart application
          ansible.builtin.systemd:
            name: myapp
            state: restarted

        - name: Verify deployment
          ansible.builtin.uri:
            url: http://localhost:8080/version
            return_content: true
          register: version_check
          until: deployment_version in version_check.content
          retries: 12
          delay: 5

      rescue:
        - name: Deployment failed - initiating rollback
          ansible.builtin.debug:
            msg: "Deployment of {{ deployment_version }} failed, rolling back to {{ rollback_info.version }}"

        - name: Restore previous symlink
          ansible.builtin.file:
            src: "/opt/myapp/releases/{{ rollback_info.version }}"
            dest: /opt/myapp/current
            state: link
            force: true

        - name: Restart application with previous version
          ansible.builtin.systemd:
            name: myapp
            state: restarted

        - name: Verify rollback
          ansible.builtin.uri:
            url: http://localhost:8080/health
            status_code: 200
          register: rollback_health

        - name: Notify on rollback
          ansible.builtin.slack:
            token: "{{ slack_token }}"
            channel: "#deployments"
            msg: "ROLLBACK: {{ deployment_version }} failed on {{ inventory_hostname }}, reverted to {{ rollback_info.version }}"
          delegate_to: localhost

        - name: Fail playbook after rollback
          ansible.builtin.fail:
            msg: "Deployment failed and was rolled back"

      always:
        - name: Clean up temporary files
          ansible.builtin.file:
            path: /tmp/deployment.tar.gz
            state: absent

Ansible Roles

Role Structure

roles/
└── nginx/
    ├── defaults/
    │   └── main.yml          # Default variables (lowest precedence)
    ├── vars/
    │   └── main.yml          # Role variables (higher precedence)
    ├── tasks/
    │   ├── main.yml          # Main task entry point
    │   ├── install.yml       # Installation tasks
    │   ├── configure.yml     # Configuration tasks
    │   └── service.yml       # Service management
    ├── handlers/
    │   └── main.yml          # Handlers for notifications
    ├── templates/
    │   ├── nginx.conf.j2     # Jinja2 templates
    │   └── vhost.conf.j2
    ├── files/
    │   └── ssl/              # Static files
    ├── meta/
    │   └── main.yml          # Role metadata and dependencies
    └── molecule/
        └── default/
            └── molecule.yml  # Testing configuration

Role Implementation

---
nginx_worker_processes: auto
nginx_worker_connections: 1024
nginx_keepalive_timeout: 65
nginx_client_max_body_size: 64m

nginx_user: "{{ 'www-data' if ansible_os_family == 'Debian' else 'nginx' }}"
nginx_group: "{{ nginx_user }}"

nginx_extra_configs: []
nginx_vhosts: []

nginx_remove_default_vhost: true
nginx_access_log: /var/log/nginx/access.log
nginx_error_log: /var/log/nginx/error.log

# SSL defaults
nginx_ssl_protocols: "TLSv1.2 TLSv1.3"
nginx_ssl_ciphers: "ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256"
nginx_ssl_prefer_server_ciphers: true
nginx_ssl_session_cache: "shared:SSL:10m"
nginx_ssl_session_timeout: "1d"

---
- name: Include OS-specific variables
  ansible.builtin.include_vars: "{{ item }}"
  with_first_found:
    - "{{ ansible_distribution }}-{{ ansible_distribution_major_version }}.yml"
    - "{{ ansible_distribution }}.yml"
    - "{{ ansible_os_family }}.yml"
    - default.yml

- name: Install nginx
  ansible.builtin.include_tasks: install.yml
  tags:
    - nginx
    - install

- name: Configure nginx
  ansible.builtin.include_tasks: configure.yml
  tags:
    - nginx
    - configure

- name: Manage nginx service
  ansible.builtin.include_tasks: service.yml
  tags:
    - nginx
    - service

---
- name: Install nginx (Debian/Ubuntu)
  ansible.builtin.apt:
    name: nginx
    state: present
    update_cache: true
  when: ansible_os_family == "Debian"

- name: Install nginx (RedHat/CentOS)
  ansible.builtin.yum:
    name: nginx
    state: present
    enablerepo: epel
  when: ansible_os_family == "RedHat"

- name: Ensure nginx directories exist
  ansible.builtin.file:
    path: "{{ item }}"
    state: directory
    owner: root
    group: root
    mode: '0755'
  loop:
    - /etc/nginx/conf.d
    - /etc/nginx/sites-available
    - /etc/nginx/sites-enabled
    - /etc/nginx/ssl
    - /var/www/html

---
- name: Deploy main nginx configuration
  ansible.builtin.template:
    src: nginx.conf.j2
    dest: /etc/nginx/nginx.conf
    owner: root
    group: root
    mode: '0644'
    validate: nginx -t -c %s
  notify: Reload nginx

- name: Remove default vhost
  ansible.builtin.file:
    path: "{{ item }}"
    state: absent
  loop:
    - /etc/nginx/sites-enabled/default
    - /etc/nginx/conf.d/default.conf
  when: nginx_remove_default_vhost
  notify: Reload nginx

- name: Deploy virtual hosts
  ansible.builtin.template:
    src: vhost.conf.j2
    dest: "/etc/nginx/sites-available/{{ item.name }}.conf"
    owner: root
    group: root
    mode: '0644'
  loop: "{{ nginx_vhosts }}"
  loop_control:
    label: "{{ item.name }}"
  notify: Reload nginx

- name: Enable virtual hosts
  ansible.builtin.file:
    src: "/etc/nginx/sites-available/{{ item.name }}.conf"
    dest: "/etc/nginx/sites-enabled/{{ item.name }}.conf"
    state: link
  loop: "{{ nginx_vhosts }}"
  loop_control:
    label: "{{ item.name }}"
  when: item.enabled | default(true)
  notify: Reload nginx

---
- name: Reload nginx
  ansible.builtin.service:
    name: nginx
    state: reloaded
  when: nginx_service_state | default('started') != 'stopped'

- name: Restart nginx
  ansible.builtin.service:
    name: nginx
    state: restarted

- name: Test nginx configuration
  ansible.builtin.command: nginx -t
  changed_when: false

{# roles/nginx/templates/nginx.conf.j2 #}
# Ansible managed - do not edit manually

user {{ nginx_user }};
worker_processes {{ nginx_worker_processes }};
pid /run/nginx.pid;

events {
    worker_connections {{ nginx_worker_connections }};
    multi_accept on;
    use epoll;
}

http {
    # Basic settings
    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;
    keepalive_timeout {{ nginx_keepalive_timeout }};
    types_hash_max_size 2048;
    server_tokens off;

    # MIME types
    include /etc/nginx/mime.types;
    default_type application/octet-stream;

    # Logging
    access_log {{ nginx_access_log }};
    error_log {{ nginx_error_log }};

    # Gzip
    gzip on;
    gzip_vary on;
    gzip_proxied any;
    gzip_comp_level 6;
    gzip_types text/plain text/css text/xml application/json application/javascript application/xml;

    # Security headers
    add_header X-Frame-Options "SAMEORIGIN" always;
    add_header X-Content-Type-Options "nosniff" always;
    add_header X-XSS-Protection "1; mode=block" always;

    # Client settings
    client_max_body_size {{ nginx_client_max_body_size }};

{% if nginx_ssl_enabled | default(false) %}
    # SSL configuration
    ssl_protocols {{ nginx_ssl_protocols }};
    ssl_ciphers {{ nginx_ssl_ciphers }};
    ssl_prefer_server_ciphers {{ 'on' if nginx_ssl_prefer_server_ciphers else 'off' }};
    ssl_session_cache {{ nginx_ssl_session_cache }};
    ssl_session_timeout {{ nginx_ssl_session_timeout }};
{% endif %}

    # Virtual host configs
    include /etc/nginx/conf.d/*.conf;
    include /etc/nginx/sites-enabled/*;
}

Ansible for Kubernetes

Kubernetes Collection Setup

# Install Kubernetes collection
ansible-galaxy collection install kubernetes.core

# Required Python packages
pip install kubernetes openshift

Managing Kubernetes Resources

---
- name: Deploy Application to Kubernetes
  hosts: localhost
  gather_facts: false

  vars:
    kubeconfig: "{{ lookup('env', 'KUBECONFIG') }}"
    namespace: myapp
    app_name: myapp
    image: myregistry/myapp:v1.0.0
    replicas: 3

  tasks:
    - name: Create namespace
      kubernetes.core.k8s:
        kubeconfig: "{{ kubeconfig }}"
        state: present
        definition:
          apiVersion: v1
          kind: Namespace
          metadata:
            name: "{{ namespace }}"
            labels:
              app.kubernetes.io/managed-by: ansible

    - name: Deploy application
      kubernetes.core.k8s:
        kubeconfig: "{{ kubeconfig }}"
        state: present
        definition:
          apiVersion: apps/v1
          kind: Deployment
          metadata:
            name: "{{ app_name }}"
            namespace: "{{ namespace }}"
            labels:
              app: "{{ app_name }}"
          spec:
            replicas: "{{ replicas }}"
            selector:
              matchLabels:
                app: "{{ app_name }}"
            template:
              metadata:
                labels:
                  app: "{{ app_name }}"
              spec:
                containers:
                  - name: "{{ app_name }}"
                    image: "{{ image }}"
                    ports:
                      - containerPort: 8080
                    resources:
                      requests:
                        memory: "256Mi"
                        cpu: "100m"
                      limits:
                        memory: "512Mi"
                        cpu: "500m"
                    readinessProbe:
                      httpGet:
                        path: /ready
                        port: 8080
                      initialDelaySeconds: 5
                      periodSeconds: 10
                    livenessProbe:
                      httpGet:
                        path: /health
                        port: 8080
                      initialDelaySeconds: 15
                      periodSeconds: 20

    - name: Create service
      kubernetes.core.k8s:
        kubeconfig: "{{ kubeconfig }}"
        state: present
        definition:
          apiVersion: v1
          kind: Service
          metadata:
            name: "{{ app_name }}"
            namespace: "{{ namespace }}"
          spec:
            selector:
              app: "{{ app_name }}"
            ports:
              - port: 80
                targetPort: 8080
            type: ClusterIP

    - name: Wait for deployment to be ready
      kubernetes.core.k8s_info:
        kubeconfig: "{{ kubeconfig }}"
        kind: Deployment
        name: "{{ app_name }}"
        namespace: "{{ namespace }}"
      register: deployment_info
      until: >
        deployment_info.resources[0].status.readyReplicas is defined and
        deployment_info.resources[0].status.readyReplicas == replicas
      retries: 30
      delay: 10

Helm with Ansible

---
- name: Deploy Application via Helm
  hosts: localhost
  gather_facts: false

  vars:
    kubeconfig: "{{ lookup('env', 'KUBECONFIG') }}"

  tasks:
    - name: Add Helm repositories
      kubernetes.core.helm_repository:
        name: "{{ item.name }}"
        repo_url: "{{ item.url }}"
      loop:
        - name: ingress-nginx
          url: https://kubernetes.github.io/ingress-nginx
        - name: cert-manager
          url: https://charts.jetstack.io
        - name: prometheus-community
          url: https://prometheus-community.github.io/helm-charts

    - name: Deploy ingress-nginx
      kubernetes.core.helm:
        kubeconfig: "{{ kubeconfig }}"
        name: ingress-nginx
        chart_ref: ingress-nginx/ingress-nginx
        chart_version: "4.8.3"
        release_namespace: ingress-nginx
        create_namespace: true
        values:
          controller:
            replicaCount: 2
            service:
              type: LoadBalancer
            metrics:
              enabled: true

    - name: Deploy cert-manager
      kubernetes.core.helm:
        kubeconfig: "{{ kubeconfig }}"
        name: cert-manager
        chart_ref: cert-manager/cert-manager
        chart_version: "v1.13.2"
        release_namespace: cert-manager
        create_namespace: true
        values:
          installCRDs: true

    - name: Deploy Prometheus stack
      kubernetes.core.helm:
        kubeconfig: "{{ kubeconfig }}"
        name: kube-prometheus-stack
        chart_ref: prometheus-community/kube-prometheus-stack
        chart_version: "54.0.0"
        release_namespace: monitoring
        create_namespace: true
        values:
          grafana:
            adminPassword: "{{ grafana_password }}"
            ingress:
              enabled: true
              hosts:
                - grafana.example.com

Testing Ansible with Molecule

Molecule Setup

---
dependency:
  name: galaxy

driver:
  name: docker

platforms:
  - name: ubuntu2204
    image: ubuntu:22.04
    pre_build_image: false
    dockerfile: ../resources/Dockerfile.ubuntu.j2
    privileged: true
    command: /sbin/init

  - name: rocky9
    image: rockylinux:9
    pre_build_image: false
    dockerfile: ../resources/Dockerfile.rocky.j2
    privileged: true
    command: /sbin/init

provisioner:
  name: ansible
  config_options:
    defaults:
      callbacks_enabled: profile_tasks
  inventory:
    host_vars:
      ubuntu2204:
        nginx_vhosts:
          - name: test
            server_name: test.local
            root: /var/www/test
      rocky9:
        nginx_vhosts:
          - name: test
            server_name: test.local
            root: /var/www/test

verifier:
  name: ansible

scenario:
  name: default
  test_sequence:
    - dependency
    - lint
    - cleanup
    - destroy
    - syntax
    - create
    - prepare
    - converge
    - idempotence
    - side_effect
    - verify
    - cleanup
    - destroy

Molecule Verification

---
- name: Verify nginx installation
  hosts: all
  gather_facts: true

  tasks:
    - name: Verify nginx package is installed
      ansible.builtin.package:
        name: nginx
        state: present
      check_mode: true
      register: pkg_check
      failed_when: pkg_check.changed

    - name: Verify nginx service is running
      ansible.builtin.service:
        name: nginx
        state: started
        enabled: true
      check_mode: true
      register: svc_check
      failed_when: svc_check.changed

    - name: Verify nginx is listening on port 80
      ansible.builtin.wait_for:
        port: 80
        timeout: 5

    - name: Test HTTP response
      ansible.builtin.uri:
        url: http://localhost/
        return_content: true
      register: http_response
      failed_when: http_response.status != 200

    - name: Verify configuration syntax
      ansible.builtin.command: nginx -t
      changed_when: false

    - name: Check log files exist
      ansible.builtin.stat:
        path: "{{ item }}"
      loop:
        - /var/log/nginx/access.log
        - /var/log/nginx/error.log
      register: log_files
      failed_when: not item.stat.exists
      loop_control:
        loop_var: item

Terraform + Ansible Integration

Generating Ansible Inventory from Terraform

output "ansible_inventory" {
  value = templatefile("${path.module}/templates/inventory.tpl", {
    webservers = aws_instance.web[*]
    databases  = aws_instance.db[*]
    bastion    = aws_instance.bastion
  })
  sensitive = true
}

# Write inventory file
resource "local_file" "ansible_inventory" {
  content  = templatefile("${path.module}/templates/inventory.tpl", {
    webservers = aws_instance.web[*]
    databases  = aws_instance.db[*]
    bastion    = aws_instance.bastion
  })
  filename = "${path.module}/../ansible/inventory/aws_hosts.yml"
}

all:
  children:
    webservers:
      hosts:
%{ for instance in webservers ~}
        ${instance.tags["Name"]}:
          ansible_host: ${instance.private_ip}
          instance_id: ${instance.id}
          instance_type: ${instance.instance_type}
          availability_zone: ${instance.availability_zone}
%{ endfor ~}

    databases:
      hosts:
%{ for instance in databases ~}
        ${instance.tags["Name"]}:
          ansible_host: ${instance.private_ip}
          instance_id: ${instance.id}
          instance_type: ${instance.instance_type}
%{ endfor ~}

    bastion:
      hosts:
        ${bastion.tags["Name"]}:
          ansible_host: ${bastion.public_ip}

  vars:
    ansible_user: ec2-user
    ansible_ssh_private_key_file: ~/.ssh/aws-key.pem
    ansible_ssh_common_args: '-o ProxyJump=ec2-user@${bastion.public_ip}'

Complete Infrastructure Pipeline

name: Infrastructure Deployment

on:
  push:
    branches: [main]
    paths:
      - 'terraform/**'
      - 'ansible/**'

jobs:
  terraform:
    runs-on: ubuntu-latest
    outputs:
      inventory_updated: ${{ steps.apply.outputs.inventory_changed }}

    steps:
      - uses: actions/checkout@v4

      - name: Setup Terraform
        uses: hashicorp/setup-terraform@v3

      - name: Terraform Init
        working-directory: terraform
        run: terraform init

      - name: Terraform Plan
        working-directory: terraform
        run: terraform plan -out=tfplan

      - name: Terraform Apply
        id: apply
        working-directory: terraform
        run: |
          terraform apply -auto-approve tfplan
          echo "inventory_changed=true" >> $GITHUB_OUTPUT

      - name: Upload inventory
        uses: actions/upload-artifact@v4
        with:
          name: ansible-inventory
          path: ansible/inventory/aws_hosts.yml

  ansible:
    runs-on: ubuntu-latest
    needs: terraform
    if: needs.terraform.outputs.inventory_updated == 'true'

    steps:
      - uses: actions/checkout@v4

      - name: Download inventory
        uses: actions/download-artifact@v4
        with:
          name: ansible-inventory
          path: ansible/inventory/

      - name: Setup Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'

      - name: Install Ansible
        run: pip install ansible boto3

      - name: Configure SSH
        run: |
          mkdir -p ~/.ssh
          echo "${{ secrets.SSH_PRIVATE_KEY }}" > ~/.ssh/aws-key.pem
          chmod 600 ~/.ssh/aws-key.pem

      - name: Run Ansible Playbook
        working-directory: ansible
        run: |
          ansible-playbook \
            -i inventory/aws_hosts.yml \
            playbooks/site.yml \
            --extra-vars "env=production"

Common Mistakes

Mistake	Problem	Solution
Not using `become`	Tasks requiring root fail	Set `become: true` for privileged operations
Hardcoded hosts	Inventory becomes stale	Use dynamic inventory or Terraform integration
No idempotency	Re-runs cause errors	Use modules that check state before acting
Missing handlers	Services don’t restart	Always notify handlers on configuration changes
No check mode testing	Unexpected production changes	Run `--check --diff` before applying
Ignoring return codes	Failures go unnoticed	Register results and use `failed_when`
Password in playbooks	Credentials exposed	Use Ansible Vault or external secrets
No tags	Can’t run partial playbooks	Tag tasks for selective execution
Serial: 1 on everything	Deployments take forever	Use appropriate serial values (e.g., 25%)
Not validating templates	Broken configs deployed	Use `validate` parameter in template task

Quiz

Test your Ansible knowledge:

1. What is the key difference between Terraform and Ansible?

Answer:

Terraform: Declarative, state-managed infrastructure provisioning via APIs. Best for creating cloud resources (VMs, networks, storage).
Ansible: Procedural configuration management via SSH. Best for configuring servers, installing packages, and deploying applications.

They’re complementary: Terraform creates infrastructure, Ansible configures it.

2. Why is Ansible called "agentless"?

Answer: Ansible doesn’t require any software to be installed on managed nodes (except Python). It connects via SSH (or WinRM for Windows) and executes modules remotely. This contrasts with tools like Puppet or Chef that require agents running on each node.

Benefits:

Instant setup—no agent deployment
No agent maintenance or upgrades
No additional attack surface
Works anywhere SSH works

3. What does "idempotent" mean in Ansible, and why is it important?

Answer: Idempotent means running a playbook multiple times produces the same end result. If nginx is already installed, the apt module won’t reinstall it.

Why it matters:

Safe to re-run playbooks
Recovers from partial failures
Validates current state matches desired state
Essential for configuration drift correction

4. What is the purpose of handlers in Ansible?

Answer: Handlers are tasks that only run when notified by other tasks. They’re typically used for operations that should only happen once, even if notified multiple times.

Example: If 5 tasks modify nginx configuration, you only want to reload nginx once at the end, not 5 times. Handlers accumulate notifications and run once at the end of the play.

tasks:
  - name: Update config A
    template: ...
    notify: Reload nginx

  - name: Update config B
    template: ...
    notify: Reload nginx

handlers:
  - name: Reload nginx
    service: name=nginx state=reloaded
# nginx reloads only ONCE

5. What is the difference between `serial` and `forks` in Ansible?

Answer:

forks (default: 5): How many hosts Ansible connects to simultaneously within a batch. Affects parallelism.
serial: How many hosts to process in each batch before moving to the next batch. Affects rolling deployments.

Example: 100 servers, serial: 25, forks: 10

First batch: 25 servers (10 at a time)
If batch succeeds: next 25 servers
Allows rolling updates with controlled blast radius

6. How do you securely store passwords and secrets in Ansible?

Answer: Use Ansible Vault to encrypt sensitive data:

# Create encrypted file
ansible-vault create secrets.yml

# Encrypt existing file
ansible-vault encrypt secrets.yml

# Edit encrypted file
ansible-vault edit secrets.yml

# Run playbook with vault
ansible-playbook site.yml --ask-vault-pass
ansible-playbook site.yml --vault-password-file=~/.vault_pass

For external secrets:

HashiCorp Vault lookup plugin
AWS Secrets Manager lookup
Environment variables for CI/CD

7. What is the purpose of `block`, `rescue`, and `always` in Ansible?

Answer: Error handling similar to try/catch/finally:

block: Group of tasks to execute
rescue: Tasks to run if block fails (like catch)
always: Tasks that always run (like finally)

- block:
    - name: Attempt deployment
      # ... deployment tasks
  rescue:
    - name: Rollback on failure
      # ... rollback tasks
  always:
    - name: Cleanup temp files
      # ... always runs

8. How does Ansible integrate with Kubernetes?

Answer: Via the kubernetes.core collection:

- kubernetes.core.k8s:
    state: present
    definition:
      apiVersion: apps/v1
      kind: Deployment
      # ...

- kubernetes.core.helm:
    name: nginx
    chart_ref: ingress-nginx/ingress-nginx
    release_namespace: ingress

Use cases:

Deploy Kubernetes resources (alternative to kubectl apply)
Manage Helm releases
Run playbooks as Kubernetes Jobs
Bootstrap cluster applications after Terraform creates the cluster

Key Takeaways

Ansible complements Terraform: Terraform provisions infrastructure; Ansible configures it
Agentless architecture: SSH-based, no daemon required on managed nodes
Idempotency is key: Playbooks should be safe to run multiple times
Use roles for reusability: Package related tasks, handlers, templates, and variables
Dynamic inventory: Generate inventory from Terraform or cloud APIs
Test with Molecule: Verify roles work across different OS versions
Handlers for efficiency: Restart services only when configuration actually changes
Vault for secrets: Never commit unencrypted passwords
Check mode first: Always dry-run before production changes
Rolling deployments: Use serial to control blast radius

Did You Know?

NASA uses Ansible to manage their High-End Computing infrastructure. The agentless architecture was crucial for their security requirements—no additional attack surface on compute nodes.
The name “Ansible” comes from Ursula K. Le Guin’s science fiction novels, where it’s a device for instantaneous communication across any distance. In the tool, it represents instant configuration without waiting for agent check-ins.
Ansible Galaxy has over 30,000 roles shared by the community. Before writing a role from scratch, check Galaxy—there’s probably a well-tested role for your use case.
Red Hat acquired Ansible in 2015 for $150 million. Since then, it’s become the foundation of their automation strategy, including Ansible Tower (now AAP - Ansible Automation Platform).

Hands-On Exercise

Exercise: Complete Server Configuration Pipeline

Objective: Create an Ansible playbook that configures a web server with nginx, SSL, and basic security hardening.

Setup:

# Create project structure
mkdir -p ansible-lab/{inventory,playbooks,roles,group_vars}
cd ansible-lab

# Create inventory with local container
cat > inventory/local.yml << 'EOF'
all:
  hosts:
    webserver:
      ansible_connection: docker
      ansible_python_interpreter: /usr/bin/python3
EOF

Tasks:

Create a hardening role:

mkdir -p roles/hardening/{tasks,handlers,defaults}

---
- name: Update all packages
  ansible.builtin.apt:
    upgrade: safe
    update_cache: true

- name: Install security packages
  ansible.builtin.apt:
    name:
      - fail2ban
      - ufw
      - unattended-upgrades
    state: present

- name: Configure UFW defaults
  community.general.ufw:
    state: enabled
    policy: deny
    direction: incoming

- name: Allow SSH
  community.general.ufw:
    rule: allow
    port: "22"
    proto: tcp

- name: Allow HTTP/HTTPS
  community.general.ufw:
    rule: allow
    port: "{{ item }}"
    proto: tcp
  loop:
    - "80"
    - "443"

Create main playbook:

---
- name: Configure Web Server
  hosts: webserver
  become: true

  roles:
    - hardening
    - nginx

  tasks:
    - name: Verify configuration
      ansible.builtin.uri:
        url: http://localhost/
        return_content: true
      register: response

    - name: Display result
      ansible.builtin.debug:
        msg: "Web server is responding: {{ response.status }}"

Run with check mode first:

ansible-playbook -i inventory/local.yml playbooks/site.yml --check --diff

Apply configuration:

ansible-playbook -i inventory/local.yml playbooks/site.yml

Success Criteria:

All tasks complete without errors
Idempotent (second run shows no changes)
UFW enabled with correct rules
Nginx serving content
Check mode accurately predicts changes

Next Module

Continue to Module 7.5: AWS CloudFormation to learn AWS-native infrastructure as code with CloudFormation templates and stacks.