Skip to main content

Command Palette

Search for a command to run...

Mastering Terraform State Management and Workspaces Through Real Projects

Building dev/prod environments and a 2-tier AWS infrastructure the right way

Published
β€’9 min read
Mastering Terraform State Management and Workspaces Through Real Projects
A
Driven by curiosity and a continuous learning mindset, always exploring and building new ideas.

Introduction

Hello readersπŸ‘‹! Past 3 weeks have been hectic to be honest and I wasn't able to get a lot done but still managed to take some time to learn and create something. I continued where I left off from my previous Terraform article and dove deep into two crucial concepts that every Terraform practitioner needs to master - State Management and Workspaces. And to practice and get some hands-on experience of using Terraform, I've built a mini project and a real-life industry standard project to enhance my understanding and workflow of Terraform.

After understanding the basics of Terraform in my previous article, I realized that knowing how to write .tf files is just the beginning. The real challenge comes when you need to manage infrastructure across different environments, collaborate with teams, and ensure that your infrastructure state is consistent and secure.

State Management and Workspaces

Understanding Terraform State

Before diving into advanced state management, I had to understand what exactly the state is in Terraform. The state is basically Terraform's way of keeping track of the real-world resources it manages. When you run terraform apply, Terraform doesn't just create resources - it also records information about those resources in a state file (terraform.tfstate).

This state file contains:

  • Resource metadata - IDs, current configuration, and dependencies

  • Resource mappings - How your .tf configuration maps to real-world resources

  • Performance optimization - Instead of querying all resources every time, Terraform uses state for faster operations

The problem I learned about is that by default, Terraform stores state locally, which creates several challenges:

  • Collaboration issues - Multiple team members can't work on the same infrastructure

  • State loss - If your local machine crashes, you lose track of your infrastructure

  • Security risks - State files contain sensitive information and shouldn't be stored in version control

Remote State Management

To solve these problems, I learned about remote state backends. A backend in Terraform determines where and how state is stored and accessed. The most commonly used backend for AWS environments is S3 with DynamoDB.

Here's how I configured remote state in my projects:

terraform {
  backend "s3" {
    bucket         = "terraform-state-bucket-akshansh029"
    key            = "dev/terraform.tfstate"
    region         = "ap-south-1"
    dynamodb_table = "terraform-state-lock"
    encrypt        = true
  }
}

The S3 bucket stores the actual state file, while DynamoDB provides state locking to prevent multiple people from running Terraform simultaneously on the same infrastructure. The encrypt = true ensures that the state file is encrypted at rest.

Terraform Workspaces

After understanding state management, I learned about Workspaces - a feature that allows you to manage multiple environments (dev, staging, prod) using the same Terraform configuration.

Think of workspaces as separate "instances" of your infrastructure. Each workspace has its own state file, so you can have identical infrastructure setups for different environments without them interfering with each other.

Key workspace commands I learned:

  • terraform workspace list - Shows all available workspaces

  • terraform workspace new <name> - Creates a new workspace

  • terraform workspace select <name> - Switches to a specific workspace

  • terraform workspace show - Shows current workspace

What's really cool about workspaces is that you can use the terraform.workspace variable in your configurations to make environment-specific decisions:

resource "aws_instance" "example" {
  instance_type = terraform.workspace == "prod" ? "t3.medium" : "t2.micro"

  tags = {
    Name = "${terraform.workspace}-server"
    Environment = terraform.workspace
  }
}

Mini Project: Dev and Prod Infrastructure

To practice these concepts, I created a mini project that demonstrates how to use workspaces to manage identical infrastructure across different environments.

Project Structure

I organized my project with the following structure:

DevOps-Learning/terraform-modules-app/
β”œβ”€β”€ main.tf
β”œβ”€β”€ provider.tf
β”œβ”€β”€ terraform.tf
β”œβ”€β”€ app
        β”œβ”€β”€ dynamo.tf
        β”œβ”€β”€ ec2.tf
        β”œβ”€β”€ s3.tf
        β”œβ”€β”€ variables.tf

Environment-Specific Configuration

The importance of this project was seeing how the same Terraform configuration could create different infrastructure based on the workspace. Here's how I implemented environment-specific logic:

# Different instance types for different environments
resource "aws_instance" "web_server" {
  ami           = var.ami_id
  instance_type = terraform.workspace == "prod" ? var.prod_instance_type : var.dev_instance_type
  key_name      = aws_key_pair.deployer.key_name

  # Different storage sizes
  root_block_device {
    volume_size = terraform.workspace == "prod" ? 20 : 8
    volume_type = "gp3"
  }

  tags = {
    Name = "${terraform.workspace}-web-server"
    Environment = terraform.workspace
  }
}

Workflow Implementation

The workflow I followed was:

  1. Create Dev Environment:
terraform workspace new dev
terraform plan
terraform apply
  1. Create Prod Environment:
terraform workspace new prod
terraform plan
terraform apply

Each environment got its own EC2 instance with appropriate sizing, security groups, but they were completely isolated from each other thanks to workspaces.

2-Tier-AWS-Infrastructure-Terraform Project

After getting comfortable with workspaces, I decided to tackle a more comprehensive project - a 2-tier AWS infrastructure that follows industry standards and best practices.

Project Architecture

This project implements a typical web application architecture with:

Presentation Tier (Web Layer):

  • Application Load Balancer for traffic distribution

  • Auto Scaling Group with EC2 instances in multiple Availability Zones

  • Launch Template with user data for automatic application setup

Data Tier (Database Layer):

  • RDS MySQL database with Multi-AZ deployment

  • Private subnets for database security

  • Database subnet group for proper placement

Infrastructure Deep Dive

The 2-tier architecture I implemented consists of two main layers that work together to create a scalable and secure web application infrastructure:

Web Tier (Presentation Layer): This is the front-facing layer that handles all user requests. I created an Application Load Balancer that distributes incoming traffic across multiple EC2 instances running in different Availability Zones. The EC2 instances are part of an Auto Scaling Group, which means they can automatically scale up or down based on traffic demand. Each instance runs a web server (I used Apache with a simple HTML page) that serves the application content to users.

Database Tier (Data Layer): The second tier consists of an RDS MySQL database that stores all the application data. What's really important here is that the database is placed in private subnets, meaning it's not directly accessible from the internet. Only the web servers can communicate with the database through internal network routing.

Implementation Process and Workflow

Phase 1: Network Foundation I started by building the network infrastructure - creating a custom VPC with both public and private subnets across multiple Availability Zones. The public subnets host the web servers and load balancer, while the private subnets contain the database. An Internet Gateway provides internet access to the public subnets, and a NAT Gateway allows the private subnets to download updates and patches while remaining secure.

Phase 2: Compute Layer Setup Next, I implemented the compute resources using a Launch Template that defines the EC2 instance configuration. The Launch Template includes the AMI, instance type, security groups, and a user data script that automatically installs and configures Apache web server when instances launch. The Auto Scaling Group uses this template to maintain the desired number of healthy instances and can automatically replace failed instances.

Phase 3: Database Implementation For the database layer, I created an RDS MySQL instance with Multi-AZ deployment for high availability. The database is configured with automated backups, encryption at rest, and is placed in a database subnet group that spans multiple Availability Zones. I also implemented parameter groups to optimize database performance.

Phase 4: Load Balancing and Traffic Management The Application Load Balancer sits in the public subnets and routes traffic to healthy web server instances based on configured health checks. I set up target groups that define which instances should receive traffic and configured the load balancer to distribute requests evenly across all available instances.

Security Architecture

Security was implemented at multiple layers throughout the infrastructure:

Network Security: I created separate security groups for each tier - the load balancer security group allows HTTP/HTTPS traffic from the internet, the web server security group only accepts traffic from the load balancer, and the database security group only allows MySQL connections from the web servers. This creates a secure communication path where each layer only accepts traffic from the layer above it.

Database Security: The RDS instance is completely isolated in private subnets with no direct internet access. Database credentials are managed securely, and I enabled encryption both at rest and in transit. The database is also configured with automated backups and point-in-time recovery.

Access Control: I used IAM roles instead of hardcoded credentials, and implemented the principle of least privilege throughout the infrastructure. The EC2 instances have just enough permissions to function but can't access other AWS services unnecessarily.

Project Flow

The beauty of this infrastructure is how all the components work together automatically:

  1. Traffic Flow: When users access the application, their requests hit the Application Load Balancer, which checks the health of all web server instances and routes the request to a healthy server.

  2. Auto Scaling: If traffic increases, the Auto Scaling Group automatically launches new EC2 instances using the Launch Template. These new instances automatically register with the load balancer target group and start receiving traffic.

  3. Database Connectivity: The web servers connect to the RDS database using the internal DNS endpoint, ensuring all data operations happen securely within the private network.

  4. Failure Recovery: If an EC2 instance fails, the Auto Scaling Group automatically terminates it and launches a replacement. If the primary database fails, RDS automatically fails over to the standby instance in another Availability Zone.

The best thing was seeing this entire infrastructure come to life with just a few Terraform commands. Running terraform apply creates dozens of resources in the correct order, with all the dependencies handled automatically by Terraform's dependency graph.

GitHub repo of this project- https://github.com/Akshansh029/2-Tier-AWS-Infrastructure-Terraform

Challenges I Faced

1️⃣ RDS Cluster Engine Version Problem

When I initially tried to create an RDS cluster, I encountered an error related to the engine version. I had specified engine_version = "8.0" but AWS was expecting a more specific version like 8.0.35.

Solution: I learned to check available engine versions using the AWS CLI. Then I updated my configuration to use a specific version that was available in my region.

2️⃣ State Lock Already Engaged

This was frustrating! I was working on the project and my terraform apply command got interrupted. When I tried to run it again, I got a "state lock" error saying the state was already locked by a previous operation.

Solution: I had to use the force-unlock command:

terraform force-unlock <LOCK_ID>

3️⃣ Unable to Select Which Fields Are Required and Which Were Not in AWS Resource Creation

While creating AWS resources, I was confused about which arguments were required and which were optional. The Terraform documentation sometimes wasn't clear, and I kept getting errors about missing required arguments.

Solution: I had to rely more heavily on the official Terraform AWS provider documentation, search for better examples and use terraform plan extensively to catch missing required arguments before applying.

Resources I Used

  1. Terraform Official Documentation

  2. DevOps Projects | NotHarshhaa

What's Next

After understanding the creation of cloud infrastructure using Terraform, my next plan is to learn configuration management of the infrastructure using Ansible. I want to understand how Terraform and Ansible work together’

Let's Connect!

πŸ”— My LinkedIn

πŸ”— My GitHub

If you have any recommended resources, better approaches to my challenges, or insights, I'd love to hear them! Drop your thoughts in the comments.

Have a wonderful day!