As applications scale across multiple environments and microservices, managing logs becomes a significant challenge. The ELK Stack (Elasticsearch, Logstash, and Kibana) offers a powerful solution for centralized logging, enabling real-time analysis, search, and visualization of logs.
In this guide, we’ll walk through setting up a cost-effective, secure, and scalable ELK logging platform using Infrastructure as Code (IaC) principles, leveraging Terraform for provisioning infrastructure and Ansible for configuration management.
Why ELK, IaC, Terraform, and Ansible?
-
ELK Stack: A robust logging system to aggregate, parse, and visualize logs.
-
Terraform: An open-source IaC tool that helps manage cloud resources declaratively.
-
Ansible: A configuration management tool used to automate the setup of software and system configurations.
-
IaC Approach: Brings repeatability, scalability, and versioning to infrastructure deployment.
Architecture Overview
Here’s what we’ll build:
-
Elasticsearch cluster for storing and querying logs.
-
Logstash for collecting and processing logs.
-
Kibana for log visualization.
-
Beats agents for lightweight log shipping.
-
Terraform for provisioning VMs, networking, and security groups.
-
Ansible for installing and configuring ELK stack components on provisioned hosts.
Step 1: Define Infrastructure Using Terraform
We will deploy ELK on AWS (you can adapt to Azure, GCP, or on-prem). Let’s start with Terraform.
1.1 Terraform Directory Structure
1.2 provider.tf
1.3 variables.tf
1.4 main.tf
1.5 Apply Terraform
Step 2: Configure ELK Stack With Ansible
With our infrastructure up, we’ll use Ansible to configure the ELK Stack.
2.1 Ansible Directory Structure
2.2 inventory.ini
2.3 Elasticsearch Role: roles/elasticsearch/tasks/main.yml
Install Elasticsearch GPG key and repo in roles/elasticsearch/tasks/setup.yml
:
2.4 Logstash Role: roles/logstash/tasks/main.yml
2.5 Kibana Role: roles/kibana/tasks/main.yml
2.6 Master Playbook: playbook.yml
2.7 Run the Playbook
Step 3: Send Logs Using Filebeat
Install Filebeat on your application servers to ship logs to Logstash:
Configure output in /etc/filebeat/filebeat.yml
:
Enable and start Filebeat:
Step 4: Access Kibana Dashboard
Navigate to http://<your_instance_ip>:5601
in your browser. Set up your index pattern (logstash-*
) and start visualizing logs.
Security Considerations
-
Use TLS/SSL: Secure communication between Beats, Logstash, and Elasticsearch.
-
Enable authentication: Use X-Pack or Open Distro to enable basic auth.
-
Restrict access: Update security groups or firewalls to restrict IP ranges.
-
Encrypt secrets: Store secrets with Ansible Vault or AWS Secrets Manager.
Scaling Recommendations
-
Elasticsearch: Run in a cluster with 3+ nodes for HA.
-
Logstash: Use queueing (like Redis) and load balance across instances.
-
Kibana: Can be fronted with Nginx and scaled horizontally if needed.
-
Storage: Use EBS with high IOPS or S3 snapshots for archived logs.
Cost Optimization Tips
-
Use spot instances for non-critical Logstash/Kibana servers.
-
Store logs for a limited retention period.
-
Offload cold logs to S3 using Elastic Curator or lifecycle policies.
-
Monitor resource usage and rightsizing VMs regularly.
Conclusion
Building a centralized logging platform with the ELK Stack using Infrastructure as Code (IaC) principles is not just a technical improvement—it’s a strategic one. As organizations increasingly adopt microservices, containerized environments, and distributed systems, log data becomes the lifeline for debugging, monitoring, security auditing, and performance optimization. Yet, managing this critical stream of operational data without proper structure can lead to chaos, inefficiencies, and missed opportunities.
In this article, we walked through how to build a cost-effective, secure, and scalable ELK stack using Terraform and Ansible. Terraform provided the foundation by provisioning reliable, repeatable infrastructure on the cloud (in our case, AWS), while Ansible automated the complex and often error-prone software installation and configuration tasks. By decoupling infrastructure provisioning from configuration management, you achieve better modularity, version control, and disaster recovery readiness.
From a cost perspective, we discussed the importance of instance right-sizing, storage optimization, short retention policies, and the use of spot instances where applicable. These optimizations make a big difference in high-throughput environments where logs can quickly accumulate into terabytes of data. Using ELK’s native features like rollover indices, ILM (Index Lifecycle Management), and S3-based cold storage options further ensures long-term sustainability and storage efficiency.
On the security side, the ELK stack supports a wide range of enterprise-grade features including TLS encryption, user authentication, and role-based access control—especially when enhanced with Elastic’s commercial X-Pack or the open-source Open Distro for Elasticsearch. Infrastructure security is further improved through Terraform’s ability to manage firewall rules, IAM policies, and security groups, and Ansible’s support for encrypted secrets using Vault.
When it comes to scalability, the solution we outlined can scale horizontally by simply adding nodes to your Elasticsearch cluster, load balancing Logstash instances, and deploying multiple Filebeat agents on different application servers. Elastic’s architecture is designed for scale, and with IaC, provisioning and configuring new nodes becomes a matter of running a script rather than manually setting up each server.
This IaC-based ELK platform is also cloud-agnostic, meaning it can be adapted for Azure, GCP, on-premise, or hybrid environments. That flexibility is vital for teams that need to support compliance, regional hosting requirements, or disaster recovery strategies. Moreover, with version-controlled IaC repositories, infrastructure becomes part of your codebase—reviewed, tested, and deployed just like application logic.
Beyond operational benefits, this setup also fosters collaboration. Teams can define infrastructure and configuration standards through code, enforce them via CI/CD pipelines, and minimize the “it works on my machine” problem. The clear separation of concerns (provisioning vs configuration) enables DevOps teams to specialize and move faster without stepping on each other’s toes.
In real-world scenarios, an IaC-driven ELK stack is often used for:
-
Real-time application log monitoring and error detection
-
Security event logging and alerting (SIEM)
-
Performance analysis and SLA reporting
-
Infrastructure health visualization and anomaly detection
-
Compliance auditing and forensics
In conclusion, implementing the ELK Stack with Terraform and Ansible is more than a technical deployment—it’s an investment in your organization’s observability, resilience, and efficiency. With a small upfront investment in time and structure, you gain a logging platform that evolves with your needs, scales with your growth, and delivers actionable insights from your data streams.