Site Reliability Engineer (SRE) – Azure AKS, Observability & Terraform
Astra North Infoteck Inc.
Job Description
Site Reliability Engineer (SRE) Azure AKS, Observability & Terraform Remote Role Key Responsibilities Observability, SRE, DevOps roles with expertise in infrastructure and application reliability Dynatrace, ELK, Splunk, PagerDuty SLI/SLO frameworks Azure Kubernetes Service (AKS), Terraform, Azure managed services What will you do Design and implement observability-as-code solutions using Terraform for monitoring pipelines, dashboards, and alerting across distributed systems Drive observability improvements using Dynatrace, ELK, Splunk, PagerDuty for real-time performance insights and system visibility Instrument applications for end-to-end observability including distributed tracing, metrics collection, and log aggregation across Node.js and .NET microservices and event-driven architectures Troubleshoot complex production incidents across service layers, databases, caches, and APIs using SLI/SLO frameworks Investigate and resolve Azure Kubernetes Service (AKS) infrastructure issues en...