![]()
Self-Healing Infrastructure: Autonomous LLM Agents for Real-Time Remediation of Configuration Drift and Security Misconfigurations in IaC Deployments
Harish Apuri1, Madhan Mohan Reddy Chinthala2, Shikher Goel3, Mukesh Aurangabadkar4, Charani Yepuri5
1Harish Apuri, Department of IT, IT Induct Inc, Charlotte (NC), United States of America (USA).
2Madhan Mohan Reddy Chinthala, Department of IT, Franklin Info Tech, Charlotte (NC), United States of America (USA).
3Shikher Goel, Department of IT, JPMorgan Chase, Jersey (New Jersey), United States of America (USA).
4Mukesh Aurangabadkar, Department of IT, Spectrum, Denver (Colorado), Vanuatu.
5Charani Yepuri, Independent Researcher, Department of IT, Hyderabad (Telangana), India.
Manuscript received on 01 February 2026 | First Revised Manuscript received on 27 February 2026 | Second Revised Manuscript received on 05 March 2026 | Manuscript Accepted on 15 March 2026 | Manuscript published on 30 March 2026 | PP: 25-32 | Volume-15 Issue-4, March 2026 | Retrieval Number: 100.1/ijitee.D475715040426 | DOI: 10.35940/ijitee.D4757.15040326
Open Access | Editorial and Publishing Policies | Cite | Zenodo | OJS | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: The application of Infrastructure as Code (IaC) has enhanced cloud environment scalability and automation, but configuration drift and security misconfigurations remain critical operational and security issues. Current drift detection and remediation solutions rely largely on reactive, rules-based, and human intervention; therefore, they are ineffective in dynamic, multi-cloud environments. This research aims to develop and deploy a self-healing infrastructure architecture that autonomously identifies and recovers from configuration drift and security misconfigurations in real time. The paper suggests the following to accomplish this: a new multi-agent architecture based on Large Language Models (LLMs), in which Drift detectors, security reasoners, root-cause analysers, remediation generators, and post-remediation validators operate within a closed-loop pipeline. To evaluate the framework, a publicly available IaC dataset (written in Terraform) of simulated drift situations is used. According to experimental results, the proposed LLM-agent system outperforms rule-based and semi automated systems, with a drift detection rate of 96.8, a security misconfiguration detection rate of 95.2, and a mean time to remediation of 6.9 minutes. The framework is also very effective in reducing false positives and manual intervention, as well as getting high policy compliance. Such findings affirm the usefulness of autonomous LLM agents in empowering proactive, intelligent and scalable self-healing infrastructure management in contemporary cloud systems.
Keywords: Self-Healing Infrastructure, Infrastructure as Code (IaC), Large Language Models (LLMs), Configuration Drift Remediation, Cloud Security Automation
Scope of the Article: Computer Science and Engineering
