I built an AI ops agent on AWS that automatically resolves AWS CloudFormation tickets at scale. The system is written in Java and orchestrated with AWS CDK, using SNS and SQS for eventing, AWS Lambda for execution, Step Functions for workflow control, and Amazon Bedrock for LLM-powered analysis and remediation. In production it clears more than 200 tickets per week and cuts manual effort by 90%.
To keep incidents on track, I added an NLP classification pipeline that routes issues with 97% accuracy across 400+ weekly events, which reduced Sev-2 escalations by about 80%. Reliability was a core focus from day one: I practiced test-driven development and wrote comprehensive JUnit integration tests, achieving over 95% coverage.
