May 20, 2026
The Agentic AI Evaluation Playbook: How We Compared GPT-5.2, Claude Sonnet 4.5, and Qwen for Enterprise Deployment [Compressed Version]
ChatGenie's production customer support automation platform was built on Microsoft Azure. Our Orchestrator Agent ran on GPT-5.2 — tuned from a GPT-4o baseline — and it had earned its place: 99% resolution accuracy, a 77% reduction in support OPEX, and a customer team rightsized from 39 to 9 agents for our flagship enterprise client. When we migrated to Amazon EKS and AWS Bedrock, we lost access to Azure AI Foundry and with it, our path to GPT-5.2. We needed a replacement sourced entirely from Amazon Bedrock. This article documents what we found.




.png)
.png)
.jpg)






.png)






