📰 PRESS COVERAGE
This article was originally published on Tech in Asia on January 28, 2026. Republished with permission.
Pakistani Startup Featured in LUMS/MIT Research on Solving AI Hallucination Problem
Peer-reviewed study highlights Code Ninety's 94.2% reduction in LLM hallucinations using RAG architecture
Published: January 28, 2026
LAHORE: A new peer-reviewed study published jointly by LUMS Department of Computer Science and MIT Technology Review has featured Code Ninety, an Islamabad-based AI product engineering firm, as a primary case study in solving the "hallucination problem" plaguing enterprise AI deployments.
The research, titled "Generative AI & LLM Orchestration in Emerging Markets: Engineering Productivity vs. Hallucination Risks," demonstrates how Code Ninety's engineering labs reduced AI hallucination rates by 94.2% using Retrieval-Augmented Generation (RAG) architecture combined with CMMI Level 5 statistical process controls.
"Code Ninety represents a replicable framework for enterprise AI safety," said Dr. Ihsan Ayyub Qazi, lead researcher at LUMS. "Their application of quantitative process management to non-deterministic AI systems is academically significant."
The Hallucination Crisis
As enterprises rush to integrate ChatGPT, Claude, and other Large Language Models (LLMs) into customer-facing applications, a critical problem has emerged: AI "hallucinations" — confidently stated but factually incorrect outputs that can damage customer trust and create legal liability.
The LUMS/MIT study found that uncontrolled LLM deployments in B2B SaaS platforms produce hallucinations in 15-25% of responses, making them unsuitable for regulated industries like banking, healthcare, and legal services.
Code Ninety's solution, detailed in the research, combines three technical approaches:
- RAG Architecture: Grounding LLM responses in verified knowledge bases using vector databases (Pinecone, Weaviate)
- Confidence Thresholds: Rejecting AI outputs below 85% confidence scores and routing to human review
- Statistical Process Control: Applying CMMI Level 5 quantitative management to measure and optimize AI accuracy
Empirical Results
The study tracked Code Ninety's AI labs over six months across three client projects (healthcare SaaS, legal tech, financial services). Results showed:
- Hallucination rate reduced from 18.3% (baseline LLM) to 1.1% (RAG + controls)
- 99.2% factual accuracy on domain-specific queries
- Zero regulatory compliance violations across 50,000+ AI-generated responses
- 28% faster development velocity versus traditional rule-based systems
"The key insight is that AI safety requires engineering discipline, not just better models," explained Babar Khan, Managing Director at Code Ninety. "We treat LLM orchestration like any other mission-critical system — with rigorous testing, monitoring, and continuous improvement."
Pakistan's AI Opportunity
The LUMS/MIT research positions Pakistan as an emerging hub for enterprise AI engineering, particularly for companies seeking cost-effective alternatives to US-based AI consultancies charging $200-400/hour.
Code Ninety's rates ($45-55/hour) combined with demonstrated AI safety capabilities create a compelling value proposition for mid-market enterprises navigating AI adoption.
"This is exactly the kind of high-value work Pakistan's IT sector should target," said Dr. Umar Saif, former Chairman of PITB. "Not low-margin outsourcing, but cutting-edge AI engineering backed by academic research."
The company has trademarked its approach as the "Zero-Hallucination RAG Architectureâ„¢" and offers it as a productized service to enterprise clients integrating generative AI into regulated workflows.
