P
PortCoAudit AI
Due Diligence

AI Due Diligence Red Flags: 12 Warning Signs PE Investors Must Catch Before Closing

The AI hype cycle has made it easy for portfolio company management to overstate AI capability, understate technical debt, and obscure unit economics. Here's what experienced PE investors look for — and what to demand when they find it.

March 202612 min readDue Diligence

The core problem: AI capabilities are notoriously difficult to evaluate in standard due diligence. Management decks are full of benchmark scores, buzzwords, and AI roadmap slides — but they rarely tell you whether the AI actually drives revenue, whether it's defensible, or whether there's a regulatory time bomb buried in the data stack.

These 12 red flags represent the most common failures we see in PE AI due diligence. Most are discoverable with the right questions and two to three days of focused technical review. All of them have caused write-downs.

Risk levels:
Critical — Deal-stopper potential
High — Requires remediation plan
Medium — Monitor post-close
01
AI Revenue Claims With No Audit Trail
Critical

Management attributes significant ARR to 'AI-powered' features with no way to isolate that revenue from the base product. Ask for cohort data showing conversion lift, retention delta, or pricing premium for AI SKUs — if they can't produce it, the claim is marketing.

How to test it:

Request a revenue bridge: what did the product earn before the AI feature launched vs. after, holding all other variables constant.

02
Model Dependency on One API Provider
High

The entire AI stack runs through a single third-party API (typically OpenAI) with no fallback, no fine-tuned model, and no switching plan. If that provider raises prices 3x or deprecates the model version, the product breaks or margins collapse.

How to test it:

Ask: what happens if OpenAI raises prices by 40%? What's your model switching cost and timeline?

03
Training Data With Unresolved IP or Privacy Exposure
Critical

The company trained models on scraped web data, user-generated content, or third-party datasets without clear licensing. Pending litigation from the New York Times vs. OpenAI era has already put acquirers on notice — undisclosed data provenance is a deal-stopper.

How to test it:

Require a data lineage report. Who provided the training data? What were the terms? Has legal reviewed them?

04
Accuracy Metrics That Don't Match Real-World Use Cases
High

The company reports 94% accuracy on a benchmark that doesn't reflect actual deployment conditions. Benchmark gaming is endemic in AI — lab performance frequently degrades 15–30% in production on real, messy customer data.

How to test it:

Ask for production accuracy metrics with timestamps, not benchmark scores. What's the false positive rate on real customer workloads?

05
No AI Governance or Model Monitoring Infrastructure
High

There's no systematic process for detecting model drift, bias incidents, or output degradation. For regulated industries (finance, healthcare, legal), this is an immediate red flag. For all others, it's a time bomb — models decay silently.

How to test it:

Ask: how would you know if the model started performing worse today? What's your drift detection and retraining cadence?

06
AI Team Is One Person Deep
High

The AI capability is entirely held by one or two engineers. When they leave, the model degrades, breaks, and no one can rebuild it. This is especially dangerous post-acquisition when key technical talent often departs.

How to test it:

Map the bus factor: how many people can independently rebuild the core model pipeline? What's the documentation coverage?

07
Compute Costs That Don't Scale With Revenue
Medium

GPU and inference costs grow faster than revenue as usage scales. Many AI-forward companies are cash-flow negative on their AI features at scale and don't know it because they haven't stress-tested the unit economics.

How to test it:

Build a unit economics model: cost-per-inference × usage volume at 2x, 5x, 10x current customers. Does gross margin hold?

08
Contractual Lock-In on Customer Data for Model Training
Critical

Existing customer contracts include opt-out rights or restrictions that prevent the company from using customer data to train or improve models. This quietly breaks the 'data flywheel' thesis and can invalidate the AI moat narrative.

How to test it:

Review a sample of 10 customer contracts specifically for data usage, training rights, and opt-out clauses.

09
Hallucination Risk in Customer-Facing Outputs
High

The product generates factual claims, recommendations, or reports that customers act on — but there's no human review step, confidence scoring, or output auditing. One high-profile hallucination incident can trigger customer churn and reputational damage faster than any other failure mode.

How to test it:

Ask for examples of the worst outputs the model has produced. Is there a human-in-the-loop for high-stakes outputs?

10
Shadow AI Usage Across the Org
Medium

Employees are using personal ChatGPT accounts, Copilot, or other AI tools to process customer data outside of approved systems. This creates data leakage risk and potential breach of customer data processing agreements.

How to test it:

Ask: what's your AI acceptable use policy? How do you enforce it? Have you audited what tools employees are actually using?

11
Regulatory Exposure in High-Risk AI Act Categories
Critical

The product touches categories flagged as high-risk under the EU AI Act (hiring, credit scoring, medical diagnosis, critical infrastructure) with no compliance roadmap. Enforcement timelines are here — fines run to 3% of global annual turnover.

How to test it:

Classify the product against EU AI Act risk categories. Does the company have a conformity assessment plan?

12
AI Claims in Marketing vs. Actual Product Functionality
Medium

The company calls itself 'AI-powered' but the core product is rule-based automation with a thin LLM wrapper on one feature. This matters for valuation multiple justification — AI companies trade at higher multiples, and if the AI thesis doesn't hold, the entry multiple collapses.

How to test it:

Do a feature-by-feature audit: which capabilities are genuinely ML/AI vs. deterministic logic? What % of ARR would exist without the AI features?

What to Do When You Find These Red Flags

Manageable

Flags #7, #10 are typically manageable post-close with the right operating plan. Price them into the deal and build remediation milestones into the 100-day plan.

Negotiate or Escrow

Flags #2, #4, #5, #6, #9, #12 typically warrant purchase price adjustments, escrow arrangements, or rep & warranty coverage. Quantify the remediation cost and use it as leverage.

Walk Away

Flags #1, #3, #8, #11 unresolved are potential deal-stoppers. Misrepresented AI revenue, unresolved IP exposure, and active regulatory non-compliance are liabilities that often exceed acquisition price.

A Systematic Framework for AI Due Diligence

The best PE teams run AI due diligence in four parallel workstreams across a two-week sprint, with technical experts, legal, and financial analysts working simultaneously:

Technical Architecture Review

Model stack, vendor dependencies, compute economics, data pipeline

Covers: Red flags #2, #4, #5, #7

Data & Legal Audit

Training data provenance, customer contract data rights, regulatory classification

Covers: Red flags #3, #8, #11

Commercial AI Validation

Revenue attribution, benchmark vs. production accuracy, customer interviews

Covers: Red flags #1, #4, #12

People & Process Assessment

AI team depth, documentation, governance processes, shadow AI usage

Covers: Red flags #6, #9, #10

Run a systematic AI audit before your next close

PortCoAudit AI delivers a structured AI readiness scorecard for any portfolio company in 48 hours — covering all 12 red flag categories with evidence-based scoring and actionable remediation plans.

Related Reading

Board-Cycle Ready
Review engagement options, then request fit based on your current portfolio timeline.