====== AI Service Level Agreement (AI-SLA) ======

An AI Service Level Agreement (AI-SLA) is a contractual document that defines the performance standards, metrics, and remedies applicable to AI-powered services. ((Source: [[https://sparkco.ai/blog/mastering-slos-and-slas-for-ai-agents-in-2025|Sparkco — Mastering SLOs and SLAs for AI Agents]])) Unlike traditional SLAs that focus primarily on uptime and response time, AI-SLAs must address the unique characteristics of AI systems including model accuracy, inference latency, output quality, fairness, and drift monitoring.

===== Why AI-SLAs Differ from Traditional SLAs =====

Traditional SLAs measure deterministic software behavior: the service is either available or it is not, and response times are predictable. AI systems introduce stochastic behavior where outputs can vary, models can degrade over time, and quality metrics extend beyond simple availability. ((Source: [[https://sparkco.ai/blog/mastering-slos-and-slas-for-ai-agents-in-2025|Sparkco — Mastering SLOs and SLAs for AI Agents]]))

AI-SLAs must account for:

  * Model accuracy that may change as data distributions shift
  * Inference latency that varies with input complexity
  * Output quality that requires domain-specific evaluation metrics
  * Fairness and bias measurements across protected groups
  * Model versioning and update procedures

===== Key Metrics =====

=== Availability and Uptime ===

Standard uptime commitments remain foundational. Cloud providers typically offer 99.5 to 99.9 percent monthly uptime for AI services, with service credits for breaches. ((Source: [[https://cloud.google.com/document-ai/sla|Google Cloud — Document AI SLA]])) Downtime calculations exclude planned maintenance windows with advance notice.

=== Model Performance ===

  * **Accuracy Rate**: Minimum acceptable accuracy for classifications, predictions, or generations (e.g., 95 percent accuracy for natural language processing tasks) ((Source: [[https://sparkco.ai/blog/mastering-slos-and-slas-for-ai-agents-in-2025|Sparkco — Mastering SLOs and SLAs for AI Agents]]))
  * **Inference Latency**: Maximum response time for model predictions, typically measured at the 50th and 99th percentiles
  * **Throughput**: Minimum requests per second the service must sustain
  * **Error Rate**: Maximum acceptable percentage of failed or invalid responses

=== AI-Specific Metrics ===

  * **Model Drift Monitoring**: Regular measurement of performance degradation as input data distributions change over time
  * **Fairness Metrics**: Quantified bias measurements across demographic groups to ensure equitable outcomes
  * **Hallucination Rate**: For generative AI services, the acceptable frequency of factually incorrect outputs
  * **Data Freshness**: Maximum age of training data or knowledge cutoff dates

===== Service Tiers =====

AI-SLAs commonly define tiered service levels:

  * **Standard**: Basic availability guarantees, best-effort model performance, standard support response times
  * **Premium**: Enhanced uptime commitments, guaranteed model performance thresholds, priority support
  * **Enterprise**: Custom SLOs, dedicated model instances, guaranteed retraining schedules, and named support contacts

((Source: [[https://ezel.ai/templates/ai-performance-sla-agreement|Ezel AI — AI Performance SLA Agreement Template]]))

===== Remedies and Credits =====

When service levels are not met, AI-SLAs typically provide financial credits as the sole remedy. Credit structures are usually tiered based on the severity and duration of the breach. For example, falling below 99.9 percent uptime may trigger a 10 percent credit, while falling below 99.0 percent may trigger a 30 percent credit. ((Source: [[https://cloud.google.com/document-ai/sla|Google Cloud — Document AI SLA]]))

===== Exclusions =====

Common exclusions from AI-SLA calculations include:

  * Planned maintenance with advance notice
  * Force majeure events
  * Customer-caused issues (malformed inputs, exceeding rate limits)
  * Third-party service outages beyond provider control
  * Beta or preview features not covered by production SLAs

===== Best Practices =====

  * Define SLOs (Service Level Objectives) before formalizing SLAs to ensure targets are both ambitious and attainable ((Source: [[https://sparkco.ai/blog/mastering-slos-and-slas-for-ai-agents-in-2025|Sparkco — Mastering SLOs and SLAs for AI Agents]]))
  * Embed SLO monitoring into CI/CD pipelines to catch regressions before deployment
  * Include model retraining and update schedules as contractual commitments
  * Specify data handling and privacy obligations within the SLA
  * Define escalation procedures for AI-specific incidents such as model poisoning or adversarial attacks
  * Align AI-SLA metrics with regulatory requirements such as the EU AI Act's accuracy and robustness mandates

===== See Also =====

  * [[ai_accountability_mandates|AI Accountability Mandates]]
  * [[hitl_governance|Human-in-the-Loop (HITL) Governance]]
  * [[data_poisoning_insurance|What Is a Data-Poisoning Insurance Policy]]

===== References =====