Using Azure API Management Circuit Breaker and Load balancing with Azure OpenAI Service
Load balance requests to multiple instances of the Azure OpenAI Service using the round-robin load balancing technique. This can help you distribute the load across different resources and regions and increase the availability and performance of your service.
Implement the circuit breaker pattern to protect your backend service from being overwhelmed by excessive requests. This can help you prevent cascading failures and improve the stability and resiliency of your service. You can configure the circuit breaker property in the backend resource, and define rules for tripping the circuit breaker, such as the number or percentage of failure conditions within a defined time interval and a range of status codes indicating failures.
Circuit breaker pattern
name: ‘aoai-backend-1
properties: {
url: ‘https://<openai endpoint>’
protocol: ‘http’
circuitBreaker: {
rules: [
{
failureCondition: {
count: 1
errorReasons: [
‘Server errors’
]
interval: ‘PT10S’
statusCodeRanges: [
{
min: 429
max: 429
}
]
}
name: ‘myBreakerRule’
tripDuration: ‘PT10S’
}
]
}
}
}
Load balance requests to multiple instances of the Azure OpenAI Service
name: ‘aoai-lb-pool’
properties: {
description: ‘Load balance openai instances’
type: ‘Pool’
protocol: ‘https’
url: ‘https://does-not-matter
pool: {
services: [
{
id: ‘/backends/aoai-backend-1’
}
{
id: ‘/backends/aoai-backend-2’
}
]
}
}
}
Limitations
Microsoft Tech Community – Latest Blogs –Read More