Azure OpenAI Insights: Monitoring AI with Confidence
Azure OpenAI Insights: Monitoring AI with Confidence
Getting started : Step by step
Step 1: Download the workbook from here.
Step 2: Import the workbook into your Azure Monitor workspace. Here is an external guide on how to import workbooks into Azure Monitor. Alternatively, you can use this repo for additional instructions.
Step 3: Optional step; Enable diagnostic settings for your Azure OpenAI resource. This will allow you to view additional dimensions and logs in the workbook. More information on the level of details later in this post.
Step 4: Explore the workbook.
Please check our repository for further enhancements, issues etc. We hope hearing from you via issues and stars.
Workbook Overview
Monitor
Http requests, by multiple dimensions: model name & version, status code, model deployment name, operation and api name and region.
Token based usage – multiple metrics: Processed Inference Tokens, Processed Prompt Tokens, Generate Completions Tokens, Active Tokens; these are displayed with couple of dimensions such as, model name and model deployment name.
PTU Utilization – by multiple dimensions: model name & version, streaming type and model deployment name.
Fine-tuning – Here we show the ‘Processed FineTuned Training Hours’ metric by two dimensions: model name and model deployment name.
Insights
Insights Overview
Model name
Model Deployment name
Average Duration (in milliseconds)
API operation name
Figure 5: Insights Overview – more aggregative view
By Caller IP
Request/Response (Model name, Model deployment name & Operation name)
Average Duration
All Logs
Why? Activating Azure OpenAI Monitoring: Cognitive Services, Metrics, and Diagnostics
Resource Allocation: As an ISV operator, monitoring and controlling the usage of cognitive services across tenants is vital for fair resource distribution.
Billing Accuracy: Keeping track of each tenant’s service consumption is crucial for accurate billing and service verification.
Monetization Strategy: For ISVs, monetizing cognitive service usage is key to recovering operational costs and maintaining profitability.
Usage Limits: Setting limits on service access for each tenant helps in preventing resource monopolization and ensuring service availability for all.
Data Segregation: Ensuring strict data segregation between tenants is paramount for maintaining privacy and preventing data leakage.
Metrics and Documentation: Having access to detailed documentation on AOAI metrics, error codes, and rate limits is essential for effective system integration.
Comprehensive Metrics: Access to extensive metrics like deployment names and hosting hours is crucial for managing usage and performance of cognitive services effectively.
Azure OpenAI Service Monitoring: Azure OpenAI Service Monitoring Guide details how to use Azure Monitor tools for tracking the availability, performance, and operation of Azure OpenAI Service resources. It covers different monitoring data types such as platform metrics, resource logs, and activity logs, explaining their collection and storage via diagnostic settings. The guide highlights out-of-box dashboards with categories like HTTP Requests and PTU Utilization, and delves into using the Kusto query language in Log Analytics for complex data queries. Additionally, it provides insights into creating alerts based on various monitoring data and outlines best practices and use cases for proactive notification, making it an essential resource for efficient Azure OpenAI Service management.
Azure OpenAI Service Overview: Understanding Azure OpenAI Service offers a comprehensive look at Microsoft’s Azure OpenAI Service, which grants access to OpenAI’s advanced language models like GPT-4, GPT-4 Turbo with Vision, and GPT-3.5-Turbo. The service is accessible via REST APIs, Python SDK, or a web-based interface and is tailored for customers with established partnerships with Microsoft, focusing on lower-risk applications and adherence to responsible AI principles. Key features include the Completions Endpoint for generating text completions from prompts, and the introduction of the DALL-E and Whisper models, which are in preview for generating images from text and transcribing or translating speech. The page also guides new users on starting with Azure OpenAI, including creating an Azure OpenAI resource, deploying models, and crafting effective prompts, making it a vital resource for anyone looking to leverage these cutting-edge AI capabilities.
How? Approaches to Provision Azure OpenAI Services for ISVs and Enterprises
Reuse: Utilizing Existing Azure Monitoring Tools
Overview: The ‘Reuse’ strategy focuses on leveraging existing Azure tools for monitoring and diagnostics, such as Azure Monitor, Azure Metrics, and diagnostic logs.
Detailed View: These tools provide detailed insights into the usage of Azure OpenAI services. By reusing these tools, ISVs can gain a comprehensive view of service utilization, performance metrics, and operational health without the need for extensive custom development.
Integration and Customization: Azure monitor and workbooks are pre-backed into Azure portal, enabling cost-effectiveness and time-saving aspects of this approach.
Build: Crafting Custom Solution
Overview: This approach involves ISVs developing their own custom tools tailored to their specific requirements for controlling and monitoring Azure OpenAI services.
Considerations: When building a custom solution, ISVs must consider the integration complexity, development cost, and the ongoing maintenance. This route offers maximum flexibility and control but requires significant investment in development resources.
Leveraging Existing Platforms/Tools: While the specifics of building custom tools are beyond the scope of this discussion, it’s worth noting that these tools can often be built on top of existing platforms or frameworks, enhancing efficiency and reducing development time.
Decision Factors
Balancing Flexibility and Resource Investment: The choice between building custom tools or reusing existing Azure tools depends on several factors, including the desired level of customization, available resources, and the specific needs of the ISV or enterprise.
Scalability and Future Growth: Considerations should also include scalability and the ability to adapt to future changes in Azure OpenAI services and the broader AI landscape.
Reuse Strategies in Azure OpenAI Provisioning
Unique Deployment Names for Each Customer
Overview: In this approach, ISVs assign a unique deployment name for each customer, with individualized settings including TPM (tokens per minute) and RPM (requests per minute). This customization allows for more precise control over how each customer can utilize the service.
Controlled Management: By having distinct deployment names, ISVs can fine-tune the service parameters per customer. This ensures that each customer’s usage stays within the prescribed limits, helping to manage resource allocation effectively and prevent over utilization.
Benefits: This method delegates significant control measures to the Azure platform, reducing the management burden on the ISV. It’s particularly suitable for scenarios where customer-specific data segregation and usage monitoring are critical, and where each customer’s capacity needs are within the overall model limits.
Considerations: While this setup simplifies management for the ISV, it requires careful planning and setup for each customer to ensure that their specific needs are met within the parameters of TPM and RPM.
Figure 10: Configuring deployment names
Multiple Endpoints for Increased Capacity
Overview: Alternatively, ISVs can use multiple endpoints to enhance capacity. In this scenario, each ISV customer uses the same endpoint, and the ISV is responsible for load balancing and monitoring individual customer usage.
Challenges: This approach requires the ISV to actively manage load balancing and usage tracking, which can be complex but offers greater flexibility in resource allocation and scalability.
Usage Monitoring: The ISV must implement robust systems to accurately monitor and count usage per customer, ensuring fair billing and resource distribution.
Hybrid Approach
Possibility: A third alternative could be a hybrid approach, combining elements of both strategies. This could involve using unique deployment names for certain customers with specific needs while employing multiple endpoints for others to scale capacity.
Flexibility: This approach offers the greatest flexibility, allowing ISVs to tailor the provisioning strategy to the specific needs and usage patterns of each customer.
Management Complexity: While offering adaptability, this approach can increase management complexity and resource requirements for the ISV.
Conclusion and Next Steps
Microsoft Tech Community – Latest Blogs –Read More