Announcing key updates to Responsible AI features and content filters in Azure OpenAI Service
We’re excited to announce the release of new Responsible AI features and content filter improvements in Azure OpenAI Service (AOAI) and AI Studio, spanning from new unified content filters, to customizable content filters for DALL-E and GPT-4 Turbo Vision deployments, safety system message templates in the AOAI Studio, asynchronous filters now available for all AOAI customers, and updates to protected material and image generation features.
Unified content filters
We are excited to announce that a new unified content filter experience is coming soon to Azure AI. This update will streamline the process of setting up content filters across different deployments and various products such as Azure AI Studio, AOAI, and Azure AI Content Safety for a more uniform user experience. Content filters enable users to effectively block harmful content, whether it’s text, images, or multimodal forms. With this unified approach, users have the flexibility to establish a content filtering policy tailored to their particular needs and scenarios.
Configurable content filters for DALL-E and GPT-4 Turbo with Vision GA
The integrated content filtering system in AOAI provides Azure AI Content Safety content filters by default, and they detect and the output of harmful content. Furthermore, we also provide a range of different content safety customization options for the AOAI GPT model series. Today, we are releasing configurable content filters for DALL-E 2 and 3, and GPT-4 Turbo Vision GA deployments, enabling content filter customization based on specific use case needs. Customers can configure input and output filters, adjust severity levels for the content harms categories and add additional applicable RAI models and capabilities such as Prompt Shields and custom blocklists. Customers who have been approved for modified content filters can turn the content filters off or use annotate mode to return annotations via API response, without blocking content. Learn more.
Asynchronous Filters
In addition to the default streaming experience in AOAI – where completions are vetted before they are returned to the user, or blocked in case of a policy violation – we’re excited to announce that all customers now have access to the Asynchronous Filter feature. Content filters are run asynchronously, and completion content is returned immediately with a smooth and fast token-by-token streaming experience. No content is buffered, which allows for a faster streaming experience at zero latency associated with content safety. Customers must be aware that while the feature improves latency, it’s a trade-off against the safety and real-time vetting of smaller sections of model output. Because content filters are run asynchronously, content moderation messages and policy violation signals are delayed, which means some sections of harmful content that would otherwise have been filtered immediately could be displayed to the user. Content that is retroactively flagged as protected material may not be eligible for Customer Copyright Commitment coverage. Read more about Asynchronous Filter and how to enable it.
Safety System Messages
System messages for generative AI models are an effective strategy for additional AI content safety. The AOAI Studio and AI Studio are now supporting safety system message templates directly in the playground that can be quickly tested and deployed, covering a range of different safety related topics such as preventing harmful content, jailbreak attempts, as well as grounding instructions. Learn more.
Protected Materials
Protections for Azure OpenAI GPT-based models
In November 2023, Microsoft announced the release of Protected Material Detection for Text in AOAI and Azure AI Content Safety. Soon, this model will upgrade to version 2.0 and identifies content that highly resembles pre-existing content. This update also prevents attempts to subvert the filter by asking for known modifications of the original text, e.g. the original text with repeated characters or more whitespace. Soon, the Protected Material Detection for Code model version 2.0 will update its attribution feature to flag 2023 public GitHub repository code from flagging 2021 repository code.
Updated Features in Azure OpenAI Service DALL-E
AOAI now prevents DALL-E from generating works that closely resemble certain types of known creative content, such as studio characters and contemporary artwork. It does this by re-interpreting the text prompt to DALL-E, removing keywords or phrases associated with creative content categories. Below are examples showing image outputs before and after the modification is applied. Please note that the DALL-E model is non-deterministic and so is likely not going to generate the same image with the same prompt each time.
New Responsible AI features in Azure AI Content Safety & Azure AI Studio
Custom Categories
This week at Build 2024 we also previewed other important features for responsible AI, one of which will be coming soon to Azure OpenAI Service: Custom Categories. Learn more about Custom Categories.
Get started today
Visit Azure OpenAI Service Studio: oai.azure.com
Visit Azure AI Studio: ai.azure.com
Visit Azure AI Content Safety Studio: aka.ms/contentsafetystudio
Microsoft Tech Community – Latest Blogs –Read More