Securing Multi-Cloud Gen AI workloads using Azure Native Solutions
Note: This series is part of “Security using Azure Native services” series and assumes that you are or planning to leverage Defender for Cloud, Defender XDR Portal, and Azure Sentinel.
Introduction
AI Based Technology introduces a new set of security risks that may not be comprehensively covered by existing risk management frameworks. Based on our experience, customers often only consider the risks related to the Gen AI models like OpenAI or Anthropic. Thereby, not taking a holistic approach that cover all aspects of the workload.
This article will help you:
Understand a typical multi-cloud Gen AI workload pattern
Articulate the technical risks exists in the AI workload
Recommend security controls leveraging Azure Native services
We will not cover Data Security (cryptography, regulatory implications etc.), model specific issues like Hallucinations, deepfakes, privacy, toxicity, societal bias, supply chain security, attacks that leverage Gen AI capabilities to manifest such as Disinformation, Deepfakes, Financial Fraud etc. Instead, we aim to provide guidance on architectural security controls that will enable secure:
Configuration of AI workload
Operation of the workload
This is a two-part series:
Part 1: Provides a framework to understand the threats related to Gen AI workloads holistically and an easy reference to the native security solutions that help mitigate. We also provide sample controls using leading industry frameworks.
Part 2: Will dive deeper into the AI shared responsibility model and how that overlaps with your design choices
Threat Landscape
Let’s discuss some common threats:
Insider abuse: An insider (human or machine) sending sensitive / proprietary information to a third party GenAI model
Supply chain poisoning: Compromise of a third-party GenAI model (whether this is a SaaS or binary llm models developed by third party and downloaded by your organization)
System abuse: Manipulating the model prompts to mislead the end user of the model
Over privilege: Granting unrestricted permissions and capability to the model thereby allowing the model to perform unintentional actions
Data theft/exfiltration: Intentional or unintentional exfiltration of the proprietary models, prompts, and model outputs
Insecure configuration: Not following the leading practices when architecting and operating your AI workload
Model poisoning: Tampering with the model itself to affect the desired behavior of the model
Denial of Service: Impacting the performance of the model with resource intensive operations
We will discuss how these threats apply in a common architecture.
Reference architecture
Fig. Gen-AI cloud native workload
Let’s discuss each step so we can construct a layered defense:
Assuming you are following cloud native architecture patterns, your developer will publish all the application and infrastructure code in an Azure DevOps repo
The DevOps pipeline will then Create a container image
Pipeline will also set up respective API endpoints in Azure API management
Pipeline will deploy the image with Kubernetes manifests (note that he secrets will stored out of bound in Azure Key Vault)
User access an application that leverages GenAI (Open AI for Azure and Anthropic in AWS)
Depending on the API endpoint requested, APIM will direct the request to the containerized application running in cloud native Kubernetes platforms (AKS or EKS)
The application uses API credentials stored in KeyVault
The application makes requests to appropriate Gen AI service
The results are stored in a storage service and are reported back to the user who initiated step 5 above
Each cloud native service stores the diagnostic logs in a centralized Log Analytics Workspace (LAW)
Azure Sentinel is enabled on the LAW
The subscription where the workload is running is protected by Microsoft Defender for Cloud. Entra ID is the identity provider for this multi-cloud workload.
Layered defense using native services
If you were to follow the Kill Chain framework, in many cases the initial attack vectors may be what we have seen earlier like spear phishing, exploiting poorly written application (XSS etc.), privilege escalation. However, these attacks will lead to gen AI specific scenarios:
Injecting instructions to trick the application to query private datastores by removing the prompt and conversation history. The classic example would be Cross Site Request Forgery (CSRF) where the attacker will trick the browser (or client application) to perform this query. Without a Gen AI aware outbound filter and inbound checking this would hard to detect or prevent.
Privilege escalation, exploiting the privileges that user might have granted to a third-party app or plugin, which can query LLM model. In this scenario, an attacker can get hold of user’s browser running the plugin or the desktop client and then provide a prompt to exfiltrate the information. Like 1 above, in this case you will need Gen AI aware controls at different layers
Jailbreaking the AI safety mechanisms to trick the LLM to perform restricted capabilities like malware generation, phishing etc.
Distributed denial of service by making the model unavailable this can be via unrestricted queries
Supply chain poisoning by using unapproved images, SBOM components, in your application etc.
Hopefully, this helps you realize that you need a layered defense to mitigate, like so:
Microsoft Native Security Solutions and security controls
Microsoft’s security solutions provide deep capabilities across each of the above layers. Let’s look at some specific controls, where would they apply in the reference architecture, and the corresponding native solution. To provide a standardized framework, we leveraged leading practices provided by NIST, OWASP, ISACs and other bodies to derive the controls.
Please note that in this article we intend to provide a foundational knowledge as a result we are focusing on the high-level description of different native security components that you may be able to leverage. In subsequent articles we plan to focus specific use cases.
#
Control description
Ref architecture step
Native solution
Description
1.
Verify that systems properly handle queries that may give rise to inappropriate, malicious, or illegal usage, including facilitating manipulation, extortion, targeted impersonation, cyber-attacks, and weapons creation:
– Prompt injection (OWASP LLM01)
– Insecure Output Handling (OWASP LLM02)
– Sensitive Information Disclosure (LLM06)
– Insecure Plugin Design (LLM07)
5, 6, 8,9
Azure WAF, APIM, Defender for API, Defender for Containers,
Defender for Key Vault,
Defender for AI,
Defender for Database,
Microsoft Sentinel
Azure WAF may act as the first line of defense for injections made via API or HTTP requests like XSS reducing the likelihood and impact of LLM01 and LLM02
Defender for API can help identify risky APIs that also APIs that might respond with Sensitive Data released by the model in addition to alerting on anomalies. This is specifically useful for LLM06
Defender for Containers might detect if an attacker is trying to manipulate the running Gen AI application to change the behavior to get access to the Gen AI services
Defender for AI might detect several Gen AI specific anomalies related to the usage pattern like Jailbreak (LLM07), Unintended sensitive data disclosure (LLM06) etc.
Azure AI content safety has Prompt Shield for User Prompts this feature natively targets User Prompt injection attacks, where users deliberately exploit system vulnerabilities to elicit unauthorized behavior from the LLM (LLM01)
2.
Regularly assess and verify that security measures remain effective and have not been compromised:
– Prompt injection (OWASP LLM01)
– Insecure Output Handling (OWASP LLM02)
– Model Denial of Service (LLM04)
– Supply chain vulnerabilities (LLM05)
– Insecure Plugin Design (LLM07)
– Excessive agency (OWASP LLM08)
– Overreliance (LLM09)
– Model Theft (OWASP LLM10)
All
Defender CSPM, Defender for API, Defender for Containers,
Defender for Key Vault,
Defender for AI,
Microsoft Sentinel
DevOps Security, which is part of Defender CSPM, allows you to scan your Infrastructure as Code templates as well as your application code (via third party solution or GHAS). This enables you to prevent insecure deployment reducing the likelihood of LLM05 and LLM07
Defender CSPM has several leading practices driven recommendations for each step of the reference architecture to enable secure operations of AI workloads reducing the likelihood of LLM01
Defender for AI specifically have alerting targeted at LLM06, LLM07 (as described above) in addition to Model Theft (LLM10)
Defender for Database also has protections to prevent for LLM07 to notify of the harmful impacts of raw SQL or programming statements instead of Parameters
Monitoring the recommendations of Defender CSPM may reduce your chances of accidental exposure as well simulating the alerts from Workload protection capabilities will help you validate if you have appropriate response plans in place.
3.
Measure the rate at which recommendations from security checks and incidents are implemented. Assess how quickly the AI system can adapt and improve based on lessons learned from security incidents and feedback.
All
Defender CSPM, Microsoft Sentinel
Secure Score overtime and Governance workbook within Defender CSPM can help you identify how the security of your AI workload is trending.
In addition, you can run workbooks in Microsoft Sentinel to assess your response capabilities and specifically Azure Open AI related issues. There are also workbooks that help you identify visualize the events from the WAF
4.
Verify fine-tuning does not compromise safety and security controls.
– Prompt injection (OWASP LLM01)
– Insecure Output Handling (OWASP LLM02)
– Model Denial of Service (LLM04)
– Insecure Plugin Design (LLM07)
– Excessive agency (OWASP LLM08)
1,2,3,5,6,8
Defender CSPM, Defender for API, Defender for Containers,
Defender for AI,
Microsoft Sentinel
As mentioned in #1 above, there are native safeguards for each of these Gen AI specific threats. You may want to pre-emptively review the security alerts and recommendations to develop responses that you can deploy via Logic App and automate the response using Workflow Automation. There are several starter templates available in Defender for Cloud’s GitHub.
5.
Conduct adversarial testing at a regular cadence to map and measure GAI risks, including tests to address attempts to deceive or manipulate the application of provenance techniques or other misuses. Identify vulnerabilities and understand potential misuse scenarios and unintended outputs.
All
Defender CSPM, Defender for API, Defender for Containers, Defender for Key Vault
Defender for AI,
Microsoft Sentinel
As specified in #4, your Red Team can conduct test against your workloads and review the results in Defender for Cloud’s dashboards or in Microsoft Sentinel’s workbooks mentioned under #3
6.
Evaluate GAI system performance in real-world scenarios to observe its behavior in practical environments and reveal issues that might not surface in controlled and optimized testing environments.
11
Defender for AI, Microsoft Sentinel
You can review the current security state of the AI workload as mentioned in Microsoft Sentinel’s workbooks Azure Open AI related issues and in Defender for Cloud’s Secure Score overtime, Governance workbook, and current alerts.
7.
Establish and maintain procedures for the remediation of issues which trigger incident response processes for the use of a GAI system and provide stakeholders timelines associated with the remediation plan.
5,6,8,11
Defender CSPM, Defender for API, Defender for Containers, Defender for Key Vault,
Defender for AI,
Microsoft Sentinel
You can generate sample alerts in Defender for Cloud and review the recommended actions that are specific to each alert with each stakeholder. Customers often find it beneficial to do this exercise as part of their Cloud Center of Excellence (CCoE). You may then want to set up automation as suggested under #4 above. The Secure Score overtime and Governance workbook will help you benchmark your progress.
8.
Establish and regularly review specific criteria that warrants the deactivation of GAI systems in accordance with set risk tolerances and appetites.
5,6,11
Defender CSPM, Microsoft Sentinel
As you will use Defender for Cloud as a key component, you can leverage the alert management capabilities to manage the status, set up suppression rules, and risk exemptions for specific resources.
Similarly, in Microsoft Sentinel you can set up custom analytic rules and responses to correspond to your risk tolerance. For example, if you do not want to allow access from specific locations you can set up an analytic rule to monitor for that using Azure WAF events.
Call to action
As you saw above, the native services are more than capable to monitor the security of your Gen AI workloads. You should consider:
Discuss the reference architecture with your CCoE to see what mechanisms you currently have in place to transparently protect your Gen AI workloads
Review if you are already leveraging the security services mentioned in the table. If you are not, consider talking to your Azure Account team to see if you can do a deep dive in these services to explore the possibility of replacement
Work with your CCoE to review the alerts recommended above to determine your potential exposure and relevance. Subsequently develop a response plan.
Consider running an extended Proof of Concept of the Services mentioned above so you can evaluate the advantages of native over best of breed.
Understand that if you must, you can leverage specific capabilities in Defender for Cloud and other Azure Services to develop a comprehensive plan. For example, if you are already using a CSPM solution you might want to leverage foundational CSPM capabilities to detect AI configuration deviations along with Defender for AI workload protection. A natively protected reference architecture might look like so,
Fig. Natively protected Gen AI workload
Microsoft Tech Community – Latest Blogs –Read More