Month: September 2024
Please help, import from Excel
Hi!
I am trying to import from Excel to SQL server with the wizard. Everything seems to be go fine except the articel id that is a unique identifier together wih an id that I leave empty. The Id is before registerdate.
I get this errorcode. Can someone please advice me?
The excelfile looks like this:
– Executing (Error)
Messages
Error 0xc02020c5: Data Flow Task 1: Data conversion failed while converting column “Article_Id” (33) to column “Article_Id” (80). The conversion returned status value 2 and status text “The value could not be converted because of a potential loss of data.”.
(SQL Server Import and Export Wizard)
Error 0xc0209029: Data Flow Task 1: SSIS Error Code DTS_E_INDUCEDTRANSFORMFAILUREONERROR. The “Data Conversion 0 – 0.Outputs[Data Conversion Output].Columns[Article_Id]” failed because error code 0xC020907F occurred, and the error row disposition on “Data Conversion 0 – 0.Outputs[Data Conversion Output].Columns[Article_Id]” specifies failure on error. An error occurred on the specified object of the specified component. There may be error messages posted before this with more information about the failure.
(SQL Server Import and Export Wizard)
Error 0xc0047022: Data Flow Task 1: SSIS Error Code DTS_E_PROCESSINPUTFAILED. The ProcessInput method on component “Data Conversion 0 – 0” (60) failed with error code 0xC0209029 while processing input “Data Conversion Input” (61). The identified component returned an error from the ProcessInput method. The error is specific to the component, but the error is fatal and will cause the Data Flow task to stop running. There may be error messages posted before this with more information about the failure.
(SQL Server Import and Export Wizard)
Hi!I am trying to import from Excel to SQL server with the wizard. Everything seems to be go fine except the articel id that is a unique identifier together wih an id that I leave empty. The Id is before registerdate.I get this errorcode. Can someone please advice me?The excelfile looks like this: – Executing (Error)MessagesError 0xc02020c5: Data Flow Task 1: Data conversion failed while converting column “Article_Id” (33) to column “Article_Id” (80). The conversion returned status value 2 and status text “The value could not be converted because of a potential loss of data.”.(SQL Server Import and Export Wizard)Error 0xc0209029: Data Flow Task 1: SSIS Error Code DTS_E_INDUCEDTRANSFORMFAILUREONERROR. The “Data Conversion 0 – 0.Outputs[Data Conversion Output].Columns[Article_Id]” failed because error code 0xC020907F occurred, and the error row disposition on “Data Conversion 0 – 0.Outputs[Data Conversion Output].Columns[Article_Id]” specifies failure on error. An error occurred on the specified object of the specified component. There may be error messages posted before this with more information about the failure.(SQL Server Import and Export Wizard)Error 0xc0047022: Data Flow Task 1: SSIS Error Code DTS_E_PROCESSINPUTFAILED. The ProcessInput method on component “Data Conversion 0 – 0” (60) failed with error code 0xC0209029 while processing input “Data Conversion Input” (61). The identified component returned an error from the ProcessInput method. The error is specific to the component, but the error is fatal and will cause the Data Flow task to stop running. There may be error messages posted before this with more information about the failure.(SQL Server Import and Export Wizard) Read More
Error message when required data is empty
Hello,
I have two required fields on my bookings page.
If those are empty and I click on “book meeting”, there is no error message.
The bookings page is just reloading itself and a potential customer doesn’t get informed, that the meeting IS NOT booked.
Is there any chance to have an error message if something isn’t filled out correctly?
Hello, I have two required fields on my bookings page. If those are empty and I click on “book meeting”, there is no error message. The bookings page is just reloading itself and a potential customer doesn’t get informed, that the meeting IS NOT booked. Is there any chance to have an error message if something isn’t filled out correctly? Read More
Unity Editor Crashing When I Open Mesh Uploader
Hii,
I’ve faced the problem while uploading an environment
When I click in Mesh Toolkit/Environments => Unity editor Crashed (All the time).
Resources / Info:
I’ve used below mentioned version of unity editor and mesh toolkit.
Unity Editor: v2022.3.34f1
Mesh Tool Kit: v5.2411.210
I’ve downloaded the mesh toolkit – Mesh 101 project from GitHub – microsoft/Mesh-Toolkit-Unity .
I’ve followed the Mesh 101 tutorial from Mesh 101 Overview and setup – Microsoft Mesh | Microsoft Learn .
* If any need for editor log, I can provide.
Hii,I’ve faced the problem while uploading an environment When I click in Mesh Toolkit/Environments => Unity editor Crashed (All the time). Resources / Info:I’ve used below mentioned version of unity editor and mesh toolkit. Unity Editor: v2022.3.34f1Mesh Tool Kit: v5.2411.210 I’ve downloaded the mesh toolkit – Mesh 101 project from GitHub – microsoft/Mesh-Toolkit-Unity .I’ve followed the Mesh 101 tutorial from Mesh 101 Overview and setup – Microsoft Mesh | Microsoft Learn . * If any need for editor log, I can provide. Read More
Can we recover permanently deleted files in Windows 10?
Hey everyone,
I accidentally deleted some important files on my Windows 10 PC and I could’t be able to find them in the Recycle Bin. I’m wondering if there’s any way to recover them. I know Windows 10 doesn’t make this easy after files are permanently deleted, but I’ve read that there are a few methods and tools out there that might help.
Have any of you successfully recovered permanently deleted files in Windows 10 using these methods, or are there other solutions I should try? Let me know if you have any recommendations or tips!
Thanks
Hey everyone, I accidentally deleted some important files on my Windows 10 PC and I could’t be able to find them in the Recycle Bin. I’m wondering if there’s any way to recover them. I know Windows 10 doesn’t make this easy after files are permanently deleted, but I’ve read that there are a few methods and tools out there that might help. Have any of you successfully recovered permanently deleted files in Windows 10 using these methods, or are there other solutions I should try? Let me know if you have any recommendations or tips! Thanks Read More
Account verification has not started and in “pending” status for over a month
Hi, I can’t pass the verification, it doesn’t even start. I got an EV certificate and registered for Windows Hardware Dev Center program.
I talked to support before creating the account, they said that the EV certificate contact info and Partner Center contact info must match.
When verifying the company for the EV certificate, I used a private email everywhere as the company does not have a domain or website.
I filled in all available fields in Partner Center as in the EV certificate, except for the company email, for which I had to take one of my old domains and raise the domain mail service, because private emails are not accepted. (I did not use the words admin etc in the email name as Microsoft documentation requires)
I received an OTP code to verify the email while editing the contact info, but the verification is still in “pending” status
I wrote 3 appeals to support. One appeal was immediately closed with the wording “Denied for internal reasons in registration in Microsoft AI Cloud Partner Program” – I didn’t even register in this program.
The other 2 appeals are just hanging, for a couple of weeks. I received message saying “Your issue is under review currently. As soon as we have an update, we will inform you.” and that’s it. No one is responding to my inquiries repeatedly.
I’ve read all the documentation on registering an account and still have all the discussions from 2023.
The only thing I can think of is to create a domain with an open whois, maybe after that verification will at least start and reach Email verification.
Does anyone have any ideas? Maybe I missed something?
Hi, I can’t pass the verification, it doesn’t even start. I got an EV certificate and registered for Windows Hardware Dev Center program. I talked to support before creating the account, they said that the EV certificate contact info and Partner Center contact info must match. When verifying the company for the EV certificate, I used a private email everywhere as the company does not have a domain or website. I filled in all available fields in Partner Center as in the EV certificate, except for the company email, for which I had to take one of my old domains and raise the domain mail service, because private emails are not accepted. (I did not use the words admin etc in the email name as Microsoft documentation requires)I received an OTP code to verify the email while editing the contact info, but the verification is still in “pending” status I wrote 3 appeals to support. One appeal was immediately closed with the wording “Denied for internal reasons in registration in Microsoft AI Cloud Partner Program” – I didn’t even register in this program. The other 2 appeals are just hanging, for a couple of weeks. I received message saying “Your issue is under review currently. As soon as we have an update, we will inform you.” and that’s it. No one is responding to my inquiries repeatedly.I’ve read all the documentation on registering an account and still have all the discussions from 2023. The only thing I can think of is to create a domain with an open whois, maybe after that verification will at least start and reach Email verification. Does anyone have any ideas? Maybe I missed something? Read More
Search Service Appliaction stuck in Pause for: BackupRestore
I have a test environment witch contains two farms:
2 Sharepoint Server 2019 Enterprise (Custom Role) + 2 SQL Server 2019, Both farms are located in the same domain. The URLs of the sp.domain.local and spdr.domain.local farms correspondingly. Firewall is disabled on all VMs.
I tried to restore SSA from the SP farm to the SPDR farm, using standard Sharepoint backup tools (via the Center Administration ). For some unknown reason, after the recovery process is completed, the SSA stuck in the status “Paused for:Backup/Restore”. At the same time, the backup and recovery processes themselves are completed without errors and warnings. ULS are clean. From PowerShell, the Search Service Application itself is in the ‘Online’ status, all search components are in the ‘Active’ status. At the same time, any actions with the SSA add to its ‘External request’ status and it remains in this form until it is deleted. If you try to make an Index Reset, then the same status is also displayed for a while in the ‘Administrative Status’ field, while the index itself remains in place.
Over the past week, I’ve tried all the instructions I could find on the Internet:
$SSA.ForceResume($SSA.IsPaused())
#—
$SSA.Pause() | $SSA.Resume()
#—
Resume-SPEnterpriseSearchServiceApplication $ssa
#—
Foreach ($cs in ContentSources){$cs.ResumeCrawl()}
I tried Stop the Timer Service and cleaned the cache
According to this article I tried to change the value 130 to 0 in the search database, in the ‘MSSStatusChange’ table. In this case, SSA switched to the ‘Running’ status, but Index Reset and Full Crawl were started, but never completed (I waited almost a day).
When I created a new SSA, it accepted the status of ‘Running’, Search works as expected.
I’m not sure if CU July 2024 is to blame for the problem, but on test farms without this patch, the search service is being restored normally.
Any ideas, how I can restore SSA from backup?
I have a test environment witch contains two farms:2 Sharepoint Server 2019 Enterprise (Custom Role) + 2 SQL Server 2019, Both farms are located in the same domain. The URLs of the sp.domain.local and spdr.domain.local farms correspondingly. Firewall is disabled on all VMs.I tried to restore SSA from the SP farm to the SPDR farm, using standard Sharepoint backup tools (via the Center Administration ). For some unknown reason, after the recovery process is completed, the SSA stuck in the status “Paused for:Backup/Restore”. At the same time, the backup and recovery processes themselves are completed without errors and warnings. ULS are clean. From PowerShell, the Search Service Application itself is in the ‘Online’ status, all search components are in the ‘Active’ status. At the same time, any actions with the SSA add to its ‘External request’ status and it remains in this form until it is deleted. If you try to make an Index Reset, then the same status is also displayed for a while in the ‘Administrative Status’ field, while the index itself remains in place.Over the past week, I’ve tried all the instructions I could find on the Internet: $SSA.ForceResume($SSA.IsPaused())
#—
$SSA.Pause() | $SSA.Resume()
#—
Resume-SPEnterpriseSearchServiceApplication $ssa
#—
Foreach ($cs in ContentSources){$cs.ResumeCrawl()} I tried Stop the Timer Service and cleaned the cacheAccording to this article I tried to change the value 130 to 0 in the search database, in the ‘MSSStatusChange’ table. In this case, SSA switched to the ‘Running’ status, but Index Reset and Full Crawl were started, but never completed (I waited almost a day).When I created a new SSA, it accepted the status of ‘Running’, Search works as expected.I’m not sure if CU July 2024 is to blame for the problem, but on test farms without this patch, the search service is being restored normally.Any ideas, how I can restore SSA from backup? Read More
How to install Windows 10 on new SSD drive on my PC?
The stock SSD on my Windows 7 PC is almost full and the size is is so small (128GB SATA SSD). So I bought a new fast SSD to replace the old and I am planning to install Windows 10 on the new SSD drive, which is 1T in storage.
My question is how can I install Windows 10 on new SSD? I never done this before and need some suggestions to get the job done. What kind of things I should pay attention to?
The stock SSD on my Windows 7 PC is almost full and the size is is so small (128GB SATA SSD). So I bought a new fast SSD to replace the old and I am planning to install Windows 10 on the new SSD drive, which is 1T in storage. My question is how can I install Windows 10 on new SSD? I never done this before and need some suggestions to get the job done. What kind of things I should pay attention to? Read More
Manage multiple tenants
Hi There,
I have around 50 M365 customers to manage. Previously, in each tenant I create an unlicensed global admin account ( with .onmicrosoft.com account) which shared by our team. when customer need help, we will access it. understand that Microsoft will force all global admin to use MFA, but we have a challenge that each of the tenant will be access by multiple support from my team, if use MFA, it will affect our support efficiency.
2. yes, we do have a partner portal, we did use this way to access our customer’s tenant, and, currently, Microsoft force to set expiry date for the GDAP, which we require to get consent from our customer to click the link.
I just wonder how u guys manage or use which approach to support multiple tenants. Any idea?
Hi There, I have around 50 M365 customers to manage. Previously, in each tenant I create an unlicensed global admin account ( with .onmicrosoft.com account) which shared by our team. when customer need help, we will access it. understand that Microsoft will force all global admin to use MFA, but we have a challenge that each of the tenant will be access by multiple support from my team, if use MFA, it will affect our support efficiency. 2. yes, we do have a partner portal, we did use this way to access our customer’s tenant, and, currently, Microsoft force to set expiry date for the GDAP, which we require to get consent from our customer to click the link. I just wonder how u guys manage or use which approach to support multiple tenants. Any idea? Read More
Copilot for Outlook error – The thread is too long so we couldn’t consider all of the conversation
Hello Community,
While replying to a mail chain which has many mail in the mail thread, Draft with Copilot shows error as “Copilot for Outlook error – The thread is too long so we couldn’t consider all of the conversation history. Make sure to check the draft.”
I tried finding if there is any known limitation with Copilot for Outlook or if there is any fix to it but couldn’t find any working solution to it.
Please let me know if you have faced the same issue and the solution / workaround to it.
Thank you
Hello Community, While replying to a mail chain which has many mail in the mail thread, Draft with Copilot shows error as “Copilot for Outlook error – The thread is too long so we couldn’t consider all of the conversation history. Make sure to check the draft.” I tried finding if there is any known limitation with Copilot for Outlook or if there is any fix to it but couldn’t find any working solution to it. Please let me know if you have faced the same issue and the solution / workaround to it. Thank you Read More
OpenAI at Scale: Maximizing API Management through Effective Service Utilization
Scenario
In this blog post, I will demonstrate how to leverage Azure API Management to enhance the resiliency and capacity of your OpenAI Service.
Azure API Management is a tool that assists in creating, publishing, managing, and securing APIs. It offers features like routing, caching, throttling, authentication, transformation, and more.
By utilizing Azure API Management, you can:
Distribute requests across multiple instances of the Azure OpenAI Service using the priority-based load balancing technique, which includes groups with weight distribution inside the group. This helps spread the load across various resources and regions, thereby enhancing the availability and performance of your service.
Implement the circuit breaker pattern to protect your backend service from being overwhelmed by excessive requests. This helps prevent cascading failures and improves the stability and resiliency of your service. You can configure the circuit breaker property in the backend resource and define rules for tripping the circuit breaker, such as the number or percentage of failure conditions within a specified time frame and a range of status codes indicating failures.
Diagram 1: API Management with circuit breaker implementation.
>Note: Backends in lower priority groups will only be used when all backends in higher priority groups are unavailable because circuit breaker rules are tripped.
Diagram 2: API Management load balancer with circuit breaker in action.
In the following section I will guide you through circuit breaker deployment with API Management and Azure Open AI services. you can use the same solution with native OpenAI service.
The GitHub repository for this article can be found in github.com/eladtpro/api-management-ai-policies
Prerequisites
If you don’t have an Azure subscription, create a free account before you begin.
Use the Bash environment in Azure Cloud Shell. For more information, see Quickstart for Bash in Azure Cloud Shell.
If you prefer to run CLI reference commands locally, install the Azure CLI.
If you’re using a local installation, sign in to the Azure CLI by using the az login command. To finish the authentication process, follow the steps displayed in your terminal. For other sign-in options, see Sign in with the Azure CLI.
if you don’t have Azure API Management, create a new instance.
Azure OpenAI Services for the backend pool, each service should have the same model deployed with the same name and version across all the services.
Step I: Provision Azure API Management Backend Pool (bicep)
Bicep CLI
Install or Upgrade Bicep CLI.
# az bicep install
az bicep upgrade
Deploy the Backend Pool using Bicep
Login to Azure.
az login
>Important: Update the names of the backend services in the deploy.bicep file before running the next command.
Create a deployment at resource group from a remote template file, update the parameters in the file.
az deployment group create –resource-group <resource-group-name> –template-file <path-to-your-bicep-file> –name apim-deployment
>Note: You can learn more about the bicep backend resource Microsoft.ApiManagement service/backends. Also about the CircuitBreakerRule Note: The following warning may be displayed when running the above command:
>Note: The following warning may be displayed when running the above command:
/path/to/deploy.bicep(102,3) : Warning BCP035: The specified “object” declaration is missing the following required properties: “protocol”, “url”. If this is an inaccuracy in the documentation, please report it to the Bicep Team. [https://aka.ms/bicep-type-issues]
Output:
{
“id”: “<deployment-id>”,
“location”: null,
“name”: “apim-deployment”,
“properties”: {
“correlationId”: “754b1f5b-323f-4d4d-99e0-7303d8f64695”,
.
.
.
“provisioningState”: “Succeeded”,
“templateHash”: “8062591490292975426”,
“timestamp”: “2024-09-07T06:54:37.490815+00:00”,
},
“resourceGroup”: “azure-apim”,
“type”: “Microsoft.Resources/deployments”
}
>Note: To view failed operations, filter operations with the ‘Failed’ state.
az deployment operation group list –resource-group <resource-group-name> –name apim-deployment –query “[?properties.provisioningState==’Failed’]”
The following is the deploy.bicep backend circuit breaker and load balancer configuration:
resource apiManagementService ‘Microsoft.ApiManagement/service@2023-09-01-preview’ existing = {
name: apimName
}
resource backends ‘Microsoft.ApiManagement/service/backends@2023-09-01-preview’ = [for (name, i) in backendNames: {
name: name
parent: apiManagementService
properties: {
url: ‘https://${name}.openai.azure.com/openai’
protocol: ‘http’
description: ‘Backend for ${name}’
type: ‘Single’
circuitBreaker: {
rules: [
{
acceptRetryAfter: true
failureCondition: {
count: 1
interval: ‘PT10S’
statusCodeRanges: [
{
min: 429
max: 429
}
{
min: 500
max: 503
}
]
}
name: ‘${name}BreakerRule’
tripDuration: ‘PT10S’
}
]
}
}
}]
And the part for the backend pool:
resource aoailbpool ‘Microsoft.ApiManagement/service/backends@2023-09-01-preview’ = {
name: ‘openaiopool’
parent: apiManagementService
properties: {
description: ‘Load balance openai instances’
type: ‘Pool’
pool: {
services: [
{
id: ‘/backends/${backendNames[0]}’
priority: 1
weight: 1
}
{
id: ‘/backends/${backendNames[1]}’
priority: 2
weight: 1
}
{
id: ‘/backends/${backendNames[2]}’
priority: 2
weight: 1
}
]
}
}
}
Step II: Create the API Management API
>Note: The following policy can be used in existing APIs or new APIs. the important part is to set the backend service to the backend pool created in the previous step.
Option I: Add to existing API
All you need to do is to add the following set-backend-service and the retry policies for activating the Load Balancer with Circuit Breaker module:
<set-backend-service id=”lb-backend” backend-id=”openaiopool” /><retry condition=”@(context.Response.StatusCode == 429)” count=”3″ interval=”1″ first-fast-retry=”true”>
<forward-request buffer-request-body=”true” />
</retry>
Option II: Create new API
Add new API
Go to your API Management instance.
Click on APIs.
Click on Add API.
Select ‘HTTP’ API.
Give it a name and set the URL suffix to ‘openai’.
>Note: The URL suffix is the path that will be appended to the API Management URL. For example, if the API Management URL is ‘https://apim-ai-features.azure-api.net‘, the URL suffix is ‘openai’, and the full URL will be ‘https://apim-ai-features.azure-api.net/openai‘.
Add “catch all” operation
Click on the API you just created.
Click on the ‘Design’ tab.
Click on Add operation.
Set the method to ‘POST’.
Set the URL template to ‘/{*path}’.
Set the name.
Click on ‘Save’.
>Note: The ‘catch all’ operation is planned to match all OpenAI requests, we achieve this by setting the URL template to ‘/{*path}’. for example:
Base URL will be: https://my-apim.azure-api.net/openai
Postfix URL will be: /deployments/gpt-4o/chat/completions?api-version=2024-06-01
The full URL will be: https://my-apim.azure-api.net/openai/deployments/gpt-4o/chat/completions?api-version=2024-06-01
Add the Load Balancer Policy
Select the operation you just created.
Click on the ‘Design’ tab.
Click on ‘Inbound processing’ policy button ‘</>’.
Replace the existing policy with this policy (showed below).
Click on ‘Save’.
This policy is set up to distribute requests across the backend pool and retry requests if the backend service is unavailable:
<policies>
<inbound>
<base />
<set-backend-service id=”lb-backend” backend-id=”openaiopool” />
<azure-openai-token-limit tokens-per-minute=”400000″ counter-key=”@(context.Subscription.Id)” estimate-prompt-tokens=”true” tokens-consumed-header-name=”consumed-tokens” remaining-tokens-header-name=”remaining-tokens” />
<authentication-managed-identity resource=”https://cognitiveservices.azure.com/” />
<azure-openai-emit-token-metric namespace=”genaimetrics”>
<dimension name=”Subscription ID” />
<dimension name=”Client IP” value=”@(context.Request.IpAddress)” />
</azure-openai-emit-token-metric>
<set-variable name=”traceId” value=”@(Guid.NewGuid().ToString())” />
<set-variable name=”traceparentHeader” value=”@(“00” + context.Variables[“traceId”] + “-0000000000000000-01″)” />
<set-header name=”traceparent” exists-action=”skip”>
<value>@((string)context.Variables[“traceparentHeader”])</value>
</set-header>
</inbound>
<backend>
<retry condition=”@(context.Response.StatusCode == 429)” count=”3″ interval=”1″ first-fast-retry=”true”>
<forward-request buffer-request-body=”true” />
</retry>
</backend>
<outbound>
<base />
<set-header name=”backend-host” exists-action=”skip”>
<value>@(context.Request.Url.Host)</value>
</set-header>
<set-status code=”@(context.Response.StatusCode)” reason=”@(context.Response.StatusReason)” />
</outbound>
<on-error>
<base />
<set-header name=”backend-host” exists-action=”skip”>
<value>@(context.LastError.Reason)</value>
</set-header>
<set-status code=”@(context.Response.StatusCode)” reason=”@(context.LastError.Message)” />
</on-error>
</policies>
>Important: The main policies taking part of the load balancing that will distribute the requests to the backend pool created in the previous step are the following: set-backend-service: This policy sets the backend service to the backend pool created in the previous step.
<set-backend-service id=”lb-backend” backend-id=”openaiopool” />
retry: This policy retries the request if the backend service is unavailable. in case the circuit breaker gets triggered, the request will be retried immediately to the next available backend service.
>Important: The value of count should be equal to the number of backend services in the backend pool.
<retry condition=”@(context.Response.StatusCode == 429)” count=”3″ interval=”1″ first-fast-retry=”true”>
<forward-request buffer-request-body=”true” />
</retry>
Step III: Configure Monitoring
Go to your API Management instance.
Click on ‘APIs’.
Click on the API you just created.
Click on ‘Settings’.
Scroll down to ‘Diagnostics Logs’.
Check the ‘Override global’ checkbox.
Add the ‘backend-host’ and ‘Retry-After’ headers to log.
Click on ‘Save’.
>Note: The ‘backend-host‘ header is the host of the backend service that the request was actually sent to. The ‘Retry-After‘ header is the time in seconds that the client should wait before retrying the request sent by the Open AI service overriding tripDuration of the backend circuit breaker setting.
>Note: Also you can add the request and response body to the HTTP requests in the ‘Advanced Options’ section.
Step IV: Prepare the OpenAI Service
Deploy the model
>Important: In order to use the load balancer configuration seamlessly, All the OpenAI services should have the same model deployed. The model should be deployed with the same name and version across all the services.
Go to the OpenAI service.
Select the ‘Model deployments’ blade.
Click the ‘Manage Deployments’ button.
Configure the model.
Click on ‘Create’.
Repeat the above steps for all the OpenAI services, making sure that the model is deployed with the same name and version across all the services.
Set the Managed Identity
>Note: The API Management instance should have the System/User ‘Managed Identity’ set to the OpenAI service.
Go to the OpenAI service.
Select the ‘Access control (IAM)’ blade.
Click on ‘Add role assignment’.
Select the role ‘Cognitive Services OpenAI User’.
Select the API Management managed identity.
Click on ‘Review + assign’.
Repeat the above steps for all the OpenAI services.
Step V: Test the Load Balancer
>Note: Calling the API Management API will require the ‘api-key’ header to be set to the subscription key of the API Management instance.
We are going to run the Chat Completion API from the OpenAI service through the API Management API. The API Management API will distribute the requests to the backend pool created in the previous steps.
Run the Python load-test script
Execute the test python script main.py to test the load balancer and circuit breaker configuration.
python main.py –apim-name apim-ai-features –subscription-key APIM_SUBSCRIPTION_KEY –request-max-tokens 200 –workers 5 –total-requests 1000 –request-limit 30
Explanation
python main.py: This runs the main.py script.
–apim-name apim apim-ai-features: The name of the API Management.
–key APIM_SUBSCRIPTION_KEY: This passes the API subscription key.
–request-max-tokens 200: The maximum number of tokens to generate per request in the completion (optional, as it defaults to 200).
–workers 5: The number of parallel requests to send (optional, as it defaults to 20).
–total_requests 1000: This sets the total number of requests to 1000 (optional, as it defaults to 1000).
–request-limit 30: The number of requests to send per second (optional, as it defaults to 20).
>Note: You can adjust the values of –batch_size and –total_requests as needed. If you omit them, the script will use the default values specified in the argparse configuration.
Test Results
ApiManagementGatewayLogs
| where OperationId == “chat-completion”
| summarize CallCount = count() by BackendId, BackendUrl
| project BackendId, BackendUrl, CallCount
| order by CallCount desc
| render barchart
Conclusion
In conclusion, leveraging Azure API Management significantly enhances the resiliency and capacity of Azure OpenAI service by distributing requests across multiple instances and implementing load-balancer with retry/circuit-breaker patterns.
These strategies improve service availability, performance, and stability. To read more look in Backends in API Management.
References
Azure API Management
Azure API Management terminology
API Management policy reference
API Management policy expressions
Backends in API Management
Error handling in API Management policies
Azure Tech Community
AI Hub Gateway Landing Zone accelerator
Microsoft Tech Community – Latest Blogs –Read More
Unlocking the Power of Responsible AI with Microsoft Azure
Discover how Microsoft is leading the way in responsible AI development with our comprehensive resources on Azure AI products. Learn about the latest features in Azure AI Content Safety, including prompt shields, custom categories, and groundedness detection, and understand their importance and implementation.
Learn More about Responsible AI
Key Resources:
Responsible AI Learning Modules: Dive deep into responsible AI practices.
YouTube Playlist: Watch tutorials and demos on Azure AI Content Safety.
Azure AI Content Safety Workshops: Hands-on training available on MS Learn and GitHub.
The resources are a deep-dive into the latest features of Azure AI Content Safety, including prompt shields, custom categories, and groundedness detection, and offer guidance on their importance and implementation.
Collection of Responsible AI Learning Modules: Explore Here
Responsible AI YouTube Playlist: Watch Here
Learn Modules for Azure AI Content Safety:
Azure AI Content Safety Studio Workshop
Azure AI Content Safety Code Workshop
Fundamentals of Responsible Generative AI – Training | Microsoft Learn
Manage and review models in Azure Machine Learning – Training | Microsoft Learn
Discover Microsoft guidelines for responsible conversational AI development – Training | Microsoft Learn
Discover Microsoft safety guidelines for responsible conversational AI development – Training | Microsoft Learn
Train a model and debug it with Responsible AI dashboard – Training | Microsoft Learn
Responsible AI in AI Studio – How to safeguard your generative AI applications in Azure AI (youtube.com)
Azure AI Studio Evaluations – How to evaluate and improve generative AI responses with Azure AI Studio (youtube.com)
Content Safety – How to build safe and responsible AI applications with Azure AI Content Safety (youtube.com)
Prompt Shields – How to mitigate GenAI security threats with Azure AI Content Safety prompt shields (youtube.com)
Tech Community Prompt Shields GA Blog – https://aka.ms/PromptShieldsGA
Tech Community Protected Material Detection GA Blog – https://aka.ms/ProtectedMaterialGA
Customizing content safety (thresholders, custom categories, etc) – How to customize generative AI guardrails with Azure AI Content Safety (youtube.com)
Groundedness detection – How to detect and mitigate generative AI hallucinations with Azure AI Content Safety (youtube.com)
YouTube: Azure AI Content Safety demo videos on Microsoft Developer channel for customer tutorials:
Visit the playlist here and check back for more updates: aka.ms/rai-playlist
GitHub: Azure AI Content Safety Workshop in Azure AI Studio
GitHub repo with train-the-trainer slide deck – GitHub – Azure-Samples/aacs-workshops: Samples for Azure AI Content Safety training modules
MS Learn: Azure AI Content Safety Workshop
Azure AI Content Safety Workshop on MS Learn (UI-based) aka.ms/aacs-studio-workshop
Azure AI Content Safety Workshop on MS Learn (Code) aka.ms/aacs-code-workshop
Microsoft Tech Community – Latest Blogs –Read More
Integrating vision into RAG applications
Retrieval Augmented Generation (RAG) is a popular technique to get LLMs to provide answers that are grounded in a data source. What do you do when your knowledge base includes images, like graphs or photos? By adding multimodal models into your RAG flow, you can get answers based off image sources, too!
Our most popular RAG solution accelerator, azure-search-openai-demo, now has an optional feature for RAG on image sources. In the example question below, the app answers a question that requires correctly interpreting a bar graph:
This blog post will walk through the changes we made to enable multimodal RAG, both so that developers using the solution accelerator can understand how it works, and so that developers using other RAG solutions can bring in multimodal support.
First let’s talk about two essential ingredients: multimodal LLMs and multimodal embedding models.
Multimodal LLMs
Azure now offers multiple multimodal LLMs: gpt-4o and gpt-4o-mini, through the Azure OpenAI service, and Phi-3.5-vision-instruct, through the Azure AI Model Catalog. These models allow you to send in both images and text, and return text responses. (In the future, we may have LLMs that take audio input and return non-text inputs!)
For example, an API call to the gpt-4o model can contain a question along with an image URL:
{
“role”: “user”,
“content”: [
{
“type”: “text”,
“text”: “Whats in this image?”
},
{
“type”: “image_url”,
“image_url”: { “url”: “https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg” }
}
]
}
Those image URLs can be specified as full HTTP URLs, if the image happens to be available on the public web, or they can be specified as base-64 encoded Data URIs, which is particularly helpful for privately stored images.
For more examples working with gpt-4o, check out openai-chat-vision-quickstart, a repo which can deploy a simple Chat+Vision app to Azure, plus includes Jupyter notebooks showcasing scenarios.
Multimodal embedding models
Azure also offers a multimodal embedding API, as part of the Azure AI Vision APIs, that can compute embeddings in a multimodal space for both text and images. The API uses the state-of-the-art Florence model from Microsoft Research.
For example, this API call returns the embedding vector for an image:
curl.exe -v -X POST “https://<endpoint>/computervision/retrieval:vectorizeImage?api-version=2024-02-01-preview&model-version=2023-04-15”
–data-ascii ” { ‘url’:’https://learn.microsoft.com/azure/ai-services/computer-vision/media/quickstarts/presentation.png’ }”
Once we have the ability to embed both images and text in the same embedding space, we can use vector search to find images that are similar to a user’s query. For an example, check out this notebook that setups a basic multimodal search of images using Azure AI Search.
Multimodal RAG
With those two multimodal models, we were able to give our RAG solution the ability to include image sources in both the retrieval and answering process.
At a high-level, we made the following changes:
Search index: We added a new field to the Azure AI Search index to store the embedding returned by the multimodal Azure AI Vision API (while keeping the existing field that stores the OpenAI text embeddings).
Data ingestion: In addition to our usual PDF ingestion flow, we also convert each PDF document page to an image, store that image with the filename rendered on top, and add the embedding to the index.
Question answering: We search the index using both the text and multimodal embeddings. We send both the text and the image to gpt-4o, and ask it to answer the question based on both kinds of sources.
Citations: The frontend displays both image sources and text sources, to help users understand how the answer was generated.
Let’s dive deeper into each of the changes above.
Search index
For our standard RAG on documents approach, we use an Azure AI search index that stores the following fields:
content: The extracted text content from Azure Document Intelligence, which can process a wide range of files and can even OCR images inside files.
sourcefile: The filename of the document
sourcepage: The filename with page number, for more precise citations
embedding: A vector field with 1536 dimensions, to store the embedding of the content field, computed using text-only OpenAI ada-002 model.
For RAG on images, we add an additional field:
imageEmbedding: A vector field with 1024 dimensions, to store the embedding of the image version of the document page, computed using the AI Vision vectorizeImage API endpoint.
Data ingestion
For our standard RAG approach, data ingestion involves these steps:
Use Azure Document Intelligence to extract text out of a document
Use a splitting strategy to chunk the text into sections. This is necessary in order to keep chunk sizes at a reasonable size, as sending too much content to an LLM at once tends to reduce answer quality.
Upload the original file to Azure Blob storage.
Compute ada-002 embeddings for the content field.
Add each chunk to the Azure AI search index.
For RAG on images, we add two additional steps before indexing: uploading an image version of each document page to Blob Storage and computing multi-modal embeddings for each image.
Generating citable images
The images are not just a direct copy of the document page. Instead, they contain the original document filename written in the top left corner of the image, like so:
This crucial step will enable the GPT vision model to later provide citations in its answers. From a technical perspective, we achieved this by first using the PyMuPDF Python package to convert documents to images, then using the Pillow Python package to add a top border to the image and write the filename there.
Question answering
Now that our Blob storage container has citable images and our AI search index has multi-modal embeddings, users can start to ask questions about images.
Our RAG app has two primary question asking flows, one for “single-turn” questions, and the other for “multi-turn” questions which incorporates as much conversation history that can fit in the context window. To simplify this explanation, we’ll focus on the single-turn flow.
Our single-turn RAG on documents flow looks like:
Receive a user question from the frontend.
Compute an embedding for the user question using the OpenAI ada-002 model.
Use the user question to fetch matching documents from the Azure AI search index, using a hybrid search that does a keyword search on the text and a vector search on the question embedding.
Pass the resulting document chunks and the original user question to the gpt-3.5 model, with a system prompt that instructs it to adhere to the sources and provide citations with a certain format.
Our single-turn RAG on documents-plus-images flows looks like this:
Receive a user question from the frontend.
Compute an embedding for the user question using the OpenAI ada-002 model AND an additional embedding using the AI Vision API multimodal model.
Use the user question to fetch matching documents from the Azure AI search index, using a hybrid multivector search that also searches on the imageEmbedding field using the additional embedding. This way, the underlying vector search algorithm will find results that are both similar semantically to the text of the document but also similar semantically to any images in the document (e.g. “what trends are increasing?” could match a chart with a line going up and to the right).
For each document chunk returned in the search results, convert the Blob image URL into a base64 data-encoded URI. Pass both the text content and the image URIs to a GPT vision model, with this prompt that describes how to find and format citations: The documents contain text, graphs, tables and images.
Each image source has the file name in the top left corner of the image with coordinates (10,10) pixels and is in the format SourceFileName:<file_name>
Each text source starts in a new line and has the file name followed by colon and the actual information. Always include the source name from the image or text for each fact you use in the response in the format: [filename]
Answer the following question using only the data provided in the sources below.
The text and image source can be the same file name, don’t use the image title when citing the image source, only use the file name as mentioned.
Now, users can ask questions where the answers are entirely contained in the images and get correct answers! This can be a great fit for diagram-heavy domains, like finance.
Considerations
We have seen some really exciting uses of this multimodal RAG approach, but there is much to explore to improve the experience.
More file types: Our repository only implements image generation for PDFs, but developers are now ingesting many more formats, both image files like PNG and JPEG as well as non-image files like HTML, docx, etc. We’d love help from the community in bringing support for multimodal RAG to more file formats.
More selective embeddings: Our ingestion flow uploads images for *every* PDF page, but many pages may be lacking in visual content, and that can negatively affect vector search results. For example, if your PDF contains completely blank pages, and the index stored the embeddings for those, we have found that vector searches often retrieve those blank pages. Perhaps in the multimodal space, “blankness” is considered similar to everything. We’ve considered approaches like using a vision model in the ingestion phase to decide whether an image is meaningful, or using that model to write a very descriptive caption for images instead of storing the image embeddings themselves.
Image extraction: Another approach would be to extract images from document pages, and store each image separately. That would be helpful for documents where the pages contain multiple distinct images with different purposes, since then the LLM would be able to focus more on only the most relevant image.
We would love your help in experimenting with RAG on images, sharing how it works for your domain, and suggesting what we can improve. Head over to our repo and follow the steps for deploying with the optional GPT vision feature enabled!
Microsoft Tech Community – Latest Blogs –Read More
Microsoft 365 Admin Center to Support Continuous Access Evaluation
Continuous Access Evaluation Revokes Access Immediately
The announcement in message center notification MC884015 (5 Sept 2024) that the Microsoft 365 admin center (Figure 1) will implement continuous access evaluation (CAE) in September 2024 is very welcome. Microsoft implemented CAE for Exchange Online, SharePoint Online, and Teams in January 2022.
Implementing CAE means that the Microsoft 365 admin center can respond to critical events that occur such as user account password changes or if a connection originates from an unexpected IP address. If an administrator account is unfortunate enough to be compromised, CAE will ensure that the credentials used to access the admin center will expire immediately after the password is changed for the account or access is revoked for the account.
Speed is Key
Speed is of the essence when it comes to responding to attacks and making sure that credentials are invalidated and forcing reauthentication as soon as possible is helpful. CAE replaces older methods like waiting for an access token to expire. The problem with waiting for access tokens to age out is that unauthorized access could persist for up to an hour after the compromise occurs.
Of course, it’s even better to stop compromise by making sure that administrator accounts are protected by strong multifactor authentication such as the Microsoft administrator app or passkeys. Even though we’ve known that this is true for years, the percentage of Microsoft 365 accounts protected by multifactor authentication is still disappointing (38% in February 2024). In that context, being able to revoke access to critical administrative tools like the Microsoft 365 admin center is important.
Other Microsoft 365 Administrative Portals
The Microsoft 365 Admin Center is a headline administrative portal and it’s important that Microsoft protects it with CAE. However, this step shouldn’t be seen as bulletproof protection for a tenant because it is not. There’s no news about support for CAE in other important administrative portals like the Purview compliance portal and the Defender portal.
Although it would be good for CAE to be supported in all Microsoft 365 admin centers, the fact remains that this might not be enough to stop an attacker. As noted above, speed is key after an attacker penetrates a tenant. Waiting for a GUI slows down an attacker, who can use automated scripting using PowerShell and Graph API requests to perform actions like the creation of new accounts and permissioned apps. Firing off some scripts to infect a tenant thoroughly is a lot more efficient than using an admin center. This underlines the need to stop attackers getting into a tenant. CAE is a kind of plaster that will heal some of the damage, but it can’t stop attackers wreaking havoc if they manage to compromise an account holding administrative roles.
Continuous Access Evaluation is a Good Thing
Don’t get me wrong. I strongly endorse the implementation of Continuous Access Evaluation across the administrative landscape of Microsoft 365 tenants. Anything that slows or obstructs attackers is a good thing. Everything that complicates the process of compromise is valued.
The sad thing is that 38% figure for accounts protected by multifactor authentication reported above. Taking Microsoft’s reported figure of 400 million paid Office 365 seats, that means only 152 million accounts use multifactor authentication and almost 250 million do not. That’s just too many lucrative targets for the bad guys to go after. We need to do better.
So much change, all the time. It’s a challenge to stay abreast of all the updates Microsoft makes across the Microsoft 365 ecosystem. Subscribe to the Office 365 for IT Pros eBook to receive monthly insights into what happens, why it happens, and what new features and capabilities mean for your tenant.
GoTo Group Berkolaborasi dengan Microsoft Tingkatkan Produktivitas Tim Engineering dengan GitHub Copilot
Gambar utama: Engineer GoTo bekerja dengan GitHub Copilot untuk mengakselerasi laju inovasi (Foto oleh GoTo Group)
Read the English version here
Jakarta, 10 September 2024 – GoTo Group, ekosistem digital terkemuka di Indonesia, telah mengambil langkah signifikan untuk meningkatkan produktivitas tim engineer-nya dengan berkolaborasi bersama Microsoft Indonesia dalam menggunakan GitHub Copilot, solusi AI yang paling banyak digunakan developer di dunia.
Dibekali dengan kapabilitas AI, GitHub Copilot meningkatkan produktivitas dan kepuasan engineer dalam mengerjakan tugas coding sehari-hari. Mulai dari memberikan saran coding secara real time, mendampingi keseluruhan proses coding di integrated development environment[1] melalui fitur chat, hingga menyederhanakan konsep coding kompleks dalam bahasa sehari-hari.
Hans Patuwo, Chief Operating Officer, GoTo mengatakan, “Sejak Juni 2024, hampir seribu engineer GoTo mulai mengadopsi GitHub Copilot, dengan rencana implementasi penuh diharapkan selesai pada pertengahan Oktober 2024. Penggunaan asisten coding berbasis AI ini memungkinkan engineer kami untuk meningkatkan kualitas code dan mengerjakan lebih banyak hal dalam waktu yang lebih singkat. Para engineer GoTo pun telah melaporkan penghematan waktu yang signifikan, dengan penghematan rata-rata lebih dari 7 jam per minggu sehingga memungkinkan mereka untuk berinovasi dengan lebih cepat dan memberikan nilai lebih kepada pengguna kami.”
Engineer GoTo mempercepat proses coding dengan GitHub Copilot (Foto oleh GoTo Group)
Selain penghematan waktu, para engineer GoTo juga telah menerima 30% rekomendasi code yang diberikan GitHub Copilot, dalam bulan pertama mereka menggunakan solusi AI ini. Sebuah angka solid yang berada di rentang atas tingkat penerimaan rekomendasi code GitHub Copilot, yang secara global umumnya berada di 26-30%.
Naya Hodi, Manajer Merchant Gateway, GoTo mengatakan, “GitHub Copilot secara signifikan mengurangi kesalahan sintaks dan menyediakan fitur autocomplete yang sangat membantu. Dengan memanfaatkan GitHub Copilot tim bisa mengurangi pekerjaan berulang dan membuat coding jadi lebih efisien. Ini memungkinkan saya dan tim untuk fokus pada hal-hal yang lebih kompleks dalam mengembangkan software.”
“Kami sangat senang dapat mendukung GoTo dengan GitHub Copilot, membekali engineers mereka dengan bantuan teknologi AI dalam pekerjaan sehari-hari, khususnya untuk mengembangkan software. Kami juga mengapresiasi langkah GoTo untuk berkolaborasi langsung dengan para engineer dalam mengkuantifikasi nilai Copilot sejak awal. Melalui kuantifikasi ini, GoTo mampu membangun momentum kuat untuk memperluas dan mempercepat adopsi teknologi AI di seluruh tim engineering mereka. Kami senang dapat terus membantu GoTo untuk menghasilkan software secara lebih cepat, dan mengakselerasi laju inovasi yang berdampak nyata bagi Indonesia,” ujar Sharryn Napier, Vice President, APAC at GitHub.
Langkah GoTo dalam mengintegrasikan AI ke dalam sistem operasional perusahaan memperkuat komitmen perusahaan untuk mendorong inovasi sembari memberdayakan tenaga kerja, guna menghasilkan teknologi yang berdampak positif bagi pengguna.
“Kami merasa terhormat dapat mendukung GoTo dalam misinya untuk memberdayakan kemajuan, dengan menghadirkan solusi teknologi yang memungkinkan semua orang bisa berkembang di era ekonomi digital. Dengan mengintegrasikan GitHub Copilot ke dalam proses kerja engineer, GoTo memberikan kesempatan kepada tim mereka untuk berinovasi lebih cepat, meningkatkan produktivitas, dan pada akhirnya memberikan nilai lebih kepada pengguna. Kepercayaan dan teknologi senantiasa berjalan beriringan, dan kolaborasi ini menegaskan komitmen bersama kami untuk mempercepat transformasi digital inklusif, sebagai bagian dari inisiatif Berdayakan Indonesia Microsoft,” kata Andrew Boyd, General Manager, Digital Natives & Startups, Microsoft Asia.
###
[1]Piranti lunak yang menyediakan berbagai alat untuk pengembangan piranti lunak dalam satu aplikasi
GoTo Group Collaborates with Microsoft to Boost Engineering Productivity with GitHub Copilot
Featured image: GoTo engineers work using GitHub Copilot to accelerate the pace of innovation (Photo by GoTo Group)
Jakarta, 10 September 2024 – GoTo Group, Indonesia’s leading digital ecosystem, has taken a significant step forward in enhancing productivity across its engineering teams by collaborating with Microsoft Indonesia to adopt GitHub Copilot, the world’s most widely adopted AI developer tool.
GitHub Copilot significantly boosts engineers’ productivity and happiness in daily coding tasks with its AI capabilities, from real-time code suggestions, chat assistance in the integrated development environment*, to breaking down complex coding concepts using daily conversational language.
Hans Patuwo, Chief Operating Officer, GoTo said, “Since June 2024, almost a thousand GoTo’s engineers have adopted GitHub Copilot, with full rollout expected by mid-October 2024. The adoption of this AI-powered coding assistant has enabled our engineers to enhance code quality and to accomplish more in less time. GoTo engineers have reported significant time savings averaging over seven hours per week, allowing them to innovate with greater speed and to bring more value to our users.”
GoTo engineer accelerates coding process with GitHub Copilot (Photo by GoTo Group)
Aside from time saving, GoTo’s engineers are also already seeing an early code acceptance rate of 30% within their first month adopting GitHub Copilot. This means 30% of code suggestions made by GitHub Copilot are accepted or used by GoTo engineers — a solid figure on the higher end of the average acceptance rate of 26-30% typically seen among engineers using GitHub Copilot.
Nayana Hodi, Engineering Manager, GoTo shared, “GitHub Copilot has significantly reduced syntax errors and provided helpful autocomplete features, eliminating repetitive tasks and making coding more efficient. This has allowed me to focus on the more complex elements in building great software.”
“We are thrilled to empower GoTo with GitHub Copilot, equipping their engineers with AI across the software development lifecycle. GoTo has implemented an impressive evaluation strategy, collaborating directly with engineers to collect first-hand measurements that showcase real impact. By quantifying the value of Copilot from the start, GoTo is building strong momentum for widespread adoption and accelerated learning across their engineering team. We’re excited to continue this journey, helping GoTo ship software ahead of the curve and accelerate the pace of innovation,” said Sharryn Napier, Vice President, APAC at GitHub.
GoTo’s move in integrating AI into the company’s workflow underscores its commitment to driving innovation while empowering its workforce to deliver impactful technology at scale.
“We are proud to support GoTo in their mission to empower progress by offering technology infrastructure and solutions that enable everyone to thrive in the digital economy. By integrating GitHub Copilot into their engineering processes, GoTo equips their teams with the tools to innovate faster, enhance productivity, and ultimately deliver greater user value. Trust and technology go hand-in-hand, and this collaboration underscores our shared commitment to harnessing AI technology, creating meaningful opportunities for all Indonesians, and accelerating an inclusive digital transformation agenda as part of Microsoft’s Empowering Indonesia initiative,” stated Andrew Boyd, General Manager, Digital Natives & Startups, Microsoft Asia.
###
* Software that provides various tools for software development in a single application.
Accelerating water wading simulation using Altair® nanoFluidX® on Azure Nvidia A100 and Nvidia H100
Over the last few weeks we have been working together with Altair engineers to verify and validate their nanoFluidX v2024 product on Azure. This software offers significant advantages for engineers tackling problems where traditional CFD technology requires significant manual time and heavy computational resources. Vehicle wading is an important durability attribute where engineers monitor water reach and accumulation, and assess the potential for damage caused by water impact.
nanoFluidX’s Lagrangian meshless approach was designed from inception for GPU compute using NVIDIA CUDA, making it one of the fastest SPH solvers on the market. Models can be setup incredibly quickly, giving engineers the power to iterate faster.
With this validation, the intention was to look at the GPU compute possibilities in two ways: how will nanoFluidX perform on the Nvidia H100 series GPUs, and how it will work while scaling up to 8-way GPU virtual machines (VMs). Let’s look at the A100 and the H100 first.
The NC_A100_v4 has 3 flavors, with 1, 2 or 4 A100 80GB GPUs. In the basis, these are PCIe based GPUs, but internally they are NVlink connected in pairs. The rest of the system consists of 24 (non-multithreaded) AMD Milan CPU cores, 220GB main memory, and a 960TB NVME local scratch disk per GPU. When selecting a 2 or 4 GPU VMs, these numbers are multiplied up to a total of 880GB main memory.
The NC_H100_v5 has grown together with the GPU capabilities. It is available in a 1 or 2 GPU configuration built around the Nvidia 94GB H100 NVL. While this GPU has a PCIe interface towards the main system, many of the capabilities are in line with the SXM H100 series. The CPU cores are increased to 40 (non-multi-threaded) AMD Genoa CPU cores and 320 GB of main memory together with an upgraded 3.5TB NVME local scratch disk.
The benchmark that was run for this validation is the Altair CX-1 car model. This benchmark represents a production-scale model of a full size vehicle traveling at 10 km/h through a 24 meter wading channel in 15 seconds.
“Collaborating with Microsoft and NVIDIA, we have successfully validated nanoFluidX v2024 on NVIDIA’s A100 and H100 GPUs. The latest release boasts a solver that is 1.5x faster than previously, and offers improved scaling on multiple GPUs. These benchmarks show the use of NVIDIA H100 significantly enhances performance by up to 1.8x, cutting simulation times and accelerating design cycles. These advancements solidify nanoFluidX as one of the fastest Smoothed-particle Hydrodynamic (SPH) GPU code on the market.” – David Curry, Senior Vice President, CFD and EDEM, Altair.
As can been seen in the table below, the H100 delivers higher performance than the A100, which is in line with the published performance increase between the two generations by Nvidia. Therefore, both the software and the Azure VMs allow these GPUs to reach their compute potential.
Since nanoFluidX supports multi-GPU systems, we wanted to validate the scalability and test them on the 8-way GPU ND series. Again, we tested on both the Nvidia A100 systems, the NDads_A100_v4, and its successor based on the Nvidia H100: the NDisr_H100_v5. Both of these systems have all 8 GPUs interconnected through NVlink.
Chart showing performance increase for H100 (NCv5) over A100 (NCv4)
As shown in the table above, nanoFluidX effectively utilizes all GPU power. On the NDisr_H100_v5, it achieved a 1-hour simulation duration, significantly impacting turnaround and design cycle time.
While you can simply go to the portal, request a quota, and spin up these VMs, we often see customers seeking an HPC environment that integrates better into their workflow for production. Altair offers a solution to run projects on Azure through their Altair One platform. Please collaborate with your Altair representative to enable this Azure-based solution for you. Alternatively, you can use the Altair SaaS solution, Altair Unlimited, from the virtual appliance marketplace to deploy and manage your own HPC cluster on Azure. To enable GPU quotas for HPC, please coordinate with your Azure account manager.
#AzureHPCAI
Microsoft Tech Community – Latest Blogs –Read More
Deploying .dapacs to Azure SQL via Azure DevOps Pipelines
Introduction
This post is part of the SQL Database series that I am compiling. This specific topic assumes that you have already built a .dacpac file via Azure DevOps YAML Pipeline and are ready to now deploy your .dacpac to Azure. Congratulations! If you’d like to follow along all source code is in my GitHub repository.
PreReqs
To be successful here we’d require some items to be setup:
An Azure SQL Server and Database already deployed in Azure
An Azure DevOps Service Connection that has access to deploy the database (for me I like to have the Service Connection be a member of the Entra SQL Admin group, more on that later)
A .dacpac built by ADO and ready to publish. I used the SDK style project to create mine, you could use other methods as long as you have the .dacpac file.
Network connectivity to the Database. For this specific example we will be using an Azure SQL instance and leverage MS Hosted Azure DevOps Agents. Variations of this process is possible leveraging either a Windows self hosted build agents or the newer Managed DevOps Pools.
Deploy Steps
When writing one of these I have found it can be helpful to write out the individual steps required for our build. In our case it will consist of:
Download the pipeline artifact
Open up on the Azure SQL Server Firewall to the Agent
Deploy .dacpac
Delete the Azure SQL Server Firewall rule
The good news here is that first, the job will automatically download the pipeline artifact. To reiterate this will download the .dacpac which was built in the previous stage so that subsequent jobs can leverage it for deployments to one or more environments. The second piece of good news is the opening of the Azure SQL Server Firewall rules can be handled by SqlAzureDacpacDeployment@1 task. In addition, there is the option to delete the firewall rule after the task has been completed.
So, this means we effectively just need a single job in our deployment stage!
SqlAzureDacpacDeployment@1
Here is the YAML code for the job to handle the deployment:
jobs:
– deployment: sqlmoveemecicd_app_dev_eus
environment:
name: dev
dependsOn: []
strategy:
runOnce:
deploy:
steps:
– task: SqlAzureDacpacDeployment@1
displayName: Publish sqlmoveme on sql-moveme-dev2-eus.database.windows.net
inputs:
DeploymentAction: Publish
azureSubscription: [Insert Service Connection Name]
AuthenticationType: servicePrincipal
ServerName: [SQL Server Destination]
DatabaseName: [SQL Database Destination]
deployType: DacpacTask
DacpacFile: $(Agent.BuildDirectory)sqlmoveme***.dacpac
AdditionalArguments: ”
DeleteFirewallRule: True
So, an item here to discuss which is a bit of a pre-requisite when discussing Azure DevOps YAML Pipelines is the deployment job concept. I do cover this in a previous post in the YAML Pipeline series on Azure DevOps Pipelines: Tasks, Jobs, Stages. Suffice it to say deployment jobs are special types of jobs in Azure DevOps which are to be leveraged for the actual deployment of artifacts and one of the key capabilities of deployment jobs is the ability to tie them to an environment. Environments can have gates which is a set of criteria that can include manual or automatic checks prior to deployment.
Lets take a step back and talk a little more on some of the requirements. First, we need to establish authentication from Azure DevOps to our Azure SQL Server. The most secure way to do this would be through Entra Authentication. The credentials we will be using would be the service principal associated with an Azure DevOps Service connection. This connection will either have credentials stored as part of the App Registration or leveraged workload identity federation.
Personally, I would recommend using the workload identity federation process as this will eliminate the need for a secret. This Service Connection can be the same one used to deploy other resources in the environment, though I understand and respect the separation of data and management plane activities so a separate one specific to the database is acceptable. If you’d rather not use Entra Auth for authenticating to the database, you can alternatively pass credentials stored in a Variable Group, though usually it’s a good idea not to use passwords when possible.
So now that we know how and what we are going to authenticate with it’s time to go over how the access to the service account would be provisioned. When configuring an Azure SQL Server one can designate an Entra Security group as the Admin.
Below is a screenshot showing that the Microsoft Entra Admin ID has been granted to an Entra Security Group. If this is a new concept please follow up with Using automation to set up the Microsoft Entra admin for SQL Server – SQL Server | Microsoft Learn. Additionally, here is a great walkthrough put together by MVP Stephan van Rooij, Azure SQL and Entra ID authentication, tips from the field.
The service principle being used for the deployment is in turn added to this group. Thus, our deployment will have full access to deploy the .dacpac to the Azure SQL Server. Additionally, I have Microsoft Entra Authentication only configured; this is considered a best practice as SQL user credentials expose a potential credential liability. If new to this concept feel free to read more on Microsoft Entra-only authentication – Azure SQL Database & Azure SQL Managed Instance & Azure Synapse Analytics | Microsoft Learn
Results
After we add this deployment stage to our previous build stage our results in ADO will look like:
A two stage pipeline where the first stage will generate an artifact of our .dacpac and the second stage which will take the .dacpac produced in the first stage and deploy it. A complete YAML definition of this pipeline can be found on my GitHub repository.
Next Steps
Now that we have covered how to effectively build .sqlproj into a .dacpac anddeploy said .dacpac to Azure our next step will be to deploy tto multiple environments via different configurations! Feel free to subscribe to this series on SQL Databases alternatively if you like my posts feel free to follow me.
Microsoft Tech Community – Latest Blogs –Read More
Exploring AI Agent-Driven Auto Insurance Claims RAG Pipeline.
Introduction:
In this post, I explore a recent experiment aimed at creating a RAG pipeline tailored for the insurance industry, specifically for handling automobile insurance claims, with the goal of potentially reducing processing times.
I also showcase the implementation of Autogen AI Agents to enhance search retrieval through agent interaction and function calls on sample auto insurance claims documents, a Q&A use case, and how this workflow can substantially reduce the time required for claims processing.
RAG workflows in my opinion represent a novel data stack, distinct from traditional ETL processes. Although they encompass data ingestion and processing similar to traditional ETL in data engineering, they introduce additional pipeline stages like chunking, embedding, and the loading of data into vector databases, diverging from the standard Lakehouse or data warehouse pipelines.
Each stage of the RAG application workflow is pivotal to the accuracy and pertinence of the downstream LLM application. One of these stages is the chunking method, and for this proof of concept, I chose to test a page-based chunking technique that leverages the document’s layout without relying on third party packages.
Key Services and Features:
By leveraging enterprise-grade features of Azure AI services, I can securely integrate Azure AI Document Intelligence, Azure AI Search, and Azure OpenAI through private endpoints. This integration ensures that the solution adheres to best practice cybersecurity standards. In addition, it offers secure network isolation and private connectivity to and from virtual networks and associated Azure services.
Some of these services are:
Azure AI Document Intelligence and the prebuilt-layout model.
Azure AI Search Index and Vector database configured with the HNSW search algorithm.
Azure OpenAI GPT-4-o model.
Page-based Chunking technique.
Autogen AI Agents.
Azure Open AI Embedding model: text-ada-003.
Azure Key Vault.
Private Endpoints integration across all services.
Azure Blob Storage.
Azure Function App. (This serverless compute platform can be replaced with Microsoft Fabric or Azure Databricks)
Document Extraction and Chunking:
These templates include forms with data detailing the accident location, description, vehicle information of the involved parties, and any injuries sustained. Thanks to the folks at LlamaIndex for providing the sample claims documents. Below is a sample of the forms template.
The claim documents are PDF files housed in Azure Blob Storage. Data ingestion begins from the container URL of the blob storage using the Azure AI Document Intelligence Python SDK.
This implementation of a page-based chunking method utilizes the markdown output from the Azure AI Document Intelligence SDK. The SDK, setup with the prebuilt-layout extraction model, extracts the content of pages, including forms and text, into markdown formats, preserving the document’s specific structure, such as paragraphs and sections, and its context.
The SDK facilitates the extraction of documents page by page, via the pages collection of the documents, allowing for the sequential organization of markdown output data. Each page is preserved as an element within a list of pages, streamlining the process of efficiently extracting page numbers for each segment. More details about the document intelligence service and layout model can be found at this link.
The snippet below illustrates the process of page-based extraction, preprocessing of page elements, and their assignment to a Python list:
Each page content will be used as the value of the content field in the vector database index, alongside other metadata fields in the vector index. Each page content is its own chunk and will be embedded before being loaded into the vector database. The following snippet demonstrates this operation:
Define Autogen AI Agents and Agent Tool/Function:
The concept of an AI Agent is modeled after human reasoning and the question-and-answer process. The agent is driven by a Large Language Model (its brain), which assists in determining whether additional information is required to answer a question or if a tool needs to be executed to complete a task.
In contrast, non-agentic RAG pipelines incorporate meticulously designed prompts that integrate context information (typically through a context variable within the prompt) sourced from the vector store before initiating a request to the LLM for a response. AI agents possess the autonomy to determine the “best” method for accomplishing a task or providing an answer. This experiment presents a straightforward agentic RAG workflow. In upcoming posts, I will delve into more complex, agent-driven RAG solutions. More details about Autogen Agents can be accessed here.
I set up two Autogen agent instances designed to simulate or engage in a question-and-answer chat conversation among themselves to carry out search tasks based on the input messages. To facilitate the agents’ ability to search and fetch query results from the Azure AI Search vector store via function calls, I authored a Python function that will be associated with these agents. The AssistantAgent, which is configured to invoke the function, and the UserProxyAgent, which is tasked with executing the function, are both examples of the Autogen Conversable Agent class.
The user agent begins a dialogue with the assistant agent by asking a question about the search documents. The assistant agent then gathers and synthesizes the response according to the system message prompt instructions and the context data retrieved from the vector store.
The snippets below provide the definition of Autogen agents and a chat conversation between the agents. The complete notebook implementation is available in the linked GitHub repository.
Last Thoughts:
The assistant agent correctly answered all six questions, aligning with my assessment of the documents’ information and ground truth. This proof of concept demonstrates the integration of pertinent services into a RAG workflow to develop an LLM application, which aims to substantially decrease the time frame for processing claims in the auto insurance industry scenario.
As previously stated, each phase of the RAG workflow is crucial to the response quality. The system message prompt for the Assistant agent needs precise crafting, as it can alter the response outcomes based on the set instructions. Similarly, the custom retrieval function’s logic plays a significant role in the agent’s ability to locate and synthesize responses to the messages.
The accuracy of the responses has been assessed manually. Ideally, this process should be automated.
In an upcoming post, I intend to explore the automated evaluation of the RAG workflow. Which methods can be utilized to accurately assess and subsequently refine the RAG pipeline?
Both the retrieval and generative stages of the RAG process require thorough evaluation.
What tools can we use to accurately evaluate the end-to-end phases of a RAG workflow, including extraction, processing, and chunking strategies? How can we compare various chunking methods, such as the page-based chunking described in this article versus the recursive character text split chunking option?
How do we compare the retrieval results of an HNSW vector search algorithm against the KNN exhaustive algorithm?
What kind of evaluation tools are available and what metrics can be captured for agent-based systems?
Is a one-size-fits-all tool available to manage these? We will find answers to these questions.
Moreover, I would also like to examine and assess how this and other RAG and generative ai workflows are reviewed to ensure alignment with the standards of fairness, reliability and safety, privacy and security, inclusiveness, transparency, and accountability as defined in the Responsible AI Ethics framework for building and developing these systems.
Microsoft Tech Community – Latest Blogs –Read More
Microsoft Defender for Identity: the critical role of identities in automatic attack disruption
In today’s digital landscape, cyber-threats are becoming increasingly sophisticated and frequent. Advanced attacks are often multi-workload and cross-domain, requiring organizations to deploy robust security solutions to counter this complexity and protect their assets and data. Microsoft Defender XDR offers a comprehensive suite of tools designed to prevent, detect and respond to these threats. With speed and effectiveness being the two most important elements in incident response, Defender XDR tips the scale back to defenders with automatic attack disruption.
What is Automatic attack disruption?
Automatic attack disruption is an AI-powered capability that uses the correlated signals in Microsoft Defender XDR to stop and prevent further damage of in-progress attacks. What makes this disruption technology so differentiated is our ability to recognize the intent of an attacker and accurately predict, then stop, their next move with an extremely high level of confidence. This includes automated response actions such as containing compromised devices, disabling compromised user accounts, or disabling malicious OAuth apps. The benefits of attack disruption include:
Disruption of attacks at machine speed: with an average time of 3 minutes to disrupt ransomware attacks, attack disruption changes the speed of response for most organizations.
Reduced Impact of Attacks: by minimizing the time attackers have to cause damage, attack disruption limits the lateral movement of threat actors within your network, reducing the overall impact of the threat. This means less downtime, fewer compromised systems, and lower recovery costs.
Enhanced Security Operations: attack disruption allows security operations teams to focus on investigating and remediating other potential threats, improving their efficiency and overall effectiveness.
The role of Defender for Identity
While attack disruption occurs at the Defender XDR level, it’s important to note that Microsoft Defender for Identity, delivers critical identity signals and response actions to the platform. At a high level, Defender for Identity helps customers better protect their identity fabric through identity-specific posture recommendations, detections and response actions. These are correlated with the other workload signals in the Defender platform and attributed to a high-fidelity incident. Within the context of attack disruption, Defender for Identity enables user specific response actions including:
Disabling user accounts: When a user account is compromised, Defender for Identity can automatically disable the account to prevent further malicious activities. Whether the identity in question is managed in Active Directory on-premises or Entra ID in the cloud, Defender is able to take immediate action and help contain the threat and protect your organization’s assets.
Resetting passwords: In cases where a user’s credentials have been compromised, Defender for Identity can force a password reset. This ensures that the attacker can no longer use the compromised credentials to access your systems
Microsoft Defender XDR’s automatic disruption capability is a game-changer in the world of cybersecurity. Powered by Microsoft Security intelligence and leveraging AI and machine learning, it provides real-time threat mitigation, reduces the impact of attacks, and enhances the efficiency of security operations. However, to fully realize the benefits of automatic disruption, it’s essential to include Defender for Identity in your security strategy, filling a critical need in your defenses.
Use this quick installation guide to deploy Defender for Identity.
Microsoft Tech Community – Latest Blogs –Read More
Azure Communication Services at the DEVIntersection Conference
Join us for the DEVintersection Conference from September 10 to 12, 2024, in Las Vegas, Nevada. This event gathers technology enthusiasts from across the globe. Whether you are a developer, IT professional, or business leader, this conference provides an exceptional chance to explore the latest in cloud technology.
Our experts from Azure Communication Services will be there at the event and will be hosting the following sessions.
Take Your Apps to the Next Level: Azure OpenAI, Communication, and Organizational Data Features
Sept 10, 14:00 – 15:00 | Lab | Grand Ballroom 118 | Dan Wahlin
Many of us are building Line of Business (LOB) apps that can integrate data from custom APIs and 3rd party data sources. That’s great, but when was the last time you sat down to think through how you can leverage the latest technologies to take the user experience to the next level?
In this session, Dan Wahlin introduces different ways to enhance customer experience by adding calling and SMS. You can integrate organizational data to minimize user context shifts, leverage the power of AI to enhance customer productivity, and take your LOB apps to the next level.
Register and add this session.
Bridging the Gap: Integrating Custom Applications with Microsoft Using Azure
Sept 11, 15:30 – 16:30 | Session | Grand Ballroom 122 | Milan Kaur
Discover how Azure can facilitate seamless integration between non-Teams users and Microsoft Teams. If you’re invested in Teams and seeking to develop audio-video solutions to connect your custom third-party applications with Teams, this session is for you. Join us to explore the possibilities and streamline collaboration beyond internal Teams.
Register and add this session.
Beyond chatbots: add multi-channel communication to your AI apps
Sept 12, 08:30 – 09:30 | Session | Grand Ballroom 122 | Milan Kaur
Unlock the potential of conversational AI with Azure!
In this session, discover how to extend your bot’s functionality beyond standard chat interactions. We’ll learn together how to add voice and other messaging channels such as WhatsApp to build pro-code AI bots grounded in custom data.
Register and add this session.
About our speakers
Dan Wahlin
Principal Cloud Developer Advocate
Milan Kaur
Senior Product Manager
Dan Wahlin is a Principal Cloud Developer Advocate at Microsoft focusing on Microsoft 365 and Azure integration scenarios. In addition to his work at Microsoft, Dan creates training courses for Pluralsight, speaks at conferences and meetups around the world, and offers webinars on a variety of technical topics.
Twitter: @DanWahlin
Milan is a seasoned software engineer turned product manager passionate about building innovative communication tools. With over a decade of experience in the industry, she has a deep understanding of the challenges and opportunities in the field of cloud communications.
LinkedIn: @milankaurintech
Microsoft Tech Community – Latest Blogs –Read More