Category: Microsoft
Category Archives: Microsoft
Enable Chat History on Azure AI Studio with Azure Cosmos DB
Azure AI Studio offers a feature that allows you to enable chat history for your web app users. This feature provides your users with access to their previous queries and responses, allowing them to easily reference past conversations. Check out the blog below for the full details on how to enable it today!
Benefits of enabling chat history
With Azure AI Studio, Developers can build a chatbot with cutting-edge models that draws on your own data for informed and custom responses to customers’ questions. In addition, you can incorporate multimodality – enabling your app to see, hear, and speak by pairing Azure OpenAI Service with Speech and Vision models.
Streamline customer support: Chat history serves as a powerful ally for streamlining customer support services. By referencing past chat logs, support teams gain the ability to quickly find solutions for customers. This enhances the efficiency of issue resolution while enabling support agents to manage request volumes effectively leading to improved customer satisfaction.
Data Analytics: Analyzing past interactions provides valuable insights into user behavior, preferences, and recurring issues. Armed with this data, you can make informed decisions to optimize user experiences, tailor content, and refine your application’s performance. The analytics derived from chat history pave the way for data-driven strategies, ensuring your application evolves in tune with user needs and expectations.
Product Enhancements: By studying past interactions, you gain a comprehensive view of user feedback, pain points, and preferences. This user-centric insight becomes a compass for product enhancement. Whether it’s refining features, addressing common concerns, or identifying opportunities for innovation, chat history becomes a valuable resource in the iterative process of improving your product for end-users.
How to enable chat history?
To enable chat history, deploy or redeploy your model as a web app using Azure AI Studio. Once completed, activate chat history by clicking the dedicated enablement button within the Azure AI Studio interface. With chat history enabled, users gain control over their interaction.
In the top right corner, they can show or hide their chat history. When displayed, users can rename or delete conversations, giving full control of the chat history experience to users. Conversations are automatically ordered from newest to oldest, simplifying navigation. Each conversation is named based on the initial query, making it easy for users to locate and reference past interactions.
Enabling chat history in Azure AI Studio can easily provide a valuable resource for your web app users, allowing them to easily reference past conversations and queries.
Important! Please note that enabling chat history with Azure Cosmos DB will incur additional charges for the storage used.
About Azure AI Advantage Offer
About Azure Cosmos DB
Azure Cosmos DB is a fully managed, serverless NoSQL database for high-performance applications of any size or scale. It is a multi-tenant, distributed, shared-nothing, horizontally scalable database that provides planet-scale NoSQL capabilities. It offers APIs for Apache Cassandra, MongoDB, Gremlin, Tables, and the Core (SQL)
Get started
Azure Cosmos DB Docs
Check us out on Youtube
Follow us on X (Twitter)
About Azure AI Studio
Azure AI Studio is a trusted and inclusive platform that empowers developers of all abilities and preferences to innovate with AI and shape the future. Seamlessly explore, build, test, deploy, and manage AI innovations at scale. Integrate cutting-edge AI tools and models, prompt orchestration, app evaluation, model fine-tuning, and responsible AI practices. Directly from Azure AI Studio, interact with your projects in a code-first environment using the Azure AI SDK and Azure AI CLI.
Build with Azure AI Studio
Learn more about Azure AI Studio
Watch the Demo!
Azure AI Studio Documentation
Microsoft Learn: Intro to Azure AI Studio
Enabling Chat History Microsoft Docs
Microsoft Tech Community – Latest Blogs –Read More
The Philosophy of the Federal Cyber Data Lake (CDL): A Thought Leadership Approach
Pursuant to Section 8 of Executive Order (EO) 14028, “Improving the Nation’s Cybersecurity”, Federal Chief Information Officers (CIOs) and Chief Information Security Officers (CISOs) aim to comply with the U.S. Office of Management and Budget (OMB) Memorandum 21-31, which centers on system logs for services both within authorization boundaries and deployed on Cloud Service Offerings (CSOs). This memorandum not only instructs Federal agencies to provide clear guidelines for service providers but also offers comprehensive recommendations on logging, retention, and management to increase the Government’s visibility before, during and after a cybersecurity incident. Additionally, OMB Memorandum 22-09, “Moving the U.S. Government Toward Zero Trust Cybersecurity Principles”, references M-21-31 in its Section 3.
While planning to address and execute these requirements, Federal CIO and CISO should explore the use of Cyber Data Lake (CDL). A CDL is a capability to assimilate and house vast quantities of security data, whether in its raw form or as derivatives of original logs. Thanks to its adaptable, scalable design, a CDL can encompass data of any nature, be it structured, semi-structured, or unstructured, all without compromising quality. This article probes into the philosophy behind the Federal CDL, exploring topics such as:
The Importance of CDL for Agency Missions and Business
Strategy and Approach
CDL Infrastructure
Application of CDL
The Importance of CDL for Agency Missions and Business
The overall reduction in both capital and operational expenditures for hardware and software, combined with enhanced data management capabilities, makes CDLs an economically viable solution for organizations looking to optimize their data handling and security strategies. CDLs are cost-effective due to their ability to consolidate various data types and sources into a single platform, eliminating the need for multiple, specialized data management tools. This consolidation reduces infrastructure and maintenance costs significantly. CDLs also adapt easily to increasing data volumes, allowing for scalable storage solutions without the need for expensive infrastructure upgrades. By enabling advanced analytics and efficient data processing, they reduce the time and resources needed for data analysis, further cutting operational costs. Additionally, improved accuracy in threat detection and reduction in false positives lead to more efficient security operations, minimizing the expenses associated with responding to erroneous alerts and increasing the speed of detection and remediation.
However, CDLs are not without challenges. As technological advancements and the big data paradigm evolve, the complexity of network, enterprise, and system architecture escalates. This complexity is further exacerbated by the integration of tools from various vendors into Federal ecosystem, managed by diverse internal and external teams. For security professionals, maintaining pace with this intricate environment and achieving real-time transparency into technological activities is becoming an uphill battle. These professionals require a dependable, almost instantaneous source that adheres to the National Institute of Standards and Technology (NIST) core functions—identify, protect, detect, respond, and recover. Such a source empowers them to strategize, prioritize, and address any anomalies or shifts in their security stance. The present challenge lies in acquiring a holistic view of security risk, especially when large agencies might deploy hundreds of applications across the US and in some cases globally. The security data logs, scattered across these applications, clouds and environments, often exhibit conflicting classifications or categorizations. Further complicating matters are logging maturity levels at different cloud deployment models, infrastructure, platform, and software.
It is vital to scrutinize any irregularities to ensure the environment is secure, aligning with zero-trust principles which advocate for a dual approach: never automatically trust and always operate under the assumption that breaches may occur. As security breaches become more frequent and advanced, malicious entities will employ machine learning to pinpoint vulnerabilities across expansive threat landscape. Artificial intelligence will leverage machine learning and large language models to further enhance organizations’ abilities to discover and adapt to changing risk environments, allowing security professionals to do more with less.
Strategy and Approach
The optimal approach to managing a CDL depends on several variables, including leadership, staff, services, governance, infrastructure, budget, maturity, and other factors spanning all agencies. It is debatable whether a centralized IT team can cater to the diverse needs and unique challenges of every agency. We are seeing a shift where departments are integrating multi-cloud infrastructure into their ecosystem to support the mission. An effective department strategy is pivotal for success, commencing with systems under the Federal Information Security Modernization Act (FISMA) and affiliated technological environments. Though there may be challenges at the departmental level in a federated setting, it often proves a more effective strategy than a checklist approach.
Regarding which logs to prioritize, there are several methods. CISA has published a guide on how to prioritize deployment: Guidance for Implementing M-21-31: Improving the Federal Government’s Investigative and Remediation Capabilities. Some might opt to begin with network-level logs, followed by enterprise and then system logs. Others might prioritize logs from high-value assets based on FISMA’s security categorization, from high to moderate to low. Some might start with systems that can provide logs most effortlessly, allowing them to accumulate best practices and insights before moving on to more intricate systems.
Efficiently performing analysis, enforcement, and operations across data repositories dispersed across multiple cloud locations in a departmental setting involves adopting a range of strategies. This includes data integration and aggregation, cross-cloud compatibility, API-based connectivity, metadata management, cloud orchestration, data virtualization, and the use of cloud-agnostic tools to ensure seamless data interaction. Security and compliance should be maintained consistently, while monitoring, analytics, machine learning, and AI tools can enhance visibility and automate processes. Cost optimization and ongoing evaluation are crucial, as is investing in training and skill development. By implementing these strategies, departments can effectively manage their multi-cloud infrastructure, ensuring data is accessible, secure, and cost-effective, while also leveraging advanced technologies for analysis and operations.
CDL Infrastructure
One of the significant challenges is determining how a CDL aligns with an agency’s structure. The decision between a centralized, federated, or hybrid approach arises, with cost considerations being paramount. Ingesting logs in their original form into a centralized CDL comes with its own set of challenges, including accuracy, privacy, cost, and ownership. Employing a formatting tool can lead to substantial cost savings in the extract, transform, and load (ETL) process. Several agencies have experienced cost reductions of up to 90% and significant data size reductions by incorporating formatting in tables, which can be reorganized as needed during the investigation phase. A federated approach means the logs remain in place, analyses are conducted locally, and the results are then forwarded to a centralized CDL for further evaluation and dissemination.
For larger and more complex agencies, a multi-tier CDL might be suitable. By implementing data collection rules (DCR), data can be categorized during the collection process, with department-specific information directed at the respective department’s CDL, while still ensuring that high value and timely logs are forwarded to a centralized CDL at the agency level, prioritizing privileged accounts. Each operating division or bureau could establish its own CDL, reporting on to the agency’s headquarters’ CDL. The agency’s Office of Inspector General (OIG) or a statistical component of a department may need to create their own independent CDL for independence purposes. This agency HQ CDL would then report to DHS. In contrast, smaller agencies might only need a single CDL. This could integrate with the existing Cloud Log Aggregation Warehouse (CLAW) a CISA-deployed architecture for collecting and aggregating security telemetry data from agencies using commercial CSP services — and align with the National Cybersecurity Protection System (NCPS) Cloud Interface Reference Architecture. This program ensures security data from cloud-based traffic is captured, analyzed, and enables CISA analysts to maintain situational awareness and provide support to agencies.
If data is consolidated in a central monolithic, stringent data stewardship is crucial, especially concerning data segmentation, access controls, and classification. Data segmentation provides granular access control based on a need-to-know approach, with mechanisms such as encryption, authorization, access audits, firewalls, and tagging. If constructed correctly, this can eliminate the need for separate CDL infrastructures for independent organizations. This should be compatible with role-based user access schemes, segment data based on sensitivity or criticality, and meet Federal authentication standards. This supports Zero Trust initiatives in Federal agencies and aligns with Federal cybersecurity regulations, data privacy laws, and current TLS encryption standards. Data must also adhere to retention standards outlined in OMB 21-31 Appendix C and the latest National Archives and Records Administration (NARA) publications, and comply with Data Loss Prevention requirements, covering data at rest, in transit, and at endpoints, in line with NIST 800-53 Revision 5.
In certain scenarios, data might require reclassification or recategorization based on its need-to-know status. Agencies must consider storage capabilities, ensuring they have a scalable, redundant and highly available storage system that can handle vast amounts of varied data, from structured to unstructured formats. Other considerations include interoperability, migrating an existing enterprise CDL to another platform, integrating with legacy systems, and supporting multi-cloud enterprise architectures that source data from a range of CSPs and physical locations. When considering data portability, the ease of transferring data between different platforms or services is crucial. This necessitates storing data in widely recognized formats and ensuring it remains accessible. Moreover, the administrative efforts involved in segmenting and classifying the data should also be considered.
Beyond cost and feasibility, the CDL model also provides the opportunity for CIOs and CISOs to achieve data dominance with their security and log data. This concept of data dominance allows them to gather data, quickly and securely, reduces processing time, which provides quicker time to respond. This quicker time to respond, the strategic goal of any security implementation, is only possible with the appropriate platform and infrastructure so organizations can get closer to real-time situational awareness.
The Application of CDL
With a solid strategy in place, it’s time to delve into the application of a CDL. Questions arise about its operation, making it actionable, its placement relative to the Security Operations Center (SOC), and potential integrations with agency Governance Risk Management, and Compliance (GRC) tools and other monitoring systems. A mature security program needs a comprehensive real-time view of an agency’s security posture, encompassing SOC activities and the agency’s governance, risk management, and compliance tasks. The CDL should interface seamlessly with existing or future Security Orchestration and Response (SOAR) and End Point Detection (EDR) tools, as well as ticketing systems.
CDLs facilitate the sharing of analyses within their agencies, as well as with other Federal entities like the Department of Homeland Security (DHS), Cybersecurity and Infrastructure Security Agency (CISA), Federal law enforcement agencies, and intelligence agencies. Moreover, CDLs can bridge the gaps in a Federal security program, interlinking entities such as the SOC, GRC tools, and other security monitoring capabilities. At the highest levels of maturity, the CDL will leverage Network Operations Center (NOC) and even potentially administration information such as employee leave schedules. The benefit of modernizing the CDL lies in eliminating the requirement to segregate data before ingestion. Data is no longer categorized as security-specific or operations-specific. Instead, it is centralized into a single location, allowing CDL tools and models to assess the data’s significance. Monolithic technology stacks are effective when all workloads are in the same cloud environment. However, in a multi-cloud infrastructure, this approach becomes challenging. With workloads spread across different clouds, selecting one as a central hub incurs egress costs to transfer log data between clouds. Departments are exploring options to store data in the cloud where it’s generated, while also considering if Cloud Service Providers (CSPs) offer tools for analysis, visibility, machine learning, and artificial intelligence.
The next step is for agencies to send actionable information to security personnel regarding potential incidents and provide mission owners with the intelligence necessary to enhance efficiency. Additionally, this approach eliminates the creation of separate silos for security data, mission data, financial information, and operations data. This integration extends to other Federal security initiatives such as Continuous Diagnostics and Mitigation (CDM), Authority to Operate (ATO), Trusted Internet Connection (TIC), and the Federal Risk and Authorization Management Program (FedRAMP).
It’s also pivotal to determine if the CDL aligns with the MITRE ATT&CK Framework, which can significantly assist in incident response. MITRE ATT&CK® is a public knowledge base outlining adversary tactics and techniques based on observed events. The knowledge base aids in developing specific threat models and methodologies across various sectors.
Lastly, to gauge the CDL’s applicability, one might consider creating a test case. Given the vast amount of log data — since logs are perpetual — this presents an ideal scenario for machine learning. Achieving real-time visibility can be challenging with the multiple layers of log aggregation, but timely insights might be within reach. For more resources from Microsoft Federal Security, please visit https://aka.ms/FedCyber.
Stay Connected
Connect with the Public Sector community to keep the conversation going, exchange tips and tricks, and join community events. Click “Join” to become a member and follow or subscribe to the Public Sector Blog space to get the most recent updates and news directly from the product teams.
Microsoft Tech Community – Latest Blogs –Read More
Creating Azure Container Apps using Azure Python SDK
The Azure Python SDK, also known as the Azure SDK for Python, is a set of libraries and packages that allow developers to interact with Microsoft Azure services using the Python programming language. It simplifies the process of integrating Python applications with Azure services by providing a set of high-level abstractions and APIs. With the SDK, developers can programmatically manage and interact with Azure resources, such as virtual machines, storage accounts, databases, and other cloud services.
To use the Azure Python SDK, developers typically install the required Python packages using a package manager like pip. They can then import the relevant modules in their Python code and use the provided classes and methods to interact with Azure services.
If we talk about Azure Container Apps, Microsoft provides comprehensive documentation and samples to help developers get started with the Azure Python SDK.
In this blog, we will be looking at how to create Container Apps using Azure Python SDK.
Getting Started
Prerequisites
It is assumed here that you are already having an existing Azure Subscription, Resource Group, Container App Environment and a Container Registry available. Also, we will be using a Windows machine here to run the file which has Python version > 3.7 installed.
Here as an example, we will be creating an Azure Container App, testing it, and then deleting it via the Azure Python SDK. To run the file, we would be using Azure CLI. This has been tested with the AZ CLI version 2.56
Package Installation
Install the packages that will be used for managing the resources. The Azure Identity Package is needed almost every time. We would be using the Azure Container App package along with it.
pip install azure-identity
pip install azure-mgmt-appcontainers
Authentication
There are two options that can be used for authenticating. Authentication via Subscription ID and Authentication via Service Principal. In this example, we will be using Subscription ID for authenticating to Azure.
You can specify the Subscription ID as an Environment Variable or use it directly in the code. Both the examples are provided below.
from azure.identity import DefaultAzureCredential
from azure.mgmt.appcontainers import ContainerAppsAPIClient
import os
sub_id = os.getenv(“AZURE_SUBSCRIPTION_ID”)
client = ContainerAppsAPIClient(credential=DefaultAzureCredential(), subscription_id=sub_id)
from azure.identity import DefaultAzureCredential
from azure.mgmt.appcontainers import ContainerAppsAPIClient
client = ContainerAppsAPIClient(credential=DefaultAzureCredential(),subscription_id=”<YOUR_SUBSCRIPTION_ID>”)
Python File
We will be using the following file for our management tasks specified above. I am naming this file as containerapp.py
from azure.identity import DefaultAzureCredential
from azure.mgmt.appcontainers import ContainerAppsAPIClient
def main():
client = ContainerAppsAPIClient(
credential=DefaultAzureCredential(),
subscription_id=”4db72a57-a748-41c7-aabc-1f7a153960cf”
)
response = client.container_apps.begin_create_or_update(
resource_group_name=”defaultrg”,
container_app_name=”containerapp-test”,
container_app_envelope={
“location”: “East US 2”,
“properties”: {
“configuration”: {
“ingress”: {
“external”: True,
“targetPort”: 80,
“transport”: “http”,
“stickySessions”: {
“affinity”: “none”
}
}
},
“environmentId”: “/subscriptions/4db72a57-a748-41c7-aabc-1f7a153960cf/resourceGroups/defaultrg/providers/Microsoft.App/managedEnvironments/defaultcaenv”,
“template”: {
“containers”: [
{
“image”: “docker.io/nginx:latest”,
“name”: “testapp4”,
“resources”: {
“cpu”: 0.25,
“memory”: “.5Gi”
}
}
]
},
},
},
).result()
print(response)
client.container_apps.begin_delete(
resource_group_name=”defaultrg”,
container_app_name=”containerapp-test”,
).result()
if __name__ == “__main__”:
main()
In the above file, we are using a Public Repository (DockerHub) as our image source. If in case you want to use your private Azure Container Registry as an image source, the template section must include the auth configuration.
“template”: {
“containers”: [
{
“image”: “nginx:latest”,
“name”: “containerapp-test”,
“resources”: {
“cpu”: 0.25,
“memory”: “.5Gi”
},
“registries”: {
“server”: “https://<YOUR_ACR_NAME>.azurecr.io”,
“username”: “<YOUR_ACR_USERNAME>”,
“passwordSecretRef”: “acr-password”
}
}
],
“secrets”: [
{
“name”: “acr-password”,
“value”: “<YOUR_ACR_PASSWORD>”
},
],
}
The above configuration assumes that there is an image called “nginx” with the tag “latest” in your ACR. Also, the ACR has admin credentials enabled. (Ref..)
After editing the python management file, we can run it simply by using the command
python containerapp.py
On successful run, the result will be printed in json format on the cli.
Troubleshooting
On successful run, the result will be printed in json format on the cli. In some cases, during an error, restarting the Azure CLI can help. I am listing some common scenarios that we usually see while working with the SDK.
InvalidAuthenticationTokenTenant
The error message suggests that the access token is from the wrong issuer, and it must match one of the tenants associated with this subscription. It is usually seen when the Subscription ID on the file does not match with the account you’ve logged in. Re-logging with the correct account may help. (az logout & az login)
InvalidParameterValueInContainerTemplate
The error message noted two issues. Possible invalid or missing image or an issue with authentication. Please check on any typo on the ‘registryPassword‘. Apart form that, if you are using any external public registry like DockerHub, please make sure that the full repository URL is mentioned in the ‘image’ parameter. Also, while using ACR, make sure that only the image and the tag is mentioned as its value.
Microsoft Tech Community – Latest Blogs –Read More
ZoomIt v8.01
Microsoft Tech Community – Latest Blogs –Read More
Nominations are now open for this year’s Microsoft Partner of the Year Awards!
Celebrated annually, these awards recognize the incredible impact that Microsoft partners are delivering to customers and celebrate the outstanding successes and innovations across Solution Areas, industries, and key areas of impact, with a focus on strategic initiatives and technologies. Partners of all types, sizes, and geographies are encouraged to self-nominate. This is an opportunity for partners to be recognized on a global scale for their innovative solutions built using Microsoft technologies.
In addition to recognizing partners for the impact in our award categories, we also recognize partners from over 100 countries/regions around the world as part of the Country/Region Partner of the Year Awards. In 2024, we’re excited to offer additional opportunities to recognize partner impact through new awards – read our blog to learn more and download the official guidelines for specific eligibility requirements.
Visit the Microsoft Partner of the Year Awards page to see the full list of awards and to submit your nomination in advance of the April 3, 2024, deadline. To ensure you create a strong entry, we encourage you to explore the provided resources and expert advice on the nomination process. We look forward to receiving another amazing set of nominations this year and are excited to celebrate another round of incredible partner innovations!
Read more on the Partner Blog
Microsoft Tech Community – Latest Blogs –Read More
Become a Microsoft Defender Vulnerability Management Ninja
Do you want to become a ninja for Microsoft Defender Vulnerability Management? We can help you get there! We collected content with multiple modules. We will keep updating this training on a regular basis.
In addition, we offer you a knowledge check based on the training material! Since there’s a lot of content, the goal of the knowledge checks is to help ensure understanding of the key concepts that were covered. Lastly, there’ll be a fun certificate issued at the end of the training: Disclaimer: This is not an official Microsoft certification and only acts as a way of recognizing your participation in this training content.
Module 1- Getting started
What is Microsoft Defender Vulnerability Management
Prerequisites & permissions
Supported operating systems, platforms and capabilities
Compare Defender Vulnerability Management plans and capabilities
Interactive Guide – Reduce organizational risk with Microsoft Defender Vulnerability Management
Defender Vulnerability Management trial
Defender Vulnerability Management add on trial
Defender Vulnerability Management standalone trial
Frequently asked questions
What’s new in Public Preview
Module 2 – Portal Orientation
Onboard to Defender Vulnerability Management
Dashboard overview
Device inventory
Software inventory
Browser extensions assessment
Certificate inventory
Firmware and hardware assessment
Authenticated scan
Module 3 -Prioritization
Vulnerabilities in my organization
Exposure score
Microsoft Secure Score for Devices
Assign device value
Security recommendation
Mitigate zero-day vulnerabilities
Module 4- Remediation
Remediate vulnerabilities
Request Remediation
Create and view exceptions for security recommendations
View remediation activities
Block vulnerable applications
Module 5 – Posture and Compliance
Microsoft Secure Score for Devices
Security baselines assessment
Module 6 – Data access
Hunt for exposed devices
Vulnerable devices report
Device health reporting in Defender for Endpoint
Monthly security summary reporting in Defender for Endpoint
API’s
Export assessment methods and properties per device
Export secure configuration assessment per device
Export software inventory assessment per device
Build your own custom reports
Are you ready for the Knowledge check?
Once you’ve finished the training and passed the knowledge check, please click here to request your certificate (you’ll see it in your inbox within 3-5 business days.)
Microsoft Tech Community – Latest Blogs –Read More
Firewall considerations for gMSA on Azure Kubernetes Service
This week I spent some time helping a customer with a gMSA environment on which they were finding some issues in deploying their app. The issues started when they were trying to figure out why the Kerberos ticket was not being issues for the Window pod with gMSA configured in AKS. I decided to write this blog post to list some of the firewall considerations for different scenarios on which security rules might block the authentication process.
gMSA and its moving parts
To use gMSA on AKS, you must understand that there are many moving parts in play. First, your Kubernetes cluster on AKS is comprised of both Linux and Windows nodes. Your nodes will all be part of a virtual network, but only the Windows nodes will try to reach the Domain Controller (DC).
The DC itself might be in another virtual network, in the same virtual network, or even outside of Azure. Then you have the Azure Key Vault (AKV) on which the secret (username and password) is securely stored. Your AKV should only be available to the proper Windows nodes, no one else.
The problem though, comes when you have Windows nodes on AKS and DCs running on different networks or even sites, and you need to open the proper ports between the Windows nodes and the Active Directory DC.
Ports to open for Active Directory and gMSA
We have had documentation on which ports to open for Active Directory for a while. That is relatively well known and can be leveraged here.
The thing to understand is that when using gMSA on AKS, not all these ports need to be opened, and allowing unnecessary traffic might expose you to threats without a need for it. For gMSA, there’s no computer or user account being used interactively, and thus we can compile the following list:
Protocol and port
Purpose
TCP and UDP 53
DNS
TCP and UDP 88
Kerberos
TCP 139
NetLogon
TCP and UDP 389
LDAP
TCP 636
LDAP SSL
Keep in mind this list of ports does not take into consideration ports that your application might need to query AD or perform any other action with the DC. You might need to check for those with the application owner.
Domain Controllers in Azure
You might mitigate a lot of firewall issues by simply adding one (or more) DC to Azure as a VM. By doing that, you have two things that play in your favor:
You keep the authentication process within Azure. Your Windows pods and nodes don’t need to reach to an on-premises environment – unless the DC(s) in Azure is down.
You have a better understanding of ports to open between NSGs in Azure rather than traffic between workloads on Azure and DCs on-premises.
On the other hand, you must consider that the DCs in Azure do need to replicate to the DCs on-premises. However, this is a preferred scenario because you know who the DCs are, versus workloads machine that might scale-out or even new workloads/clusters be added in the future. At the end of the day, the scope for opening ports is lower, which minimizes exposure. Please refer to the documentation to understand ports for AD replication as well.
Hopefully this will help you fix any issues you might be having with gMSA caused by blocked traffic. Keep in mind the ports listed above might not be the full list of ports you need to open, but the minimal set of ports and traffic for the proper authentication. As always, let us know in the comments what are your thoughts and if you have a different scenario.
Microsoft Tech Community – Latest Blogs –Read More
ADX Continuous Export to Delta Table – Preview
We’re excited to announce that continuous export to Delta table is now available in Preview.
Continuous export in ADX allows you to export data from Kusto to an external table with a periodically run query. The results are stored in the external table, which defines the destination, such as Azure Blob Storage, and the schema of the exported data. This process guarantees that all records are exported “exactly once”, with some exceptions. Continous export previously supported CSV, TSV, JSON and Parquet formats.
Starting today, you can continuously export to a delta table.
To define continuous export to a delta table:
Create an external delta table, as described in Create and alter delta external tables on Azure Storage.
(.create | .alter | .create-or-alter) external table TableName [(Schema)] kind = delta (StorageConnectionString ) [with (Property [, …])]
Define continuous export to this table using the commands described in Create or alter continuous export.
.create-or-alter continuous-export continuousExportName [over (T1, T2 )] to table externalTableName [with (propertyName = propertyValue [, …])] <| query
Few things to note:
If the schema of delta table while defining the external table isn’t provided, Kusto will try to infer it automatically based on the delta table defined in the target storage container.
If the schema of delta table while deining the external table is provided and there is no delta table defined in the target storage container, continous export will create a delta table during the first export.
The schema of the delta table must be in sync with the continuous export query. If the underlying delta table changes, the export might start failing with unexpected behavior.
Delta table partitioning is not supported today.
Read more : Continuous data export – Azure Data Explorer & Real-Time Analytics | Microsoft Learn
As always, we’d love to hear your feedback and comments.
Microsoft Tech Community – Latest Blogs –Read More
AI for Developers
The era of AI is here, and today’s developer needs the skills and tools to build intelligent apps. This month, we’re exploring resources to help developers modernize their applications and get started with AI. Join a Hack Together event, complete a Cloud Skills Challenge, work through guided tutorials, and register for upcoming events. These resources will help you build intelligent chat apps, extend Microsoft Copilot or create a custom copilot, learn about Microsoft Fabric, and much more.
Cloud Skills Challenge: Build Intelligent Apps
Join a Cloud Skills Challenge to compete against peers, show case your talents, and learn new skills. Combine AI, cloud-scaled data, and cloud-native app development to create intelligent apps. Join a challenge today.
Hack Together: The AI Chat App Hack
It’s not too late to join the AI Chat App Hack! This Hack Together event (January 29 – February 12) offers a playground for experimenting with RAG chat apps and a chance to learn from Microsoft experts.
Azure Cosmos DB Conf Call for Proposals
Want to give a presentation at the Azure Cosmos DB Conference 2024? Submit proposals for presentations on AI integration, innovative use cases, and other topics emphasizing practical insights and hands-on experiences. Submit by February 15, 2024.
Hack Together: The Microsoft Fabric Global AI Hack
Join the Microsoft Fabric Global AI Hack February 19 – March 1 for hands-on learning and find out why Microsoft Fabric is the data platform of choice for AI.
Official Collection: Learn how to build intelligent apps with .NET
Explore a collection of Microsoft Learn modules, videos, and samples on GitHub that will help you build intelligent apps with .NET.
Microsoft Fabric Community Conference
Register for the first annual Microsoft Fabric Community Conference—a live, in-person event taking place March 26 – 28 in Las Vegas. Immerse yourself in data and AI, get hands-on experience with the latest technologies, and connect with other experts.
Playwright Testing and GitHub Actions tutorial: How to run Playwright tests on every code commit
Set up continuous, end-to-end testing for your web apps with Microsoft Playwright and GitHub actions. Watch this tutorial to see how you can run tests on every code commit and validate that your app works across different browsers and operating systems.
The future of collaboration and AI
Build the next era of AI apps with the Teams AI Library, now generally available. Combined with Azure Open AI Service, you have everything you need to build your own AI apps and copilots. Learn more about extending your app to the Copilot ecosystem.
Azure Cosmos DB Conf 2024
Sign up for Azure Cosmos DB Conf, a free virtual developer event. Tune into the live show on April 16 to learn why Azure Cosmos DB is the leading database for AI and modern app development. Then explore more sessions on demand.
POSETTE Call for Presentations
Every great event starts with great speakers. Do you have Postgres tips, tricks, stories, or expertise to share? Submit your presentation proposals to be considered for POSETTE (formerly Citus Con), a free, virtual developer event organized by the Postgres team at Microsoft.
Build and modernize AI apps with new solution accelerators
Build intelligent apps on Azure with new tools that bring top use cases to life. Explore demos, GitHub repos, and Hackathon content to help you get started building AI-powered apps, such as a copilot using your own data.
New Azure AI Advantage offer
There’s a new Azure AI Advantage offer that lets Azure AI and GitHub Copilot customers save when using Azure Cosmos DB.
Build a production RAG chat using Azure AI Studio and Prompt Flow
Learn how to build a production-level RAG app for a customer support agent – and integrate it with your web-based product catalog. Streamline your end-to-end app development from prompt engineering to LLMOps with prompt flow in Azure AI Studio.
Train a machine learning model and debug it with the Responsible AI dashboard
Ready to build a machine learning model or integrate one into your app? Learn how to debug your model to assess it for Responsible AI practices using the Azure Responsible AI Dashboard.
How to Convert Audio to .WAV for Speech Service Using MoviePy
Azure Speech Service requires audio files to adhere to specific standards. Find out how to use MoviePy to easily convert your audio files to make them compatible with Azure Speech Service.
Build it with AI video series
Ready to get started with AI? Check out the Build it with AI video series from Microsoft Reactor. Deepen your engagement, grow your AI-driven solutions, and start building your business on AI technology.
How to build a custom copilot using Azure AI Studio and Microsoft Copilot Studio
Want to build your own copilot? Explore options in the Microsoft ecosystem for building a copilot. This blog post looks into low code tools and out-of-the-box features. A follow-up post will focus on code-heavy and extensible options.
Build an AI Powered Image App
Use AI image technologies to deploy it to build an AI-powered image web app. A new Microsoft Learn challenge module steps you through bite-sized project to give you a taste of the latest tools.
Microsoft JDConf 2024
Get ready for JDConf 2024—a free virtual event for Java developers. Explore the latest in tooling, architecture, cloud integration, frameworks, and AI. It all happens online March 27-28. Learn more and register now.
Step-by-step guide: Build a recommender full stack app using OpenAI and Azure SQL
Check out this step-by-step guide for creating an intelligent web app with Azure Open AI Service. This blog post shows you how to create a recommender full stack app with OpenAI and Azure SQL.
Official collection: AI Kick-off Projects
Put your AI skills to test and start building innovative solutions. This collection of AI Challenge Projects provides modules that will teach you how to build various intelligent solutions, such as a minigame and a speech translator.
Register now: Microsoft Fabric Community Conference
Join us at the first ever Microsoft Fabric Community Conference—a live, in-person event. Discover how Microsoft data and AI services accelerate innovation and prepare you for the era of AI. Use discount code MSCUST to save $100.
Microsoft Tech Community – Latest Blogs –Read More
ICYMI | Microsoft 365 Blog: Introducing the new Microsoft 365 Document Collaboration Partner Program
If you’re an independent software vendor (ISV) who provides a cloud communication and collaboration platform, you may want to offer customers a collaboration experience inside and outside meetings. That’s why we are excited to introduce the Microsoft 365 Document Collaboration Partner Program (MDCPP), a new opportunity for eligible platform providers to integrate Microsoft 365 apps into their platforms. Whether it’s a presentation, a spreadsheet, or a document, the program can enable users to share, edit, and coauthor, without switching between apps or losing context.
Continue reading in our Microsoft 365 blog
Microsoft Tech Community – Latest Blogs –Read More
Partner Blog | Empowering Partners: Celebrating Excellence in Tech
Authored by Leona Locke, Director, GTM Benefits and Partner Engagement, with contributions from Regina Johnson, (RJohnson_Microsoft) Senior Manager and Community Lead.
Our partner community is enriched by its diversity, and we are committed to strengthening our collective capacity. In this blog, we will share a few opportunities for partners to access information, resources, and capital that are designed not only to empower you to achieve your goals, but also connect you to collaborators with shared visions and aspirations.
Partner-led associations at Microsoft
Partner-led Associations are nonprofit organizations led by Microsoft partners and technology company business owners. They boast strong membership bases and provide a direct pipeline to channels driving partner engagement, professional skilling, P2P opportunities, and enablement to increase partner knowledge, growth, and sales.
Drive growth for your organization by engaging individually and collectively with partner-led associations such as the Black Channel Partner Alliance (BCPA), the International Association of Microsoft Channel Partners (IAMCP), the Women in Tech Network (WIT), and Women in Cloud (WIC).
Continue reading here
**Don’t forget to join our Partner-led communities to stay connected!**
Inclusive Growth Discussion Board
Microsoft Tech Community – Latest Blogs –Read More
Make demo typing easy with DemoType in ZoomIt v8.0
DemoType is a ZoomIt feature that allows you to synthesize keystrokes from a script. Queue code blocks or Copilot prompts and send them to target windows during a live demo. Additionally, ZoomIt counteracts editor specific auto formatting, allowing a script to be interchangeable between target windows. Watch a video overview here.
Standard mode
Default behavior immediately begins injecting keystrokes to the target window upon pressing the DemoType hotkey (e.g. Ctrl + 7). No user input is required. ZoomIt will simply run to the end of the current text segment and exit, returning control to the user.
User-driven mode
You can select the option to drive input with typing. Toggle this behavior in the ZoomIt options dialog. One user key press will trigger one output character in 1:1 injection ratio.
You can adjust the injection ratio between 1:1, 1:2, and 1:3 with the speed slider in the ZoomIt options dialog. Upon reaching the end of the current text segment in user-driven mode, DemoType will continuie blocking keyboard input until you press space bar.
Input Script
Your script can be sourced from a file or from the clipboard. To use the clipboard, you must put the control keyword [start] at the beginning of your selection. This deliberate safety prefix is meant to stop you from unintentionally presenting sensitive data in the clipboard.
To use a file, select it from the ZoomIt options dialog. If you were previously sourcing input from the clipboard and would like to switch to file, set the clipboard to some text that doesn’t include the [start] prefix, or clear the clipboard via Windows Settings > System > Clipboard > Clear clipboard data.
The [end] control keyword is used to split your script into text segments. It is important to note that DemoType will look to the left and right of an [end] and absorb a single newline from each side if present. This allows you to format your script and pad an [end] with newlines that won’t render.
Cancelling DemoType
To cancel an active session, press escape. DemoType will also quit if focus is changed to a different window. Terminating a DemoType session mid text segment will hop to the next text segment. To hop back to the previous text segment, enter the DemoType hotkey with the Shift key in the opposite mode (e.g. Ctrl + Shift + 7).
Control Keywords
Use the following keywords throughout your script to control behavior.
[start] is a safety prefix only used when tagging clipboard data as a viable script
[end] is a delimiter to segment your script into snippets
[enter], [up], [down], [left], [right] synthesizes keystrokes
[pause:n] synthesizes a pause of n seconds
[paste] with a closing [/paste] allows you to inject a chunk of text via the clipboard
Microsoft Tech Community – Latest Blogs –Read More
Microsoft Copilot for Sales is here!
Microsoft Copilot for Sales is here! The next step in the evolution of Viva Sales and Sales Copilot was released on February 1, 2024 for Dynamics 365 Sales and Salesforce CRM. You can read the announcement about the general availability of Copilot for Sales (and Copilot for Service) and learn what’s new.
With Copilot for Sales, we bring together the power of Copilot for Microsoft 365 with role-specific insights and actions to streamline business processes, automate repetitive tasks, and unlock productivity. We still provide the flexibility to integrate with Microsoft Dynamics 365 Sales and Salesforce to get more done with less effort.
Check out the Copilot for Sales Adoption Center where we provide resources to deploy, use, and scale Copilot for you, your team, and your organization!
Get started
Ready to join us and other top-performing sales organizations worldwide? Reach out to your Microsoft sales team or visit our product web page.
Ready to install? Have a look at our deployment guides for Dynamics 365 Sales users or for Salesforce users.
Stay connected
Keep up to date on the latest improvements at https://aka.ms/salescopilotupdates and learn what we’re planning next. Join our community in the community discussion forum and we always welcome your feedback and ideas in our product feedback portal.
Microsoft Tech Community – Latest Blogs –Read More
Sync Up Episode 08: From Waterfalls to Weekly Releases with Steven Bailey & John Selbie
Ever wanted to learn more about what goes into making OneDrive? Then this is the podcast for you! Join Stephen Rice and Arvind Mishra as they talk with CVP for OneDrive Engineering, Steven Bailey and engineering Manager John Selbie! This month, we’re talking about how OneDrive became the product that it is today, how engineering itself has evolved at Microsoft, and how the OneDrive engineering team strives to deliver a product that surpasses your expectations!
Microsoft Tech Community – Latest Blogs –Read More
Announcing OpenAI text-to-speech voices on Azure OpenAI Service and Azure AI Speech
At OpenAI DevDay on November 6th 2023, OpenAI announced a new text-to-speech (TTS) model that offers 6 preset voices to choose from, in their standard format as well as their respective high-definition (HD) equivalents. Today, we are excited to announce that we are bringing those models in preview to Azure. Developers can now access OpenAI’s TTS voices through Azure OpenAI and Azure AI Speech services. Each of the 6 voices has its own personality and style. The standard voices models are optimized for real-time use cases, and the HD equivalents are optimized for quality.
These new TTS voices augment capabilities, such as building custom voices and avatars, already available in Azure AI and allow customers to build entirely new experiences across customer support, training videos, live-streaming and more.
This capability allows developers to give human-like voices to chatbots, audiobook or article narration, translation across multiple languages, content creation for games and offers much-needed assistance to the visually impaired.
Click here to see these voices in action:
The new voices will support a wide range of languages from Afrikaans to Welsh, and the service can cater to diverse linguistic needs. For a complete list of supported languages, please follow this link.
The table below shows the 6 preset voices:
Voice
Sample Text
Sample Language (s)
Sample Audio
(Standard)
Sample Audio
(HD)
Alloy
The world is full of beauty when your heart is full of love.
Le monde est plein de beauté quand votre cœur est plein d’amour.
English- French
https://nerualttswaves.blob.core.windows.net/oai-samples/test_oai_alloy.wav
https://nerualttswaves.blob.core.windows.net/oai-samples/test_oai_alloyHD.wav
Echo
Well, John, that’s very kind of you to invite me to your quarters. I’m flattered that you want to spend more time with me.
Des efforts de collaboration entre les pays sont nécessaires pour lutter contre le changement climatique, protéger les océans et préserver les écosystèmes fragiles.
English – French
https://nerualttswaves.blob.core.windows.net/oai-samples/test_oai_echo.wav
https://nerualttswaves.blob.core.windows.net/oai-samples/test_oai_echoHD.wav
Fable
Success is not the key to happiness, but happiness is the key to success.
Erfolg ist nicht der Schlüssel zum Glück, aber Glück ist der Schlüssel zum Erfolg.
English – German
https://nerualttswaves.blob.core.windows.net/oai-samples/test_oai_fable.wav
https://nerualttswaves.blob.core.windows.net/oai-samples/test_oai_fableHD.wav
Onyx
Conserving water resources through efficient usage and implementing responsible water management practices is crucial, especially in regions prone to drought and water scarcity.
Die Einführung nachhaltiger Praktiken in unserem täglichen Leben, wie z. B. die Einsparung von Wasser und Energie, die Auswahl umweltfreundlicher Produkte und die Reduzierung unseres CO2-Fußabdrucks, kann erhebliche positive Auswirkungen auf die Umwelt haben.
English – German
https://nerualttswaves.blob.core.windows.net/oai-samples/test_oai_onyx.wav
https://nerualttswaves.blob.core.windows.net/oai-samples/test_oai_onyxHD.wav
Nova
Success is not the key to happiness, but happiness is the key to success.
El éxito no es la clave de la felicidad, pero la felicidad es la clave del éxito.
English – Spanish
https://nerualttswaves.blob.core.windows.net/oai-samples/test_oai_nova.wav
https://nerualttswaves.blob.core.windows.net/oai-samples/test_oai_novaHD.wav
Shimmer
In this moment, I realized that amid the chaos of life, tranquility and peace can always be found.
人生は、一度きりのチャンスです。失敗しても、次に向けて立ち上がりましょう。
English – Japanese
https://nerualttswaves.blob.core.windows.net/oai-samples/test_oai_Shimmer.wav
https://nerualttswaves.blob.core.windows.net/oai-samples/test_oai_ShimmerHD.wav
In addition to making these voices available in Azure OpenAI Service, customers will also find them in the Azure AI Speech with the added support for Speech Synthesis Markup Language (SSML) SDK.
Getting started
With these updates, we’re excited to be powering natural and intuitive voice experiences for more customers.
For more information:
Try the demo from AI Studio
See our documentation (AI Speech Learn doc link)
Check out our sample code
Microsoft Tech Community – Latest Blogs –Read More
IPv6 Transition Technology Survey
The journey toward IPv6 only networking is challenging, and today there are several different approaches to the transition from IPv4 to IPv6 with multiple dual-stack or tunneling stages along the way. As we prioritize future Windows work, we would like to know more about what customers like you are using to support your own IPv6 deployments. We have published the survey below to ask you a few questions that will contribute to that exercise.
The survey is fairly short and anonymous (though we left a field for sharing your contact information if you would be ok with direct follow up). Thank you in advance for your responses; your experiences will help us focus on what you find most valuable in our future work.
Microsoft Tech Community – Latest Blogs –Read More
To BE or not to be – case sensitive using Power BI and ADX/RTA in Fabric
To BE or not to be – case sensitive
Power BI, Power Query and ADX/RTA in Fabric
Summary
You use these combination – Power BI including Power query with data coming from Kusto/ADX/RTA in Fabric.
Is your solution case sensitive? Is “PoKer” == “poker”?
Depends on who you ask,
Power Query says definitely no, Power BI says definitely yes and Kusto says that
“PoKer” == “poker” is false but “PoKer” =~ “poker” is true.
What about your data? Is the same piece of information always written in the same way or is it sometimes Canada and some other times Canada?
In this article I’ll highlight the challenges of using mixed case data and navigating the differences between the different technologies.
Power BI
Chris Webb in his blog writes:
Case sensitivity is one of the more confusing aspects of Power BI: while the Power Query engine is case sensitive, the main Power BI engine (that means datasets, relationships, DAX etc.) is case insensitive
In this post, Chris mentioned a way to do cases insensitive comparisons in PQ but it is not supported in Direct Query.
Kusto/ADX/RTA
Kusto is case sensitive. Every function and language term must be written in the right case which is usually all lower case.
The same with tables, functions and columns.
What about text comparisons?
The KQL language offers case sensitive and case insensitive comparisons:
== vs. =~, != vs. !~ , has_cs vs. has , in vs. in~, contains_cs vs. contains
Comparisons created in Power BI or in Power Query and folded to KQL
The connector uses by default case sensitive comparisons: has_cs and ==
You can change this behavior by using settings in the connection.
Mixed case data
This is the trickiest topic. I attach a PBI report that shows a list of colors in two pages.
The slicer on the first page is showing the color Blue twice. If you edit the query, you can see that there are some products that have the color as “blue” all lower case.
This is confusing PBI and it shows two different variations that look exactly the same.
If you filter on either version in the first page, you get the same value which is the value of the version “Blue”.
For the second page I created a copy of the query where the versions of Blue are well separated, and you can see the total for each one. you can see that the total shown on the first page is just for the version “Blue”
What can you do in such cases (pun intended)
I create a third version of the query when I converted all color names to be proper cased.
The M function for right case couldn’t be used in Direct Query so I added a KQL snippet using the M function Value.NativeQuery.
The snippet is
| extend ColorName = strcat(toupper(substring(ColorName,0,1)),substring(ColorName,1))
In the third page you can see that filtering “Blue” shows the total values for “Upper blue” and “Lower blue” as they appear in the second page.
So, if you have a column in mixed case you must convert all values to a standard case.
Microsoft Tech Community – Latest Blogs –Read More
GenAI Solutions: Elevating Production Apps Performance Through Latency Optimization
As the influence of GenAI-based applications continues to expand, the critical need to enhance their performance becomes ever more apparent. In the realm of production applications, responses are expected within a range of milliseconds to seconds. The integration of Large Language Models (LLMs) has the potential to extend response times of such applications to few more seconds. This blog intricately explores diverse strategies aimed at optimizing response times in applications that harness Large Language Models on the Azure platform. In broad context, the subsequent methodologies can be employed to optimize the responsiveness of Generative Artificial Intelligence (GenAI) applications:
Response Optimization of LLM models
Designing an Efficient Workflow orchestration
Improving Latency in Ancillary AI Services
Response Optimization of LLM models
The inherent complexity and size of Language Model (LLM) architectures contribute substantially to the latency observed in any application upon their integration. Therefore, prioritizing the optimization of LLM responsiveness becomes imperative. Let’s now explore various strategies aimed at enhancing the responsiveness of LLM applications, placing particular emphasis on the optimization of the Large Language Model itself.
Key factors influencing the latency of LLMs are
Prompt Size and Output token count
A token is a unit of text that the model processes. It can be as short as one character or as long as one word, depending on the model’s architecture. For example, in the sentence “ChatGPT is amazing,” there are five tokens: [“Chat”, “G”, “PT”, ” is”, ” amazing”]. Each word or sub-word is considered a token, and the model analyses and generates text based on these units. A helpful rule of thumb is that one token corresponds to ~4 characters of text for common English text. This translates to ¾ of a word (so 100 tokens ~= 75 words).
A deployment of the GPT-3.5-turbo instance on Azure comes with a rate limit of around 120,000 tokens per minute, equivalent to approximately 2,000 tokens per second ( Details of TPM limits of each Azure OpenAI model are given here . It is evident that the quantity of output tokens has a direct impact on the response of Large Language Models (LLMs), consequently influencing the application’s responsiveness. To Optimize application response times, it is recommended to minimize the number of output tokens generated. Set an appropriate value for the max_tokens parameter to limit the response length. This can help in controlling the length of the generated output.
The latency of LLMs is influenced not only by the output tokens but also by the input prompts. Input prompts can be categorized into two main types:
Instructions, which serve as guidelines for LLMs to follow, and
Information, providing a summary or context for the grounded data to be processed by LLMs.
While instructions are typically of standard lengths and crucial for prompt construction, but the inclusion of multiple tasks may lead to varied instructions, and ultimately increasing the overall prompt size. It is advisable to limit prompts to a maximum of one or two tasks to manage prompt size effectively. Additionally, the information or content can be condensed or summarized to optimize the overall prompt length.
Model size
The size of the LLMs is typically measured in terms of its parameters. A simple neural network with just one hidden layer has a parameter for each connection between nodes (neurons) across layers and for each node’s bias. The more layers and nodes a model have, the more parameters it will contain. A larger parameter size usually translates into a more complex model that can capture intricate patterns in the data.
Applications frequently utilize Large Language Models (LLMs) for various tasks such as classification, keyword extraction, reasoning, and summarization. It is crucial to choose the appropriate model for the specific task at hand. Smaller models like Davinci are well-suited for tasks like classification or key value extraction, offering enhanced accuracy and speed compared to larger models. On the other hand, large models are more suitable for complex use cases like summarization, reasoning and chat conversations. Selecting the right model tailored to the task optimizes both efficiency and performance.
Leverage Azure-hosted LLM Models
Azure AI Studio provides customers with cutting-edge language models like OpenAI’s GPT-4, GPT-3, Codex, DALL-E, and Whisper models, and other open-source models all backed by the security, scalability, and enterprise assurances of Microsoft Azure. The OpenAI models are co-developed by Azure OpenAI and OpenAI, ensuring seamless compatibility and a smooth transition between the different models.
By opting for Azure OpenAI, customers not only benefit from the security features inherent to Microsoft Azure but also run on the same models employed by OpenAI. This service offers additional advantages such as private networking, regional availability, scalability, and responsible AI content filtering, enhancing the overall experience and reliability of language AI applications.
If anyone is using GenAI models from creators, the transition to Azure-hosted version of these Models has yielded notable enhancements in the response time of the models. This shift to Azure infrastructure has led to improved efficiency and performance, resulting in more responsive and timely outputs from the models.
Rate Limiting, Batching, Parallelize API calls
Large language models are subject to rate limits, such as RPM (requests per minute) and TPM (tokens per minute), which depend on the chosen model and platform. It is important to recognize that rate limiting can introduce latency into the application. To accommodate high traffic requirements, it is recommended to select the maximum value for the max_token parameter to prevent any occurrence of a 429 error, which can lead to subsequent latency issues. Additionally, it is advisable to implement retry logic in your application to further enhance its resilience.
Effectively managing the balance between RPM and TPM allows for enhanced latency through strategies like batching or parallelizing API calls.
When you find yourself reaching the upper limit of RPM but remain comfortably within TPM bounds, consolidating multiple requests into a single batch can optimize your response times. This batching approach enables more efficient utilization of the model’s token capacity without violating rate limits.
Moreover, if your application involves multiple calls to the LLMs API, you can achieve a notable speed boost by adopting an asynchronous programming approach that allows requests to be made in parallel. This concurrent execution minimizes idle time, enhancing overall responsiveness and making the most of available resources.
If the parameters are already optimized and the application requires additional support for higher traffic and a more scalable approach, consider implementing a load balancing solution through Azure API Management layer.
Stream output and use stop sequence
Every LLM endpoint has a particular throughput capacity. As discussed, earlier GPT-3.5-turbo instance on Azure comes with a rate limit of 120,000 tokens per minute, equivalent to approximately 2 tokens per milliseconds. So, to get an output paragraph with 2000 tokens it takes 1 second and the time taken to get the output response increases as the number of tokens increase. The time taken for the output response (Latency) can be measured as the sum of time taken for the first token generation and the time taken per token from the first token onwards. That is
Latency = (Time to first token + (Time taken per token * Total tokens))
So, to improve latency we can stream the output as every token gets generated instead of waiting for the entire paragraph to finish. Both the completions and chat Azure OpenAI APIs support a stream parameter, which when set to true, streams the response back from the model via Server Sevent Events (SSE). We can use Azure Functions with FastAPI to stream the output of OpenAI models in Azure as shown in the blog here.
Designing an Efficient Workflow orchestration
Incorporating GenAI solutions into applications requires the utilization of specialized frameworks like LangChain or Semantic Kernel. These frameworks play a crucial role in orchestrating multiple Large Language Model (LLM) based tasks and grounding these models on custom datasets. However, it’s essential to address the latency introduced by these frameworks in the overall application response. To minimize the impact on application latency, a strategic approach is imperative. A highly effective strategy involves optimizing LLM usage through workflow consolidation, either by minimizing the frequency of calls to LLM APIs or simplifying the overall workflow steps. By streamlining the process, you not only enhance the overall efficiency but also ensure a smoother user experience.
For example, when the requirement is to identify the intention of the user query and based on its context get response by grounding on data from multiple sources. Most times such requirements are executed as a 3-step process –
the first step is to identify the intent using a LLM
and the next step is to get the prompt content from the knowledge base relevant to the intent
and then with the prompt content get the output derived from the LLMs.
One simple approach could be to leverage data engineering and building a consolidated knowledge base with data from all sources and using the input user text directly as the prompt to the grounded data in knowledge base to get the final LLM response in almost a single step.
Improving Latency in Ancillary AI Services
The supporting AI services like Vector DB, Azure AI Search, data pipelines, and others that complement a Language Model (LLM)-based application within the overall Retrieval-Augmented Generation (RAG) pattern are often referred to as “ancillary AI services.” These services play a crucial role in enhancing different aspects of the application, such as data ingestion, searching, and processing, to create a comprehensive and efficient AI ecosystem. For instance, in scenarios where data ingestion plays a substantial role, optimizing the ingestion process becomes paramount to minimize latency in the application.
Similarly lets look at the improvement of few other such services –
Azure AI search
Here are some tips for better performance in Azure AI Search:
Index size and schema: Queries run faster on smaller indexes. One best practice is to periodically revisit index composition, both schema and documents, to look for content reduction opportunities. Schema complexity can also adversely affect indexing and query performance. Excessive field attribution builds in limitations and processing requirements.
Query design: Query composition and complexity are one of the most important factors for performance, and query optimization can drastically improve performance.
Service capacity: A service is overburdened when queries take too long or when the service starts dropping requests. To avoid this, you can increase capacity by adding replicas or upgrading the service tier.
For more information on the optimizations of Azure AI Search index please refer here .
For optimizing third-party vector databases, consider exploring techniques such as vector indexing, Approximate Nearest Neighbor (ANN) search (instead of KNN), optimizing data distribution, implementing parallel processing, and incorporating load balancing strategies. These approaches enhance scalability and improve overall performance significantly.
Conclusion
In conclusion, these strategies contribute significantly to mitigating latency and enhancing response in large language models. However, given the inherent complexity of these models, the optimal response time can fluctuate between milliseconds and 3-4 seconds. It is crucial to recognize that comparing the response expectations of large language models to those of traditional applications, which typically operate in milliseconds, may not be entirely equitable.
Microsoft Tech Community – Latest Blogs –Read More
New Teams for US Government (GCC) Webinars
Discover the latest updates and best practices for a seamless transition to the new Microsoft Teams. Join us for an informative webinar with Teams Engineering, designed to provide you with firsthand knowledge. Ensure a successful migration as we approach the upcoming deadlines:
3.31.2024 (for Desktop, Web)
6.30.2024 (for VDI)
Session Agenda:
New Teams App Considerations for GCC (Mac/Web/Windows/VDI)
Migration Approaches
Known Limitations
Recent Service Updates
Q&A – extended for an additional 30 minutes (attending during this extra time is optional)
Reserve your spot today by registering:
Option #1
When: February 07, 2024 9:00 AM-10:30 AM EST
Duration: 90 minutes
Registration URL: New Teams in GCC Registration Page (eventbuilder.com)
Option #2
When: February 12, 2024 2:30 PM-4:00 PM PST
Duration: 90 minutes
Registration URL: New Teams in GCC (West Coast friendly) Registration Page (eventbuilder.com)
Don’t miss this opportunity to stay informed and make the transition to the new Teams smooth. Reserve your spot now by registering for the webinar. We look forward to your participation and addressing any questions you have, ensuring a successful migration process.
Microsoft Tech Community – Latest Blogs –Read More
Support tip: Improving the efficiency of dynamic group processing with Microsoft Entra ID and Intune
By: Chris Kunze – Sr. Product Manager | Microsoft Intune
If you’re managing a lot of devices, you know how important it is to keep your Microsoft Entra ID dynamic group processing running smoothly and efficiently. To encourage performant dynamic group rules, the ‘contains’ and ‘not Contains’ operators were recently removed (MC705357) from the rule builder’s list of operators. While it’s still possible to use these operators if you edit the rule syntax manually, there is a reason why these operators were removed. Certain properties and operators, such as ‘contains’ and ‘match’, are significantly less efficient in group processing than others. This inefficiency can lead to significant delays in dynamic group processing. You can optimize these rules by using more performant alternatives such as ‘Equals’, ‘Not Equals’, ‘Starts With’, and ‘Not Starts With’.
In addition, some device properties that are available in the creation of a dynamic group and not indexed which also leads to inefficiencies in the processing of the group membership. It’s best to avoid using these properties until they are indexed, if possible. The deviceOwnership and enrollmentProfileName properties have recently been indexed and work is ongoing to index the following properties to improve dynamic group processing efficiency:
deviceCategory
deviceManagementAppId
deviceManufacturer
deviceModel
deviceOSType
deviceOSVersion
devicePhysicalIds
deviceTrustType
isRooted
managementType
objectId
profileType
systemLabels
Using this guidance, we saw significant improvement in group membership evaluation times in a large customer’s production environment.
Here’s a quick example. An organization wants to group all devices that were enrolled with any of these 3 enrollment profiles:
iOS devices – Teachers
iOS devices – Students
iOS devices – Admins
While “device.enrollmentProfileName -contains “iOS devices” works, the rule “device.enrollmentProfileName -startswith “iOS devices” yields the same results but is a much more efficient query.
Evaluating your dynamic group rules with PowerShell
The following is a sample script that you can use to output the displayName, id, and membershipRule for each of the dynamic groups in your organization to a CSV-based file. Using this output, you can quickly list and evaluate the membership rules for all of your Entra ID dynamic groups for inefficiencies and start improving them.
$csvPath = “C:temp”
$csvFile = “dynGroups.csv”
if (!(Get-InstalledModule Microsoft.Graph -ErrorAction SilentlyContinue)) {
Write-Host “You need to install the Microsft.Graph module to run this script.” -ForegroundColor Red
Write-Host “Run ‘Install-Module Microsoft.Graph -Scope CurrentUser’ as an administrator” -ForegroundColor Red
exit 1
}
if (!(Get-MgContext -ErrorAction SilentlyContinue)) {
Connect-MgGraph -Scopes “Directory.Read.All,Group.Read.All”
}
$results = Invoke-MgGraphRequest -Method GET -Uri “https://graph.microsoft.com/v1.0/groups?`$filter=groupTypes/any(c:c+eq+’dynamicMembership’)”
$dynamicGroups = ($results).value
do {
if ($results.’@odata.nextlink’) {
$results = Invoke-MgGraphRequest -Method GET -Uri $results.’@odata.nextlink’
$dynamicGroups += ($results).value
}
} while (
$results.’@odata.nextlink’
)
$dynamicGroups | Select-Object displayName,id,membershipRule | Export-Csv -Path $csvPath$csvFile
Conclusion
We recommend evaluating your group membership rules to see how you can write them more efficiently. Use ‘Equals’ and ‘Starts With’ wherever possible and avoid using the non-indexed properties listed above if they don’t materially change the membership of the dynamic group. You can learn more about creating efficient rules by reading this documentation: Create simpler, more efficient rules for dynamic groups in Microsoft Entra ID.
We hope this helps to improve the processing of your dynamic group memberships! If you have any questions, leave a comment below or reach out to us on X @IntuneSuppTeam.
Microsoft Tech Community – Latest Blogs –Read More