Tag Archives: microsoft
फोनपे में गलत ट्रांजेक्शन कैसे रिफंड करें?
फ़ोनपे से पैसे कट जाने पर, ये कदम उठाए जा सकते ग्राहक सहायता से संपर्क(9154^621√700) है फ़ोनपे ऐप में जाकर, “ट्रांज़ैक्शन” या “इतिहास” सेक्शन में जाएं. असफल लेन-देन चुनें. “वापस लें” या “वापस लेने के लिए अनुरोध करें” विकल्प चुने
फ़ोनपे से पैसे कट जाने पर, ये कदम उठाए जा सकते ग्राहक सहायता से संपर्क(9154^621√700) है फ़ोनपे ऐप में जाकर, “ट्रांज़ैक्शन” या “इतिहास” सेक्शन में जाएं. असफल लेन-देन चुनें. “वापस लें” या “वापस लेने के लिए अनुरोध करें” विकल्प चुने Read More
腾龙公司官网开户客服在哪里
想要在腾龙公司开户,首先我们需要先找到腾龙官网比如tl0809.com。或者添加腾龙公司客服负责人微信ch0098888在线了解公司情况,最好是添加一下公司客服负责人的微信,平时在平台上遇到什么问题也可以向他了解情况,我们在浏览器上输入上面的网址后进入官网的页面,然后在点击注册账号,进入注册页面填写上自己喜欢的账号密码,然后再输入需要填写的必要信息就可以,完成注册后我们再返回主页下载APP,下载完成后登录上去就可以正常游戏了。
想要在腾龙公司开户,首先我们需要先找到腾龙官网比如tl0809.com。或者添加腾龙公司客服负责人微信ch0098888在线了解公司情况,最好是添加一下公司客服负责人的微信,平时在平台上遇到什么问题也可以向他了解情况,我们在浏览器上输入上面的网址后进入官网的页面,然后在点击注册账号,进入注册页面填写上自己喜欢的账号密码,然后再输入需要填写的必要信息就可以,完成注册后我们再返回主页下载APP,下载完成后登录上去就可以正常游戏了。 Read More
EDGE Issue
We are using EDGE browser with IOS 17.7, the website can not open with no any error just a white page. Please provide the necessary information/solution regarding this issue.
We are using EDGE browser with IOS 17.7, the website can not open with no any error just a white page. Please provide the necessary information/solution regarding this issue. Read More
AG开户网址AG平台开户客服
AG开户客服微ch0098888。
如何选择正规AG网络娱乐平台?这些防骗技巧你需要了解!
近年来,网络娱乐平台的快速发展为用户提供了更多娱乐选择,但也让一些不良平台有了可乘之机。今天,我们来聊聊如何识别非正规平台,避免上当受骗。
一、选择正规平台的关键点
正规平台通常会拥有多项国际认证和第三方验证,以保障用户权益。以下几个特征可以帮助你初步判断平台的真实性:
合法认证
合规的娱乐平台通常持有知名国际认证,例如马耳他娱乐管理局、英国娱乐委员会等颁发的授权。用户可通过平台官网查看认证信息,正规平台一般会透明展示相关信息。
第三方检测
受信的平台通常会邀请独立的第三方机构(如GLI、eCOGRA)检测平台内容的公正性。这些机构对平台的技术进行审核,以保障平台的公平性。
用户口碑
查看其他用户的评价与反馈,有助于判断平台的信誉。可以在社交媒体、用户评论平台等渠道查看其他玩家的分享,有助于更全面地了解平台。
二、如何识别不正规平台的惯用套路
夸张宣传
不正规平台通常会夸大收益或打着“快速返利”的宣传口号来吸引用户注册。对于这类宣传,用户需要提高警惕,因为正规平台不会做出这些虚假承诺。
充值诱导
有些平台可能通过设置特定机制诱导用户反复充值。合理规划娱乐支出,不要轻信高回报的宣传。
不透明的取款规定
有的平台在用户取款时设置复杂的条款,甚至无故延迟。正规的娱乐平台一般会有清晰、透明的取款流程,用户在注册前可以查看这些条款内容。
三、如何保障自己的安全?
保持警觉,避免因小失大
有些平台打着“超高奖励”的口号吸引新手用户。选择时要擦亮眼睛,不要轻信优惠,谨慎注册。
分散参与
在尝试不同的平台时,避免过多投入于某个平台,选择用户反馈良好的平台,可以降低风险。
关注平台信息透明度
在选择平台时,尽量选择那些拥有清晰用户协议、隐私政策和安全保障的公司,正规平台通常会公开说明其相关条款,以保证用户的知情权。
结语
在网络娱乐平台不断增多的今天,用户不仅需要享受娱乐乐趣,还要增强辨别虚假平台的能力。通过了解平台认证、用户反馈和相关条款,可以更好地保障自身权益,避免潜在风险。
希望这篇文章能帮助大家在选择娱乐平台时多一分信心,避免掉入不正规平台的陷阱!
AG开户客服微ch0098888。如何选择正规AG网络娱乐平台?这些防骗技巧你需要了解! 近年来,网络娱乐平台的快速发展为用户提供了更多娱乐选择,但也让一些不良平台有了可乘之机。今天,我们来聊聊如何识别非正规平台,避免上当受骗。 一、选择正规平台的关键点正规平台通常会拥有多项国际认证和第三方验证,以保障用户权益。以下几个特征可以帮助你初步判断平台的真实性: 合法认证合规的娱乐平台通常持有知名国际认证,例如马耳他娱乐管理局、英国娱乐委员会等颁发的授权。用户可通过平台官网查看认证信息,正规平台一般会透明展示相关信息。 第三方检测受信的平台通常会邀请独立的第三方机构(如GLI、eCOGRA)检测平台内容的公正性。这些机构对平台的技术进行审核,以保障平台的公平性。 用户口碑查看其他用户的评价与反馈,有助于判断平台的信誉。可以在社交媒体、用户评论平台等渠道查看其他玩家的分享,有助于更全面地了解平台。 二、如何识别不正规平台的惯用套路夸张宣传不正规平台通常会夸大收益或打着“快速返利”的宣传口号来吸引用户注册。对于这类宣传,用户需要提高警惕,因为正规平台不会做出这些虚假承诺。 充值诱导有些平台可能通过设置特定机制诱导用户反复充值。合理规划娱乐支出,不要轻信高回报的宣传。 不透明的取款规定有的平台在用户取款时设置复杂的条款,甚至无故延迟。正规的娱乐平台一般会有清晰、透明的取款流程,用户在注册前可以查看这些条款内容。 三、如何保障自己的安全?保持警觉,避免因小失大有些平台打着“超高奖励”的口号吸引新手用户。选择时要擦亮眼睛,不要轻信优惠,谨慎注册。 分散参与在尝试不同的平台时,避免过多投入于某个平台,选择用户反馈良好的平台,可以降低风险。 关注平台信息透明度在选择平台时,尽量选择那些拥有清晰用户协议、隐私政策和安全保障的公司,正规平台通常会公开说明其相关条款,以保证用户的知情权。 结语在网络娱乐平台不断增多的今天,用户不仅需要享受娱乐乐趣,还要增强辨别虚假平台的能力。通过了解平台认证、用户反馈和相关条款,可以更好地保障自身权益,避免潜在风险。 希望这篇文章能帮助大家在选择娱乐平台时多一分信心,避免掉入不正规平台的陷阱! Read More
फोनपे गलत लेनदेन की शिकायत कैसे करें?
फ़ोनपे से पैसे कट जाने पर, ये कदम उठाए जा सकते ग्राहक सहायता से संपर्क(9154^621√700) है फ़ोनपे ऐप में जाकर, “ट्रांज़ैक्शन” या “इतिहास” सेक्शन में जाएं. असफल लेन-देन चुनें. “वापस लें” या “वापस लेने के लिए अनुरोध करें” विकल्प चुने
फ़ोनपे से पैसे कट जाने पर, ये कदम उठाए जा सकते ग्राहक सहायता से संपर्क(9154^621√700) है फ़ोनपे ऐप में जाकर, “ट्रांज़ैक्शन” या “इतिहास” सेक्शन में जाएं. असफल लेन-देन चुनें. “वापस लें” या “वापस लेने के लिए अनुरोध करें” विकल्प चुने Read More
फोनपे गलत लेनदेन की शिकायत कैसे करें?
फ़ोनपे से पैसे कट जाने पर, ये कदम उठाए जा सकते ग्राहक सहायता से संपर्क(9154^621√700) है फ़ोनपे ऐप में जाकर, “ट्रांज़ैक्शन” या “इतिहास” सेक्शन में जाएं. असफल लेन-देन चुनें. “वापस लें” या “वापस लेने के लिए अनुरोध करें” विकल्प चुने
फ़ोनपे से पैसे कट जाने पर, ये कदम उठाए जा सकते ग्राहक सहायता से संपर्क(9154^621√700) है फ़ोनपे ऐप में जाकर, “ट्रांज़ैक्शन” या “इतिहास” सेक्शन में जाएं. असफल लेन-देन चुनें. “वापस लें” या “वापस लेने के लिए अनुरोध करें” विकल्प चुने Read More
मैं यूनियन बैंक से ऑनलाइन शिकायत कैसे करूं?-
मैं यूनियन बैंक से ऑनलाइन शिकायत कैसे करूं?-
यूनियन बैंक के खिलाफ शिकायत दर्ज करने के लिए, आप इन चरणों का पालन कर सकते हैं:198 यूनियन बैंक ग्राहक सहायता से संपर्क करें:(O9339✓176✓804 ) एयरटेल वेबसाइट पर जाएँ या अपने मोबाइल डिवाइस पर यूनियन बैंक खोलें।
मैं यूनियन बैंक से ऑनलाइन शिकायत कैसे करूं?-यूनियन बैंक के खिलाफ शिकायत दर्ज करने के लिए, आप इन चरणों का पालन कर सकते हैं:198 यूनियन बैंक ग्राहक सहायता से संपर्क करें:(O9339✓176✓804 ) एयरटेल वेबसाइट पर जाएँ या अपने मोबाइल डिवाइस पर यूनियन बैंक खोलें। Read More
Does Hololens support compiling scripts on the fly?
I’m thinking of building a VR application using Unity (and I’m still a beginner).
Is it possible to generate new code while the user is using the VR application, and recompile it while the application is being run?
I’m thinking of building a VR application using Unity (and I’m still a beginner).Is it possible to generate new code while the user is using the VR application, and recompile it while the application is being run? Read More
मैं यूनियन बैंक से ऑनलाइन शिकायत कैसे करूं?-
यूनियन बैंक के खिलाफ शिकायत दर्ज करने के लिए, आप इन चरणों का पालन कर सकते हैं:198 यूनियन बैंक ग्राहक सहायता से संपर्क करें:(O8102✓611✓817 ) एयरटेल वेबसाइट पर जाएँ या अपने मोबाइल डिवाइस पर यूनियन बैंक खोलें।
यूनियन बैंक के खिलाफ शिकायत दर्ज करने के लिए, आप इन चरणों का पालन कर सकते हैं:198 यूनियन बैंक ग्राहक सहायता से संपर्क करें:(O8102✓611✓817 ) एयरटेल वेबसाइट पर जाएँ या अपने मोबाइल डिवाइस पर यूनियन बैंक खोलें। Read More
Azure Deployment Dashboard
Is there a way build dashboard view for Deployment status (Active, Successful, Failed) from activity logs across subscriptions on my tenant.
I tried Azure Resource Graph table deploymentresources , but does not provide who (actor) and when details. Please advise.
Is there a way build dashboard view for Deployment status (Active, Successful, Failed) from activity logs across subscriptions on my tenant. I tried Azure Resource Graph table deploymentresources , but does not provide who (actor) and when details. Please advise. Read More
कुकू एफएम ऑटोपे कैसे बंद करें?
कुकूएफएम मई को स्वचालित भुगतान रोकने के लिए, उनके ग्राहक सेवा नंबर,(0801~6727√970)पर संपर्क करें … और ऑटो-पे को अक्षम करने का अनुरोध करें।.
कुकूएफएम मई को स्वचालित भुगतान रोकने के लिए, उनके ग्राहक सेवा नंबर,(0801~6727√970)पर संपर्क करें … और ऑटो-पे को अक्षम करने का अनुरोध करें।. Read More
यूनियन बैंक में शिकायत कैसे करें?-
यूनियन बैंक के खिलाफ शिकायत दर्ज करने के लिए, आप इन चरणों का पालन कर सकते हैं:198 यूनियन बैंक ग्राहक सहायता से संपर्क करें:(O8102✓611✓817 ) एयरटेल वेबसाइट पर जाएँ या अपने मोबाइल डिवाइस पर यूनियन बैंक खोलें।
यूनियन बैंक के खिलाफ शिकायत दर्ज करने के लिए, आप इन चरणों का पालन कर सकते हैं:198 यूनियन बैंक ग्राहक सहायता से संपर्क करें:(O8102✓611✓817 ) एयरटेल वेबसाइट पर जाएँ या अपने मोबाइल डिवाइस पर यूनियन बैंक खोलें। Read More
इंडियन बैंक में शिकायत दर्ज कैसे करें?-
इंडियन बैंक में शिकायत दर्ज कैसे करें?, आप इन चरणों का पालन कर सकते हैं:1980 एक्सिस बैंक ग्राहक सहायता से संपर्क करें:(08102✓611✓811 ) एयरटेल वेबसाइट पर जाएँ या अपने मोबाइल डिवाइस पर एक्सिस बैंक खोलें।
इंडियन बैंक में शिकायत दर्ज कैसे करें?, आप इन चरणों का पालन कर सकते हैं:1980 एक्सिस बैंक ग्राहक सहायता से संपर्क करें:(08102✓611✓811 ) एयरटेल वेबसाइट पर जाएँ या अपने मोबाइल डिवाइस पर एक्सिस बैंक खोलें। Read More
मीशो में शिकायत कैसे करें?
मीशो की वेबसाइट के ग्राहक,सहायता पोर्टल पर जाएं और शिकायत दर्ज करें. – आप मीशो हेल्पलाइन नंबर ( +91 O8102-611-817 (भाषा: हिंदी और अंग्रेजी) पर भी कॉल कर सकते हैं।… ,,,
मीशो की वेबसाइट के ग्राहक,सहायता पोर्टल पर जाएं और शिकायत दर्ज करें. – आप मीशो हेल्पलाइन नंबर ( +91 O8102-611-817 (भाषा: हिंदी और अंग्रेजी) पर भी कॉल कर सकते हैं।… ,,, Read More
Review Defender Scan Results – Linux
Hi Team,
Please advise how to review defender full scan results on linux endpoint and any detections identified. As per Microsoft – it should show up in MS 365 Defender>Alerts section however I have found nothing there.
I have tried to browse through directory on endpoint – /var/opt/microsoft/mdatp/log/ however it doesn’t exist. Do I have to enable logging to review scan results?
Can these results be shipped to Sentinel so that we have logging enabled?
Hi Team, Please advise how to review defender full scan results on linux endpoint and any detections identified. As per Microsoft – it should show up in MS 365 Defender>Alerts section however I have found nothing there. I have tried to browse through directory on endpoint – /var/opt/microsoft/mdatp/log/ however it doesn’t exist. Do I have to enable logging to review scan results? Can these results be shipped to Sentinel so that we have logging enabled? Read More
मैं मीशो से शिकायत कैसे कर सकता हूँ?-
मीशो की वेबसाइट के ग्राहक,सहायता पोर्टल पर जाएं और शिकायत दर्ज करें. – आप मीशो हेल्पलाइन नंबर ( +91 O8102-611-817 (भाषा: हिंदी और अंग्रेजी) पर भी कॉल कर सकते हैं।… ,,,
मीशो की वेबसाइट के ग्राहक,सहायता पोर्टल पर जाएं और शिकायत दर्ज करें. – आप मीशो हेल्पलाइन नंबर ( +91 O8102-611-817 (भाषा: हिंदी और अंग्रेजी) पर भी कॉल कर सकते हैं।… ,,, Read More
윈도우10 그림판 실행되지 않을때
윈도우10 그림판 실행되지 않을때
그림판은 어디서 다운로드 받나요? 재설치 해야하나요?
윈도우11 그림판과는 다릅ㄴ디ㅏ.
윈도우10 그림판 실행되지 않을때그림판은 어디서 다운로드 받나요? 재설치 해야하나요?윈도우11 그림판과는 다릅ㄴ디ㅏ. Read More
Limitations of Web-Version Office 365 Excel Data Validation
I just need to check with someone else. Does the webversion of 365 lack certain features that are available in Desktop copies of Excel? I’m trying to create a dependent drop-down value, but I keep getting error messages when I try to mess with the formula in data validation to allow it to cascade into lower rows automatically.
Every single tutorial I’ve seen and read has been for a Desktop version of Excel, so I’m starting to lose hope that it’s even possible for me to do it. All I want is a single layer of dependent drop-down that works on every row in a sheet. But getting it to work on every row is just proving impossible.
If there’s a way to get the formula that’s set-up in B3, to work in every row below, but change the formula to B4 and so on – without moving the cell references for the tables. That’d be terrific. But I don’t think its possible on the web version of Excel.
I just need to check with someone else. Does the webversion of 365 lack certain features that are available in Desktop copies of Excel? I’m trying to create a dependent drop-down value, but I keep getting error messages when I try to mess with the formula in data validation to allow it to cascade into lower rows automatically. Every single tutorial I’ve seen and read has been for a Desktop version of Excel, so I’m starting to lose hope that it’s even possible for me to do it. All I want is a single layer of dependent drop-down that works on every row in a sheet. But getting it to work on every row is just proving impossible. XLOOP UP.xlsx If there’s a way to get the formula that’s set-up in B3, to work in every row below, but change the formula to B4 and so on – without moving the cell references for the tables. That’d be terrific. But I don’t think its possible on the web version of Excel. Read More
Questions on the Microsoft Edge effect "Enhance video".
My question is whether anyone knows the exact values that the effect applies when you turn it on, so I can replicate it on my own.
That’s it—thank you!
Hi, my name is Cristian, and recently I’ve been having questions about the ‘Enhance video’ effect that Microsoft Edge provides to improve the appearance of a video when you’re watching it.My question is whether anyone knows the exact values that the effect applies when you turn it on, so I can replicate it on my own. That’s it—thank you! Read More
Fine-tune/Evaluate/Quantize SLM/LLM using the torchtune on Azure ML
In this blog, we’ll explore how to leverage torchtune on Azure ML to fine-tune, evaluate, and quantize small and large language models (SLM/LLM) effectively.
As demand for adaptable and efficient language models grows, there’s a need for robust tools that make model fine-tuning and optimization more accessible. torchtune is a versatile library that simplifies these processes, offering support for distributed training, flexible logging, and model quantization. Azure ML complements torchtune by providing scalable infrastructure and integration options, making it an ideal platform for experimenting with and deploying SLM/LLMs.
This guide provides hands-on code examples and step-by-step instructions for:
- Setting up Azure ML to work with torchtune for distributed model fine-tuning.
- Handling dynamic path adjustments in the YAML recipe, particularly useful for Azure’s storage-mounted environments.
- Applying quantization techniques to optimize models for deployment on resource-limited devices.
By the end of this guide, you’ll be equipped to run scalable and efficient language model pipelines using torchtune on Azure ML, enhancing your model’s performance and accessibility.
Hands-on Labs: https://github.com/Azure/torchtune-azureml
1. Introduction
1.1. torchtune
torchtune is a Python library designed to simplify fine-tune SLM/LLM models using PyTorch. torchtune stands out for its simplicity and flexibility, enabling users to perform fine-tuning, evaluation, and quantization effortlessly with minimal code through YAML-based recipes. This intuitive setup allows users to define and adjust complex training configurations in a structured, readable format, reducing the need for extensive code changes. By centralizing settings into a YAML recipe, torchtune not only speeds up the experimentation process but also makes it easy to replicate or modify configurations across different models and tasks. This approach is ideal for streamlining model optimization, ensuring that fine-tuning and deployment processes are both quick and highly adaptable.
The representative features are as follows:
- Easy Model Tuning: torchtune is a PyTorch-native library that simplifies the SLM fine-tuning, making it accessible to users without advanced AI expertise.
- Easy Application of Distributed Training: torchtune simplifies the setup for distributed training, allowing users to scale their models across multiple GPUs with minimal configuration. This feature significantly reduces users’ trial-and-errors.
- Simplified Model Evaluation and Quantization: torchtune makes model evaluation and quantization straightforward, providing built-in support to easily assess model performance and optimize models for deployment.
- Scalability and Portability: torchtune is flexible enough to be used on various cloud platforms and local environments. It can be easily integrated with AzureML.
For more information about torchtune, please check this link.
1.2. Azure ML with torchtune
Running torchtune on AzureML offers several advantages that streamline the GenAI workflow. Here are some key benefits of using AzureML with torchtune:
- Scalability and Compute Power: Azure ML provides powerful, scalable compute resources, allowing torchtune to handle multiple SLMs/LLMs across multiple GPUs or distributed clusters. This makes it ideal for efficiently managing intensive tasks like fine-tuning and quantization on large datasets.
- Managed ML Environment: Azure ML offers a fully managed environment, so setting up dependencies and managing versions are handled with ease. This reduces setup time for torchtune, letting users focus directly on model optimization without infrastructure concerns.
- Model Deployment and Scaling: Once the model is optimized with torchtune, AzureML provides a straightforward pathway to deploy it on Azure’s cloud infrastructure, making it easy to scale applications to production with robust monitoring and scaling features.
- Seamless Integration with Other Azure Services: Users can leverage other Azure services, such as Azure Blob Storage for dataset storage or Azure SQL for data management. This ecosystem support enhances workflow efficiency and makes AzureML a powerful choice for torchtune-based model tuning and deployment.
2. torchtune YAML configuration
In a torchtune YAML configuration, each parameter and setting controls specific training aspects for fine-tuning large language models (LLMs). Here’s a breakdown of key components like supervised fine-tuning (SFT), direct preference optimization (DPO), knowledge distillation (KD), and quantization:
- SFT (Supervised Fine-Tuning): This setting manages the fine-tuning process by training the model with labeled datasets. It involves specifying the dataset path, batch size, learning rate, and the number of epochs. SFT is critical for adapting pre-trained models to specific tasks using supervised data.
- DPO (Direct Preference Optimization): This setting is for training models based on human preference data. It generally uses a reward model to rank outputs, guiding the model to optimize directly for preferred responses. In torchtune, you can easily apply DPO with the settings below.
- KD (Knowledge Distillation): In this setting, a larger, more accurate model (teacher) transfers knowledge to a smaller model (student). YAML settings might define teacher and student model paths, temperature (for smoothing probabilities), and alpha (weight for balancing loss between teacher predictions and labels). KD allows smaller models to mimic larger models’ performance while reducing computation needs. In torchtune, you can easily apply DPO with the settings below.
- Evaluation: Torchtune integrates seamlessly with EleutherAI’s LM Evaluation Harness, which allows you to evaluate the truthfulness and accuracy of your models using benchmarks like TruthfulQA. You can easily perform these evaluations using Torchtune’s eleuther_eval recipe.
- Quantization: This setting reduces model size and computational requirements by lowering the bit precision of model weights. YAML settings specify the quantization method (e.g., 8-bit or 4-bit), target layers, and possibly additional parameters for post-training quantization. This is particularly helpful for deploying models on edge devices with limited resources. In torchtune, you can easily apply DPO with the settings below.
Check out the YAML samples on torchtune’s official website.
3. Azure ML Training Life Hacks
Applying torchtune’s standalone command to Azure ML is very simple. However, applying the pipeline of hugging face model download-fine-tuning-evaluation-quantization and distributed training as expressed in the architecture requires some trial and error. So, refer to the life hacks below to minimize trial and error when applying them to your workload.
3.1. Downloading model
The torch_distributed_zero_first
decorator is used to ensure that only one process (typically rank 0 in a distributed setup) performs certain operations, such as downloading or loading a model. This approach is crucial in a distributed environment where multiple processes might attempt to load a model concurrently, which could lead to redundant downloads, excessive memory usage, or conflicts.
Here’s why torch_distributed_zero_first
is used to download the model on a single process:
- Prevent Redundant Downloads: In a distributed setup, if every process tries to download the model simultaneously, it can lead to unnecessary network traffic and redundant file storage. By ensuring that only one process downloads the model,
torch_distributed_zero_first
prevents this redundancy. - Avoid Conflicts and File Corruption: If multiple processes attempt to write or modify the same file during download, it could lead to file corruption or access conflicts.
torch_distributed_zero_first
minimizes this risk by allowing only one process to handle the file download.
After downloading, the model can be distributed or loaded into memory across all processes using standard PyTorch distributed training methods. This approach makes the model loading process more efficient and stable in multi-process environments.
3.2. Destroying process group
When applying distributed training on AzureML with torchtune’s CLI, it’s essential to manage the process groups carefully. The distributed training recipe in torchtune CLI initializes a process group using dist.init_process_group(...)
. However, if a process group is already active, initializing another one can cause conflicts, leading to nested or redundant process groups.
To prevent this, you should close any existing process groups before Torchtune’s distributed training starts. This can be done by calling dist.destroy_process_group(…)
to terminate any active process groups, ensuring a clean state. By doing so, you avoid process conflicts, enabling torchtune CLI’s distributed training recipe to operate smoothly without overlapping with pre-existing groups. Code snippets for 3.1 and 3.2 are below.
MASTER_ADDR = os.environ.get('MASTER_ADDR', '127.0.0.1')
MASTER_PORT = os.environ.get('MASTER_PORT', '7777')
WORLD_SIZE = int(os.environ.get("WORLD_SIZE", 1))
GLOBAL_RANK = int(os.environ.get('RANK', -1))
LOCAL_RANK = int(os.environ.get('LOCAL_RANK', -1))
NUM_GPUS_PER_NODE = torch.cuda.device_count()
NUM_NODES = WORLD_SIZE // NUM_GPUS_PER_NODE
if LOCAL_RANK != -1:
dist.init_process_group(backend="nccl" if dist.is_nccl_available() else "gloo")
@contextmanager
def torch_distributed_zero_first(local_rank: int):
"""
Decorator to make all processes in distributed training
wait for each local_master to do something.
"""
if local_rank not in [-1, 0]:
dist.barrier(device_ids=[local_rank])
yield
if local_rank == 0:
dist.barrier(device_ids=[0])
...
with torch_distributed_zero_first(LOCAL_RANK):
# Download the model
download_model(args.teacher_model_id, args.teacher_model_dir)
download_model(args.student_model_id, args.student_model_dir)
# Construct the fine-tuning command
if "single" in args.tune_recipe:
print("***** Single Device Training *****");
full_command = (
f'tune run '
f'{args.tune_recipe} '
f'--config {args.tune_config_name}'
)
# Run the fine-tuning command
run_command(full_command)
else:
print("***** Distributed Training *****");
dist.destroy_process_group()
if GLOBAL_RANK in {-1, 0}:
# Run the fine-tuning command
full_command = (
f'tune run --master-addr {MASTER_ADDR} --master-port {MASTER_PORT} --nnodes {NUM_NODES} --nproc_per_node {NUM_GPUS_PER_NODE} '
f'{args.tune_recipe} '
f'--config {args.tune_config_name}'
)
run_command(full_command)
...
3.3. Dynamic configuration
Since the path to the blob storage mounted on the computing cluster is dynamic, the YAML recipe must be modified dynamically. Here’s an example of how to adjust the configuration using Jinja templates to ensure the paths are set correctly at runtime:
# Dynamically modify fine-tuning YAML file.
import os, jinja2
jinja_env = jinja2.Environment()
template = jinja_env.from_string(Path(args.tune_config_name).open().read())
train_path = os.path.join(args.train_dir, "train.jsonl")
metric_logger = "DiskLogger"
if len(args.wandb_api_key) > 0:
metric_logger = "WandBLogger"
Path(args.tune_config_name).open("w").write(
template.render(
train_path=train_path,
log_dir=args.log_dir,
model_dir=args.model_dir,
model_output_dir=args.model_output_dir,
metric_logger=metric_logger
)
)
lora_finetune.yaml code snippet
# Model arguments
model:
...
# Tokenizer
tokenizer:
_component_: torchtune.models.phi3.phi3_mini_tokenizer
path: {{model_dir}}/tokenizer.model
max_seq_len: null
# Checkpointer
checkpointer:
_component_: torchtune.training.FullModelHFCheckpointer
checkpoint_dir: {{model_dir}}
checkpoint_files: [
model-00001-of-00002.safetensors,
model-00002-of-00002.safetensors
]
recipe_checkpoint: null
output_dir: {{model_output_dir}}
model_type: PHI3_MINI
resume_from_checkpoint: False
save_adapter_weights_only: False
# Dataset
dataset:
_component_: torchtune.datasets.instruct_dataset
source: json
data_files: {{train_path}}
column_map:
input: instruction
output: output
train_on_input: False
packed: False
split: train
seed: null
shuffle: True
# Logging
output_dir: {{log_dir}}/lora_finetune_output
metric_logger:
_component_: torchtune.training.metric_logging.{{metric_logger}}
log_dir: {{log_dir}}/training_logs
log_every_n_steps: 1
log_peak_memory_stats: False
...
In this setup:
- The script reads the template YAML file and dynamically injects the appropriate paths and configurations.
train_path
,log_dir
,model_dir
, andmodel_output_dir
are populated based on the environment’s dynamically assigned paths, ensuring that the YAML file reflects the actual storage locations.metric_logger
is set to"DiskLogger"
by default but changes to"WandBLogger"
if awandb_api_key
is provided, allowing for flexible metric logging configurations.
This approach guarantees that the configuration is always in sync with the environment, even when paths are assigned dynamically by Azure ML’s blob storage mounting.
3.4. Logging
When running a training pipeline with torchtune CLI, it may be challenging to use MLflow for logging. Therefore, you should use Torchtune’s DiskLogger
or WandBLogger
instead.
The DiskLogger
option logs metrics and training information directly to disk, making it a suitable choice when MLFlow is unavailable. Alternatively, if you have a Weights & Biases (WandB) account and API key, the WandBLogger
can be used to log metrics to your WandB dashboard, enabling remote access and visualization of training progress. This way, you can ensure robust logging and monitoring within the torchtune framework.
4. Azure ML Training
Before reading this section please refer to the Azure guide and past blogs (Blog 1, Blog 2) for basic information on Azure ML training and serving.
4.1. Dataset preparation
torchtune provides several dataset options, but in this blog, we will introduce how to save the Hugging Face dataset as json and save it as a Data asset in the Azure Blog Datastore. Please note that if you would like to build/augment your own dataset, please refer to the blog and the GitHub repo for synthetic data generation.
Instruction Dataset for SFT and KD
Preprocessing the dataset is not difficult, but don’t forget to convert the column names to match the specifications in the yaml file.
dataset = load_dataset("HuggingFaceH4/helpful_instructions", name="self_instruct", split="train[:10%]")
dataset = dataset.rename_column('prompt', 'instruction')
dataset = dataset.rename_column('completion', 'output')
print(f"Loaded Dataset size: {len(dataset)}")
if IS_DEBUG:
logger.info(f"Activated Debug mode. The number of sample was resampled to 1000.")
dataset = dataset.select(range(800))
print(f"Debug Dataset size: {len(dataset)}")
logger.info(f"Save dataset to {SFT_DATA_DIR}")
dataset = dataset.train_test_split(test_size=0.2)
train_dataset = dataset['train']
train_dataset.to_json(f"{SFT_DATA_DIR}/train.jsonl", force_ascii=False)
test_dataset = dataset['test']
test_dataset.to_json(f"{SFT_DATA_DIR}/eval.jsonl", force_ascii=False)
Preference Dataset for DPO
For the preference dataset, it may be necessary to convert it into a chat template format. Below is a code example.
def convert_to_preference_format(dataset):
json_format = [
{
"chosen_conversations": [
{"content": row["prompt"], "role": "user"},
{"content": row["chosen"], "role": "assistant"}
],
"rejected_conversations": [
{"content": row["prompt"], "role": "user"},
{"content": row["rejected"], "role": "assistant"}
]
}
for row in dataset
]
return json_format
# Load dataset from the hub
data_path = "jondurbin/truthy-dpo-v0.1"
dataset = load_dataset(data_path, split="train")
print(f"Dataset size: {len(dataset)}")
# if IS_DEBUG:
# logger.info(f"Activated Debug mode. The number of sample was resampled to 1000.")
# dataset = dataset.select(range(800))
logger.info(f"Save dataset to {DPO_DATA_DIR}")
dataset = dataset.train_test_split(test_size=0.2)
train_dataset = dataset['train']
test_dataset = dataset['test']
train_dataset = convert_to_preference_format(train_dataset)
test_dataset = convert_to_preference_format(test_dataset)
with open(f"{DPO_DATA_DIR}/train.jsonl", "w") as f:
json.dump(train_dataset, f, ensure_ascii=False, indent=4)
with open(f"{DPO_DATA_DIR}/eval.jsonl", "w") as f:
json.dump(test_dataset, f, ensure_ascii=False, indent=4)
4.2. Environment asset
You can add pip install
to the command based on the curated environment or add a conda-based custom environment, but in this blog, we will add a docker-based custom environment.
FROM mcr.microsoft.com/aifx/acpt/stable-ubuntu2004-cu124-py310-torch241:biweekly.202410.2
# Install pip dependencies
COPY requirements.txt .
RUN pip install -r requirements.txt --no-cache-dir
# Inference requirements
COPY --from=mcr.microsoft.com/azureml/o16n-base/python-assets:20230419.v1 /artifacts /var/
RUN /var/requirements/install_system_requirements.sh &&
cp /var/configuration/rsyslog.conf /etc/rsyslog.conf &&
cp /var/configuration/nginx.conf /etc/nginx/sites-available/app &&
ln -sf /etc/nginx/sites-available/app /etc/nginx/sites-enabled/app &&
rm -f /etc/nginx/sites-enabled/default
ENV SVDIR=/var/runit
ENV WORKER_TIMEOUT=400
EXPOSE 5001 8883 8888
# support Deepspeed launcher requirement of passwordless ssh login
RUN apt-get update
RUN apt-get install -y openssh-server openssh-client
RUN MAX_JOBS=4 pip install flash-attn==2.6.3 --no-build-isolation
[Tip] If you are building a container with Ubuntu 22.04, make sure to remove the liblttng-ust0
related packages/dependencies. Otherwise, you will get an error when building the container.
FROM mcr.microsoft.com/aifx/acpt/stable-ubuntu2204-cu124-py310-torch250:biweekly.202410.2
...
# Remove packages or dependencies related to liblttng-ust0.
# Starting from Ubuntu 22.04, liblttng-ust0 has been updated to liblttng-ust1 package, deprecating liblttng-ust0 for compatibility reasons.
# If you build a docker file on Ubuntu 22.04 without including this syntax, you will get the following liblttng-ust0 error:
# -- Package 'liblttng-ust0' has no installation candidate
RUN sed -i '/liblttng-ust0/d' /var/requirements/system_requirements.txt
...
4.3. Start a Training job
The code snippet below activates a compute cluster for training. The command
allows user to configure the following key aspects.
inputs
– This is the dictionary of inputs using name value pairs to the command.type
– The type of input. This can be auri_file
oruri_folder
. The default isuri_folder
.path
– The path to the file or folder. These can be local or remote files or folders. For remote files – http/https, wasb are supported.- Azure ML
data
/dataset
ordatastore
are of typeuri_folder
. To usedata
/dataset
as input, you can use registered dataset in the workspace using the format ‘<data_name>:’. For e.g Input(type=’uri_folder’, path=’my_dataset:1′)
- Azure ML
mode
– Mode of how the data should be delivered to the compute target. Allowed values arero_mount
,rw_mount
anddownload
. Default isro_mount
code
– This is the path where the code to run the command is locatedcompute
– The compute on which the command will run. You can run it on the local machine by usinglocal
for the compute.command
– This is the command that needs to be run in thecommand
using the${{inputs.<input_name>}}
expression. To use files or folders as inputs, we can use theInput
class. TheInput
class supports three parameters:environment
– This is the environment needed for the command to run. Curated (built-in) or custom environments from the workspace can be used.instance_count
– Number of nodes. Default is 1.distribution
– Distribution configuration for distributed training scenarios. Azure Machine Learning supports PyTorch, TensorFlow, and MPI-based distributed.
from azure.ai.ml import command
from azure.ai.ml import Input
from azure.ai.ml.entities import ResourceConfiguration
from utils.aml_common import get_num_gpus
num_gpu = get_num_gpus(azure_compute_cluster_size)
logger.info(f"Number of GPUs={num_gpu}")
str_command = ""
if USE_BUILTIN_ENV:
str_env = "azureml://registries/azureml/environments/acpt-pytorch-2.2-cuda12.1/versions/19" # Use built-in Environment asset
str_command += "pip install -r requirements.txt && "
else:
str_env = f"{azure_env_name}@latest" # Use Curated (built-in) Environment asset
if num_gpu > 1:
tune_recipe = "lora_finetune_distributed"
str_command += "python launcher_distributed.py "
else:
tune_recipe = "lora_finetune_single_device"
str_command += "python launcher_single.py "
if len(wandb_api_key) > 0 or wandb_api_key is not None:
str_command += "--wandb_api_key ${{inputs.wandb_api_key}}
--wandb_project ${{inputs.wandb_project}}
--wandb_watch ${{inputs.wandb_watch}} "
str_command += "--train_dir ${{inputs.train_dir}}
--hf_token ${{inputs.hf_token}}
--tune_recipe ${{inputs.tune_recipe}}
--tune_action ${{inputs.tune_action}}
--model_id ${{inputs.model_id}}
--model_dir ${{inputs.model_dir}}
--log_dir ${{inputs.log_dir}}
--model_output_dir ${{inputs.model_output_dir}}
--tune_config_name ${{inputs.tune_config_name}}"
logger.info(f"Tune recipe: {tune_recipe}")
job = command(
inputs=dict(
#train_dir=Input(type="uri_folder", path=SFT_DATA_DIR), # Get data from local path
train_dir=Input(path=f"{AZURE_SFT_DATA_NAME}@latest"), # Get data from Data asset
hf_token=HF_TOKEN,
wandb_api_key=wandb_api_key,
wandb_project=wandb_project,
wandb_watch=wandb_watch,
tune_recipe=tune_recipe,
tune_action="fine-tune,run-quant",
model_id=HF_MODEL_NAME_OR_PATH,
model_dir="./model",
log_dir="./outputs/log",
model_output_dir="./outputs",
tune_config_name="lora_finetune.yaml"
),
code="./scripts", # local path where the code is stored
compute=azure_compute_cluster_name,
command=str_command,
environment=str_env,
instance_count=1,
distribution={
"type": "PyTorch",
"process_count_per_instance": num_gpu, # For multi-gpu training set this to an integer value more than 1
},
)
returned_job = ml_client.jobs.create_or_update(job)
logger.info("""Started training job. Now a dedicated Compute Cluster for training is provisioned and the environment
required for training is automatically set up from Environment.
If you have set up a new custom Environment, it will take approximately 20 minutes or more to set up the Environment before provisioning the training cluster.
""")
ml_client.jobs.stream(returned_job.name)
4.4. Logging
Use torchtune.training.metric_logging.DiskLogger
or torchtune.training.metric_logging.WandBLogger
. When applying DiskLogger
, the save path must be a subfolder of outputs. Otherwise, you cannot check it in the Azure ML UI.
Below is a screenshot of DiskLogger
applied.
Below is a screenshot of WandBLogger
applied.
Any additional training history is recorded in the user_logs folder of Azure ML. Below is an example when using Standard_NC48ads_A100_v4
(NVIDIA A100 GPU x 2ea) as a compute cluster.
Please do not forget to save the quantized model parameters when you apply fine-tuning-evaluation-quantization pipeline in your training code. It is recommended that you also save the original model weights before quantization for comparison.
4.5. Registering a Model
Once you have fine-tuned and quantized your model using torchtune, you can register it as a Model asset on Azure ML. This registration process offers several advantages, making model management and deployment more efficient and organized. Here are the advantages of Registering as a Model asset.
- Version Control: Azure ML’s Model asset allows you to maintain multiple versions of a model. Each new iteration of your model, whether it’s a different fine-tuning configuration or an updated quantization approach, can be registered as a new version. This makes it easy to track model evolution, compare performance across versions, and roll back to previous versions if necessary.
- Centralized Repository: By registering your model as an asset, you store it in a centralized repository. This repository provides easy access for other team members or projects within your organization, enabling collaboration and consistent model usage across different applications.
- Deployment Ready: Models registered as assets in AzureML are directly deployable. This means you can set up endpoints, batch inference pipelines, or other serving mechanisms using the registered model, streamlining the deployment process and minimizing potential errors.
- Metadata Management: Along with the model, you can also store relevant metadata (such as training configuration, environment details, and evaluation metrics) in the Model asset. This metadata is essential for reproducibility and for understanding model performance under different conditions.
Below is a code snippet that registers a model asset and downloads the model artifact.
def get_or_create_model_asset(ml_client, model_name, job_name, model_dir="outputs", model_type="custom_model",
download_quantized_model_only=False, update=False):
try:
latest_model_version = max([int(m.version) for m in ml_client.models.list(name=model_name)])
if update:
raise ResourceExistsError('Found Model asset, but will update the Model.')
else:
model_asset = ml_client.models.get(name=model_name, version=latest_model_version)
print(f"Found Model asset: {model_name}. Will not create again")
except (ResourceNotFoundError, ResourceExistsError) as e:
print(f"Exception: {e}")
model_path = f"azureml://jobs/{job_name}/outputs/artifacts/paths/{model_dir}"
if download_quantized_model_only:
model_path = f"azureml://jobs/{job_name}/outputs/artifacts/paths/{model_dir}/quant"
run_model = Model(
name=model_name,
path=model_path,
description="Model created from run.",
type=model_type # mlflow_model, custom_model, triton_model
)
model_asset = ml_client.models.create_or_update(run_model)
print(f"Created Model asset: {model_name}")
return model_asset
model = get_or_create_model_asset(ml_client, azure_model_name, job_name, model_dir, model_type="custom_model",
download_quantized_model_only=True, update=False)
# Download the model (this is optional)
DOWNLOAD_TO_LOCAL = False
local_model_dir = "./artifact_downloads_dpo"
if DOWNLOAD_TO_LOCAL:
os.makedirs(local_model_dir, exist_ok=True)
ml_client.models.download(name=azure_model_name, download_path=local_model_dir, version=model.version)
We have published the code to do this post end-to-end at https://github.com/Azure/torchtune-azureml. We hope you can easily perform fine-tuning/evaluation/quantization using torchtune and Azure ML.
References
- Azure ML Fine-tuning (Florence-2) Blog
- Synthetic QnA Generation Blog
- torchtune official website
- Fine-tune Meta Llama 3.1 models using torchtune on Amazon SageMaker
Microsoft Tech Community – Latest Blogs –Read More