Category: Microsoft
Category Archives: Microsoft
AdaptiveCard not working with incoming webhook
Anyone have any idea why webhook endpoint would be returning the following error? It seems I already have the type set and still get this error in the response with card actually being posted in teams. Seems to work fine with MessageCard though.
“AdaptiveCards.AdaptiveSerializationException: Property ‘type’ must be ‘AdaptiveCard'”
Here is an example payload that still fails for me with same error message.
type: ‘message’,
attachments: [
{
contentType: ‘application/vnd.microsoft.card.adaptive’,
contentUrl: null,
content: {
$schema:
‘http://adaptivecards.io/schemas/adaptive-card.json’,
type: ‘AdaptiveCard’,
version: ‘1.6’,
body: [
{
type: ‘TextBlock’,
text: ‘Submitted response’,
},
],
},
},
],
};
axios
.post(options.webhookUrl, payload)
.then(function (response) {
console.log(response);
})
.catch(function (error) {
console.log(error);
});
Anyone have any idea why webhook endpoint would be returning the following error? It seems I already have the type set and still get this error in the response with card actually being posted in teams. Seems to work fine with MessageCard though.”AdaptiveCards.AdaptiveSerializationException: Property ‘type’ must be ‘AdaptiveCard'”Here is an example payload that still fails for me with same error message.
const payload = {
type: ‘message’,
attachments: [
{
contentType: ‘application/vnd.microsoft.card.adaptive’,
contentUrl: null,
content: {
$schema:
‘http://adaptivecards.io/schemas/adaptive-card.json’,
type: ‘AdaptiveCard’,
version: ‘1.6’,
body: [
{
type: ‘TextBlock’,
text: ‘Submitted response’,
},
],
},
},
],
};
axios
.post(options.webhookUrl, payload)
.then(function (response) {
console.log(response);
})
.catch(function (error) {
console.log(error);
}); Read More
Trojan and Malware download
Recently, I ran a full scan of my computer using Defender and the system reported a trojan malware on my computer. The threat was quarantined. Question I have is why did not system not detect the trojan when it was downloaded in the first place? How confident can I be that Defender will protect my system. Here is the message I got:
Is there a setting that I can enable to make sure this doesnot happen in the future? I have default settings now. Having used Microsoft365 from BPOS days in the year 2009, I am pretty confident about capabilities of the Defender. This, now has me worried. Should I install McAfee or Norton in addition to Defender? Thanks
Recently, I ran a full scan of my computer using Defender and the system reported a trojan malware on my computer. The threat was quarantined. Question I have is why did not system not detect the trojan when it was downloaded in the first place? How confident can I be that Defender will protect my system. Here is the message I got: Is there a setting that I can enable to make sure this doesnot happen in the future? I have default settings now. Having used Microsoft365 from BPOS days in the year 2009, I am pretty confident about capabilities of the Defender. This, now has me worried. Should I install McAfee or Norton in addition to Defender? Thanks Read More
Show data value when I hover mouse over a data point on Chart created in Microsoft 365
I have installed Microsoft 365 free. I have entered some data on an excel spreadsheet. I have created an excel chart. The graph is fine but I do not see the data value when I hover the mouse over a data point on the graph. This works when I open the file with the graph using an earlier version of Office. Is there an Options button somewhere so that I can turn on this feature? Is this Option not available on Microsoft 365 free?
Regards, John
I have installed Microsoft 365 free. I have entered some data on an excel spreadsheet. I have created an excel chart. The graph is fine but I do not see the data value when I hover the mouse over a data point on the graph. This works when I open the file with the graph using an earlier version of Office. Is there an Options button somewhere so that I can turn on this feature? Is this Option not available on Microsoft 365 free?Regards, John Read More
Is it safe to link ID?
Hello everyone,
My question is that can I safely link my Microsoft account with the third party CapCut app? If any of you have experience with this, please let me know because I need to edit my template video. Here is the app link I am referring to.
Thanks
Hello everyone,My question is that can I safely link my Microsoft account with the third party CapCut app? If any of you have experience with this, please let me know because I need to edit my template video. Here is the app link I am referring to.Thanks Read More
Using SUMIF with Greater/less than times
Hi there,
I have a spreadsheet that calculates a time. I want to automate a count of how many are >3:00 and <3:00.
I use =SUMIF(M2:M110,”<3:00″)
But instead of getting an answer with a whole number, I’m getting a strange percentage .21 ? (The column format is set to general, I changed it to number and it didn’t make a difference).
Likewise, when I run it for >3:00; I get 3.12 …
What am I doing wrong?
Thanks
Sarah
Hi there, I have a spreadsheet that calculates a time. I want to automate a count of how many are >3:00 and <3:00.I use =SUMIF(M2:M110,”<3:00″)But instead of getting an answer with a whole number, I’m getting a strange percentage .21 ? (The column format is set to general, I changed it to number and it didn’t make a difference).Likewise, when I run it for >3:00; I get 3.12 …What am I doing wrong? ThanksSarah Read More
Microsoft Defender for Endpoint Plan 2 enforcing and licenses
Hi Everyone,
We started onboarding our devices to the defender and I have noticed that the Subscription state
for Microsoft Defender for Endpoint Plan 2 mentioning that These plan features and capabilities are applied to all your devices. How can I specify the devices for which I would like to use the devices in a specific security group?
Also, once we finished the deployment, we were promoted with an error stating “Your organization is using more Plan 2 licenses than you own. Compare flexible purchase options, or assign licenses to devices in the subscription settings”. How can we rectify the situation as we are pushing this on mobile phones as well? I know that every user gets to enroll 5 devices so it is normal to have the number of licenses *5 without having any compliance issues, correct?
What if we have devices = less than 5 times the number of licenses but some users who don’t have licenses are using these devices? will that be a problem?
Thanks
Hi Everyone, We started onboarding our devices to the defender and I have noticed that the Subscription statefor Microsoft Defender for Endpoint Plan 2 mentioning that These plan features and capabilities are applied to all your devices. How can I specify the devices for which I would like to use the devices in a specific security group? Also, once we finished the deployment, we were promoted with an error stating “Your organization is using more Plan 2 licenses than you own. Compare flexible purchase options, or assign licenses to devices in the subscription settings”. How can we rectify the situation as we are pushing this on mobile phones as well? I know that every user gets to enroll 5 devices so it is normal to have the number of licenses *5 without having any compliance issues, correct? What if we have devices = less than 5 times the number of licenses but some users who don’t have licenses are using these devices? will that be a problem? Thanks Read More
Microsoft Defender for Endpoint for Server vs for Endpoint
Hi Everyone,
Can I onboard my servers to Microsoft Defender for Endpoint P2 or it is mandatory to obtain a separate Microsoft Defender for Endpoint for Server for them?
Also, what is the difference between the two licenses? Is it for compliance only and technically they are the same?
Thanks
Hi Everyone, Can I onboard my servers to Microsoft Defender for Endpoint P2 or it is mandatory to obtain a separate Microsoft Defender for Endpoint for Server for them? Also, what is the difference between the two licenses? Is it for compliance only and technically they are the same? Thanks Read More
PivotChart (bar graph) of average with standard deviation as error
Hello everybody out there using Excel,
Is there a way to create a PivotChart with a group average as the height of the bars and the standard deviation as the length of the error indicator?
I managed to generate a pivot table with the average and the standard deviation as columns, but Excel would consider each column as an individual group of bars (which the one with the standard errors isn’t).
Hello everybody out there using Excel, Is there a way to create a PivotChart with a group average as the height of the bars and the standard deviation as the length of the error indicator? I managed to generate a pivot table with the average and the standard deviation as columns, but Excel would consider each column as an individual group of bars (which the one with the standard errors isn’t). Read More
how to recover my office product key
can you guys teach how to activate my office. recently i deleted my microsoft office i dint know that it can affect all of microsoft product now i cant acces any of them becuase it requires to put product key. This all can affect me becuase this can help me for my studies but now i cant acces it. I hope you guys help me, Im Zhadrack Quibo from Philippines.
can you guys teach how to activate my office. recently i deleted my microsoft office i dint know that it can affect all of microsoft product now i cant acces any of them becuase it requires to put product key. This all can affect me becuase this can help me for my studies but now i cant acces it. I hope you guys help me, Im Zhadrack Quibo from Philippines. Read More
Word Print format
Good evening,
I created an organization chart in word with a big layout (ansi C, 43.18cm x 55.88cm)
Now i want it to print out as an A4. But when I choose A4, only a fraction of my chart shows up. It is cropped. When I print it as a pdf and then print it out on A4, it looks fine.
How can i print it directly from word? When i reduce the chart manually, the text in it is about 7pt, which is way too small.
I can not find an option to reduce it automaticly to the right size in the print options.
Maybe someone can help me with this?
Thanks for your time!
Best regards
Good evening, I created an organization chart in word with a big layout (ansi C, 43.18cm x 55.88cm)Now i want it to print out as an A4. But when I choose A4, only a fraction of my chart shows up. It is cropped. When I print it as a pdf and then print it out on A4, it looks fine. How can i print it directly from word? When i reduce the chart manually, the text in it is about 7pt, which is way too small. I can not find an option to reduce it automaticly to the right size in the print options. Maybe someone can help me with this?Thanks for your time!Best regards Read More
Months remaining
Good morning,
I’m currently having an issue getting a formula to work.
currently column i contains expirations formatted as MMM-YYYY. I need column J to show either months remaining as a positive or months passed as a negative. In addition if column i shows N/A column j should just show a blank string. this is the code I have but I keep receiving either #Num! for months that haven’t passed yet or the number 1494 for some odd reason.
this is the code I’m working with
=IF(OR(I3=””,I3=”N/A”),””,IF(TODAY()>=I3,DATEDIF(I3,TODAY(),”m”)*(-1),IF(DATE(YEAR(12),MONTH(12),1)>TODAY(),DATEDIF(12,DATE(YEAR(TODAY()),MONTH(TODAY()),0),”m”),DATEDIF(12,EOMONTH(TODAY(),0),”m”))))
Good morning,I’m currently having an issue getting a formula to work.currently column i contains expirations formatted as MMM-YYYY. I need column J to show either months remaining as a positive or months passed as a negative. In addition if column i shows N/A column j should just show a blank string. this is the code I have but I keep receiving either #Num! for months that haven’t passed yet or the number 1494 for some odd reason.this is the code I’m working with=IF(OR(I3=””,I3=”N/A”),””,IF(TODAY()>=I3,DATEDIF(I3,TODAY(),”m”)*(-1),IF(DATE(YEAR(12),MONTH(12),1)>TODAY(),DATEDIF(12,DATE(YEAR(TODAY()),MONTH(TODAY()),0),”m”),DATEDIF(12,EOMONTH(TODAY(),0),”m”)))) Read More
Hotmail Subscription Query
Hope this makes sense.
For years now I’ve subscribed to the premium hotmail account for ad free and larger email storage etc, about £15 a year. This year I’ve got a new MacBook and taken out a yearly subscription to office 365 which I’ll probably keep subscribing to.
My question is, now that I’m doing this can I cancel the premium hotmail and get the benefits now I’m a yearly subscriber please?
Thanks in advance
Hope this makes sense. For years now I’ve subscribed to the premium hotmail account for ad free and larger email storage etc, about £15 a year. This year I’ve got a new MacBook and taken out a yearly subscription to office 365 which I’ll probably keep subscribing to. My question is, now that I’m doing this can I cancel the premium hotmail and get the benefits now I’m a yearly subscriber please? Thanks in advance Read More
How to get access to the free office licenses!
Hi,
Techsoup advise that the non-profit organisation I represent has been approved for the related application despite the hub reporting that eligibility is still being reviewed a month post the application. No confirmation email has been received.
But when I login to the Non-profit hub I don’t see any way to access the free licenses. We are a small organisation and the limited 10 free licenses on offer would suit.
Hi,Techsoup advise that the non-profit organisation I represent has been approved for the related application despite the hub reporting that eligibility is still being reviewed a month post the application. No confirmation email has been received.But when I login to the Non-profit hub I don’t see any way to access the free licenses. We are a small organisation and the limited 10 free licenses on offer would suit. Read More
Copy cells onto another sheet
If I type a number into cell A1 on sheet one so the cell A1 on sheet two copies, is there a formula for the next day, to change the number in cell A1 on sheet one and the cell A1 on sheet two stays the same as the previous day but cell A2 on sheet two updates to the new figure in cell A1 on sheet one?
Thanks
If I type a number into cell A1 on sheet one so the cell A1 on sheet two copies, is there a formula for the next day, to change the number in cell A1 on sheet one and the cell A1 on sheet two stays the same as the previous day but cell A2 on sheet two updates to the new figure in cell A1 on sheet one?Thanks Read More
Conditional Access Grant Access options
Scenario:
In Conditional Access Policies, under the grant controls section, we select 2 options:
1. Require multifactor authentication
2. Require approved client app
and then For multiple controls, we select “Require one of the selected controls“option.
Now assuming all the conditions defined in previous steps are satisfied, in this case which of the above 2 options would be evaluated? Is there a criteria? I tried checking the documentation, didn’t find the answer there.
Also, does this mean if I am coming from an approved app, I don’t have to do MFA?
Lastly, if this is the main MFA policy, then this configuration is not correct, right?
Scenario:In Conditional Access Policies, under the grant controls section, we select 2 options:1. Require multifactor authentication2. Require approved client appand then For multiple controls, we select “Require one of the selected controls”option. Now assuming all the conditions defined in previous steps are satisfied, in this case which of the above 2 options would be evaluated? Is there a criteria? I tried checking the documentation, didn’t find the answer there.Also, does this mean if I am coming from an approved app, I don’t have to do MFA?Lastly, if this is the main MFA policy, then this configuration is not correct, right? Read More
Link a table from MS Fabric
Is it possible to link a table stored in MS Fabric Dataflow Gen2 to an Access database? The data set is roughly 500k rows of data.
FYI I’m not trying to link Fabric FROM an Access database. Rather I would like to work with the data in MS Access.
Any help would be appreciated.
Is it possible to link a table stored in MS Fabric Dataflow Gen2 to an Access database? The data set is roughly 500k rows of data. FYI I’m not trying to link Fabric FROM an Access database. Rather I would like to work with the data in MS Access. Any help would be appreciated. Read More
Creating a SLURM Cluster for Scheduling NVIDIA MIG-Based GPU Accelerated workloads
Today, researchers and developers often use a dedicated GPU for their workloads, even when only a fraction of the GPU’s compute power is needed. The NVIDIA A100, A30, and H100 Tensor Core GPUs introduce a revolutionary feature called Multi-Instance GPU (MIG). MIG partitions the GPU into up to seven instances, each with its own dedicated compute, memory, and bandwidth. This enables multiple users to run their workloads on the same GPU, maximizing per-GPU utilization and boosting user productivity.
In this blog, we will guide you through the process of creating a SLURM cluster and integrating NVIDIA’s Multi-Instance GPU (MIG) feature to efficiently schedule GPU-accelerated jobs. We will cover the installation and configuration of SLURM, as well as the setup of MIG on NVIDIA GPUs.
Overview:
SLURM (Simple Linux Utility for Resource Management) is an open-source job scheduler used by many of the world’s supercomputers and HPC (High-Performance Computing) clusters. It facilitates the allocation of resources such as CPUs, memory, and GPUs to users and their jobs, ensuring efficient use of available hardware. SLURM provides robust workload management capabilities, including job queuing, prioritization, scheduling, and monitoring.
MIG (Multi-Instance GPU) is a feature introduced by NVIDIA for its A100 and H100 Tensor Core GPUs, allowing a single physical GPU to be partitioned into multiple independent GPU instances. Each MIG instance operates with dedicated memory, cache, and compute cores, enabling multiple users or applications to share a single GPU securely and efficiently. This capability enhances resource utilization and provides a level of flexibility and isolation not previously possible with traditional GPUs.
Advantages of Using NVIDIA MIG (Multi-Instance GPU):
Improved Resource Utilization
Maximizes GPU Usage: MIG allows you to run multiple smaller workloads on a single GPU, ensuring that the GPU’s resources are fully utilized. This is especially useful for applications that do not need the full capacity of a GPU.
Cost Efficiency: By enabling multiple instances on a single GPU, organizations can achieve better cost-efficiency, reducing the need to purchase additional GPUs.
Workload Isolation
– Security and Stability: Each GPU instance is fully isolated, ensuring that workloads do not interfere with each other. This is critical for multi-tenant environments where different users or applications might run on the same physical hardware.
– Predictable Performance: Isolation ensures consistent and predictable performance for each instance, avoiding resource contention issues.
Scalability and Flexibility
– Adaptability: MIG allows dynamic partitioning of GPU resources, making it easy to scale workloads up or down based on demand. You can allocate just the right amount of resources needed for different tasks.
– Multi-Tenant Support: Ideal for cloud service providers and data centers that host services for multiple customers, each requiring different levels of GPU resources.
Simplified Management
– Administrative Control: Administrators can use NVIDIA tools to easily configure, manage, and monitor the GPU instances. This includes allocating specific memory and compute resources to each instance.
– Automated Management: Tools and software can automate the allocation and management of GPU resources, reducing the administrative overhead.
Enhanced Performance for Diverse Workloads
– Support for Various Applications: MIG supports a wide range of applications, from AI inference and training to data analytics and virtual desktops. This makes it versatile for different types of computational workloads.
– Optimized Performance: By running multiple instances optimized for specific tasks, you can achieve better overall performance compared to running all tasks on a single monolithic GPU.
Better Utilization in Shared Environments
– Educational and Research Institutions: In environments where GPUs are shared among students or researchers, MIG allows multiple users to access GPU resources simultaneously without impacting each other’s work.
– Development and Testing: Developers can use MIG to test and develop applications in an environment that simulates multi-GPU setups without requiring multiple physical GPUs.
By leveraging the power of NVIDIA’s MIG feature within a SLURM-managed cluster, you can significantly enhance the efficiency and productivity of your GPU-accelerated workloads. Join us as we delve into the steps for setting up this powerful combination and unlock the full potential of your computational resources.
Prerequisites
Scheduler:
Size: Standard D4s v5 (4 vCPUs, 16 GiB memory)
Image: Ubuntu-HPC 2204 – Gen2 (Ubuntu 22.04)
Scheduling software: Slurm 23.02.7-1
Execute VM:
Size: Standard NC40ads H100 v5 (40 vCPUs, 320 GiB memory)
Image: Ubuntu-HPC 2204 – Gen2 (Ubuntu 22.04) – Image contains Nvidia GPU driver.
It is recommended to install the latest NVIDIA GPU driver. The minimum versions are provided below:
If using H100, then CUDA 12 and NVIDIA driver R525 ( >= 525.53) or later
If using A100/A30, then CUDA 11 and NVIDIA driver R450 ( >= 450.80.02) or later
Scheduling software: Slurm 23.02.7-1
Slurm Scheduler setup:
Step 1: First, create users for Munge and SLURM services to manage their operations securely.
groupadd -g 11101 munge
useradd -u 11101 -g 11101 -s /bin/false -M munge
groupadd -g 11100 slurm
useradd -u 11100 -g 11100 -s /bin/false -M slurm
Step 2: Setup NFS Server on Scheduler
NFS will be used to share configuration files across the cluster.
apt install nfs-kernel-server -y
mkdir -p /sched /shared/home
echo “/sched *(rw,sync,no_root_squash)” >> /etc/exports
echo “/shared *(rw,sync,no_root_squash)” >> /etc/exports
systemctl restart nfs-server
systemctl enable nfs-server.service
showmount -e
Step 3: Install and Configure Munge
Munge is used for authentication across the SLURM cluster.
apt install -y munge
dd if=/dev/urandom bs=1 count=1024 > /etc/munge/munge.key
cp /etc/munge/munge.key /sched/
chown munge:munge /sched/munge.key
chmod 400 /sched/munge.key
systemctl restart munge
systemctl enable munge
Step 4: Install and Configure SLURM on Scheduler
Installing Slurm Scheduler daemon and setting up the directories for slurm.
apt install slurm-slurmctld -y
mkdir -p /etc/slurm /var/spool/slurmctld /var/log/slurmctld
chown slurm:slurm /etc/slurm /var/spool/slurmctld /var/log/slurmctld
Creating the `slurm.conf` file. Alternatively, you can generate the file using the Slurm configurator tool.
cat <<EOF > /sched/slurm.conf
MpiDefault=none
ProctrackType=proctrack/cgroup
ReturnToService=2
PropagateResourceLimits=ALL
SlurmctldPidFile=/var/run/slurmctld.pid
SlurmdPidFile=/var/run/slurmd.pid
SlurmdSpoolDir=/var/spool/slurmd
SlurmUser=slurm
StateSaveLocation=/var/spool/slurmctld
SwitchType=switch/none
TaskPlugin=task/affinity,task/cgroup
SchedulerType=sched/backfill
SelectType=select/cons_tres
SelectTypeParameters=CR_Core
GresTypes=gpu
ClusterName=mycluster
JobAcctGatherType=jobacct_gather/none
SlurmctldDebug=debug
SlurmctldLogFile=/var/log/slurmctld/slurmctld.log
SlurmctldParameters=idle_on_node_suspend
SlurmdDebug=debug
SlurmdLogFile=/var/log/slurmd/slurmd.log
PrivateData=cloud
TreeWidth=65533
ResumeTimeout=1800
SuspendTimeout=600
SuspendTime=300
SchedulerParameters=max_switch_wait=24:00:00
Include accounting.conf
Include partitions.conf
EOF
echo “SlurmctldHost=$(hostname -s)” >> /sched/slurm.conf
Creating cgroup.conf for Slurm:
This command creates a configuration file named cgroup.conf in the /sched directory with specific settings for Slurm’s cgroup resource management.
cat <<EOF > /sched/cgroup.conf
CgroupAutomount=no
ConstrainCores=yes
ConstrainRamSpace=yes
ConstrainDevices=yes
EOF
Configuring Accounting Storage Type for Slurm:
echo “AccountingStorageType=accounting_storage/none” >> /sched/accounting.conf
Changing Ownership of Configuration Files:
chown slurm:slurm /sched/*.conf
Creating Symbolic Links for Configuration Files:
ln -s /sched/slurm.conf /etc/slurm/slurm.conf
ln -s /sched/cgroup.conf /etc/slurm/cgroup.conf
ln -s /sched/accounting.conf /etc/slurm/accounting.conf
Configure the Execute VM
Check and Enable NVIDIA GPU Driver and MIG Mode. more details on Nvidia MIG can be found in Nvidia MIG documentation
Ensure the GPU driver is installed. The Ubuntu HPC 2204 image includes the Nvidia GPU driver. If you don’t have the GPU driver, make sure to install it. Here are the commands to enable Nvidia GPU MIG mode:
root@h100vm:~# nvidia-smi -pm 1
Enabled persistence mode for GPU 00000001:00:00.0.
All done.
root@h100vm:~# nvidia-smi -mig 1
Enabled MIG Mode for GPU 00000001:00:00.0
All done.
2. Check supported profiles and create MIG partitions.
The following command check the supported MIG mode in Nvidia H100 GPU.
root@h100vm:~# nvidia-smi mig -lgip
+—————————————————————————–+
| GPU instance profiles: |
| GPU Name ID Instances Memory P2P SM DEC ENC |
| Free/Total GiB CE JPEG OFA |
|=============================================================================|
| 0 MIG 1g.12gb 19 7/7 10.75 No 16 1 0 |
| 1 1 0 |
+—————————————————————————–+
| 0 MIG 1g.12gb+me 20 1/1 10.75 No 16 1 0 |
| 1 1 1 |
+—————————————————————————–+
| 0 MIG 1g.24gb 15 4/4 21.62 No 26 1 0 |
| 1 1 0 |
+—————————————————————————–+
| 0 MIG 2g.24gb 14 3/3 21.62 No 32 2 0 |
| 2 2 0 |
+—————————————————————————–+
| 0 MIG 3g.47gb 9 2/2 46.38 No 60 3 0 |
| 3 3 0 |
+—————————————————————————–+
| 0 MIG 4g.47gb 5 1/1 46.38 No 64 4 0 |
| 4 4 0 |
+—————————————————————————–+
| 0 MIG 7g.94gb 0 1/1 93.12 No 132 7 0 |
| 8 7 1 |
+—————————————————————————–+
Create the MIG partitions using the following command. In this example, we are creating 4 MIG partitions using the 1g.24gb profile.
root@h100vm:~# nvidia-smi mig -cgi 15,15,15,15 -C
Successfully created GPU instance ID 6 on GPU 0 using profile MIG 1g.24gb (ID 15)
Successfully created compute instance ID 0 on GPU 0 GPU instance ID 6 using profile MIG 1g.24gb (ID 7)
Successfully created GPU instance ID 5 on GPU 0 using profile MIG 1g.24gb (ID 15)
Successfully created compute instance ID 0 on GPU 0 GPU instance ID 5 using profile MIG 1g.24gb (ID 7)
Successfully created GPU instance ID 3 on GPU 0 using profile MIG 1g.24gb (ID 15)
Successfully created compute instance ID 0 on GPU 0 GPU instance ID 3 using profile MIG 1g.24gb (ID 7)
Successfully created GPU instance ID 4 on GPU 0 using profile MIG 1g.24gb (ID 15)
Successfully created compute instance ID 0 on GPU 0 GPU instance ID 4 using profile MIG 1g.24gb (ID 7)
root@h100vm:~# nvidia-smi
Fri Jul 5 06:32:39 2024
+—————————————————————————————+
| NVIDIA-SMI 535.161.08 Driver Version: 535.161.08 CUDA Version: 12.2 |
|—————————————–+———————-+———————-+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA H100 NVL On | 00000001:00:00.0 Off | On |
| N/A 38C P0 61W / 400W | 51MiB / 95830MiB | N/A Default |
| | | Enabled |
+—————————————–+———————-+———————-+
+—————————————————————————————+
| MIG devices: |
+——————+——————————–+———–+———————–+
| GPU GI CI MIG | Memory-Usage | Vol| Shared |
| ID ID Dev | BAR1-Usage | SM Unc| CE ENC DEC OFA JPG |
| | | ECC| |
|==================+================================+===========+=======================|
| 0 3 0 0 | 12MiB / 22144MiB | 26 0 | 1 0 1 0 1 |
| | 0MiB / 32767MiB | | |
+——————+——————————–+———–+———————–+
| 0 4 0 1 | 12MiB / 22144MiB | 26 0 | 1 0 1 0 1 |
| | 0MiB / 32767MiB | | |
+——————+——————————–+———–+———————–+
| 0 5 0 2 | 12MiB / 22144MiB | 26 0 | 1 0 1 0 1 |
| | 0MiB / 32767MiB | | |
+——————+——————————–+———–+———————–+
| 0 6 0 3 | 12MiB / 22144MiB | 26 0 | 1 0 1 0 1 |
| | 0MiB / 32767MiB | | |
+——————+——————————–+———–+———————–+
+—————————————————————————————+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+—————————————————————————————+
Create Munge and SLURM users on the execute VM
groupadd -g 11101 munge
useradd -u 11101 -g 11101 -s /bin/false -M munge
groupadd -g 11100 slurm
useradd -u 11100 -g 11100 -s /bin/false -M slurm
4. Mount NFS Shares from Scheduler (Use Scheduler IP address)
mkdir /shared /sched
mount <scheduler ip>:/sched /sched
mount <scheduler ip>:/shared /shared
5. Install and Configure Munge
apt install munge -y
cp /sched/munge.key /etc/munge/
chown munge:munge /etc/munge/munge.key
chmod 400 /etc/munge/munge.key
systemctl restart munge.service
6. Install and Configure SLURM on execute VM
apt install slurm-slurmd -y
mkdir -p /etc/slurm /var/spool/slurmd /var/log/slurmd
chown slurm:slurm /etc/slurm /var/spool/slurmd /var/log/slurmd
chown slurm:slurm /etc/slurm/
ln -s /sched/slurm.conf /etc/slurm/slurm.conf
ln -s /sched/cgroup.conf /etc/slurm/cgroup.conf
ln -s /sched/accounting.conf /etc/slurm/accounting.conf
Create GRES Configuration for MIG. The following steps show how to use the Mig Detection program and use a single H100 system as an example.
git clone https://gitlab.com/nvidia/hpc/slurm-mig-discovery.git
cd slurm-mig-discovery
gcc -g -o mig -I/usr/local/cuda/include -I/usr/cuda/include mig.c -lnvidia-ml
./mig
8. check the GRES config file.
root@h100vm:~/slurm-mig-discovery# cat gres.conf
# GPU 0 MIG 0 /proc/driver/nvidia/capabilities/gpu0/mig/gi3/access
Name=gpu Type=1g.22gb File=/dev/nvidia-caps/nvidia-cap30
# GPU 0 MIG 1 /proc/driver/nvidia/capabilities/gpu0/mig/gi4/access
Name=gpu Type=1g.22gb File=/dev/nvidia-caps/nvidia-cap39
# GPU 0 MIG 2 /proc/driver/nvidia/capabilities/gpu0/mig/gi5/access
Name=gpu Type=1g.22gb File=/dev/nvidia-caps/nvidia-cap48
# GPU 0 MIG 3 /proc/driver/nvidia/capabilities/gpu0/mig/gi6/access
Name=gpu Type=1g.22gb File=/dev/nvidia-caps/nvidia-cap57
9. copy the generated configuration file to central location.
cp gres.conf cgroup_allowed_devices_file.conf /sched/
chown slurm:slurm /sched/cgroup_allowed_devices_file.conf
chown slurm:slurm /sched/gres.conf
10. create symlinks to slurm configuration directory.
ln -s /sched/cgroup_allowed_devices_file.conf /etc/slurm/cgroup_allowed_devices_file.conf
ln -s /sched/gres.conf /etc/slurm/gres.conf
11. create slurm partitions file. This command creates a configuration file named `partitions.conf` in the `/sched` directory. It defines:
– A GPU partition named `gpu` on node `h100vm` with default settings.
– The node `h100vm` has 40 CPUs, 1 board, 1 socket per board, 40 cores per socket, and 1 thread per core.
– It has a real memory of 322243 MB.
– GPU resources are specified with 4 partitions using the `gpu:1g.22gb` profile.
cat << ‘EOF’ > /sched/partitions.conf
PartitionName=gpu Nodes=h100vm Default=YES MaxTime=INFINITE State=UP
NodeName=h100vm CPUs=40 Boards=1 SocketsPerBoard=1 CoresPerSocket=40 ThreadsPerCore=1 RealMemory=322243 Gres=gpu:1g.22gb:4
EOF
12. setting the permission for partitions.conf and creating a symlink to slurm configuration directory.
chown slurm:slurm /sched/partitions.conf
ln -s /sched/partitions.conf /etc/slurm/partitions.conf
Finalize and Start the SLURM Services
On Scheduler:
ln -s /sched/partitions.conf /etc/slurm/partitions.conf
ln -s /sched/cgroup_allowed_devices_file.conf /etc/slurm/cgroup_allowed_devices_file.conf
ln -s /sched/gres.conf /etc/slurm/gres.conf
systemctl restart slurmctld
systemctl enable slurmctld
On Execute VM
systemctl restart slurmd
systemctl enable slurmd
Check sinfo command on scheduler VM to verify the slurm configuration.
root@scheduler:~# sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
gpu* up infinite 1 idle h100vm
Testing the job and functionality
1. To submit the job, first create a test user. In this example, we’ll create a test user named `vinil` for testing purposes. Start by creating the user on the scheduler and then on the execute VM. We have set up an NFS server to share the `/shared` directory, which will serve as the centralized home directory for the user.
# On Scheduler VM
useradd -m -d /shared/home/vinil -u 20001 vinil
# Execute VM
useradd -d /shared/home/vinil -u 20001 vinil
On Scheduler VM:
2. I am using the CIFAR-10 training model to run tests on the 4 MIG instances we created. I will set up an Anaconda environment to run the CIFAR-10 job. This involves installing the TensorFlow GPU machine learning libraries and running 4 jobs simultaneously on a single node using Slurm to demonstrate the capabilities of MIG partitions and GPU workload scheduling on MIG partitions.
# Download and install Anaconda software.
curl -O https://repo.anaconda.com/archive/Anaconda3-2024.06-1-Linux-x86_64.sh
chmod +x Anaconda3-2024.06-1-Linux-x86_64.sh
sh Anaconda3-2024.06-1-Linux-x86_64.sh -b
3. Create a Conda environment named `mlprog` and install the TensorFlow GPU libraries.
#Setting the PATH and creating a conda environment called mlprog enviornment.
export PATH=$PATH:/shared/home/vinil/anaconda3/bin
/shared/home/vinil/anaconda3/bin/conda init
source ~/.bashrc
/shared/home/vinil/anaconda3/bin/conda create -n mlprog tensorflow-gpu -y
4. The following code will download the `cifar10.py` script, which contains the CIFAR-10 image classification machine learning code written using TensorFlow.
#Download the CIFAR10 code.
wget https://raw.githubusercontent.com/vinil-v/slurm-mig-setup/main/test_job_setup/cifar10.py
5. Create a job submission script named `mljob.sh` to run the job on a GPU using the Slurm scheduler. This script is designed to submit a job named `MLjob` to the GPU partition (`–partition=gpu`) of the Slurm scheduler. It allocates 10 tasks (`–ntasks=10`) and specifies GPU resources (`–gres=gpu:1g.22gb:1`). The script sets up the environment by adding Conda to the PATH and activating the `mlprog` Conda environment before executing the `cifar10.py` script to perform CIFAR-10 image classification using TensorFlow.
#!/bin/sh
#SBATCH –job-name=MLjob
#SBATCH –partition=gpu
#SBATCH –ntasks=10
#SBATCH –gres=gpu:1g.22gb:1
export PATH=$PATH:/shared/home/vinil/anaconda3/bin/conda
source /shared/home/vinil/anaconda3/bin/activate mlprog
python cifar10.py
6. Submit the job using the `sbatch` command and execute 4 instances of the job using the same `mljob.sh` script. This method will fully utilize all 4 MIG partitions available on the node. After submission, use the `squeue` command to check the status. You will observe all 4 jobs in the Running state.
(mlprog) vinil@scheduler:~$ sbatch mljob.sh
Submitted batch job 7
(mlprog) vinil@scheduler:~$ sbatch mljob.sh
Submitted batch job 8
(mlprog) vinil@scheduler:~$ sbatch mljob.sh
Submitted batch job 9
(mlprog) vinil@scheduler:~$ sbatch mljob.sh
Submitted batch job 10
(mlprog) vinil@scheduler:~$ squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
7 gpu MLjob vinil R 0:05 1 h100vm
8 gpu MLjob vinil R 0:01 1 h100vm
9 gpu MLjob vinil R 0:01 1 h100vm
10 gpu MLjob vinil R 0:01 1 h100vm
7. Log in to the execution VM and execute the `nvidia-smi` command. You will observe that all 4 MIG GPU partitions are allocated to the jobs and are currently running.uj
azureuser@h100vm:~$ nvidia-smi
Fri Jul 5 07:32:50 2024
+—————————————————————————————+
| NVIDIA-SMI 535.161.08 Driver Version: 535.161.08 CUDA Version: 12.2 |
|—————————————–+———————-+———————-+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA H100 NVL On | 00000001:00:00.0 Off | On |
| N/A 43C P0 90W / 400W | 83393MiB / 95830MiB | N/A Default |
| | | Enabled |
+—————————————–+———————-+———————-+
+—————————————————————————————+
| MIG devices: |
+——————+——————————–+———–+———————–+
| GPU GI CI MIG | Memory-Usage | Vol| Shared |
| ID ID Dev | BAR1-Usage | SM Unc| CE ENC DEC OFA JPG |
| | | ECC| |
|==================+================================+===========+=======================|
| 0 3 0 0 | 20846MiB / 22144MiB | 26 0 | 1 0 1 0 1 |
| | 2MiB / 32767MiB | | |
+——————+——————————–+———–+———————–+
| 0 4 0 1 | 20846MiB / 22144MiB | 26 0 | 1 0 1 0 1 |
| | 2MiB / 32767MiB | | |
+——————+——————————–+———–+———————–+
| 0 5 0 2 | 20850MiB / 22144MiB | 26 0 | 1 0 1 0 1 |
| | 2MiB / 32767MiB | | |
+——————+——————————–+———–+———————–+
| 0 6 0 3 | 20850MiB / 22144MiB | 26 0 | 1 0 1 0 1 |
| | 2MiB / 32767MiB | | |
+——————+——————————–+———–+———————–+
+—————————————————————————————+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 3 0 11813 C python 20826MiB |
| 0 4 0 11836 C python 20826MiB |
| 0 5 0 11838 C python 20830MiB |
| 0 6 0 11834 C python 20830MiB |
+—————————————————————————————+
azureuser@h100vm:~$
Conclusion:
You have now successfully set up a SLURM cluster with NVIDIA MIG integration. This setup allows you to efficiently schedule and manage GPU jobs, ensuring optimal utilization of resources. With SLURM and MIG, you can achieve high performance and scalability for your computational tasks. Happy computing!
Microsoft Tech Community – Latest Blogs –Read More
Two OneDrive folders showing with Explorer
Explorer shows a OneDrive – Personal folder, which I assume is the one I should use. AND a OneDrive folder under C:. Can I delete the C: folder after saving any unique files under it?
Explorer shows a OneDrive – Personal folder, which I assume is the one I should use. AND a OneDrive folder under C:. Can I delete the C: folder after saving any unique files under it? Read More
Unable to open Embedded excel document in excel
hi.
i have an excel document from work which has embedded word,excel and pdf documents in it. i cannot seem to open one of the excel documents and one of the pdf documents but other people are able to. i get the same message when trying to open either of them ‘ cannot start the source application for this object.
any ideas?
Thanks
hi. i have an excel document from work which has embedded word,excel and pdf documents in it. i cannot seem to open one of the excel documents and one of the pdf documents but other people are able to. i get the same message when trying to open either of them ‘ cannot start the source application for this object. any ideas? Thanks Read More
Older versions of Teams are still appearing in the registry for other user profiles and are being fl
Hello,
I wanted to update you on the issues we are facing after cleaning Classic Teams. Older versions of Teams are still appearing in the registry for other user profiles and are being flagged as vulnerable in 365 Defender, specifically in the HKEY_USERS registry path for others users.
For example, as evidence from the Defender portal, here are some entries indicating software issues:
– Endpoint Name: TestPC
– ComputerHKEY_CURRENT_USERSoftwareMicrosoftWindowsCurrentVersionUninstallTeams
– HKEY_USERSuser1SOFTWAREMicrosoftWindowsCurrentVersionUninstallTeams
– HKEY_USERSuser2SOFTWAREMicrosoftWindowsCurrentVersionUninstallTeams
– HKEY_USERSuser3SOFTWAREMicrosoftWindowsCurrentVersionUninstallTeams
We attempted to remove the registry entries from other user profiles to clean up the Classic Teams presence by using the following commands:
powershell
” reg load “hku$user” “C:Users$userNTUSER.DAT”
” Test-Path -Path Registry::HKEY_USERS$hiveNameSOFTWAREMicrosoftWindowsCurrentVersionUninstallTeams “
For checking the registry presence, we used the detection and remediation method in Intune for cleaning Classic Teams. I ran the detection script on only three PCs for testing.
Surprisingly, we received a warning from Sentinel about “User and group membership reconnaissance (SAMR) on one endpoint,” indicating a potential security incident involving suspicious SAMR (Security Account Manager Remote) queries. This was detected for admin accounts, DC, and also for an account belonging to someone who left the organization five years ago (ABC Admin).
I am looking for appreciate your guidance on the best practices for detecting and removing Classic Teams leftovers in the registry for other user profiles.
Best Practice:
– How to detect and remove Classic Teams registry entries for other user profiles in the system.
– Best method? Using the Hive to load another user profile into the registry and remove the Classic Teams registry entries.
Reference Links:
– [Older versions of Teams showing in user profiles](https://answers.microsoft.com/en-us/msteams/forum/all/older-versions-of-teams-showing-in-user-profiles/2bc7563c-ccc9-4afc-b522-337acff9d20e?page=1)
– [Remove old user profiles on Microsoft Teams (Reddit)](https://www.reddit.com/r/PowerShell/comments/1bvjner/remove_old_user_profiles_on_microsoft_teams/)
Hello, I wanted to update you on the issues we are facing after cleaning Classic Teams. Older versions of Teams are still appearing in the registry for other user profiles and are being flagged as vulnerable in 365 Defender, specifically in the HKEY_USERS registry path for others users. For example, as evidence from the Defender portal, here are some entries indicating software issues:- Endpoint Name: TestPC – ComputerHKEY_CURRENT_USERSoftwareMicrosoftWindowsCurrentVersionUninstallTeams – HKEY_USERSuser1SOFTWAREMicrosoftWindowsCurrentVersionUninstallTeams – HKEY_USERSuser2SOFTWAREMicrosoftWindowsCurrentVersionUninstallTeams – HKEY_USERSuser3SOFTWAREMicrosoftWindowsCurrentVersionUninstallTeams We attempted to remove the registry entries from other user profiles to clean up the Classic Teams presence by using the following commands:powershell ” reg load “hku$user” “C:Users$userNTUSER.DAT” ” Test-Path -Path Registry::HKEY_USERS$hiveNameSOFTWAREMicrosoftWindowsCurrentVersionUninstallTeams ” For checking the registry presence, we used the detection and remediation method in Intune for cleaning Classic Teams. I ran the detection script on only three PCs for testing. Surprisingly, we received a warning from Sentinel about “User and group membership reconnaissance (SAMR) on one endpoint,” indicating a potential security incident involving suspicious SAMR (Security Account Manager Remote) queries. This was detected for admin accounts, DC, and also for an account belonging to someone who left the organization five years ago (ABC Admin). I am looking for appreciate your guidance on the best practices for detecting and removing Classic Teams leftovers in the registry for other user profiles. Best Practice:- How to detect and remove Classic Teams registry entries for other user profiles in the system.- Best method? Using the Hive to load another user profile into the registry and remove the Classic Teams registry entries. Reference Links:- [Older versions of Teams showing in user profiles](https://answers.microsoft.com/en-us/msteams/forum/all/older-versions-of-teams-showing-in-user-profiles/2bc7563c-ccc9-4afc-b522-337acff9d20e?page=1)- [Remove old user profiles on Microsoft Teams (Reddit)](https://www.reddit.com/r/PowerShell/comments/1bvjner/remove_old_user_profiles_on_microsoft_teams/) Read More