Category: Microsoft
Category Archives: Microsoft
Is it safe to end the search app running in background?
I noticed something odd yesterday. The “search” app was running in the background even after I finished my virus scan. I ended the task, but it popped back up in the task manager almost immediately. Any insights would be appreciated! Thank you!
I noticed something odd yesterday. The “search” app was running in the background even after I finished my virus scan. I ended the task, but it popped back up in the task manager almost immediately. Any insights would be appreciated! Thank you! Read More
SSL Certificate Renewal Process for Exchange 2019 Hybrid Environment with Edge Servers
Hello Community,
We are managing an Exchange Hybrid environment with the following on-premises setup:
Two Exchange Mailbox Servers 2019Two Edge Servers 2019
Recently, we renewed our third-party SSL certificates, which include SANs for mail.xyz.com and autodiscover.xyz.com. I imported the renewed certificate on all four servers, but encountered these warnings:
Edge Servers need to be resubscribed after the SSL certificate renewal.The same SSL certificate should not be used on both Hub Transport Servers and Edge Servers.
I’ve reviewed various Microsoft resources, but I’m still seeking clear guidance on the best practices and specific requirements for SSL certificate renewal in an Exchange 2019 environment, particularly when Edge Servers are involved.
Could anyone provide detailed advice or clarify the correct process for:
Handling SSL certificates between Mailbox and Edge ServersResubscribing Edge Servers following certificate renewalBest practices for SSL management in a hybrid setup
Any expert insights or pointers to relevant documentation would be highly valuable!
Thank you in advance for your support!
Hello Community,We are managing an Exchange Hybrid environment with the following on-premises setup:Two Exchange Mailbox Servers 2019Two Edge Servers 2019Recently, we renewed our third-party SSL certificates, which include SANs for mail.xyz.com and autodiscover.xyz.com. I imported the renewed certificate on all four servers, but encountered these warnings:Edge Servers need to be resubscribed after the SSL certificate renewal.The same SSL certificate should not be used on both Hub Transport Servers and Edge Servers.I’ve reviewed various Microsoft resources, but I’m still seeking clear guidance on the best practices and specific requirements for SSL certificate renewal in an Exchange 2019 environment, particularly when Edge Servers are involved.Could anyone provide detailed advice or clarify the correct process for:Handling SSL certificates between Mailbox and Edge ServersResubscribing Edge Servers following certificate renewalBest practices for SSL management in a hybrid setupAny expert insights or pointers to relevant documentation would be highly valuable!Thank you in advance for your support! Read More
History export
Hi,
I’m trying to create a report in Power BI or Excel with the history of tasks for the entire project. We have options to track the history of each task, but I would like to see the history as one log for the complete project. I have been unable to do it by myself. However, maybe someone has already created such a Power BI report or a flow to upload it to an Excel file?
Best regards,
Bartosz
Hi,I’m trying to create a report in Power BI or Excel with the history of tasks for the entire project. We have options to track the history of each task, but I would like to see the history as one log for the complete project. I have been unable to do it by myself. However, maybe someone has already created such a Power BI report or a flow to upload it to an Excel file?Best regards,Bartosz Read More
PowerPoint File Size Doubles After Saving on macOS Sonoma
Hello,
I recently upgraded my Mac to macOS Sonoma, and I’m experiencing an issue with Microsoft PowerPoint. When I open a PowerPoint file of approximately 60MB in size and then save it again, the file size increases to about 100MB without any changes made to the content. This issue didn’t occur prior to the macOS upgrade.
Here are some additional details:
macOS version: Apple M2 Pro – 14.6.1 (23G93)PowerPoint version: Microsoft® PowerPoint for Mac – Version 16.88.1 – Microsoft 365 SubscriptionFile format: .pptxThe file contains images and media, but no changes were made to them before saving.
Could you please assist me in resolving this issue?
Any guidance or troubleshooting steps would be appreciated.
Hello, I recently upgraded my Mac to macOS Sonoma, and I’m experiencing an issue with Microsoft PowerPoint. When I open a PowerPoint file of approximately 60MB in size and then save it again, the file size increases to about 100MB without any changes made to the content. This issue didn’t occur prior to the macOS upgrade.Here are some additional details:macOS version: Apple M2 Pro – 14.6.1 (23G93)PowerPoint version: Microsoft® PowerPoint for Mac – Version 16.88.1 – Microsoft 365 SubscriptionFile format: .pptxThe file contains images and media, but no changes were made to them before saving.Could you please assist me in resolving this issue?Any guidance or troubleshooting steps would be appreciated. Read More
Any way to turn off Windows 11 upgrade prompt?
Is it possible to turn it off as I don’t want to upgrade though this PC does meet all the system requirements.
Is it possible to turn it off as I don’t want to upgrade though this PC does meet all the system requirements. Read More
Unresponsive Modal Windows in SSMS Database Diagram Floating Windows
Description:
I am encountering a persistent issue in SQL Server Management Studio (SSMS) where modal windows, such as the “Save” dialog, become unresponsive when using floating windows within database diagrams. This occurs when I switch focus from the floating window to the main SSMS window.
Steps to Reproduce:
Open SSMS and create a new database diagram.Create a floating window within the diagram.Trigger a modal window (e.g., by attempting to save the diagram).Shift focus to the main SSMS window.
Expected Behavior:
The modal window should remain responsive and allow me to interact with it.
Actual Behavior:
The modal window becomes unresponsive, preventing me from proceeding.
Workarounds Attempted:
I have tried the following workarounds without success:
Minimizing the main SSMS windowMoving the floating windowSwitching to a different window and backAtempting to use full-screen mode
Additional Information:
SSMS Version: SQL Server Management Studio 20.2.30.0Operating System: Windows 11 but I had the same problem with Win10
I request that Microsoft investigate and address this issue to improve the usability of SSMS for database diagram design.
Description:I am encountering a persistent issue in SQL Server Management Studio (SSMS) where modal windows, such as the “Save” dialog, become unresponsive when using floating windows within database diagrams. This occurs when I switch focus from the floating window to the main SSMS window.Steps to Reproduce:Open SSMS and create a new database diagram.Create a floating window within the diagram.Trigger a modal window (e.g., by attempting to save the diagram).Shift focus to the main SSMS window.Expected Behavior:The modal window should remain responsive and allow me to interact with it.Actual Behavior:The modal window becomes unresponsive, preventing me from proceeding.Workarounds Attempted:I have tried the following workarounds without success:Minimizing the main SSMS windowMoving the floating windowSwitching to a different window and backAtempting to use full-screen modeAdditional Information:SSMS Version: SQL Server Management Studio 20.2.30.0Operating System: Windows 11 but I had the same problem with Win10I request that Microsoft investigate and address this issue to improve the usability of SSMS for database diagram design. Read More
Resource calendar hour mistake
Hi!
On project for the web i’ve created a calendar template from a “calendar resource” and assigned them a 08:00 AM – 17:00 PM GMT+1 working hour
Then i assigned this calendar to each resource.
I have some plan on project with these resources but when i assign them some task the hour are “random”, why?
This cause daily hours counting mistake
Start date: 5/7
End date: 5/7
Total should be 1 day(8 hour) instead of 0,75 day
What am i missing?
Thanks!
Hi! On project for the web i’ve created a calendar template from a “calendar resource” and assigned them a 08:00 AM – 17:00 PM GMT+1 working hourThen i assigned this calendar to each resource. I have some plan on project with these resources but when i assign them some task the hour are “random”, why? This cause daily hours counting mistakeStart date: 5/7End date: 5/7 Total should be 1 day(8 hour) instead of 0,75 day What am i missing? Thanks! Read More
Detect client platform
Is there a way to detect the client type connected with the Visual Script? I want to show a simple tutorial on teleporting and interacting with objects, but it differs for PC and Quest; therefore, I want to display a different image for PC and a different one for Quest.
Is there a way to detect the client type connected with the Visual Script? I want to show a simple tutorial on teleporting and interacting with objects, but it differs for PC and Quest; therefore, I want to display a different image for PC and a different one for Quest. Read More
Project for the web – view only user
Hi,
is it possibile on project for the web assign role to each users in the plan?
Some users should not modify data but just look at them!
Actually everyone can add task, modify dates etc.. that’s a huge issue!
Only PM should do that
Thanks
Hi,is it possibile on project for the web assign role to each users in the plan? Some users should not modify data but just look at them! Actually everyone can add task, modify dates etc.. that’s a huge issue!Only PM should do that Thanks Read More
User Profile service application doesn’t function correctly in sharepoint server 2019
Hi I am new with SharePoint server. I want to Configure profile synchronization by using SharePoint Active Directory. But the sync status of User Profile Service Application is always idle even I manually run a full sync. And I got Critical ID 6398 in my event logs. I also checked failing timer jobs, the User Profile service application – feed cache full repopulation is always failing with error messages: unexpected exception in FeedCacheService.IsRepopulationNeeded:Unable to create a Data Cache.SPDistributedCache is probably down.
Is there any advice?
Thanks,
Hi I am new with SharePoint server. I want to Configure profile synchronization by using SharePoint Active Directory. But the sync status of User Profile Service Application is always idle even I manually run a full sync. And I got Critical ID 6398 in my event logs. I also checked failing timer jobs, the User Profile service application – feed cache full repopulation is always failing with error messages: unexpected exception in FeedCacheService.IsRepopulationNeeded:Unable to create a Data Cache.SPDistributedCache is probably down.Is there any advice?Thanks, Read More
MS Project – Different links between operations in print and processing view
Dear Community,
I have a problem with the links of my processes. In order to be able to compare the former dates (target dates) with the actual dates (actual) or the postponed dates, I have carried out a target/actual comparison via the GANTT diagram monitoring view. However, something does not seem to work with the display of the link lines between the tasks. The following problem description:
In my Gantt chart view (image 1), the links are displayed correctly by the system, just as I would like them to be. The actual dates (on the right-hand side) are linked with connecting lines. However, if I then want to print the plan, the links refer to the target dates (on the left) of the activities and no longer to the postponed or actual dates (on the right).
I would also like to have the views from image 1 as a print view so that I can send it to the customer as a PDF.
What could be the problem?
My thoughts go in the direction of the different stored dates (Fig. 3) or bar types (see Fig. 4).
However, I have tried everything and have not found a solution to the problem. Can anyone help me?
Figure 1:
Figure 2:
Figure 3:
Figure 4:
Dear Community,I have a problem with the links of my processes. In order to be able to compare the former dates (target dates) with the actual dates (actual) or the postponed dates, I have carried out a target/actual comparison via the GANTT diagram monitoring view. However, something does not seem to work with the display of the link lines between the tasks. The following problem description: In my Gantt chart view (image 1), the links are displayed correctly by the system, just as I would like them to be. The actual dates (on the right-hand side) are linked with connecting lines. However, if I then want to print the plan, the links refer to the target dates (on the left) of the activities and no longer to the postponed or actual dates (on the right). I would also like to have the views from image 1 as a print view so that I can send it to the customer as a PDF. What could be the problem?My thoughts go in the direction of the different stored dates (Fig. 3) or bar types (see Fig. 4).However, I have tried everything and have not found a solution to the problem. Can anyone help me? Figure 1: Figure 2: Figure 3: Figure 4: Read More
Last executed queries on SQL Server
I have tried this code snippet to get last n queries from SQL SERVER:
SELECT deqs.last_execution_time AS [Time],
dest.TEXT AS [Query]
FROM
sys.dm_exec_query_stats AS deqs
CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest
ORDER BY
deqs.last_execution_time DESC
But as per documentations also, I am understanding is that sys.dm_exec_query_stats returns aggregated results, so if a query is ran for three times(simultaneously or may be with a minute time difference) it will show as one query.
So is there a way by which I can see precisely how many times a query is getting executed on server.
May be my understandings are not quite clear wrt sys.dm_exec_query_stats but please help if this query is correct or if there is another way to have the required result.
I have tried this code snippet to get last n queries from SQL SERVER:SELECT deqs.last_execution_time AS [Time],
dest.TEXT AS [Query]
FROM
sys.dm_exec_query_stats AS deqs
CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest
ORDER BY
deqs.last_execution_time DESCBut as per documentations also, I am understanding is that sys.dm_exec_query_stats returns aggregated results, so if a query is ran for three times(simultaneously or may be with a minute time difference) it will show as one query. So is there a way by which I can see precisely how many times a query is getting executed on server.May be my understandings are not quite clear wrt sys.dm_exec_query_stats but please help if this query is correct or if there is another way to have the required result. Read More
How to screen record on Windows 10 laptop with audio?
Hi all,
I know it is quite old question but I really need some help here. I am running a workshop and need to record the screen from time to time for making demonstration for clients. The computer is pretty old and not eligible to run Windows 11 as old of date CPU.
Is there any simple way to screen record on Windows 10 laptop with audio? I know how to take screenshot with the built-in shortcut and unable to find a way for recording the screen on Window 10
Please advise!
Hi all, I know it is quite old question but I really need some help here. I am running a workshop and need to record the screen from time to time for making demonstration for clients. The computer is pretty old and not eligible to run Windows 11 as old of date CPU. Is there any simple way to screen record on Windows 10 laptop with audio? I know how to take screenshot with the built-in shortcut and unable to find a way for recording the screen on Window 10 Please advise! Read More
Freezing and Lagging in Windows 365
I’ve been using windows 365 for a while now and it has worked perfectly well but for the last few weeks it keeps freezing and lagging when doing simple tasks, web browsing, copy and paste and even tabbing through fields in an excel spreadsheet.
I have tried logging this with the windows 365 support team but they have been unable to help, except suggesting I upped the ram (which I did and it did not work) so now I am paying more for a system that still does not work.
Has anybody else had issues with lagging and freezing and were they able to over come it?
Thanks
I’ve been using windows 365 for a while now and it has worked perfectly well but for the last few weeks it keeps freezing and lagging when doing simple tasks, web browsing, copy and paste and even tabbing through fields in an excel spreadsheet. I have tried logging this with the windows 365 support team but they have been unable to help, except suggesting I upped the ram (which I did and it did not work) so now I am paying more for a system that still does not work.Has anybody else had issues with lagging and freezing and were they able to over come it? Thanks Read More
Train Vision Transformer model and run Inference
You’re welcome to follow my GitHub repo and give it a star:https://github.com/xinyuwei-david/david-share.git,lots of useful code is here!
As of now, CV models are primarily based on convolutional neural networks. However, with the rise of Transformers, Vision Transformers are gradually being applied.
Next, let’s look at mainstream CV implementations and their characteristics.
CV Architecture
U-Net
Features: Encoder-decoder structure, skip connections.
Network Type: Convolutional Neural Network (CNN).
Applications: Image segmentation, medical image processing.
Advantages: Efficient in segmentation tasks, preserves details.
Disadvantages: Limited scalability for large datasets.
Usage: Widely used in medical image segmentation.
Main Models: Original U-Net, 3D U-Net, Stable Diffusion.
R-CNN
Features: Selective search for generating candidate regions.
Network Type: CNN-based.
Applications: Object detection.
Advantages: High detection accuracy.
Disadvantages: High computational complexity, slow speed.
Usage: Replaced by faster models like Faster R-CNN.
Main Models: Fast R-CNN, Faster R-CNN.
GAN
Features: Adversarial training between generator and discriminator.
Network Type: Framework, usually using CNN.
Applications: Image generation, style transfer.
Advantages: Generates high-quality images.
Disadvantages: Unstable training, prone to mode collapse.
Usage: Widely used in generation tasks.
Main Models: DCGAN, StyleGAN.
RNN/LSTM
Features: Handles sequential data, remembers long-term dependencies.
Network Type: Recurrent Neural Network.
Applications: Time series prediction, video analysis.
Advantages: Suitable for sequential data.
Disadvantages: Difficult to train, gradient vanishing.
Usage: Commonly used in sequence tasks.
Main Models: LSTM, GRU.
GNN
Features: Processes graph-structured data.
Network Type: Graph Neural Network.
Applications: Social network analysis, chemical molecule modeling.
Advantages: Captures graph structure information.
Disadvantages: Limited scalability for large graphs.
Usage: Used in graph data tasks.
Main Models: GCN, GraphSAGE.
Capsule Networks
Features: Capsule structure, captures spatial hierarchies.
Network Type: CNN-based.
Applications: Image recognition.
Advantages: Captures pose variations.
Disadvantages: High computational complexity.
Usage: Research stage, not widely applied.
Main Models: Dynamic Routing.
Autoencoder
Features: Encoder-decoder structure.
Network Type: Can be CNN-based.
Applications: Dimensionality reduction, feature learning.
Advantages: Unsupervised learning.
Disadvantages: Limited generation quality.
Usage: Used for feature extraction and dimensionality reduction.
Main Models: Variational Autoencoder (VAE).
Vision Transformer (ViT)
Features: Based on self-attention mechanism, processes image patches.
Network Type: Transformer.
Applications: Image classification.
Advantages: Captures global information.
Disadvantages: Requires large amounts of data for training.
Usage: Gaining popularity, especially on large datasets.
Main Models: Original ViT, DeiT.
ViT and U-Net
According to the paper: “Understanding the Efficacy of U-Net & Vision Transformer for Groundwater Numerical Modelling,” U-Net is generally more efficient than ViT, especially in sparse data scenarios. U-Net’s architecture is simpler with fewer parameters, making it more efficient in terms of computational resources and time. While ViT has advantages in capturing global information, its self-attention mechanism has high computational complexity, particularly when handling large-scale data.
In the experiments of the paper, models combining U-Net and ViT outperformed the Fourier Neural Operator (FNO) in both accuracy and efficiency, especially in sparse data conditions.
In image processing, sparse data typically refers to incomplete or unevenly distributed information in images. For example:
Low-resolution images: Fewer pixels, missing details.
Occlusion or missing data: Parts of the image are blocked or data is missing.
Uneven sampling: Lower pixel density in certain areas.
In these cases, models need to infer the complete image content from limited pixel information.
After the emergence of Vision Transformers, new branches and variations have appeared:
DeiT (Data-efficient Image Transformers) by Facebook AI: DeiT models are refined ViT models. The authors also released more training-efficient ViT models, which can be directly integrated into ViTModel or ViTForImageClassification. Four variants are available (in three different sizes): facebook/deit-tiny-patch16-224, facebook/deit-small-patch16-224, facebook/deit-base-patch16-224, and facebook/deit-base-patch16-384. Note that images should be prepared using DeiTImageProcessor.
BEiT (BERT pre-training of Image Transformers) by Microsoft Research: BEiT models use a self-supervised method inspired by BERT (masked image modeling) and based on VQ-VAE, outperforming vision transformers with supervised pre-training.
DINO (a self-supervised training method for Vision Transformers) by Facebook AI: Vision Transformers trained with the DINO method exhibit interesting properties not found in convolutional models. They can segment objects without being explicitly trained for it. DINO checkpoints can be found on the hub.
MAE (Masked Autoencoder) by Facebook AI: By pre-training Vision Transformers to reconstruct the pixel values of a large portion (75%) of masked patches (using an asymmetric encoder-decoder architecture), the authors demonstrate that this simple method outperforms supervised pre-training after fine-tuning.
The following diagram describes the workflow of Vision Transformer (ViT):
Image Patching: The input image is divided into small, fixed-size patches.
Linear Projection: Each image patch is flattened and transformed into a vector through linear projection.
Position Embedding: Position embeddings are added to each image patch to retain positional information.
CLS Token: A learnable CLS token is added at the beginning of the sequence for classification tasks.
Transformer Encoder: These embedded vectors (including the CLS token) are fed into the Transformer encoder for multi-layer processing. Each layer includes a multi-head attention mechanism and a feedforward neural network.
MLP Head: After processing by the encoder, the output of the CLS token is passed to a multi-layer perceptron (MLP) head for the final classification decision.
This entire process demonstrates how the Transformer architecture can directly handle sequences of image patches to perform image classification tasks.
Training ViT
Pure ViT is mainly for Image Classifier.
class Attention(nn.Module):
def __init__(self, dim, heads=8, dim_head=64, dropout=0.):
super().__init__()
inner_dim = dim_head * heads
project_out = not (heads == 1 and dim_head == dim)
self.heads = heads
self.scale = dim_head ** -0.5
self.norm = nn.LayerNorm(dim)
self.attend = nn.Softmax(dim=-1)
self.dropout = nn.Dropout(dropout)
self.to_qkv = nn.Linear(dim, inner_dim * 3, bias=False)
self.to_out = nn.Sequential(
nn.Linear(inner_dim, dim),
nn.Dropout(dropout)
) if project_out else nn.Identity()
def forward(self, x):
x = self.norm(x)
qkv = self.to_qkv(x).chunk(3, dim=-1)
q, k, v = map(lambda t: rearrange(t, ‘b n (h d) -> b h n d’, h=self.heads), qkv)
dots = torch.matmul(q, k.transpose(-1, -2)) * self.scale
attn = self.attend(dots)
attn = self.dropout(attn)
out = torch.matmul(attn, v)
out = rearrange(out, ‘b h n d -> b n (h d)’)
return self.to_out(out)
# 定义Feed Forward Network (FFN)
class FFN(nn.Module):
def __init__(self, dim, hidden_dim, dropout=0.):
super().__init__()
self.net = nn.Sequential(
nn.LayerNorm(dim),
nn.Linear(dim, hidden_dim),
nn.GELU(),
nn.Dropout(dropout),
nn.Linear(hidden_dim, dim),
nn.Dropout(dropout)
)
def forward(self, x):
return self.net(x)
# 定义Transformer Encoder
class Transformer(nn.Module):
def __init__(self, dim, depth, heads, dim_head, mlp_dim_ratio, dropout):
super().__init__()
self.layers = nn.ModuleList([])
mlp_dim = mlp_dim_ratio * dim
for _ in range(depth):
self.layers.append(nn.ModuleList([
Attention(dim=dim, heads=heads, dim_head=dim_head, dropout=dropout),
FFN(dim=dim, hidden_dim=mlp_dim, dropout=dropout)
]))
def forward(self, x):
for attn, ffn in self.layers:
x = attn(x) + x
x = ffn(x) + x
return x
# 定义Vision Transformer (ViT)
class ViT(nn.Module):
def __init__(self, *, image_size, patch_size, num_classes, dim, depth, heads, mlp_dim_ratio, pool=’cls’, channels=3, dim_head=64, dropout=0.):
super().__init__()
image_height, image_width = pair(image_size)
patch_height, patch_width = pair(patch_size)
assert image_height % patch_height == 0 and image_width % patch_width == 0, ‘Image dimensions must be divisible by the patch size.’
num_patches = (image_height // patch_height) * (image_width // patch_width)
patch_dim = channels * patch_height * patch_width
self.to_patch_embedding = nn.Sequential(
Rearrange(‘b c (h p1) (w p2) -> b (h w) (p1 p2 c)’, p1=patch_height, p2=patch_width),
nn.LayerNorm(patch_dim),
nn.Linear(patch_dim, dim),
nn.LayerNorm(dim)
)
self.cls_token = nn.Parameter(torch.randn(1, 1, dim))
self.pos_embedding = nn.Parameter(torch.randn(1, num_patches + 1, dim))
self.dropout = nn.Dropout(dropout)
self.transformer = Transformer(dim, depth, heads, dim_head, mlp_dim_ratio, dropout)
self.pool = pool
self.to_latent = nn.Identity()
self.mlp_head = nn.Linear(dim, num_classes)
def forward(self, img):
x = self.to_patch_embedding(img)
b, n, _ = x.shape
cls_tokens = repeat(self.cls_token, ‘1 1 d -> b 1 d’, b=b)
x = torch.cat((cls_tokens, x), dim=1)
x += self.pos_embedding[:, :(n + 1)]
x = self.dropout(x)
x = self.transformer(x)
cls_token = x[:, 0]
feature_map = x[:, 1:]
pooled_output = cls_token if self.pool == ‘cls’ else feature_map.mean(dim=1)
pooled_output = self.to_latent(pooled_output)
classification_result = self.mlp_head(pooled_output)
return classification_result
# 辅助函数
def pair(t):
return t if isinstance(t, tuple) else (t, t)
# 数据预处理
transform = transforms.Compose([
transforms.Resize((32, 32)),
transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,))
])
# 加载CIFAR-10数据集
train_dataset = datasets.CIFAR10(root=’./data’, train=True, download=True, transform=transform)
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
# 初始化ViT模型
model = ViT(
image_size=32,
patch_size=4,
num_classes=10,
dim=128,
depth=6,
heads=8,
mlp_dim_ratio=4,
dropout=0.1
)
# 定义损失函数和优化器
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=3e-4)
# 训练模型
device = torch.device(‘cuda’ if torch.cuda.is_available() else ‘cpu’)
model.to(device)
for epoch in range(10): # 训练10个epoch
model.train()
total_loss = 0
for images, labels in train_loader:
images, labels = images.to(device), labels.to(device)
optimizer.zero_grad()
outputs = model(images)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
total_loss += loss.item()
print(f’Epoch {epoch+1}, Loss: {total_loss/len(train_loader)}’)
# 保存整个模型
torch.save(model, ‘vit_complete_model.pth’)
print(“训练完成并保存模型!”)
Training result:
Files already downloaded and verified
Epoch 1, Loss: 1.5606277365513774
Epoch 2, Loss: 1.2305729564498453
Epoch 3, Loss: 1.0941925532067829
Epoch 4, Loss: 1.0005672584714183
Epoch 5, Loss: 0.9230595080139082
Epoch 6, Loss: 0.8589703797379418
Epoch 7, Loss: 0.7988450761188937
Epoch 8, Loss: 0.7343863746546724
Epoch 9, Loss: 0.6837297593388716
Epoch 10, Loss: 0.6306750321632151
训练完成并保存模型!
Inference test:
# 数据预处理
transform = transforms.Compose([
transforms.Resize((32, 32)),
transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,))
])
# 加载CIFAR-10数据集
test_dataset = datasets.CIFAR10(root=’./data’, train=False, download=True, transform=transform)
test_loader = DataLoader(test_dataset, batch_size=64, shuffle=False)
# 加载整个模型
model = torch.load(‘vit_complete_model.pth’)
model.eval()
# 设备设置
device = torch.device(‘cuda’ if torch.cuda.is_available() else ‘cpu’)
model.to(device)
# 进行推理测试
with torch.no_grad():
for images, labels in test_loader:
images, labels = images.to(device), labels.to(device)
outputs = model(images)
_, predicted = torch.max(outputs, 1)
# 显示前5个样本的预测结果和图像
for i in range(5):
image = images[i].cpu().numpy().transpose((1, 2, 0))
image = (image * 0.5) + 0.5 # 反归一化
plt.imshow(image)
plt.title(f’预测: {test_dataset.classes[predicted[i]]}, 实际: {test_dataset.classes[labels[i]]}’)
plt.show()
break # 只显示一批数据
Inference result:
Florence-2
Microsoft’s Florence-2 uses a Transformer-based architecture, specifically adopting DeiT (Data-efficient Vision Transformer) as its visual encoder. DeiT’s architecture is the same as ViT, with the addition of a distillation token in the input tokens. Distillation is a method to improve training performance, especially since ViT performs poorly with insufficient data.
BTW, Phi-3 vision is also based on ViT (ViT-L).
Florence-2’s model architecture employs a sequence-to-sequence learning approach. This means the model processes input sequences (such as images with text prompts) progressively and generates output sequences (such as descriptions or labels). In the sequence-to-sequence framework, each task is treated as a translation problem: the model receives an input image and a specific task prompt, then generates the corresponding output.
Please refer to my repo get more info about Florence-2 :
Qianwen-VL
Qwen2-VL adopts an encoder-decoder architecture, combining Vision Transformer (ViT) with the Qwen2 language model. This architecture enables Qwen2-VL to handle image and video inputs and support multimodal tasks.
Qwen2-VL also utilizes a new Multimodal Rotary Position Embedding (M-ROPE). Position embeddings are decomposed to capture one-dimensional text, two-dimensional visual, and three-dimensional video positional information, enhancing the model’s ability to process multimodal data.
Training of Qwen2-VL
Pre-training Phase:
Objective: The main goal is to optimize the visual encoder and adapter, while the language model (LLM) remains frozen.
Dataset: A large, curated image-text pair dataset is used, crucial for the model to understand the relationship between visuals and text.
Optimization Goal: Improve the model’s text generation ability by minimizing the cross-entropy of text tokens, enabling more accurate text descriptions given an image.
Multitask Pre-training Phase:
Training the Entire Model: In this phase, the entire model, including the LLM, is trained.
Task Types: The model is trained on various vision-language tasks, such as image captioning and visual question answering.
Data Quality: High-quality, fine-grained data is used to provide richer visual and language information.
Input Resolution: Increasing the input resolution of the visual encoder to reduce information loss, helping the model capture image details better.
Instruction Fine-tuning Phase:
Objective: Enhance the model’s conversational and instruction-following capabilities.
Freezing the Visual Encoder: The visual encoder remains frozen, focusing on optimizing the LLM and adapter.
Data Type: A mix of multimodal and pure text dialogue data is used for optimization, aiding the model in better understanding and generating natural language when processing multimodal inputs.
Qianwen-VL-Inference
messages = [
{
“role”: “user”,
“content”: [
{
“type”: “image”,
“image”: “/root/image0.jpg”,
},
{“type”: “text”, “text”: “How many dogs do you see? What are they doing? Reply in Chinese.”},
],
}
]
[‘在这张图片中,我看到两只狗。左边的狗看起来像是柯基犬,而右边的狗看起来像是约克夏梗犬。它们似乎在户外的环境中奔跑,可能是散步或玩耍。‘]
The corresponding English translation is:
[‘In this picture, I see two dogs. The dog on the left looks like a Corgi, while the dog on the right appears to be a Yorkshire Terrier. They seem to be running outdoors, possibly taking a walk or playing. ’]
The model supports analysing video, but also using frame splitting. The model does not analyse audio.
model_name = “Qwen/Qwen2-VL-2B-Instruct”
model = Qwen2VLForConditionalGeneration.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
attn_implementation=”flash_attention_2″,
device_map=”auto”
)
processor = AutoProcessor.from_pretrained(model_name)
messages = [
{
“role”: “user”,
“content”: [
{
“type”: “video”,
“video”: “/root/cars.mp4”,
“max_pixels”: 360 * 420,
“fps”: 1.0, # 确保 fps 正确传递
“video_fps”: 1.0, # 添加 video_fps
},
{“type”: “text”, “text”: “Describe this video in Chinese.”},
],
}
]
text = processor.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True
)
image_inputs, video_inputs = process_vision_info(messages)
inputs = processor(
text=[text],
images=image_inputs,
videos=video_inputs,
padding=True,
return_tensors=”pt”,
)
inputs = inputs.to(“cuda”)
generated_ids = model.generate(**inputs, max_new_tokens=256)
generated_ids_trimmed = [
out_ids[len(in_ids):] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
]
output_text = processor.batch_decode(
generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
)
print(output_text)
[‘视频中展示了一条繁忙的街道,车辆密集,交通堵塞。街道两旁是高楼大厦,天空阴沉,可能是傍晚或清晨。‘]
The corresponding English translation is:
[‘The video shows a busy street with heavy traffic and congestion. Tall buildings line both sides of the street, and the sky is overcast, suggesting it might be either dusk or dawn.’]
Reference
https://huggingface.co/docs/transformers/model_doc/vit
https://arxiv.org/pdf/2307.04010
https://newsletter.kaitchup.com/p/run-qwen2-vl-on-your-computer-with
Microsoft Tech Community – Latest Blogs –Read More
Configure RPC packet level privacy setting for incoming connections not show in group policy
Hi,
it’s about the group policy setting in windows server 2022,
I have downloaded Windows Server 2022 Security Baseline.zip from https://www.microsoft.com/en-us/download/details.aspx?id=55319 Microsoft Security Compliance Toolkit 1.0
and imported SecGuide.adml and SecGuide.admx.
I can see the MS Security Guide folder in local group policy editor, but there is no setting for “Configure RPC packet level privacy setting for incoming connections“,
I also try to find from All Settings, there is no such setting.
Could you advice how to config it to let it show in the group policy editor or GPO.
Thanks.
Hi, it’s about the group policy setting in windows server 2022, I have downloaded Windows Server 2022 Security Baseline.zip from https://www.microsoft.com/en-us/download/details.aspx?id=55319 Microsoft Security Compliance Toolkit 1.0and imported SecGuide.adml and SecGuide.admx. I can see the MS Security Guide folder in local group policy editor, but there is no setting for “Configure RPC packet level privacy setting for incoming connections”, I also try to find from All Settings, there is no such setting. Could you advice how to config it to let it show in the group policy editor or GPO. Thanks. Read More
How Can I Downgrade iOS 18 to 17 from iPhone on a Windows 10 PC?
Hi all,
I have an iPhone 15 Pro Max and recently upgraded to iOS 18. My initial purpose is to test out the new features from iOS 18 but ended up with nothing. My iPhone is running a bit slow after upgrading to iOS 18.0. I heard it is possible to downgrade the iOS on a Windows PC.
Please let me know how to downgrade iOS 18 to 17 on a Windows 10 computer if you know to do this. Much appreciated for your help!
Hi all, I have an iPhone 15 Pro Max and recently upgraded to iOS 18. My initial purpose is to test out the new features from iOS 18 but ended up with nothing. My iPhone is running a bit slow after upgrading to iOS 18.0. I heard it is possible to downgrade the iOS on a Windows PC. Please let me know how to downgrade iOS 18 to 17 on a Windows 10 computer if you know to do this. Much appreciated for your help! Read More
Microsoft Word 2016 – Nested Field Code problem
Hi,
I have problem with Field Codes in Word Office 2016. The field code shows that for page number 10 value should be “test” but when I update field code I see value “REVISED PAGES” on page no 10. What is wrong ? How it is possible ?
Regards
Tom
Hi,I have problem with Field Codes in Word Office 2016. The field code shows that for page number 10 value should be “test” but when I update field code I see value “REVISED PAGES” on page no 10. What is wrong ? How it is possible ? RegardsTom Read More