A better Phi Family is coming – multi-language support, better vision, intelligence MOEs
After the release of Phi-3 at Microsoft Build 2024, it has received different attention, especially the application of Phi-3-mini and Phi-3-vision on edge devices. In the June update, we improved Benchmark and System role support by adjusting high-quality data training. In the August update, based on community and customer feedback, we brought Phi-3.5-mini-128k-instruct multi-language support, Phi-3.5-vision-128k with multi-frame image input, and provided Phi-3.5 MOE newly added for AI Agent. Next, let’s take a look
Multi-language support
In previous versions, Phi-3-mini had good English corpus support, but weak support for non-English languages. When we tried to ask questions in Chinese, there were often some wrong questions, such as
But in the new version, we can have better understanding and corpus support with the new Chinese prediction support
Better vision
images = []
placeholder = “”
for i in range(1,22):
with open(“../output/keyframe_”+str(i)+“.jpg”, “rb”) as f:
images.append(Image.open(“../output/keyframe_”+str(i)+“.jpg”))
placeholder += f”<|image_{i}|>n”
Intelligence MOEs
Faster pre-training speed than dense models
Faster inference speed than models with the same number of parameters
Requires a lot of video memory because all expert systems need to be loaded into memory
There are many challenges in fine-tuning, but recent research shows that instruction tuning for mixed expert models has great potential.
“””
sys_msg = “””You are a helpful AI assistant, you are an agent capable of using a variety of tools to answer a question. Here are a few of the tools available to you:
– Blog: This tool helps you describe a certain knowledge point and content, and finally write it into Twitter or Facebook style content
– Translate: This is a tool that helps you translate into any language, using plain language as required
– Final Answer: the final answer tool must be used to respond to the user. You must use this when you have decided on an answer.
To use these tools you must always respond in JSON format containing `“tool_name”` and `“input”` key–value pairs. For example, to answer the question, “Build Muliti Agents with MOE models” you must use the calculator tool like so:
{
“tool_name”: “Blog”,
“input”: “Build Muliti Agents with MOE models”
}
Or to translate the question “can you introduce yourself in Chinese” you must respond:
{
“tool_name”: “Search”,
“input”: “can you introduce yourself in Chinese”
}
Remember just output the final result, ouput in JSON format containing `“agentid”`,`“tool_name”` , `“input”` and `“output”` key–value pairs .:
[
{ “agentid”: “step1”,
“tool_name”: “Blog”,
“input”: “Build Muliti Agents with MOE models”,
“output”: “………”
},
{ “agentid”: “step2”,
“tool_name”: “Search”,
“input”: “can you introduce yourself in Chinese”,
“output”: “………”
},
{
“agentid”: “final”
“tool_name”: “Result”,
“output”: “………”
}
]
The users answer is as follows.
“””
We can see that by telling the model what skills it needs to have and the arrangement of tasks, we can assign them to different tasks through Phi-3.5 MOEs to complete related work.
[
{
“agentid”: “step1”,
“tool_name”: “Blog”,
“input”: “Generative AI with MOE”,
“output”: “Generative AI with MOE (Mixture of Experts) is a powerful approach that combines the strengths of generative models and the flexibility of MOE architecture. This hybrid model can generate high-quality, diverse, and contextually relevant content, making it suitable for various applications such as content creation, data augmentation, and more.”
},
{
“agentid”: “step2”,
“tool_name”: “Translate”,
“input”: “Generative AI with MOE is a powerful approach that combines the strengths of generative models and the flexibility of MOE architecture. This hybrid model can generate high-quality, diverse, and contextually relevant content, making it suitable for various applications such as content creation, data augmentation, and more.”,
“output”: “基于生成AI的MOE(Mixture of Experts)是一种强大的方法,它结合了生成模型的优势和MOE架构的灵活性。这种混合模型可以生成高质量、多样化且上下文相关的内容,使其适用于各种应用,如内容创建、数据增强等。”
},
{
“agentid”: “final”,
“tool_name”: “Result”,
“output”: “基于生成AI的MOE(Mixture of Experts)是一种强大的方法,它结合了生成模型的优势和MOE架构的灵活性。这种混合模型可以生成高质量、多样化且上下文相关的内容,使其适用于各种应用,如内容创建、数据增强等。”
}
]
Thoughts on SLMs
Resources
Microsoft Tech Community – Latest Blogs –Read More