Introducing AI21 Labs Jamba 1.5 Large and Jamba 1.5 Mini on Azure AI Models-as-a-Service
This June, AI21 Jamba-Instruct from AI21 Labs launched first on Azure and now, in partnership with AI21, we are excited to announce the availability of two new open models, AI21 Jamba 1.5 Large and AI21 Jamba 1.5 Mini, in the Azure AI model catalog. These models are based on the Jamba architecture, which combines Mamba and Transformer layers to achieve performance and efficiency for long-context processing tasks. You can get started with the models in your Azure AI Studio Hub through client samples like LangChain, LiteLLM, web requests and AI21’s Azure Client.
“We are excited to deepen our collaboration with Microsoft, bringing the cutting-edge innovations of the Jamba Model family to Azure AI users,” said Pankaj Dugar, SVP and GM of North America at AI21. “As an advanced hybrid SSM-Transformer model suite, the Jamba open model family democratizes access to LLMs that offer efficiency, low latency, high quality, and long-context handling. These models elevate enterprise performance and are seamlessly integrated with the Azure AI platform.”
Azure AI’s expansive model catalog, featuring over 1,600 foundational models, offers versatility and ease of use. This collection includes contributions from industry leaders such as AI21 Labs, Cohere, NVIDIA, OpenAI, G42, Mistral, and more, ensuring comprehensive coverage for diverse needs. Our partnerships with top AI providers and the release of Phi-3 from Microsoft Research have significantly broadened our offerings, making it easier for customers to find and choose models tailored to their specific applications.
What makes the Jamba 1.5 Models Unique?
Based on information from AI21 Labs, the Jamba 1.5 Large and Jamba 1.5 Mini models are the most powerful models to be built to date from the Jamba architecture. These models leverage the Hybrid Mamba-Transformer architecture, which optimizes the trade-off between speed, memory, and quality by using Mamba layers for short-range dependencies and Transformer layers for long-range dependencies. The result is a family of models that can handle long contexts with high efficiency and low latency.
Jamba 1.5 Mini has 12 billion active parameters and 52 billion total parameters, while Jamba 1.5 Large has 94 billion active parameters and 398 billion total parameters. Both models support a 256K context window, which means they can process up to 256,000 tokens or characters at a time. This is a significant improvement over the standard context window of most large language models, and it enables new possibilities for generative AI applications that require longer texts, such as document summarization, text generation, or information extraction.
These models come with several features that make them easy to use and integrate, such as function calling, RAG optimizations, and JSON mode. These features allow you to perform complex operations, such as querying external knowledge sources, composing multiple functions, or formatting the output, with simple natural language instructions.
AI21 Labs highlights various use cases of these models including the following:
Financial Services
Loan Term Sheet Generation
Customer Service Agents (Grounded Q&A)
Investment Research (Grounded Q&A)
Healthcare / Life Sciences
Digital Health Assistant
Research Assistant
Retail / CPG
Product Description Generator
Product FAQ Generator
Shopping Assistant
Why the Jamba Model Family on Azure?
Utilizing the Jamba 1.5 Model Family on Azure allows organizations to fully leverage AI with safety, reliability, and security. In addition, this offering allows developers to effortlessly integrate with Azure AI Studio tools, like Azure AI Content Safety to enhance responsible AI practices, Azure AI Search, and prompt flow to evaluate LLM outputs by computing metrics like groundedness.
Customers can use the API with various clients, including prompt flow, OpenAI, LangChain, LiteLLM, CLI with curl and Python web requests, and AI21 Lab’s Azure client. Since the Jamba 1.5 Large and Jamba 1.5 Mini models are available as Models-as-a-Service (MaaS) on Azure AI, they can easily be deployed as pay-as-you-go inference APIs without having to manage the underlying infrastructure.
Developers can also build confidently, knowing their data is secure. To start using Jamba 1.5 Large and Jamba 1.5 Mini, access the Azure AI Studio model catalog, select the Jamba 1.5 models, and deploy it using the pay go option.
How to use Jamba 1.5 Large and Jamba 1.5 Mini on Azure AI
To start building, enter the Azure AI Studio model catalog and utilize the Jamba 1.5 models. To view documentation on getting started, visit this link. Deploying Jamba 1.5 models takes a couple of minutes by following these steps:
Familiarize Yourself: If you’re new to Azure AI Studio, start by reviewing this documentation to understand the basics and set up your first project.
Access the Model Catalog: Open the model catalog in AI Studio.
Find the Model: Use the filter to select the AI21 Labs collection or click the “View models” button on the MaaS announcement card.
Select the Model: Open the Jamba 1.5 models from the list.
Deploy the Model: Click on ‘Deploy’ and choose the Pay-as-you-go (PAYG) deployment option.
Subscribe and Access: Subscribe to the offer to gain access to the model (usage charges apply), then proceed to deploy it.
Explore the Playground: After deployment, you will automatically be redirected to the Playground. Here, you can explore the model’s capabilities.
Customize Settings: Adjust the context or inference parameters to fine-tune the model’s predictions to your needs.
Access Programmatically: Click on the “View code” button to obtain the API, keys, and a code snippet. This enables you to access and integrate the model programmatically.
Integrate with Tools: Use the provided API in Large Language Model (LLM) tools such as prompt flow, Semantic Kernel, LangChain, or any other tools that support REST API with key-based authentication for making inferences.
Frequently Asked Questions (FAQ’s)
What does it cost to use the Jamba 1.5 Large or Jamba 1.5 Mini models on Azure?
You are billed based on the number of prompt and completions tokens. You can review the pricing in the Marketplace offer details tab when deploying the model. You can also find the pricing on the Azure Marketplace.
Jamba 1.5 Large: Paygo-inference-input tokens are 1k for $0.002; paygo-inference-output-tokens are 1k for $0.008
Jamba 1.5 Mini: Paygo-inference-input tokens are 1k for $0.0002; paygo-inference-output-tokens are 1k for $0.0004
Do I need GPU capacity in my Azure subscription to use Jamba 1.5 models?
No, you do not need GPU capacity. The Jamba 1.5 Large and Jamba 1.5 Mini models are offered as an API through Models as a Service.
Are Jamba 1.5 Large or Jamba 1.5 Mini available in Azure Machine Learning Studio?
Yes, Jamba 1.5 models are available in the model catalog in both Azure AI Studio and Azure Machine Learning Studio.
Jamba 1.5 Large and Jamba 1.5 Mini are listed on the Azure Marketplace. Can I purchase and use Jamba 1.5 models directly from Azure Marketplace?
Azure Marketplace is our foundation for commercial transactions for models built on or built for Azure. The Azure Marketplace enables the purchasing and billing of Jamba 1.5 models. However, model discoverability occurs in both Azure Marketplace and the Azure AI model catalog. Meaning you can search and find Jamba 1.5 models in both the Azure Marketplace and Azure AI model catalog.
If you search for Jamba 1.5 models in Azure Marketplace, you can subscribe to the offer before being redirected to the Azure AI model catalog in Azure AI Studio where you can complete subscribing and can deploy the model.
If you search for Jamba 1.5 models in the Azure AI model catalog, you can subscribe and deploy the model from the Azure AI model catalog without starting from the Azure Marketplace. The Azure Marketplace still tracks the underlying commerce flow.
Given that Jamba 1.5 models billed through the Azure Marketplace, does it retire my Azure consumption commitment (aka MACC)?
Yes, Jamba 1.5 models are an “Azure benefit eligible” Marketplace offer, which indicates MACC eligibility. Learn more about MACC here: https://learn.microsoft.com/en-us/marketplace/azure-consumption-commitment-benefit
Is my inference data shared with AI21 Labs?
No, Microsoft does not share the content from prompts or outputs with AI21 Labs. Learn more about data use through model catalog here: Data, privacy, and security for use of models through the Model Catalog in Azure AI Studio
Are there rate limits for the Jamba 1.5 Large or Jamba 1.5 Mini model on Azure?
Yes, there are rate limits for the Jamba 1.5 Large and Jamba 1.5 Mini model on Azure. Each deployment has a rate limit of 400K tokens per minute and 1,000 API requests per minute. Contact Azure customer support if you have additional questions.
Are the Jamba 1.5 models region specific?
Jamba 1.5 Large or Jamba 1.5 Mini model API endpoints can be created in AI Studio projects to Azure Machine Learning workspaces in the following regions:
Jamba 1.5 Mini Regions: East US 2, Sweden Central
Jamba 1.5 Large Regions: East US, Sweden Central
If you want to use the model in prompt flow in project or workspaces in other regions, you can use the API and key as a connection to prompt flow manually. Essentially, you can use the API from any Azure region once you create it in the regions listed above.
Can I fine-tune Jamba 1.5 Large and Jamba 1.5 Mini models on Azure?
You cannot currently fine-tune the model through Azure AI Studio.
Can I use MaaS models in any Azure subscription types?
Customers can use MaaS models in all Azure subsection types with a valid payment method, except for the CSP (Cloud Solution Provider) program. Free or trial Azure subscriptions are not supported.
Microsoft Tech Community – Latest Blogs –Read More