Deployment rules for Notebooks: Enhancing Efficiency with Microsoft Fabric.
Introduction.
Fabric Notebooks, an integral component of the Microsoft Fabric ecosystem, offer a powerful platform for interactive data exploration, analysis, and collaboration. Designed to enhance productivity and streamline workflows, Fabric Notebooks provide users with a versatile environment to write, execute, and share code and visualizations alongside workspaces.
Fabric Notebooks empower users to engage in interactive data analysis using programming languages such as Python and R. By seamlessly integrating code execution with explanatory text and visualizations, Fabric Notebooks streamline the workflow for data exploration and interpretation. Moreover, notebooks support collaborative work environments, enabling multiple users to collaborate on the same notebook simultaneously.
Utilizing source control through Fabric’s integration with Git facilitates transparency and documentation within teams. This integration enhances the ability to replicate analyses and share findings with stakeholders effectively.
In this article, we describe the relationship between Notebooks and Lakehouses in Fabric, the advantages of implementing source control for notebooks, and delve into the utilization of deployment rules for Microsoft Fabric Notebooks. These rules serve to expedite deployment processes and aid in the compartmentalization of knowledge pertaining to the components comprising an analytical solution.
Notebooks and Lakehouses.
The Microsoft Fabric Notebook is a primary code item for developing Apache Spark jobs and machine learning experiments. It’s a web-based interactive surface used by data scientists and data engineers to write code benefiting from rich visualizations and Markdown text. Data engineers write code for data ingestion, data preparation, and data transformation. Data scientists also use notebooks to build machine learning solutions, including creating experiments and models, model tracking, and deployment.
You can either create a new notebook or import an existing notebook.
Like other standard Fabric item creation processes, you can easily create a new notebook from the Fabric Data Engineering homepage, the workspace New option, or the Create Hub.
You can create new notebooks or import one or more existing notebooks from your local computer to a Fabric workspace from the Data Engineering or the Data Science homepage. Fabric notebooks recognize the standard Jupyter Notebook .ipynb files, and source files like .py, .scala, and .sql.
To learn more about notebooks creation see How to use notebooks – Microsoft Fabric | Microsoft Learn
Next figure shows a Notebook1 in a workspace named “Learn”.
The items contained in this workspace can be added to source control if you have integrated it with an ADO Repo, using this feature of Fabric explained at Overview of Fabric Git integration – Microsoft Fabric | Microsoft Learn.
With Git integration, you can back up and version your notebook, revert to previous stages as needed, collaborate or work alone using Git branches, and manage your notebook content lifecycle entirely within Fabric.
Fabric notebooks now support close interactions with Lakehouses.
Microsoft Fabric Lakehouse is a data architecture platform for storing, managing, and analyzing structured and unstructured data in a single location.
You can easily add a new or existing lakehouse to a notebook from the Lakehouse explorer:
We can create the Notebook from the Lakehouse as well, that way that is the default Lakehouse of that notebook.
Notebooks can previously exist with different codes to analyze data, and we can select which notebook the Lakehouse is going to be analyzed with, either with one or with several notebooks.
Here is an example of a notebook and its associated Lakehouse.
So, a Lakehouse can be analyzed with one or more notebooks and vice versa, a notebook is used to analyze one or more lakehouses, but the notebook can have a pinned Lakehouse which is the default Lakehouse where notebook codes are applied to store, transform and visualize data.
You can navigate to different lakehouses in the Lakehouse explorer and set one lakehouse as the default by pinning it. Your default is then mounted to the runtime working directory, and you can read or write to the default lakehouse using a local path.
The default lakehouse in a notebook is typically managed in the configuration settings of the notebook’s code. You can set or overwrite the default lakehouse for the current session programmatically using a configuration block in your notebook. Here’s an example of how you might configure it:
%%configure
{
“defaultLakehouse”: {
“name”: “your-lakehouse-name”, # The name of your lakehouse
# “id”: “<(optional) lakehouse-id>”, # The ID of your lakehouse (optional)
# “workspaceId”: “<(optional) workspace-id-that-contains-the-lakehouse>” # The workspace ID if it’s from another workspace (optional)
}
}
This code snippet should be placed at the beginning of your notebook to set the default lakehouse for the session. If you’re using a relative path to access data from the lakehouse, the default lakehouse will serve as the root folder at runtime.
Change the default lakehouse of a notebook using Fabric User Interface.
Using just the UI, in the Lakehouse list, the pin icon next to the name of a Lakehouse indicates that it’s the default Lakehouse in your current notebook.
After several lakehouses have been added to the notebook, you can pin the lakehouse that you want to manage by default. Click on the pin and the previously added lakehouses appear:
To switch to a different default lakehouse, move the pin icon.
Notebooks inside Deployment Pipelines.
You can define a deployment pipeline in the “Learn” workspace, considering this workspace as the Development stage.
To learn more about Fabric’s deployment pipelines you can read Microsoft Fabric: Integration with ADO Repos and Deployment Pipelines – A Power BI Case Study.
Creating a deployment pipeline looks like this:
When you select “Create”, you can assign the desired workspace to the Development Stage.
After being assigned, you can see three stages, and begin to deploy the items to the next stages. Deploying creates a new workspace if it doesn’t exist. See the next three images.
How can you use “deployment rules” for notebooks in Test and in Production stages?
In relation to notebooks, Fabric lets users define deployment rules associated with specific notebooks. These rules allow users to customize the default lakehouse where the notebook is utilized.
Here are the steps to follow.
Select the icon at the upper right corner of the workspace, as seen in the next figure.
2. You see the notebooks created in your workspace after deploying content from Development workspace:
3. Select the notebook you want to deploy to Production but changing the default lakehouse it will work with:
4. Add a rule to change the original default lakehouse this notebook has in Test workspace, to another lakehouse that already exists in Production. When you choose to adopt other lakehouses in the target environment, Lakehouse ID is a must have. You can find the ID of a lakehouse from the lakehouse URL link.
5. Press Deploy. This action allows the production stage to manage the codes of the notebook to which you applied a deployment rule, referred to a different Lakehouse, which will serve as the basis for all the definitive analyses that will be viewed by all members of the organization, stakeholders and authorized users.
You can also deploy content backwards, from a later stage in the deployment pipeline, to an earlier one, so you can use deployment rules for a notebook in a Production workspace to change its default Lakehouse if you need that the notebook be deployed from Production to Test.
Summary:
Microsoft Fabric notebooks are highly significant for data scientists.
They provide a comprehensive environment for completing the entire data science process, from data exploration and preparation to experimentation, modeling, and serving predictive insights.
With tools like Lakehouse, data scientists can easily attach to a notebook to browse and interact with data, streamlining the process of reading data into data frames for analysis .
With Git integration, you can back up and version your notebook, revert to previous stages as needed, collaborate or work alone using Git branches, and manage your notebook content lifecycle entirely within Fabric.
Deployment rules applied to notebooks in Microsoft Fabric are used to manage the application lifecycle, particularly when deploying content between different stages such as development, test, and production. This feature streamlines the development process and ensures quality and consistency during deployment.
You can find more information in the following resources:
How To Create NOTEBOOK in Microsoft Fabric | Apache Spark | PySpark – YouTube
How to use notebooks – Microsoft Fabric | Microsoft Learn
Develop, execute, and debug notebook in VS Code – Microsoft Fabric | Microsoft Learn
Explore the lakehouse data with a notebook – Microsoft Fabric | Microsoft Learn
Notebook source control and deployment – Microsoft Fabric | Microsoft Learn
Solved: How to set default Lakehouse in the notebook progr… – Microsoft Fabric Community
Microsoft Tech Community – Latest Blogs –Read More