How to use Log Analytics log data exported to Storage Accounts
Introduction
Exporting your logs from Sentinel or Log Analytics to Azure storage account blobs gives you low-cost long-term retention, as well as benefits such as immutability for legal hold, and geographical redundancy.
But in the event of an incident, or perhaps a legal case, you may need the data archived away in those storage account blobs to help the investigation.
How do you go about retrieving and analyzing that data? This blog will answer exactly that question. Hint, it does involve an Azure Data Explorer cluster. I will also briefly explain how data ends up in those blobs in the first place.
**Note: ADX is often used to indicate “Azure Data Explorer” in this blog post.
How to export Log Analytics to Storage Account Blobs
Option One: Workspace Export
The easiest method is to enable the export on the Log Analytics Workspace itself. To do this, browse to Settings Data Export in the Log Analytics Workspace blade, then select “New Export rule”. You can select the table you want to export, several if needed, and the destination storage account.
Notice from the drawing above that the export happens at ingestion, which means you will end up with duplicate data, at least for the duration of the Log Analytics retention period. The export happens continuously at 5-minute intervals.
The result in the storage account is the following structure:
A container is created for each table in the Storage Account with the name am- followed by the name of the table (“am-azureactivity”).
Blobs are stored in 5-minute folders in the following path structure: WorkspaceResourceId=/subscriptions/subscription-id/resourcegroups/<resource-group>/providers/microsoft.operationalinsights/workspaces/<workspace>/y=<four-digit numeric year>/m=<two-digit numeric month>/d=<two-digit numeric day>/h=<two-digit 24-hour clock hour>/m=<two-digit 60-minute clock minute>/PT05M.json. Appends to blobs are limited to 50-K writes. More blobs will be added in the folder as PT05M_#.json*, where ‘#’ is the incremental blob count.
Source: Log Analytics workspace data export in Azure Monitor – Azure Monitor | Microsoft Learn
Note that the export itself has a per GB cost, so be sure to take that into consideration: Pricing – Azure Monitor | Microsoft Azure
Option Two: Logic App
Alternatively, you can build a Logic App or a Function App that exports exactly what you want (which table and which columns, perhaps data that is first summarized in a smaller set) and when you want it. This option is documented here: Export data from a Log Analytics workspace to a storage account by using Logic Apps.
Contrary to the previous option (Export on Workspace), this example will lead to a flat folder, with a name that you define in the Logic App, with all the JSON’s listed in that same folder:
Now how do I get to the data?
Querying data in a storage account can be done in various ways, as we will see below.
externaldata() KQL
If you use the externaldata() kql function from Log Analytics or Azure Data Explorer (adx), your query will look similar to the below. In the example, two different JSON files are read from the storage account container “loganalytics-data” (from Option Two above):
The URL and SAS for each JSON is needed, and you can obtain those by going to each JSON in the storage account container, selecting “Generate SAS” and “Generate SAS token and URL” as shown below (or use your favorite scripting language):
Here some of the advantages and disadvantages of this option:
Advantages:
No need to deploy any additional components such as the Azure Data Explorer cluster.
Allows for querying “on the fly” of your blob content.
Disadvantages:
Requires a SAS token (Entra ID Authentication is always better).
Involves “leg work” in the form of defining the fields and copying and pasting the URL with the SAS token.
Potentially slow performance if a lot of data is queried (many JSON’s).
external_table in ADX
The next option is to leverage Azure Data Explorer (ADX), and create an “external table” in ADX. This is then pointed to the storage account and the relevant container holding the exported logs in JSON format. Deploying an Azure Data Explorer cluster is beyond the scope of this blog. More information can be found here: Create an Azure Data Explorer cluster and database.
Once you have it, your Azure Data Explorer Cluster can be in state “stopped” when it is not needed, allowing you to save costs. Be sure to start it before testing the following steps (and stop it again when you are finished and nobody else needs it!).
The data in the cluster can be accessed by browsing to the URI (see “Copy to clipboard” in the image above). The detailed steps are here: Create an external table using the Azure Data Explorer web UI wizard in Azure Data Explorer.
Below are some steps to get you started:
To create an external table, go to Query on the left-hand side of the screen, then right-click on the database and select “Create external table”:
Follow the wizard, giving your external table a name in the Destination part, then in the Source part select Container, and go to your Storage subscription, Storage account and then Container. I selected here the container generated under Option One above (the one generated as an export from Log Analytics):
ADX then reads all the JSON files (even across the hierarchical structure, so no need to worry about browsing that yourself) and lets you select one of the files as a template for generating the schema of the to-be-generated external table. Proceed with the wizard if you are happy with the schema proposed. The result will look similar to this:
(Notice my data is coming from the Azure Activity log).
External tables in ADX can be queried by using external_table(‘tablename’). Here are some of the advantages and disadvantages of this option:
Advantages:
No need for ingesting into the Azure Data Explorer database.
Allows for querying of the blob content as if it were a native table.
Disadvantages:
Potentially slow performance if a lot of data is queried (many JSON’s).
Ingest into ADX
The final option covered here is to ingest the data into a table in ADX. The difference with the external table, where we are in reality still reading from the storage account, is that in this case the data actually ends up residing in a table in the ADX database. This is described here: Get data from Azure storage.
Some high-level steps to get you started are here:
Again right-click on the database, and this time select “Get data”.
Follow the wizard, select Azure Storage, then connect to the container in the storage account, either via the URI, or by selecting the Azure components. Be sure to create a new table, as in the example below (“DataFromBlobs”).
ADX is smart enough to inspect the JSON’s in the container hierarchy and generate a schema. You can click “Edit columns” or “Advanced” to alter it.
For the purposes of this blog, I am happy enough, so I click on Finish at this point. ADX now goes about its business of ingesting the JSON files from the storage account. Note that depending on how much data there is to ingest, it can take a while. At the end, each blob (JSON file) will have “Successfully ingested” next to it.
Now I can query the data table directly:
Here are some of the advantages and disadvantages of this option:
Advantages:
Performance-wise the fastest querying experience, once the data is ingested.
Disadvantages:
The cost of ingesting the data and the responsibility for securing the ADX accordingly.
Cleaning up
To remove the data ingested in the last step, click on the three dots next to the table name and select “Drop table”, or issue the query command:
.drop table tablename (for example .drop table DataFromBlobs).
For external tables there is no right-click option for dropping the tables. Instead issue the following query command:
.drop external table tablename (for example .drop external table ExternalData)
Don’t forget to Stop the Azure Data Explorer cluster to save costs.
Conclusion
In this blog post, I explored some options for accessing logs that were archived in Azure storage account containers, either through export from Log Analytics and Sentinel or through a custom Logic App.
This is to address exceptional cases where you need those archived data, for example for historical context during an investigation.
Microsoft Tech Community – Latest Blogs –Read More