Migrating Azure Data Factory’s Snowflake Connector from Legacy to latest V2
Recently we announced the new updated V2 of our Popular Snowflake Connector in Azure Data Factory. The new connector promises better performance and security. The new V2 Connector supports the Pipelines and Data Flows and comes with the new Go-based Snowflake driver.
Differences between the Legacy and V2 Connector
The V2 connector comes with enhancement in terms of performance and security. The following are the major differences between legacy and V2 Connector.
Snowflake (V2)
Snowflake (legacy)
ODBC Driver
Go Driver
Support Basic and Key pair authentication.
Support Basic authentication.
Script parameters are not supported in Script activity currently. As an alternative, utilize dynamic expressions for script parameters.
Support script parameters in Script activity.
Support BigDecimal in Lookup activity. The NUMBER type, as defined in Snowflake, will be displayed as a string in Lookup activity.
BigDecimal is not supported in Lookup activity.
Connection Parameters include Account, Warehouse, Database, Schema, and Role
Connection Parameter is Connection String
Additional Connection Properties are not supported
Additional Connection Properties are Supported
Only Password / Keypair keys can be stored in KeyVault
Passwords and Connection Strings can be stored in KeyVault
The V2 version offers several enhancements over the legacy version including Autoscaling, Multi-Availability, Static IP, support for enhanced Storage Integration and upcoming features like Iceberg support. Find more information on the Connector Docs.
Deprecation of Legacy Connector
The Snowflake Legacy Connector will have the End of Support by Oct 31st, 2024, in end of support, the connector is considered as deprecated, and no longer supported. Customers are advised to upgrade the Connector to the new V2 Connector to continue using the connector and experience the new enhanced features. Read the official announcement to know more.
Migrating from Legacy to V2 Connector
The steps required to update from legacy to V2 Connector involve the following process.
Check if Script Parameters are being used.
If so modify them to use dynamic expressions
Check if Additional Connection Parameters are being used
If so, those parameters cannot be applied to the new Connector
Update the Linked Service to V2 Connector
Update the Dataset to V2 Dataset
Updating the Linked Service:
Here is a typical example of Linked Services using Parameters and Basic Authentication.
The Version of the Connector needs to be changed to V2
The Type Properties should be updated to include the following from the connection String
“typeProperties”: {
“authenticationType”: “Basic”,
“accountIdentifier”: “@{linkedService().accountIdentifier}”,
“user”: “@{linkedService().user}”,
“database”: “@{linkedService().database}”,
“warehouse”: “@{linkedService().warehouse}”,
“encryptedCredential”: “<encrypted_credential_value>”
}
Updating the Dataset:
The Dataset needs to be updated to include the V2 of the Connector. Update the Property Type from SnowflakeTable to SnowflakeV2Table.
Updating the Definition files:
The Linked Service and the Dataset can be updated through Git. The following steps can be used to update the Linked Service.
Create a new Branch from the Main Branch
Update the Definition Files
Test out the Connection and the Pipelines
Give a Pull-request and merge it with the Main branch
All the other branches should pull in from this branch or make manual changes to their branch to match the definition.
Gathering the list of Data Factories with Legacy Connector:
If you have a large environment and need to find the list of Data Factories that use the Legacy Connector, you can do so by using PowerShell. Please note that this program will run for a long time based on the number of resources.
You can access the code in this Gist.
Validating with Snowflake Query History:
The queries generated by the Data Factory users can be validated in Snowflake to check whether it is generating from Legacy or the V2 Connector.
SELECT DISTINCT CLIENT_APPLICATION_ID, USER_NAME
FROM SNOWFLAKE.ACCOUNT_USAGE.SESSIONS
WHERE CREATED_ON > DATEADD(DAY, -1, CURRENT_TIMESTAMP) // OPTIONAL: FILTER BY TIME
AND USER_NAME = ‘<ADF_USER>’ // THE USER THAT IS CONFIGURED IN DATA FACTORY
ORDER BY CLIENT_APPLICATION_ID;
The Legacy Connector will have “ODBC” followed by the version and the V2 connector will have “GO” followed by the version. To just check for the connections coming in from ODBC, you can filter on the client_application_id column.
SELECT DISTINCT CLIENT_APPLICATION_ID, USER_NAME
FROM SNOWFLAKE.ACCOUNT_USAGE.SESSIONS
WHERE CREATED_ON > DATEADD(DAY, -1, CURRENT_TIMESTAMP) // OPTIONAL: FILTER BY TIME
AND USER_NAME = ‘<ADF_USER>’ // THE USER THAT IS CONFIGURED IN DATA FACTORY
AND CLIENT_APPLICATION_ID LIKE ‘%ODBC%’ // FILTER BY ODBC FOR LEGACY CONNECTOR
ORDER BY CLIENT_APPLICATION_ID;
Best Practices:
The following best practices can be applied to the Data Factory so that the migration of the connector can be easier.
Use Git for version control with the Data Factory
Create a separate branch to manage just Linked Services without any Pipelines / Artifacts so that it can be merged with the main branch and other branches
Update the Username / Password based authentication to KeyPair Authentication for enhanced security
Use Key Vault for storing the Passwords / Keypair keys
Test out the connector in a specific dev environment / branch before pushing to the main branch
We hope you will use the new and improved connector. If you have any issues or next help with the migration, please reach out to your Microsoft Account team.
Microsoft Tech Community – Latest Blogs –Read More