Is the use of NVARCHAR(MAX) acceptable for this scenario?
Hi,
I’m using Azure Data Factory (ADF) Pipelines to ingest several external datasources into our domain. As you’ll know, I need to create a target database table for each source table I’m looking to ingest.
Upon creating the target tables for the first time, I can look at the source tables field properties (datatype and size) and set those same properties for the target tables. My concern is if the source owners change the source field properties in the future and do not notify us (which is a strong possibility) then the ADF Pipelines may fail in which I will then investigate and resolve. There could be another scenario where source field A is 20 characters long, and so I set the target field to be 20 characters long, but then the source field is amended to be 50 characters long, and I’m not informed. This may not break the ADF Pipeline execution but simply truncate the data.
An approach would be to set each target field to be NVARCHAR( MAX). This would (should) avoid the situation above. I’m a one-person band on all things data at my company and am attempting to juggle many things in the best way I can. I feel the NVARCHAR(MAX) approach would reduce headaches and needed time to investigate.
Questions:
1) I understand it isn’t good practice to use NVARCHAR(MAX) unless needed, but would you advise it acceptable with what I’m concerned about?
2) I understand using NVARCHAR(MAX) takes up more database size (allows for 2Gb) but if we’re not actually filling that 2Gb up does it cost more in Azure SQL Server? The volume of data we have isn’t great but I am still considering the costs involved.
TLDR: Would setting target fields as NVARCHAR(MAX) increase Azure SQL Server costs even if the fields only contain 20 characters?
Thanks in advance.
Hi,I’m using Azure Data Factory (ADF) Pipelines to ingest several external datasources into our domain. As you’ll know, I need to create a target database table for each source table I’m looking to ingest.Upon creating the target tables for the first time, I can look at the source tables field properties (datatype and size) and set those same properties for the target tables. My concern is if the source owners change the source field properties in the future and do not notify us (which is a strong possibility) then the ADF Pipelines may fail in which I will then investigate and resolve. There could be another scenario where source field A is 20 characters long, and so I set the target field to be 20 characters long, but then the source field is amended to be 50 characters long, and I’m not informed. This may not break the ADF Pipeline execution but simply truncate the data.An approach would be to set each target field to be NVARCHAR( MAX). This would (should) avoid the situation above. I’m a one-person band on all things data at my company and am attempting to juggle many things in the best way I can. I feel the NVARCHAR(MAX) approach would reduce headaches and needed time to investigate.Questions:1) I understand it isn’t good practice to use NVARCHAR(MAX) unless needed, but would you advise it acceptable with what I’m concerned about?2) I understand using NVARCHAR(MAX) takes up more database size (allows for 2Gb) but if we’re not actually filling that 2Gb up does it cost more in Azure SQL Server? The volume of data we have isn’t great but I am still considering the costs involved.TLDR: Would setting target fields as NVARCHAR(MAX) increase Azure SQL Server costs even if the fields only contain 20 characters?Thanks in advance. Read More