XML to CSV: how to remove CRLF from XML source field?
I’m new to Azure Data Factory. I made an Azure Pipeline with a Data Flow. I wanted to convert an XML file to a CSV file and I used this tutorial: https://www.youtube.com/watch?v=ltR5xof6wq8
In the XML source file there are embedded carriage returns in the <description> tag which appear in the resulting CSV file. This makes the output CSV file invalid since only one record should appear per line.
The XML file was created by copying the data from https://learn.microsoft.com/en-us/previous-versions/windows/desktop/ms762271(v=vs.85), pasting it into a plain XML file, and uploading the file to Azure storage.
How do I get rid of the CRLF in the Description field? I don’t even know where to start looking since I’m pretty new to Azure Data Factory. Maybe I’m asking the wrong question. In your world, would a valid XML file possibly have embedded CRLF in some of its fields? In my world we get all kinds of garbage data from the customer that they don’t know how to fix.
NOTE: I must use Azure Data Factory for this solution, I cannot use another Azure service or outside service. It is not feasible to manually edit the input file or output file as in production, the files could be quite large.
Thank you.
I’m new to Azure Data Factory. I made an Azure Pipeline with a Data Flow. I wanted to convert an XML file to a CSV file and I used this tutorial: https://www.youtube.com/watch?v=ltR5xof6wq8 In the XML source file there are embedded carriage returns in the <description> tag which appear in the resulting CSV file. This makes the output CSV file invalid since only one record should appear per line. The XML file was created by copying the data from https://learn.microsoft.com/en-us/previous-versions/windows/desktop/ms762271(v=vs.85), pasting it into a plain XML file, and uploading the file to Azure storage. How do I get rid of the CRLF in the Description field? I don’t even know where to start looking since I’m pretty new to Azure Data Factory. Maybe I’m asking the wrong question. In your world, would a valid XML file possibly have embedded CRLF in some of its fields? In my world we get all kinds of garbage data from the customer that they don’t know how to fix.NOTE: I must use Azure Data Factory for this solution, I cannot use another Azure service or outside service. It is not feasible to manually edit the input file or output file as in production, the files could be quite large. Thank you. Read More