SQL Server Virtualization and S3 – Authentication Error
We are experimenting with data virtualization in SQL server 2022 where we have data in S3 that we want to access from our SQL Server instances. I have completed the configuration according to the documentation, but I am getting an error when trying to access the external table. SQL Server says it cannot list the contents of the directory. Logs in AWS indicate that it cannot connect due to an authorization error where the header is malformed.
I verified that I can access that bucket with the same credentials using the AWS cli from the same machine, but I cannot figure out why it is failing or what the authorization header looks like. Any pointers on where to look?
Enable Polybase
select serverproperty(‘IsPolyBaseInstalled’) as IsPolyBaseInstalled
exec sp_configure @configname = ‘polybase enabled’, @configvalue = 1
Create Credentials and data source
create master key encryption by password = ‘<some password>’
go
create credential s3_dc with identity = ‘S3 Access Key’, SECRET = ‘<access key>:<secret key>’
go
create external data source s3_ds
with (
location = ‘s3://<bucket_name>/<path>/’,
credential = s3_dc,
connection_options = ‘{
“s3”:{
“url_style”:”virtual_hosted”
}
}’
)
go
Create External Table
CREATE EXTERNAL FILE FORMAT ParquetFileFormat WITH(FORMAT_TYPE = PARQUET)
GO
CREATE EXTERNAL TABLE sample_table(
code varchar,
the_date date,
ref_code varchar,
value1 int,
value2 int,
value3 int,
cost numeric(12,2),
peak_value varchar
)
WITH (
LOCATION = ‘/sample_table/’,
DATA_SOURCE = s3_ds,
FILE_FORMAT = ParquetFileFormat
)
GO
We are experimenting with data virtualization in SQL server 2022 where we have data in S3 that we want to access from our SQL Server instances. I have completed the configuration according to the documentation, but I am getting an error when trying to access the external table. SQL Server says it cannot list the contents of the directory. Logs in AWS indicate that it cannot connect due to an authorization error where the header is malformed. I verified that I can access that bucket with the same credentials using the AWS cli from the same machine, but I cannot figure out why it is failing or what the authorization header looks like. Any pointers on where to look? Enable Polybaseselect serverproperty(‘IsPolyBaseInstalled’) as IsPolyBaseInstalled
exec sp_configure @configname = ‘polybase enabled’, @configvalue = 1Create Credentials and data sourcecreate master key encryption by password = ‘<some password>’
go
create credential s3_dc with identity = ‘S3 Access Key’, SECRET = ‘<access key>:<secret key>’
go
create external data source s3_ds
with (
location = ‘s3://<bucket_name>/<path>/’,
credential = s3_dc,
connection_options = ‘{
“s3”:{
“url_style”:”virtual_hosted”
}
}’
)
go Create External TableCREATE EXTERNAL FILE FORMAT ParquetFileFormat WITH(FORMAT_TYPE = PARQUET)
GO
CREATE EXTERNAL TABLE sample_table(
code varchar,
the_date date,
ref_code varchar,
value1 int,
value2 int,
value3 int,
cost numeric(12,2),
peak_value varchar
)
WITH (
LOCATION = ‘/sample_table/’,
DATA_SOURCE = s3_ds,
FILE_FORMAT = ParquetFileFormat
)
GO Read More