With SQL Server 2016 CTP2.x PolyBase connected to Hadoop, it's a
great way to access all the data stored in your Hadoop installation using
familiar T-SQL commands. And if you have
been reading some of the latest comments on NoSQL, your company may have the
urge to start using Hadoop as your sole data warehouse and repository. All you
have to do is migrate your data warehouse data, and with the PolyBase connection, it should be
fairly simple - right?
[ Take me directly to PolyBase - Insert data into a Hadoop Hue Directory, where we outline just how to do it.]
[ Take me directly to PolyBase - Insert data into a Hadoop Hue Directory, where we outline just how to do it.]
For the early CTP releases INSERT was not supported. I have now tested it ( see PolyBase - Insert data into a Hadoop Hue Directory ), and you can now INSERT INTO an external table via PolyBase. Once you have an external table setup the next step is to run an INSERT command.
Note: The following was written before INSERT was fully supported.
Perhaps something like this:
insert into
[dbo].InsertTest_1 ( firstcol, secondcol )
values ('firstcolvalue1','secondcolvalue1');
In the earlier CTP versions, after running the insert command, you were greeted with
an error message similar to the following:
Msg
46519, Level 16, State 16, Line 68
DML
Operations are not supported with external tables.
insert into [dbo].AWDW_CSV_3String ( FullName, AddressLine1, DataSrc )
values ( 'Ziggy Stardust','1 Moonshot Drive','Manual' );
The message returned was:
Msg 46914, Level 16, State 1, Line 61
INSERT into external table is disabled. Turn on the configuration
option 'allow polybase export' to enable.
OK, update the configuration option.
sp_configure 'allow polybase export', 1;
reconfigure
Close, but still have issues. PolyBase denies the existence of my table:
Msg 7320, Level 16, State 102, Line 64
Cannot execute the query "Remote Query" against OLE DB provider "SQLNCLI11"
for linked server "SQLNCLI11". EXTERNAL TABLE access failed because the
specified path name 'hdfs://192.168.1.120:8020/user/hue/AWDW_CSV_3String.csv'
does not exist. Enter a valid path and try again.
Switching to a Hive table, we were successful in inserting data into our table. But not initially.
The full file address for our test Hive table was
LOCATION='/apps/hive/warehouse/sample_08/sample_08'
This did not work. Using only the relative Hive table location (below) did allow us to insert new data:
LOCATION='/apps/hive/warehouse/sample_08'
Ok, now take me directly to PolyBase - Insert data into a Hadoop Hue Directory, where we outline just how to do it.
1 comment:
Thank you
Same issue.. I could not insert into my directory only into Hive/Warehouse
I wish the error message was better than invalid directory.. howbout a permission display
Thanks again !!!
Rafael
Post a Comment