So, starting off, all four columns were declared as nvarchar. But this was a Hive ORC table, and it did not work. The return message was:
Msg 106000, Level 16, State 1, Line 75
org.apache.hadoop.io.IntWritable cannot be cast to org.apache.hadoop.io.Text
OLE DB provider "SQLNCLI11" for linked server "(null)" returned message
"Unspecified error".
Msg 7421, Level 16, State 2, Line 76
Cannot fetch the rowset from OLE DB provider "SQLNCLI11" for linked server " (null)"..
Looking closely, the key phrase in the error message was "cannot be cast." Once the two int columns were properly declared as int the create table command ran successfully, and the table could now be queried. The successful DDL:
CREATE EXTERNAL TABLE [dbo].sample_07c (
codeid nvarchar(200) null,
descrip nvarchar(200) null,
total_emp int null,
salary int null
)
WITH (
LOCATION='/apps/hive/warehouse/sample_07c/000000_0',
DATA_SOURCE = hdp23_orc ,
FILE_FORMAT = ORCsnappy
);
go
Also note that I had a dedicated data source for the ORC file type. According to MSDN, all data sources must have the same file format associated with it. So, you'll need a different data source for your DELIMITEDTEXT format types, and another for your ORC format types. The exact warning in MSDN is:
Note
|
All data sources must have the same format type.
You cannot have some data sources with text-delimited file formats and some with RCFILE formats. |
See: MSDN: CREATE EXTERNAL TABLE (Transact-SQL)
No comments:
Post a Comment