Tuesday, September 1, 2015

Setting up PolyBase for YARN in SQL Server 2016

If your target Hadoop system is YARN based, you will need to follow the optional configuration outlined in MSDN for PolyBase and make one more configuration adjustment. You'll need to add the key value for the yarn.application.classpath from the Hadoop server and add it to your SQL Server yarn-site.xml file.

Locating the Hadoop yarn.application.classpath
On my Hortonworks 2.0 server, the configuration key value for yarn.application.classpath is in the yarn-site.xml file. This file is located in the Hadoop configuration directory, typically in the folder:  etc/hadoop/ on your Hadoop server. If you are unsure about where the yarn-site.xml file is located, use the Linux command locate to find the file.



If the Linux command locate is not on your system, use the find command:  find / -name myfilename.  On both my single node Hortonworks 2.0 sandbox, and my Hortonwworks 2.3 sandbox, the file was located in the directory:  etc/hadoop/conf/. After locating the file, open it with the Linux command:  vi yarn-site.xml and scroll down to read the value.


To exit out of vi, use the command:  :q!<enter>

 
  For my Hortonworks 2.0 installation, the yarn.application.classpath value was:

       
<name>yarn.application.classpath</name>
<value>/etc/hadoop/conf,/usr/lib/hadoop/*,/usr/lib/hadoop/lib/*,/usr/lib/hadoop-hdfs/*,/usr/lib/hadoop-hdfs/lib/*,/usr/lib/hadoop-yarn/*,/usr/lib/hadoop-yarn/lib/*,/usr/lib/hadoop-mapreduce/*,/usr/lib/hadoop-mapreduce/lib/*</value>
    


For my Hortonworks 2.3 installation, the yarn.application.classpath value was:
       
<name>yarn.application.classpath</name>
<value>$HADOOP_CONF_DIR,/usr/hdp/current/hadoop-client/*,/usr/hdp/current/hadoop-client/lib/*,/usr/hdp/current/hadoop-hdfs-client/*,/usr/hdp/current/hadoop-hdfs-client/lib/*,/usr/hdp/current/hadoop-yarn-client/*,/usr/hdp/current/hadoop-yarn-client/lib/*</value>

Once you have this value, locate on your SQL server machine, the file yarn-site.xml and update the yarn.application.classpath value.  You can find this file in the SQL Server installation path:  <SqlBinRoot>\Polybase\Hadoop\Conf.  In most cases, the path is: 


   C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\Binn\Polybase\Hadoop\conf\   


yarn-site.xml File on SQL Server Before update:

    
<!-- Applications' Configuration-->
 <property>
<description>CLASSPATH for YARN applications. A comma-separated list of CLASSPATH entries</description>
<!-- Please set this value to the correct yarn.application.classpath that matches your server side configuration -->
<!-- For example: $HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/share/hadoop/common/*,$HADOOP_COMMON_HOME/share/hadoop/common/lib/*,$HADOOP_HDFS_HOME/share/hadoop/hdfs/*,$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*,$HADOOP_YARN_HOME/share/hadoop/yarn/*,$HADOOP_YARN_HOME/share/hadoop/yarn/lib/* -->
<name>yarn.application.classpath</name>
<value></value>
 </property>


yarn-site.xml File on SQL Server After update:


<!-- Applications' Configuration-->
 <property>
<description>CLASSPATH for YARN applications. A comma-separated list of CLASSPATH entries</description>
<!-- Please set this value to the correct yarn.application.classpath that matches your server side configuration -->
<!-- For example: $HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/share/hadoop/common/*,$HADOOP_COMMON_HOME/share/hadoop/common/lib/*,$HADOOP_HDFS_HOME/share/hadoop/hdfs/*,$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*,$HADOOP_YARN_HOME/share/hadoop/yarn/*,$HADOOP_YARN_HOME/share/hadoop/yarn/lib/* -->
<name>yarn.application.classpath</name>
<value>/etc/hadoop/conf,/usr/lib/hadoop/*,/usr/lib/hadoop/lib/*,/usr/lib/hadoop-hdfs/*,/usr/lib/hadoop-hdfs/lib/*,/usr/lib/hadoop-yarn/*,/usr/lib/hadoop-yarn/lib/*,/usr/lib/hadoop-mapreduce/*,/usr/lib/hadoop-mapreduce/lib/*</value>
 </property>

After updating the yarn.application.classpath on your SQL Server machine, restart all of the services.


For more information on MSDN for PolyBase for YARN, see MSDN - PolyBase for Yarn

2 comments:

Dennes said...


I tried to do this configuration with HDInsight, but there wasn't a class path property inside the file in name node.

Do you have information about how to do this with hdinsight ?

Thank you !

Andrew Peterson said...

It could be that the configuration settings are not correct. The HDInsight versions do not directly match the Hortonworks versions that are listed for selection.
This site might help if that is the problem:
https://azure.microsoft.com/en-us/documentation/articles/hdinsight-component-versioning/

It could be that if you have an earlier version of HDInsight, it is not YARN based. Just a thought. I'm hoping to get to HDInsight in the next few weeks.