Compilation for Hadoop 3.1.1

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Compilation for Hadoop 3.1.1

Vasily Shokov
Dear all,

I'm trying to compile CarbonData for Hadoop 3.1.1.

First of all, when I used file for Hadoop 2.7.2 (as described in the documentation), I got error from spark-shell:

./spark-shell --master yarn --driver-memory 1G --executor-memory 2G --executor-cores 2

java.lang.IllegalArgumentException: Unrecognized Hadoop major version number: 3.1.1.3.1.4.0-315
  at org.apache.hadoop.hive.shims.ShimLoader.getMajorVersion(ShimLoader.java:174)
  at org.apache.hadoop.hive.shims.ShimLoader.loadShims(ShimLoader.java:139)
  at org.apache.hadoop.hive.shims.ShimLoader.getHadoopShims(ShimLoader.java:100)
  at org.apache.hadoop.hive.conf.HiveConf$ConfVars.<clinit>(HiveConf.java:368)
  at org.apache.hadoop.hive.conf.HiveConf.<clinit>(HiveConf.java:105)
  at java.lang.Class.forName0(Native Method)
  at java.lang.Class.forName(Class.java:348)
  at org.apache.spark.util.Utils$.classForName(Utils.scala:239)
  at org.apache.spark.sql.SparkSession$.hiveClassesArePresent(SparkSession.scala:1079)
  at org.apache.spark.repl.Main$.createSparkSession(Main.scala:99)

Without carbondata jar everything works fine.

Next, I tryed to compile CarbonData with <hadoop.version>3.1.1</hadoop.version> option in pom.xml. In that case, I received compilation error:

The field org.apache.carbondata.core.indexstore.PartitionSpec.locationPath is transient but isn't set by deserialization [org.apache.carbondata.core.indexstore.PartitionSpec] In PartitionSpec.java SE_TRANSIENT_FIELD_NOT_RESTORED

Maybe someone can guide me - how to use (or recompile) CarbonData with Hadoop 3.1.1 (as far as I know, CarbonData supports Hadoop 3.1.1 from version 1.5)?

Thank you!

Best regards,
Vasily Shokov.