Failed execution error return code 1 from org apache hadoop hive ql exec ddltask

I am getting the below error on creating a hive database FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. com/facebook/fb303/FacebookService$Iface Hadoop vers...

I am getting the below error on creating a hive database

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. com/facebook/fb303/FacebookService$Iface

Hadoop version:**hadoop-1.2.1**

HIVE Version: **hive-0.12.0**

Hadoop path:/home/hadoop_test/data/hadoop-1.2.1
hive path :/home/hadoop_test/data/hive-0.12.0

I have copied hive*.jar ,jline-.jar,antlr-runtime.jar from hive-0.12.0/lib to hadoop-1.2./lib

WoodChopper's user avatar

WoodChopper

4,2055 gold badges28 silver badges53 bronze badges

asked Apr 28, 2014 at 5:26

user3579986's user avatar

1

set hive.msck.path.validation=ignore;
MSCK REPAIR TABLE table_name;

Make sure the location is specified correctly

answered Nov 15, 2017 at 16:16

loneStar's user avatar

loneStarloneStar

3,65222 silver badges39 bronze badges

5

In the following way, I solved the problem.

set hive.msck.repair.batch.size=1;
set hive.msck.path.validation=ignore;

If you can not set the value, and get the error.Error: Error while processing statement: Cannot modify hive.msck.path.validation at runtime. It is not in list of params that are allowed to be modified at runtime (state=42000,code=1)

add content in hive-site:

key:
hive.security.authorization.sqlstd.confwhitelist.append
value:
hive.msck.path.validation|hive.msck.repair.batch.size

enter image description here

answered May 20, 2019 at 3:32

QiuYi's user avatar

QiuYiQiuYi

1251 silver badge4 bronze badges

Set hive.metastore.schema.verification property in hive-site.xml to true, by default it is false.

For further details check this link.

answered Sep 9, 2014 at 10:57

Shailvi's user avatar

Amazon Athena

If you get here because of Amazon Athena errors, you might use this bit below. First check that all you files have the same schema:

If you run an ALTER TABLE ADD PARTITION (or MSCK REPAIR TABLE) statement and mistakenly specify a partition that already exists and an incorrect Amazon S3 location, zero byte placeholder files of the format partition_value_$folder$ are created in Amazon S3. You must remove these files manually.

We removed the files with the awscli.

aws s3 rm s3://bucket/key/table/ --exclude="*" --include="*folder*" --recursive --dryrun 

See also the docs with some extra steps included.

answered May 19, 2022 at 8:16

Roelant's user avatar

RoelantRoelant

4,2371 gold badge30 silver badges60 bronze badges

To proper fix this with MSCK

  1. Remove the older partitions from metastore, if their path not exists, using

    ALTER TABLE dbname.tablename DROP PARTITION IF EXISTS (partition_column_name > 0);

  2. RUN MSCK REPAIR COMMAND

    MSCK REPAIR TABLE dbname.tablename;

Why the step 1 is required because MSCK Repair command will through error if the partition is removed from the file system (HDFS), so by removing all the partitions from the metastore first and then sync with MSCK will properly add the required partitions

answered Aug 11, 2022 at 7:54

Yash's user avatar

YashYash

1411 silver badge12 bronze badges

The reason why we got this error was we added a new column to the external Hive table. set hive.msck.path.validation=ignore; worked upto fixing hive queries but Impala had additional issues which were solved with below steps:

After doing an invalidate metadata, Impala queries started failing with Error: incompatible Parquet schema for column

Impala error SOLUTION: set PARQUET_FALLBACK_SCHEMA_RESOLUTION=name;

if you’re using Cloudera distribution below steps will make the change permanent and you don’t have to set the option per session.

Cloudera Manager -> Clusters -> Impala -> Configuration -> Impala Daemon Query Options Advanced Configuration Snippet (Safety Valve)

Add the value: PARQUET_FALLBACK_SCHEMA_RESOLUTION=name

NOTE: do not use SET or semi-colon when setting the parameter in Cloudera Manager

answered Oct 22, 2018 at 7:17

Selwyn Fernandes's user avatar

open hive cli using «hive —hiveconf hive.root.logger=DEBUG,console» to enable logs and debug from there, in my case a camel case name for partition was written on hdfs and i created hive table with its name fully in lowercase.

answered Aug 24, 2020 at 18:29

musiceni's user avatar

None of proposed solutions worked for me.

I discovered a 0B file named _$folder$ inside my table location path (at same level of partitions).
Removing it allowed me to run a MSCK REPAIR TABLE t without issues.

This file was comming from a s3 restore (roll back to a previous versionned state)

answered Sep 19, 2022 at 14:04

Jérémy's user avatar

JérémyJérémy

1,0381 gold badge15 silver badges31 bronze badges

1

I faced the same error. Reason in my case was a directory created in the HDFS warehouse with the same name. When this directory was deleted, it resolved my issue.

answered Nov 6, 2014 at 21:47

Sumit Awkash's user avatar

It’s probably because your metastore_db is corrubpted. Delete .lck files from metastore_db.

answered Feb 18, 2016 at 9:27

Ashish Chaudhari's user avatar

hive -e «msck repair table database.tablename»
it will repair table metastore schema of table;

answered Sep 28, 2016 at 5:23

dilshad's user avatar

dilshaddilshad

7241 gold badge10 silver badges27 bronze badges

setting the below property and then doing msck repair worked for me :

  • set hive.mapred.mode=unstrict;

answered Nov 7, 2022 at 13:18

Rohan Mudaliar's user avatar

I faced similar issue when the underlying hdfs directory got updated with new partitions and hence the hive metastore went out of sync.

Solved using the following two steps:

  1. MSCK table table_name showed what all partitions are out of sync.
  2. MSCK REPAIR table table_name added the missing partitions.

answered Apr 30, 2020 at 9:35

Varanasi Sai Bhargav's user avatar

1

Writing Data

1.1 Caused by: org.apache.parquet.io.InvalidRecordException: Parquet/Avro schema mismatch: Avro field ‘col1’ not found

It is recommended that schema should evolve in backwards compatible way while using Hudi. Please refer here for more information on avro schema resolution — https://avro.apache.org/docs/1.8.2/spec.html. This error generally occurs when the schema has evolved in backwards incompatible way by deleting some column ‘col1’ and we are trying to update some record in parquet file which has alredy been written with previous schema (which had ‘col1’). In such cases, parquet tries to find all the present fields in the incoming record and when it finds ‘col1’ is not present, the mentioned exception is thrown.

The fix for this is to try and create uber schema using all the schema versions evolved so far for the concerned event and use this uber schema as the target schema. One of the good approaches can be fetching schema from hive metastore and merging it with the current schema.

Sample stacktrace where a field named «toBeDeletedStr» was omitted from new batch of updates : https://gist.github.com/nsivabalan/cafc53fc9a8681923e4e2fa4eb2133fe

1.2 Caused by: java.lang.UnsupportedOperationException: org.apache.parquet.avro.AvroConverters$FieldIntegerConverter

This error will again occur due to schema evolutions in non-backwards compatible way. Basically there is some incoming update U for a record R which is already written to your Hudi dataset in the concerned parquet file. R contains field F which is having certain data type, let us say long. U has the same field F with updated data type of int type. Such incompatible data type conversions are not supported by Parquet FS. 

For such errors, please try to ensure only valid data type conversions are happening in your primary data source from where you are trying to ingest. 

Sample stacktrace when trying to evolve a field from Long type to Integer type with Hudi : https://gist.github.com/nsivabalan/0d81cd60a3e7a0501e6a0cb50bfaacea

1.3 org.apache.hudi.exception.SchemaCompatabilityException: Unable to validate the rewritten record <record> against schema <schema>
at org.apache.hudi.common.util.HoodieAvroUtils.rewrite(HoodieAvroUtils.java:215)

This can possibly occur if your schema has some non-nullable field whose value is not present or is null. It is recommended to evolve schema in backwards compatible ways. In essence, this means either have every newly added field as nullable or define default values for every new field. In case if you are relying on default value for your field, as of Hudi version 0.5.1, this is not handled. 

1.4 hudi consumes too much space in a temp folder while upsert 

When upsert large input data, hudi will spills part of input data to disk when reach the max memory for merge. if there is enough memory, please increase spark executor’s memory and  «hoodie.memory.merge.fraction» option, for example

option("hoodie.memory.merge.fraction", "0.8")  //    

Ingestion

2.1 Caused by: java.io.EOFException: Received -1 when reading from channel, socket has likely been closed.
at kafka.utils.Utils$.read(Utils.scala:381)
at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)

This might happen if you are ingesting from Kafka source, your cluster is ssl enabled by default and you are using some version of Hudi older than 0.5.1. Previous versions of Hudi were using spark-streaming-kafka-0-8 library. With the release of 0.5.1 version of Hudi, spark was upgraded to 2.4.4 and spark-streaming-kafka library was upgraded to spark-streaming-kafka-0-10. SSL support was introduced from spark-streaming-kafka-0-10. Please see here for reference. 

The workaround can be either use Kafka cluster which is not ssl enabled, else upgrade Hudi version to at least 0.5.1 or spark-streaming-kafka library to spark-streaming-kafka-0-10.

2.2 Exception in thread «main» org.apache.kafka.common.KafkaException: Failed to construct kafka consumer

Caused by: java.lang.IllegalArgumentException: Could not find a ‘KafkaClient’ entry in the JAAS configuration. System property ‘java.security.auth.login.config’ is not set

This might happen when you are trying to ingest from ssl enabled kafka source and your setup is not able to read jars.conf file and its properties. To fix this, you need to pass the required property as part of your spark-submit command something like 

--files jaas.conf,failed_tables.json --conf 'spark.driver.extraJavaOptions=-Djava.security.auth.login.config=jaas.conf' --conf 'spark.executor.extraJavaOptions=-Djava.security.auth.login.config=jaas.conf'

2.3 com.uber.hoodie.exception.HoodieException: created_at(Part -created_at) field not found in record. Acceptable fields were :[col1, col2, col3, id, name, dob, created_at, updated_at]

Happens generally when field marked as recordKey or partitionKey is not present in some incoming record. Please cross verify your incoming record once. 

2.4 if it is possible to use a nullable field that contains null records as a primary key when creating hudi table

No, will throw HoodieKeyException

Caused by: org.apache.hudi.exception.HoodieKeyException: recordKey value: "null" for field: "name" cannot be null or empty.
  at org.apache.hudi.keygen.SimpleKeyGenerator.getKey(SimpleKeyGenerator.java:58)
  at org.apache.hudi.HoodieSparkSqlWriter$$anonfun$1.apply(HoodieSparkSqlWriter.scala:104)
  at org.apache.hudi.HoodieSparkSqlWriter$$anonfun$1.apply(HoodieSparkSqlWriter.scala:100)

Hive Sync

3.1 Caused by: java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. The following columns have types incompatible with the existing columns in their respective positions :
__col1,__col2

This will usually happen when you are trying to add a new column to existing hive table using our HiveSyncTool.java class. Databases usually will not allow to modify a column datatype from a higher order to lower order or cases where the datatypes may clash with the data that is already stored/will be stored in the table. To fix the same, try setting the following property — 

set hive.metastore.disallow.incompatible.col.type.changes=false;

3.2 com.uber.hoodie.hive.HoodieHiveSyncException: Could not convert field Type from <type1> to <type2> for field col1

This occurs because HiveSyncTool currently supports only few compatible data type conversions. Doing any other incompatible change will throw this exception. Please check the data type evolution for the concerned field and verify if it indeed can be considered as a valid data type conversion as per Hudi code base.

3.3 Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: Database does not exist: test_db

This generally occurs if you are trying to do Hive sync for your Hudi dataset and the configured hive_sync database does not exist. Please create the corresponding database on your Hive cluster and try again. 

3.4 Caused by: org.apache.thrift.TApplicationException: Invalid method name: ‘get_table_req’

This issue is caused by hive version conflicts, hudi built with hive-2.3.x version, so if still want hudi work with older hive version 

Steps: (build with hive-2.1.0)
1. git clone git@github.com:apache/incubator-hudi.git
2. rm hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/hive/HoodieCombineHiveInputFormat.java
3. mvn clean package -DskipTests -DskipITs -Dhive.version=2.1.0

3.5 Caused by : java.lang.UnsupportedOperationException: Table rename is not supported

This issue could occur when syncing to hive. Possible reason is that, hive does not play well if your table name has upper and lower case letter. Try to have all lower case letters for your table name and it should likely get fixed. Related issue: https://github.com/apache/hudi/issues/2409


Running from IDE

4.1 «java.lang.IllegalArgumentException: Unsupported class file major version 56«.

please use java 8, and not java 11

Error:

When i try to create table in hive, i get error due to lack of permission

hive> CREATE EXTERNAL TABLE threshold_reference_table (Attribute STRING, Low_Age_Limit INT, High_Age_Limit INT, Low_Range_Value INT, High_Range_Value INT, Alert_Flag INT, Alert_Message STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY 'n';
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: org.apache.hadoop.security.AccessControlException Permission denied: user=ec2-user, access=EXECUTE, inode="/user/hive/warehouse":hive:hive:drwxrwx---
        at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:279)
        at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:260)
        at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkTraverse(DefaultAuthorizationProvider.java:201)
        at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:154)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:152)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:3885)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6855)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:4455)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:912)
        at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getFileInfo(AuthorizationProviderProxyClientProtocol.java:533)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:862)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2281)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2277)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2275)
)

Workaround:

Connect to hdfs user and give permissions to hive user with hadoop group

[root@ip-172-31-55-239 HBase]# su - hdfs
[hdfs@ip-172-31-55-239 ~]$  hdfs dfs -chown -R hive:hadoop /user/hive/warehouse

Now create a table

[hdfs@ip-172-31-55-239 ~]$ hive
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=512M; support was removed in 8.0
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=512M; support was removed in 8.0

Logging initialized using configuration in jar:file:/opt/cloudera/parcels/CDH-5.15.1-1.cdh5.15.1.p0.4/jars/hive-common-1.1.0-cdh5.15.1.jar!/hive-log4j.properties
WARNING: Hive CLI is deprecated and migration to Beeline is recommended.
hive> CREATE EXTERNAL TABLE threshold_reference_table (Attribute STRING, Low_Age_Limit INT, High_Age_Limit INT, Low_Range_Value INT, High_Range_Value INT, Alert_Flag INT, Alert_Message STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY 'n';
OK
Time taken: 2.611 seconds

Desire and obsessive to learn !
Generic technology enthusiast who have dynamic experience in database and other technologies. I have persistence to learn any niche skills faster.
Served multiple DBA roles in fortune 500 companies to proactively prevent unexpected failure events.
Inclined to take risks,face challenging situations and embrace fear!

***I would like to share my thoughts and ideas to this world !***
View all posts by kishan

Hive Related Errors and fixes

1) FAILED: SemanticException [Error 10096]: Dynamic partition strict mode requires at least one static partition column. To turn this off set hive.exec.dynamic.partition.mode=nonstrict
Fix: a) set hive.exec.dynamic.partition.mode=nonstrict;
b) set hive.exec.dynamic.partition=true;

2) FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table.

Fix: deploy latest mysql-jdbs jar

3) 20)for ClassNotFoundException: org.apache.hadoop.hive.jdbc.HiveDriver

add hive-jdbc JAR

4) java.lang.NoClassDefFoundError: org/apache/hadoop/hive/metastore/api/MetaException

add hive-metastore jar

5) java.lang.NoClassDefFoundError: org/apache/thrift/TBase

add libthrift JAR

6) java.lang.NoClassDefFoundError: org/apache/hadoop/hive/service/HiveInterface

add hive-service Jar

7) java.lang.NoClassDefFoundError: com/facebook/fb303/FacebookService$Iface

add libfb Jar

8) java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory

add slf4j-api jar and slf4j-log4j

9) java.lang.NoClassDefFoundError: org/apache/log4j/Level

add log4j Jar

10) java.lang.NoClassDefFoundError: org/apache/http/HttpRequestInterceptor

add Httpcore and httpclient jar

11) java.lang.NoClassDefFoundError: org/apache/hadoop/hive/conf/HiveConf$ConfVars

add hive-common jar

12)  java.lang.NoClassDefFoundError: org/apache/hadoop/util/Shell

add hadoop-common jar

13) java.lang.NoClassDefFoundError: org/apache/commons/logging/LogFactory

add commons.logging jar

14) java.lang.NoClassDefFoundError: com/google/common/base/Preconditions

add guava.jar

15) java.lang.NoClassDefFoundError: org/apache/commons/collections/map/UnmodifiableMap

add common collections jar

Popular posts from this blog

HBase Interview Questions

1) When should you use HBase and what are the key components of HBase? — HBase should be used when the big data application has 1) A variable schema 2) When data is stored in the form of collections 3) If the application demands key based access to data while retrieving. — Key components of HBase are HRegion- This component contains memory data store and Hfile. default size 1024 mb HRegion Server-This monitors the Region. HBase Master-It is responsible for monitoring the region server. Zookeeper- It takes care of the coordination between the HBase Master component and the client. Catalog Tables-The two important catalog tables are ROOT and META. ROOT table tracks where the META table is and META table stores all the regions in the system. 2) What are the different operational commands in HBase at record level and table level? — Record Level Operational Commands in HBase are –put, get, increment, scan and delete. — Table Level Operational Comma

Hadoop Cluster and HDFS Interview Questions

1) What do the four V’s of Big Data denote? a) Volume –Scale of data b) Velocity –Analysis of streaming data c) Variety – Different forms of data d) Veracity –Uncertainty of data 2) On what concept the Hadoop framework works? — Hadoop Framework works on the following two core components 1) HDFS:(Storage Unit) — Hadoop Distributed File System is the java based file system for scalable and reliable storage of large datasets. — Data in HDFS is stored in the form of blocks and it operates on the Master Slave Architecture. 2) MapReduce:(Processing  and Cluster Management Unit) — This is a java based programming paradigm of Hadoop framework that provides scalability across various Hadoop clusters. — MapReduce distributes the workload into various tasks that can run in parallel. Hadoop jobs perform 2 separate tasks- job. — MAP: -The map job breaks down the data sets into key-value pairs or tuples. — Reduce: — The reduce job then takes the output of the ma

Понравилась статья? Поделить с друзьями:
  • Failed error not found python2
  • Failed error during websocket handshake unexpected response code 502
  • Failed error during websocket handshake unexpected response code 500
  • Failed error during websocket handshake unexpected response code 404
  • Failed error during websocket handshake unexpected response code 302