Presto

Metadata queries are failing with java.lang.RuntimeException: error initializing deserializer: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

  • 1.  Metadata queries are failing with java.lang.RuntimeException: error initializing deserializer: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

    Quboler
    Posted 04-17-2019 16:39
    When all metadata queries start failing with the following exception what would be the cause?

    2019-04-12T18:27:04.334Z ERROR   SplitRunner-184-797948  com.facebook.presto.execution.executor.TaskExecutor Error processing Split 20190412_182551_18118_2rvw8.1.0-0 InformationSchemaSplit{tableHandle=hive:information_schema:columns, filters={}, addresses=[10.79.5.177:8081]} (start = 6.76413448971468E8, wall = 73014 ms, cpu = 0 ms, wait = 0 ms, calls = 1)
    java.lang.RuntimeException: error initializing deserializer: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
        at com.facebook.presto.hive.HiveUtil.initializeDeserializer(HiveUtil.java:421)
        at com.facebook.presto.hive.HiveUtil.getDeserializer(HiveUtil.java:378)
        at com.facebook.presto.hive.HiveUtil.getTableObjectInspector(HiveUtil.java:340)
        at com.facebook.presto.hive.HiveUtil.getTableStructFields(HiveUtil.java:357)
        at com.facebook.presto.hive.HiveUtil.getRegularColumnHandles(HiveUtil.java:790)
        at com.facebook.presto.hive.HiveUtil.hiveColumnHandles(HiveUtil.java:772)
        at com.facebook.presto.hive.HiveMetadata.listTableColumns(HiveMetadata.java:390)​


    When all the metadata queries start failing with the above exception, please look into the following:

    1. Try re-running the metadata queries with a different cluster. If those queries also fail, then the possible reason for this an existence of a bad table that has been created which is causing this failure

    2. How do we detect which table?
    This is the tricky part of the process. You should first identify when you started observing the behavior. Identify the date, ideally a timeframe as well, but this can be difficult

    > If all you know is the date, you will have to query the metadata and identify when the timestamp of table creation, with condition such as 

    SERDES.SLIB='org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' and CREATE_TIME


    3. Once you identify the timeframe, you will have to narrow it down to a database and then within the database to a table or multiple tables that may be causing the issue.

    >When running "describe formatted" hive query on a bad table in a db, you will see an exception as follows:

    ERROR: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.ClassNotFoundException Class org.apache.hive.hcatalog.data.JsonSerDe not found
    (Ex query: 30357552)


    This will eventually lead you to identifying which table is causing this error. Dropping this (bad) table will ideally unblock you. This process is tricky and may take up some time to investigate the tables and certainly needs a bit of trial and error, but the above steps will be helpful for anyone as a template to go ahead and investigate further.

    I hope this helps. 



    ------------------------------
    Best,
    Urvashi Kohli
    Big Data Support Engineer, Qubole, Inc.
    ------------------------------