Can't use spark-batch-indexer #74

benwck · 2016-06-23T12:38:39Z

Hey guys,

I don't want to spam in druid-development group thread, so i post here.
I actually build the jar on my own with spark 1.6.1, add the jar to /druidpath/extensions/druid-spark-batch/ on both the overlord and middle manager. Added druid.indexer.task.defaultHadoopCoordinates=["org.apache.spark:spark-core_2.10:1.6.1"]
in the two nodes runtime properties, restarted nodes then submit the job with json file.
Still get: "error": "Could not resolve type id 'index_spark' into a subtype of [simple type, class io.druid.indexing.common.task.Task]\n at [Source: HttpInputOverHTTP@2cecd2f2; line: 54, column: 38]"
Any idea or morde doc provided out of here ?

Thanks,
Ben

benwck · 2016-06-23T15:05:22Z

Overlord raise the exception but i checked the log and the module is correctly loaded:
2016-06-23T15:03:43,693 INFO [main] io.druid.initialization.Initialization - Loading extension [druid-spark-batch] for class [io.druid.initialization.DruidModule]

drcrallen · 2016-06-23T15:35:00Z

Hi Ben, thanks for the information. A few things to double-check. Make sure you are using the https://github.com/metamx/druid-spark-batch/tree/druid0.9.0 branch for druid 0.9.0, and please make sure if you are using a middle manager that ALSO loads the extension properly.

benwck · 2016-06-24T07:36:06Z

Actually my middleManager doesn't load the extension. I made the same things on both nodes but it is not working on the middle manager. I will investigate and let you know.
Many thanks !

benwck · 2016-06-28T08:17:51Z

I have the following logs on both middleManager and Overlord, still got the same error.
Logs:
2016-06-28T08:14:11,916 INFO [main] io.druid.initialization.Initialization - Loading extension [druid-spark-batch] for class [io.druid.cli.CliCommandCreator]
2016-06-28T08:14:11,917 INFO [main] io.druid.initialization.Initialization - added URL[file:/home/ec2-user/druid-0.9.0/extensions/druid-spark-batch/druid-spark-batch_2.10.jar]
Error:
{
"error": "Could not resolve type id 'index_spark' into a subtype of [simple type, class io.druid.indexing.common.task.Task]\n at [Source: HttpInputOverHTTP@7e42fe95; line: 54, column: 38]"
}

Any suggestions on how investigate on this problem ?

drcrallen · 2016-06-29T00:21:36Z

@benwck Assuming there are no errors reported during the node startup, the only thing I could think of would be to do a test with just the overlord locally (where the runner is local rather than remote). And see if the overlord takes the task under that condition. That way you eliminate weird potential cross-node communication stuff

ImrulKayes · 2017-01-04T22:32:58Z

I am having a similar issue. As @drcrallen suggested I am running overload locally (i.e., single node cluster with all services running locally).
I also built the jar with spark 1.6.1 and added jar to $DRUID_HOME/extensions/druid-spark-batch/ on both the overlord and middle managers. Also added druid.indexer.task.defaultHadoopCoordinates=["org.apache.spark:spark-core_2.10:1.6.1"] in runtime properties of overlord and middle managers. Then I used pull-deps to get org.apache.spark:spark-core_2.10:1.6.1 in $DRUID_HOME/hadoop-dependencies. When I started all the services overlord and middle managers started with no error logs. However, when I am submitting the job with JSON file I am getting the same error {"error":"Could not resolve type id 'index_spark' into a subtype of [simple type, class io.druid.indexing.common.task.Task]\n at [Source: HttpInputOverHTTP@2c648040; line: 1, column: 3426]"}. Any idea?

kosii · 2017-04-10T08:23:34Z

Make sure to submit your task to the /druid/indexer/v1/task endpoint, and not to /druid/indexer/v1/supervisor. I lost a lot of time because of this :D

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't use spark-batch-indexer #74

Can't use spark-batch-indexer #74

benwck commented Jun 23, 2016

benwck commented Jun 23, 2016

drcrallen commented Jun 23, 2016

benwck commented Jun 24, 2016

benwck commented Jun 28, 2016

drcrallen commented Jun 29, 2016

ImrulKayes commented Jan 4, 2017

kosii commented Apr 10, 2017

Can't use spark-batch-indexer #74

Can't use spark-batch-indexer #74

Comments

benwck commented Jun 23, 2016

benwck commented Jun 23, 2016

drcrallen commented Jun 23, 2016

benwck commented Jun 24, 2016

benwck commented Jun 28, 2016

drcrallen commented Jun 29, 2016

ImrulKayes commented Jan 4, 2017

kosii commented Apr 10, 2017