Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't use spark-batch-indexer #74

Open
benwck opened this issue Jun 23, 2016 · 7 comments
Open

Can't use spark-batch-indexer #74

benwck opened this issue Jun 23, 2016 · 7 comments

Comments

@benwck
Copy link

benwck commented Jun 23, 2016

Hey guys,

I don't want to spam in druid-development group thread, so i post here.
I actually build the jar on my own with spark 1.6.1, add the jar to /druidpath/extensions/druid-spark-batch/ on both the overlord and middle manager. Added druid.indexer.task.defaultHadoopCoordinates=["org.apache.spark:spark-core_2.10:1.6.1"]
in the two nodes runtime properties, restarted nodes then submit the job with json file.
Still get: "error": "Could not resolve type id 'index_spark' into a subtype of [simple type, class io.druid.indexing.common.task.Task]\n at [Source: HttpInputOverHTTP@2cecd2f2; line: 54, column: 38]"
Any idea or morde doc provided out of here ?

Thanks,
Ben

@benwck
Copy link
Author

benwck commented Jun 23, 2016

Overlord raise the exception but i checked the log and the module is correctly loaded:
2016-06-23T15:03:43,693 INFO [main] io.druid.initialization.Initialization - Loading extension [druid-spark-batch] for class [io.druid.initialization.DruidModule]

@drcrallen
Copy link
Contributor

Hi Ben, thanks for the information. A few things to double-check. Make sure you are using the https://github.com/metamx/druid-spark-batch/tree/druid0.9.0 branch for druid 0.9.0, and please make sure if you are using a middle manager that ALSO loads the extension properly.

@benwck
Copy link
Author

benwck commented Jun 24, 2016

Actually my middleManager doesn't load the extension. I made the same things on both nodes but it is not working on the middle manager. I will investigate and let you know.
Many thanks !

@benwck
Copy link
Author

benwck commented Jun 28, 2016

I have the following logs on both middleManager and Overlord, still got the same error.
Logs:
2016-06-28T08:14:11,916 INFO [main] io.druid.initialization.Initialization - Loading extension [druid-spark-batch] for class [io.druid.cli.CliCommandCreator]
2016-06-28T08:14:11,917 INFO [main] io.druid.initialization.Initialization - added URL[file:/home/ec2-user/druid-0.9.0/extensions/druid-spark-batch/druid-spark-batch_2.10.jar]
Error:
{
"error": "Could not resolve type id 'index_spark' into a subtype of [simple type, class io.druid.indexing.common.task.Task]\n at [Source: HttpInputOverHTTP@7e42fe95; line: 54, column: 38]"
}

Any suggestions on how investigate on this problem ?

@drcrallen
Copy link
Contributor

@benwck Assuming there are no errors reported during the node startup, the only thing I could think of would be to do a test with just the overlord locally (where the runner is local rather than remote). And see if the overlord takes the task under that condition. That way you eliminate weird potential cross-node communication stuff

@ImrulKayes
Copy link

I am having a similar issue. As @drcrallen suggested I am running overload locally (i.e., single node cluster with all services running locally).
I also built the jar with spark 1.6.1 and added jar to $DRUID_HOME/extensions/druid-spark-batch/ on both the overlord and middle managers. Also added druid.indexer.task.defaultHadoopCoordinates=["org.apache.spark:spark-core_2.10:1.6.1"] in runtime properties of overlord and middle managers. Then I used pull-deps to get org.apache.spark:spark-core_2.10:1.6.1 in $DRUID_HOME/hadoop-dependencies. When I started all the services overlord and middle managers started with no error logs. However, when I am submitting the job with JSON file I am getting the same error {"error":"Could not resolve type id 'index_spark' into a subtype of [simple type, class io.druid.indexing.common.task.Task]\n at [Source: HttpInputOverHTTP@2c648040; line: 1, column: 3426]"}. Any idea?

@kosii
Copy link

kosii commented Apr 10, 2017

Make sure to submit your task to the /druid/indexer/v1/task endpoint, and not to /druid/indexer/v1/supervisor. I lost a lot of time because of this :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants