GitHub - himanshug/druid-hadoop-utils: Read druid segments from hadoop

About

This is a collection of utilities to read druid segments stored on hdfs from hadoop. It contains a hadoop input format, pig loader and pig udf for druid complex metrics. This code is a prototype really and in very early stages, so some of the details might change. That said, I did test it to be working and will update as and when necessary. If you have any questions, please post them to druid community user groups.

It works by fetching the list of segments from druid overlord and then directly reading the segments from HDFS. So, overlord is the only druid node used.

Quick Start

Get the code: git clone https://github.com/himanshug/druid-hadoop-utils.git
Build: mvn clean package
mvn dependency:copy-dependencies to download required dependencies
create javadocs : mvn javadoc:javadoc . docs will be in submodule/target/site/apidocs/
For help on druid hadoop Input Format, see javadoc of DruidInputFormat.
For druid pig loader, see javadoc of DruidStorage

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
druid-mr		druid-mr
druid-pig		druid-pig
.gitignore		.gitignore
LICENCE		LICENCE
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Quick Start

About

Releases

Packages

Contributors 2

Languages

License

himanshug/druid-hadoop-utils

Folders and files

Latest commit

History

Repository files navigation

About

Quick Start

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages