Skip to content

Hadoop 2.x Support

Raghu Angadi edited this page May 12, 2013 · 1 revision

Hadoop 2.x Support

Elephant-bird added support for Hadoop 2.x in version 4.0.

A major source of incompatibility between Hadoop 1.x and 2.x API is that some of the classes in 1.x are interfaces in 2.x. Though the API itself is functionally compatible, Java bytecode generated for method invocations is different based on hadoop version in build.

Many Hadoop dependent projects publish two maven artifacts, one compiled against 1.x and other compiled against 2.x. This imposes extra burden on applications to produce and deploy multiple versions, especially during transition to Hadoop 2.x. Elephant-bird artifacts are built to work with both Hadoop version. It handles class vs. interface incompatibility through reflection. The utility methods are defined in ContextUtil.java.

ContextUtil.java

Examples uses of ContextUtil in Elephant-Bird for handling Hadoop 1.x and 2.x from pull #308:

  • Configuration job = context.getConfiguration(); --> Configuration job = ContextUtil.getConfiguration(context);
  • ioCtx.setStatus(status); --> ContextUtil.setStatus(ioCtx, status);
  • ioCtx.getCounter(key).increment(amount) --> ContextUtil.incrementCounter(ioCtx.getCounter(key), amount)

JobContext.getConfiguration() is the most commonly used method affected by this incompatibility.