Skip to content

CG Code Transformations

Paul Rogers edited this page Nov 17, 2016 · 1 revision

Code Transformations

An unusual aspect of Drill's code generation system is the use of the ASM package to merge Java classes. Byte code manipulation appeared to grow out of the Aspect Oriented Programming community, and appears to be used by projects such as HBase. A good place to start is this tutorial.

The code transformer attempts to do scalar replacement (which also seems to be a feature of the JVM itself.) More information is available in this paper.

Example - ExternalSortBatch, SingleBatchSorter

The SingleBatchSorter interface, template and template definition:

public interface SingleBatchSorter {
  public void setup(FragmentContext context, SelectionVector2 vector2, VectorAccessible incoming) 
              throws SchemaChangeException;
  public void sort(SelectionVector2 vector2);

  public static TemplateClassDefinition<SingleBatchSorter> TEMPLATE_DEFINITION = 
                new TemplateClassDefinition<SingleBatchSorter>(SingleBatchSorter.class, 
                                                               SingleBatchSorterTemplate.class);
}

public abstract class SingleBatchSorterTemplate implements SingleBatchSorter, IndexedSortable{
  ...

  public abstract void doSetup(@Named("context") FragmentContext context,
                               @Named("incoming") VectorAccessible incoming, 
                               @Named("outgoing") RecordBatch outgoing);
  public abstract int doEval(@Named("leftIndex") char leftIndex, 
                             @Named("rightIndex") char rightIndex);
}

The following code in ExternalSortBatch generates the specialized class:

  public SingleBatchSorter createNewSorter(FragmentContext context, VectorAccessible batch)
          throws ClassTransformationException, IOException, SchemaChangeException{
    CodeGenerator<SingleBatchSorter> cg = CodeGenerator.get(SingleBatchSorter.TEMPLATE_DEFINITION, context.getFunctionRegistry(), context.getOptions());
    ClassGenerator<SingleBatchSorter> g = cg.getRoot();

    generateComparisons(g, batch);
    return context.getImplementationClass(cg);
  }

Here, generateComparisons() builds up the specific methods needed to do the comparisons needed to implement the sort.

The particular case compares nullable VarChar fields. Internally, the code generator looks for a function with the template name of compare_to_nulls_high with arguments nullable VarChar. The match is:

  @FunctionTemplate(name = FunctionGenerationHelper.COMPARE_TO_NULLS_HIGH,
                    scope = FunctionTemplate.FunctionScope.SIMPLE,
                    nulls = NullHandling.INTERNAL)
  public static class GCompareVarCharVsVarCharNullHigh implements DrillSimpleFunc {
    @Param VarCharHolder left;
    @Param VarCharHolder right;
    @Output IntHolder out;
    public void setup() {}
    public void eval() {
     outside: {
      out.value = org.apache.drill.exec.expr.fn.impl.ByteFunctionHelpers.compare(
          left.buffer, left.start, left.end, right.buffer, right.start, right.end );
      } // outside
    }
  }

The code generator will "inline" the above code. To do that, it needs to create local variables that correspond to the VarCharHolder parameters.

Clone this wiki locally