Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type safe API for Transport #8

Open
rdsr opened this issue Feb 2, 2019 · 0 comments
Open

Type safe API for Transport #8

rdsr opened this issue Feb 2, 2019 · 0 comments

Comments

@rdsr
Copy link

rdsr commented Feb 2, 2019

This ticket tries to make a case for a more type safe API for transport . Our API today IMO lacks in two ways :-

  1. The containers API - Struct, Map and Array is not parameterized. This means that for consuming elements from these containers requires a typecast. E.g taking a Int out of a map would mean taking StdData out and then typecasting it to a StdInt.
  2. The StdUDF API is parameterized, but the parameters extend StdData. This is limiting because :-
    1. Supporting Support for java.util.{Map, List} and Record container types #7 would require typecasting
    2. It would make it impossible to support Support for standard java.lang primitive types #6

I think we can do better, and though Presto may have some unknowns, I was able to achieve some success with a Spark prototype. Below I try to give an idea of my approach

Containers API
As described in #7, I chose java.util.{List, Map} for List and Map types. For Struct I defined a Record type, similar in line to Avro's GenericRecord. The key point here is the all these container APIs are parameterized as shown below

trait Schema {
  def schema: DataType
}

trait IndexedRecord extends Schema {
  def put[V](i: Int, v: V): Unit

  def get[V](i: Int): V
}

trait GenericRecord extends IndexedRecord {
  def put[V](key: String, v: V): Unit

  def get[V](key: String): V
}

abstract class GenericList[A] extends util.AbstractList[A] with Schema

abstract class GenericMap[K, V] extends util.AbstractMap[K, V] with Schema

So to get a field out of a record, we'd do

final GenericRecord r = ... 
final List<Integer> f = r.get("A");

This does not involve any typecasting and is type safe. Similar examples can be given for other container types.

UDF API
Similarly, for UDF API, we can have generic parameters which need not extend StdData . Below I provide the API I used in my prototype.

trait Fn0[F] extends UDF0[F] with Fn

trait Fn1[T, F] extends UDF1[T, F] with Fn

trait Fn2[T1, T2, F] extends UDF2[T1, T2, F] with Fn

trait Fn3[T1, T2, T3, F] extends UDF3[T1, T2, T3, F] with Fn

trait Fn4[T1, T2, T3, T4, F] extends UDF4[T1, T2, T3, T4, F] with Fn

trait Fn5[T1, T2, T3, T4, T5, F] extends UDF5[T1, T2, T3, T4, T5, F] with Fn

trait Fn6[T1, T2, T3, T4, T5, T6, F] extends UDF6[T1, T2, T3, T4, T5, T6, F] with Fn

This would help us implement #7 cleanly and in a type safe manner, and would make #6 possible

@rdsr rdsr changed the title Better type safe API for Transport Type safe API for Transport Feb 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant