Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Batch leap frame and a sample batch tf transformer #600

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

sushrutikhar
Copy link

@sushrutikhar sushrutikhar commented Nov 19, 2019

Currently in mleap we only have default leapframe which applies transformation to the dataset row by row. However, as TF does support predictions over a batch of requests and is internally optimised for that, we can leverage the benefits in mleap using a batch leap frame. This increases the throughput and decreases the latencies as opposed to a sequential processing.
A BatchTransformer will take Seq[Row] as input and return back the transformed and enriched output as Seq[Row]
A sample BatchTensorflowTransformer is added in this PR

Here is a comparison in benchmarking numbers (using a Gatling client) between DefaultLeapFrame and BatchLeapFrame, for a simple LR model written in Tensorflow
The throughput gain is almost 2x

TF-Mleap-

================================================================================
---- Global Information --------------------------------------------------------
> request count 300000 (OK=300000 KO=0 )
> min response time 0 (OK=0 KO=- )
> max response time 238 (OK=238 KO=- )
> mean response time 7 (OK=7 KO=- )
> std deviation 10 (OK=10 KO=- )
> response time 50th percentile 5 (OK=5 KO=- )
> response time 75th percentile 9 (OK=8 KO=- )
> response time 95th percentile 26 (OK=26 KO=- )
> response time 99th percentile 55 (OK=55 KO=- )
> mean requests/sec 3750 (OK=3750 KO=- )
---- Response Time Distribution ------------------------------------------------
> t < 5 ms 146849 ( 49%)
> 5 ms < t < 20 ms 132300 ( 44%)
> t > 20 ms 20851 ( 7%)
> failed 0 ( 0%)
================================================================================

TF-Mleap with Batching

================================================================================
---- Global Information --------------------------------------------------------
> request count                                     300000 (OK=300000 KO=0     )
> min response time                                      0 (OK=0      KO=-     )
> max response time                                     68 (OK=68     KO=-     )
> mean response time                                     3 (OK=3      KO=-     )
> std deviation                                          2 (OK=2      KO=-     )
> response time 50th percentile                          2 (OK=3      KO=-     )
> response time 75th percentile                          4 (OK=5      KO=-     )
> response time 95th percentile                          8 (OK=8      KO=-     )
> response time 99th percentile                         12 (OK=12     KO=-     )
> mean requests/sec                                7142.857 (OK=7142.857 KO=-     )
---- Response Time Distribution ------------------------------------------------
> t < 5 ms                                          217808 ( 73%)
> 5 ms < t < 20 ms                                   81691 ( 27%)
> t > 20 ms                                            501 (  0%)
> failed                                                 0 (  0%)
================================================================================

@sushrutikhar
Copy link
Author

@hollinwilkins @ancasarb

@sushrutikhar
Copy link
Author

hey @ancasarb did you get a chance to have a look at the PR?

@lucagiovagnoli
Copy link
Member

This looks interesting.
@sushrutikhar do you think our benchmark analysis in #631 (between xgboost4j and mleap) could be related? The factor 2x might be a pattern between the two ?

@sushrutikhar
Copy link
Author

This looks interesting.
@sushrutikhar do you think our benchmark analysis in #631 (between xgboost4j and mleap) could be related? The factor 2x might be a pattern between the two ?

@lucagiovagnoli the gain we saw is mainly because the default leap frame is not utilising underlying libraries's ability to do parallel processing. Looks like in your case the xgboost library being used is itself having performance issues. however, as an exercise we can try using the parallel leap frame introduced in this PR for xgbooost as well and see if that gives more performance gain over and above to the changes you proposed in your PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants