Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collections API: Use Futures #70

Merged
merged 2 commits into from
Sep 1, 2017
Merged

Conversation

rajadain
Copy link
Member

Overview

By using Futures instead of immediate values we can fetch tiles for multiple layers in parallel, thus speeding up the entire operation as the IO is the most expensive part.

Tagging @lossyrob for code review.

Connects #67

Demo

Here are some recorded run times when running a RasterGroupedCount operation for HUC-08 in isolation:

Runs nlcd-soils-request-huc8.json Future Natural
  1 6.76 8.25
  2 3.11 4.15
  3 2.77 4.92
  4 2.52 3.66
  5 3.22 3.73
Run 1 Average 3.676 4.942
  1 2.75 3.74
  2 2.41 3.69
  3 2.8 3.62
  4 2.33 3.61
  5 3.67 4.08
Run 2 Average 2.792 3.748
  1 2.42 4.19
  2 2.23 3.31
  3 2.48 3.45
  4 2.28 3.59
  5 2.82 3.62
Run 3 Average 2.446 3.632
       
  Total Average 2.971333333 4.107333333
       
  Speed up 1.382319946  

So using the Futures is clearly faster. The total speed is slower when servicing multiple requests though:

image

Notes

I tested with nlcd-soils-request-huc8.json and with nlcd-streams-request.json and got correct results. Running many of them in parallel took longer, but everything came through in the end:

image

Note that these times are from my local, and not within the VM, which is likely to be a little slower.

I could not reproduce the original issue in #67, wherein a single request "blocked" the machine from accepting others. If there are any recommendations for reproducing it, I can try them, otherwise I think we should proceed with what we have and make a new card if that behavior returns.

Testing Instructions

  • Check out this branch and run server
  • Run one of each type of operation and ensure that the values are the same as before

By using Futures instead of immediate values we can fetch
tiles for multiple layers in parallel, thus speeding up the
entire operation as the IO is the most expensive part.

This should also help with handling multiple large requests
as they are deferred appropriately.
Replace extraneous var with val.
Remove unused imports.
@kellyi
Copy link
Contributor

kellyi commented Aug 31, 2017

About to take a look at this.

lossyrob
lossyrob previously approved these changes Aug 31, 2017
Copy link

@lossyrob lossyrob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious about how the slowdown of 4 parallel requests for the Future-ized version and the previous version. I would guess that you'd still see an increase in performance for the changes in this PR.

That's a curiosity, I'm 👍 on this.

@rajadain
Copy link
Member Author

For reference here is nlcd-soils-request-huc8.json (RasterGroupedCount operation on a ~3500 sq km shape) without Futures:

image

and with:

image

Here is nlcd-streams-request.json (RasterLinesJoin operation on a few 100 sq km) without Futures:

image

and with:

image

These are the files I'm testing with:

pr70-futures-sample-requests.zip

Most clearly, the initial heavy load time of fetching tiles is significantly reduced, which is great for cold requests, the kind we're likely to have most often.

@lossyrob lossyrob dismissed their stale review August 31, 2017 14:59

Need to review how to use execution contexts for this.

@lossyrob
Copy link

After talking with some of the GT team, I need to learn a bit more about how to properly use Execution Contexts in this setting (based on some painful work on the Raster Foundry team around their tiler). I'll give a report here when I sort things out.

@kellyi
Copy link
Contributor

kellyi commented Aug 31, 2017

Ran this script with the nlcd-soils-request-huc8 simultaneously in 4 tmux panes:

#!/usr/bin/env ruby

input = ARGV[0].to_s

10.times do
  system("/usr/bin/time -p http --timeout=90 :8090/run < #{input} 2>&1 > /dev/null | grep real")
end

Here are the results from develop:

real 10.24
real 9.81
real 9.35
real 8.86
real 9.33
real 8.33
real 7.16
real 7.15
real 6.65
real 7.63

real 10.27
real 8.37
real 10.22
real 9.24
real 8.82
real 7.56
real 7.33
real 6.81
real 7.68
real 7.11

real 12.52
real 8.24
real 9.81
real 9.41
real 8.58
real 8.31
real 6.99
real 6.75
real 7.96
real 7.10

real 9.49
real 8.63
real 9.45
real 9.09
real 8.07
real 8.77
real 7.49
real 6.80
real 6.99
real 7.46

And here's this branch:

real 8.19
real 6.71
real 5.90
real 9.33
real 8.44
real 6.71
real 9.91
real 9.21
real 5.47
real 8.65

real 16.66
real 6.21
real 10.21
real 9.52
real 8.60
real 5.70
real 8.39
real 6.16
real 6.93
real 5.37

real 4.15
real 4.65
real 11.63
real 5.53
real 9.75
real 10.07
real 11.92
real 5.86
real 6.55
real 9.93

real 6.59
real 7.30
real 12.84
real 7.33
real 12.81
real 12.39
real 10.55
real 7.07
real 4.39
real 3.76

Seems like this branch is generally faster. The VisualVM output was also pretty interesting -- let me see if I can make a gif of it.

@kellyi
Copy link
Contributor

kellyi commented Aug 31, 2017

Not fully sure how to parse these but this is develop, without the futures, where the work seems to happen in the dispatchers (green represents running, brown is "parked", yellow is "waiting"):

screen shot 2017-08-31 at 11 56 07 am

This is this branch, where things seem to get forked off:

screen shot 2017-08-31 at 11 59 44 am

&

screen shot 2017-08-31 at 12 00 38 pm

@kellyi
Copy link
Contributor

kellyi commented Aug 31, 2017

+1 from me.

One thing this doesn't include is the "blocking-dispatcher" suggestion made here http://doc.akka.io/docs/akka-http/10.0.9/scala/http/handling-blocking-operations-in-akka-http-routes.html

I think we could probably delay thinking about those changes until we see how this performs without them?

Copy link
Contributor

@kellyi kellyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@kellyi kellyi assigned rajadain and unassigned kellyi Aug 31, 2017
@rajadain
Copy link
Member Author

I tried the blocking dispatcher stuff like this:

diff --git a/api/src/main/resources/application.conf b/api/src/main/resources/application.conf
index e3de9fd..b3f93d2 100644
--- a/api/src/main/resources/application.conf
+++ b/api/src/main/resources/application.conf
@@ -1,3 +1,12 @@
+mmw-dispatcher {
+  type = Dispatcher
+  executor = "thread-pool-executor"
+  thread-pool-executor {
+    fixed-pool-size = 32
+  }
+  throughput = 500
+}
+
 geoprocessing {
     port = 8090
     hostname = "0.0.0.0"
diff --git a/api/src/main/scala/WebServer.scala b/api/src/main/scala/WebServer.scala
index 7a6814c..503e990 100644
--- a/api/src/main/scala/WebServer.scala
+++ b/api/src/main/scala/WebServer.scala
@@ -1,10 +1,10 @@
 package org.wikiwatershed.mmw.geoprocessing
 
+import akka.dispatch.MessageDispatcher
 import akka.http.scaladsl.unmarshalling.Unmarshaller._
-import akka.http.scaladsl.server.{ HttpApp, Route }
+import akka.http.scaladsl.server.{HttpApp, Route}
 import akka.http.scaladsl.marshallers.sprayjson.SprayJsonSupport._
 import spray.json._
-
 import com.typesafe.config.ConfigFactory
 import com.typesafe.scalalogging.LazyLogging
 
@@ -42,16 +42,21 @@ object WebServer extends HttpApp with App with LazyLogging with Geoprocessing {
     } ~
     post {
       path("run") {
-        entity(as[PostRequest]) { data =>
-          data.input.operationType match {
-            case "RasterGroupedCount" =>
-              complete(getRasterGroupedCount(data.input))
-            case "RasterGroupedAverage" =>
-              complete(getRasterGroupedAverage(data.input))
-            case "RasterLinesJoin" =>
-              complete(getRasterLinesJoin(data.input))
-            case _ =>
-              throw new Exception(s"Unknown operationType: ${data.input.operationType}")
+        extractActorSystem { system =>
+          implicit val blockingDispatcher:MessageDispatcher =
+            system.dispatchers.lookup("mmw-dispatcher")
+
+          entity(as[PostRequest]) { data =>
+            data.input.operationType match {
+              case "RasterGroupedCount" =>
+                complete(getRasterGroupedCount(data.input))
+              case "RasterGroupedAverage" =>
+                complete(getRasterGroupedAverage(data.input))
+              case "RasterLinesJoin" =>
+                complete(getRasterLinesJoin(data.input))
+              case _ =>
+                throw new Exception(s"Unknown operationType: ${data.input.operationType}")
+            }
           }
         }
       }

and tweaking the configuration with different numbers. Here are the results:

mmw-dispatcher {
  type = Dispatcher
  executor = "thread-pool-executor"
  thread-pool-executor {
    fixed-pool-size = 32
  }
  throughput = 500
}

screen shot 2017-08-31 at 3 36 05 pm

mmw-dispatcher {
  type = Dispatcher
  executor = "thread-pool-executor"
  thread-pool-executor {
    fixed-pool-size = 32
  }
  throughput = 50
}

screen shot 2017-08-31 at 3 34 55 pm

mmw-dispatcher {
  type = Dispatcher
  executor = "thread-pool-executor"
  thread-pool-executor {
    fixed-pool-size = 32
  }
  throughput = 100
}

screen shot 2017-08-31 at 3 33 36 pm

mmw-dispatcher {
  type = Dispatcher
  executor = "thread-pool-executor"
  thread-pool-executor {
    fixed-pool-size = 16
  }
  throughput = 100
}

screen shot 2017-08-31 at 3 32 12 pm

All the numbers seem worse than the default performance. Tweaking them may be useful in certain contexts, but as we don't have a good enough understanding of the different types of dispatchers available and how their parameters affect runtime, I'd lean towards forgoing this for now.

I will however wait for @lossyrob to comment on the wisdom of using global ExecutionContext.

@lossyrob
Copy link

I'm getting caught up in other things, and I don't want to be a blocker. Since this is +1 given that there may be optimizations, perhaps we merge this and write an issue to investigate potential optimizations around EC usage?

@rajadain rajadain merged commit 821d107 into develop Sep 1, 2017
@rajadain rajadain deleted the tt/collections-api-futures branch September 1, 2017 02:50
@rajadain
Copy link
Member Author

rajadain commented Sep 1, 2017

No problem! I created #71 to track that effort. I don't think there's any real rush on it.

@kellyi we should probably create a 3.0.0-beta-2 now that this has been merged, and upgrade MMW to use it.

@kellyi
Copy link
Contributor

kellyi commented Sep 1, 2017

we should probably create a 3.0.0-beta-2 now that this has been merged, and upgrade MMW to use it.

Sounds good to me. I'll make a new release and create the corresponding MMW PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants