Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to create thread #227

Open
keiranFTW opened this issue Apr 15, 2021 · 0 comments
Open

Failed to create thread #227

keiranFTW opened this issue Apr 15, 2021 · 0 comments

Comments

@keiranFTW
Copy link

Issue Description

Please describe our issue, along with:

  • expected behavior
  • encountered behavior

The crawler crashes unexpectedly after a while, claiming that resource limits have been reached.

How to reproduce it

If you are describing a bug, please describe here how to reproduce it.

Seed crawler with 10,000 unique URLS, crawl using default fetcher and you will be greeted with following:

2021-04-15 13:45:06 INFO FairFetcher$:71 - Adding doc to SOLR
[15128.721s][warning][os,thread] Failed to start thread - pthread_create failed (EAGAIN) for attributes: stacksize: 1024k, guardsize: 0k, detached.
2021-04-15 13:45:06 WARN BlockManager:69 - Block rdd_25_0 could not be removed as it was not found on disk or in memory
2021-04-15 13:45:06 ERROR Executor:94 - Exception in task 0.0 in stage 15.0 (TID 11)
java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached
at java.lang.Thread.start0(Native Method) ~[?:?]
at java.lang.Thread.start(Thread.java:799) ~[?:?]
at shaded.org.apache.http.impl.client.IdleConnectionEvictor.start(IdleConnectionEvictor.java:96) ~[sparkler-app-0.2.2-SNAPSHOT.jar:?]
at shaded.org.apache.http.impl.client.HttpClientBuilder.build(HttpClientBuilder.java:1227) ~[sparkler-app-0.2.2-SNAPSHOT.jar:?]
at org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:319) ~[sparkler-app-0.2.2-SNAPSHOT.jar:?]
at org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:330) ~[sparkler-app-0.2.2-SNAPSHOT.jar:?]
at org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:268) ~[sparkler-app-0.2.2-SNAPSHOT.jar:?]
at org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:255) ~[sparkler-app-0.2.2-SNAPSHOT.jar:?]
at org.apache.solr.client.solrj.impl.HttpSolrClient.(HttpSolrClient.java:204) ~[sparkler-app-0.2.2-SNAPSHOT.jar:?]
at org.apache.solr.client.solrj.impl.HttpSolrClient$Builder.build(HttpSolrClient.java:952) ~[sparkler-app-0.2.2-SNAPSHOT.jar:?]
at edu.usc.irds.sparkler.storage.solr.SolrProxy.newClient(SolrProxy.scala:45) ~[sparkler-app-0.2.2-SNAPSHOT.jar:?]
at edu.usc.irds.sparkler.storage.solr.SolrProxy.(SolrProxy.scala:78) ~[sparkler-app-0.2.2-SNAPSHOT.jar:?]
at edu.usc.irds.sparkler.storage.StorageProxyFactory.getProxy(StorageProxyFactory.scala:33) ~[sparkler-app-0.2.2-SNAPSHOT.jar:?]
at edu.usc.irds.sparkler.model.SparklerJob.newStorageProxy(SparklerJob.scala:54) ~[sparkler-app-0.2.2-SNAPSHOT.jar:?]
at edu.usc.irds.sparkler.pipeline.FairFetcher.next(FairFetcher.scala:72) ~[sparkler-app-0.2.2-SNAPSHOT.jar:?]
at edu.usc.irds.sparkler.pipeline.FairFetcher.next(FairFetcher.scala:29) ~[sparkler-app-0.2.2-SNAPSHOT.jar:?]
at scala.collection.Iterator$$anon$11.next(Iterator.scala:494) ~[sparkler-app-0.2.2-SNAPSHOT.jar:?]
at org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:222) ~[sparkler-app-0.2.2-SNAPSHOT.jar:?]
at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:299) ~[sparkler-app-0.2.2-SNAPSHOT.jar:?]
at org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1371) ~[sparkler-app-0.2.2-SNAPSHOT.jar:?]
at org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1298) ~[sparkler-app-0.2.2-SNAPSHOT.jar:?]
at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1362) ~[sparkler-app-0.2.2-SNAPSHOT.jar:?]
at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1186) ~[sparkler-app-0.2.2-SNAPSHOT.jar:?]
at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:360) ~[sparkler-app-0.2.2-SNAPSHOT.jar:?]
at org.apache.spark.rdd.RDD.iterator(RDD.scala:311) ~[sparkler-app-0.2.2-SNAPSHOT.jar:?]
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) ~[sparkler-app-0.2.2-SNAPSHOT.jar:?]
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:349) ~[sparkler-app-0.2.2-SNAPSHOT.jar:?]
at org.apache.spark.rdd.RDD.iterator(RDD.scala:313) ~[sparkler-app-0.2.2-SNAPSHOT.jar:?]
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) ~[sparkler-app-0.2.2-SNAPSHOT.jar:?]
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:349) ~[sparkler-app-0.2.2-SNAPSHOT.jar:?]
at org.apache.spark.rdd.RDD.iterator(RDD.scala:313) ~[sparkler-app-0.2.2-SNAPSHOT.jar:?]
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) ~[sparkler-app-0.2.2-SNAPSHOT.jar:?]
at org.apache.spark.scheduler.Task.run(Task.scala:127) ~[sparkler-app-0.2.2-SNAPSHOT.jar:?]
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:446) ~[sparkler-app-0.2.2-SNAPSHOT.jar:?]
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377) ~[sparkler-app-0.2.2-SNAPSHOT.jar:?]
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:449) [sparkler-app-0.2.2-SNAPSHOT.jar:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:830) [?:?]

Environment and Version Information

Please indicate relevant versions, including, if relevant:

  • Java Version
    1.8
  • Spark Version
    Embedded spark in the JAR
  • Operating System name and version
    Linux crawler 4.19.0-16-cloud-amd64 Add ALV2 license headers on code #1 SMP Debian 4.19.181-1 (2021-03-19) x86_64 GNU/Linux

An external links for reference

If you think any other resources on internet will be helpful to understand and/or resolve this issue, please share them here.

Contributing

If you'd like to help us fix the issue by contributing some code, but would like guidance or help in doing so, please mention it!

I have upped the limits of max number of processes to unlimited and after checking the system while the crawl was in process there were 27302 processes with 26540 of them being sparkler, this looks like there is a leak somewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant