Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Cloud][Databricks] OAP MLLib jar can not be loaded by Databricks runtime #138

Open
yao531441 opened this issue Oct 31, 2021 · 2 comments
Open

Comments

@yao531441
Copy link
Contributor

We are going to get the OAP MLlib performance gain on Databricks, but it seems OAP MLLib jar can not be loaded by Databricks runtime. The error log is as blow:
image

We use Kmeans Demo to test.

import org.apache.spark.ml.clustering.KMeans
import org.apache.spark.ml.evaluation.ClusteringEvaluator

spark.sparkContext.setLogLevel("INFO")

val dataset = spark.read.format("libsvm").load("/FileStore/mllib_data/sample_kmeans_data.txt")

// Trains a k-means model.
val kmeans = new KMeans().setK(2).setSeed(1L)
val model = kmeans.fit(dataset)

// Make predictions
val predictions = model.transform(dataset)

// Evaluate clustering by computing Silhouette score
val evaluator = new ClusteringEvaluator()

val silhouette = evaluator.evaluate(predictions)
println(s"Silhouette with squared euclidean distance = $silhouette")

// Shows the result.
println("Cluster Centers: ")
model.clusterCenters.foreach(println)
@xwu99 xwu99 changed the title [Cloud][Databricks]InvalidClassException: scala.Product$class on Dataproc 2.0 when running Hibench [Cloud][Databricks] OAP MLLib jar can not be loaded by Databricks runtime Nov 2, 2021
@xwu99
Copy link
Collaborator

xwu99 commented Nov 15, 2021

Could I know if Databricks runtime are using K8S as cluster manager?

@yao531441
Copy link
Contributor Author

Could I know if Databricks runtime are using K8S as cluster manager?

I try to dig this information out, but I couldn't find any info about it from the official docs.
According to Databrick's slides page 23, I guess Databricks uses its own cluster manager.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants