Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

making scoverage work with classpaths #267

Closed
wants to merge 8 commits into from

Conversation

sammy-1234
Copy link

@sammy-1234 sammy-1234 commented Jul 2, 2019

I have created a pull request for this issue. The changes I have made are recommendations given by @stuhood in the comment on the issue.

in Plugin.scala, added an additional option writeToClasspath that, if set to true, stores the instruments to the classpath and invokes Invoker.invokedWriteToClasspath in Invoker.scala at runtime.

In Invoker.scala, creating a new directory from the system option scoverage_measurement_path and copying the instruments there along with storing the measurements. If the system var is not set, it creates a subdirectory in the cwd.

If the option is not set (or is set to False), scoverage works the usual way.

Added a test for invokedUseEnvironment: sbt "testOnly scoverage.InvokerUseEnvironmentTest"

@@ -35,16 +34,23 @@ class ScoveragePlugin(val global: Global) extends Plugin {
options.dataDir = opt.substring("dataDir:".length)
} else if (opt.startsWith("extraAfterPhase:") || opt.startsWith("extraBeforePhase:")) {
// skip here, these flags are processed elsewhere
} else {
} else if (opt.startsWith("useEnvironment:")) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

userEnvironment should be renamed to something more meaningful related to classpath --use-classpath-dir.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This option should also be clear that it'll override dataDir.

@@ -35,16 +34,23 @@ class ScoveragePlugin(val global: Global) extends Plugin {
options.dataDir = opt.substring("dataDir:".length)
} else if (opt.startsWith("extraAfterPhase:") || opt.startsWith("extraBeforePhase:")) {
// skip here, these flags are processed elsewhere
} else {
} else if (opt.startsWith("useEnvironment:")) {
options.useEnvironment = opt.substring("useEnvironment:".length).toBoolean
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have some error checking here? .toBoolean could potentially throw an error here.


if (opts.exists{p: String => p.contains("useEnvironment") && p.contains("false")} || (!opts.exists(_.startsWith("useEnvironment")))) {
if (!opts.exists(_.startsWith("dataDir:")))
throw new RuntimeException("Cannot invoke plugin without specifying <dataDir>")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or without specifying new option?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is being checked by !opts.exists(_.startsWith("useEnvironment"))

instrumentationComponent.setOptions(options)
}

override val optionsHelp: Option[String] = Some(Seq(
"-P:scoverage:useEnvironment:<boolean> if true, use the environment variable to store measurements" +
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we include some description for overrides dataDir?

if (!opts.exists(_.startsWith("dataDir:")))
throw new RuntimeException("Cannot invoke plugin without specifying <dataDir>")

if (opts.exists{p: String => p.contains("useEnvironment") && p.contains("false")} || (!opts.exists(_.startsWith("useEnvironment")))) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this meant to replace dataDir in the future?

Copy link
Author

@sammy-1234 sammy-1234 Jul 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is just a check that if the option to write to the classpath is either set to false p.contains("useEnvironment") && p.contains("false") or is not present (!opts.exists(_.startsWith("useEnvironment")), then make sure that dataDir attribute is given by the user.

Overriding dataDir takes place in def setOptions(options: ScoverageOptions): Unit

// When code is changed in source file, old measurements present
// inside the dataDir can skew the results. Thus, clean the dataDir to remove
// old measurements.
resetDataDir(dataDir)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an atomic operation?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call! I'll move this part inside the synchronized block to make it thread safe.


def resetDataDir(dataDir: String): Unit = {
// Making the directory if it does not exist
new File(dataDir).mkdirs()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a safe operation?

Copy link
Author

@sammy-1234 sammy-1234 Jul 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It returns true if the directories are made and false otherwise (in case the directories are already present). It raises an exception only if a SecurityManager exists and denies to create the named dir. https://www.tutorialspoint.com/java/io/file_mkdirs.htm
I think it should be safe?

Copy link

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

instrumentationComponent.setOptions(options)
}

override val optionsHelp: Option[String] = Some(Seq(
"-P:scoverage:writeToClasspath:<boolean> if true, use the environment variable to store measurements" +
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way this string is broken means that it won't have spaces in it: need to leave the spaces you want in as trailing, like:

measurements " +

"-P:scoverage:writeToClasspath:<boolean> if true, use the environment variable to store measurements" +
"and store instruments to the output classpath where compiler emits classfiles." +
"Overrides the datadir with the output classpath.",
"-P:scoverage:classpathSubdir:<pathtosubdir> subdir to attach to classpath. Default: META-INF/scoverage",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should reference the other option probably: "subdir to prefix to classpath when writeToClasspath is in use"

* [invokedUseEnvironment] as it helps differentiating the source
* files at runtime, i.e helps in creating a unique subdir for each source file.
*/
if (options.writeToClasspath){
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only difference between these two blocks is the newTermName value. You should probably extract that into a variable to avoid duplicating the blocks.

}

private def deleteMeasurementFolders(): Unit = {
val d = s"${System.getenv("SCOVERAGE_MEASUREMENT_PATH")}"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test should not be expect this env var to already be set.

What I'd recommend instead is changing the code in the invoker to consume a Java "system property", rather than an env var, and to then set the property during the test. System properties can be changed at runtime (see "Writing system properties" here: https://docs.oracle.com/javase/tutorial/essential/environment/sysprop.html).

This would avoid setting it in the build.sbt file, which affects all tests: not just this one.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...but before doing this, see the comment below about how the dataDir is initialized. That change would also allow for setting the dataDir in tests.

/**
* We record that the given id has been invoked by appending its id to the coverage
* data file.
*
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method (and method doc) duplicates a lot of the def invoke method. You should extract more of the method body in order to dedupe them.

* @param instrumentsDir the directory where the instrument data is held and needs to be copied from
*/
def invokedWriteToClasspath(id: Int, instrumentsDir: String): Unit = {
// Get the output data directory from the environment variable or write to a subdir.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Computing the dataDir like this is likely to be expensive. You should probably move this out into a @volatile var, and compute it once at startup.

That would also allow/require you to mutate it in unit tests.

// and it is possible that a test file tests multiple instrumented binaries. Thus, to
// segregate reports for each source file, we need this hash. This was not an issue with
// previous version as then we explicitly specified a different [dataDir] each time.
dataDir += s"/${safe_name(instrumentsDir)}"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto doing this at startup time. You should probably add a setter for the datadir for tests to use that will do the appropriate concatenation.

This invoke method needs to be very fast since it will be called millions of times per run. Doing too much setup here will definitely slow down an application.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this concatenation needs to be at each invocation as it is a parameter to the invokedWriteToClasspath method

// When code is changed in source file, old measurements present
// inside the dataDir can skew the results. Thus, clean the dataDir to remove
// old measurements.
resetDataDir(dataDir)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleting this directory feels like it attempts to solve a problem that is unrelated to this PR. In particular: a caller should always set a new dataDir if they don't want to merge over top of the old one. Would drop the removal.

}

def safe_name(str: String): String = {
val pattern = "[^/]+".r
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method should have a camel-case name.

But also: you should watch out for turning filepaths into single names, there are filename length limits, and take a bunch of filenames and merging them into one is very likely to violate that.

Instead, you should consider preserving the full name of the file under the directory, and call mkdirs for the parents of each file.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method does not exists in the latest PR.

Copy link

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! But how targetids are chosen might still need some work.

if (!opts.exists(_.startsWith("dataDir:")))
throw new RuntimeException("Cannot invoke plugin without specifying <dataDir>")

if (opts.exists{p: String => p.contains("writeToClasspath") && p.contains("false")} || (!opts.exists(_.startsWith("writeToClasspath")))) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to extract this entire condition and give it a name: for example: ```
val isUsingWriteToClasspath: Boolean = ???
if (!isUsingWriteToClasspath) { ??? }

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with the latest change (below), we always need the dataDir to be set.


// Overriding the [dataDir] if [writeToClasspath] is true.
// If writeToClasspath is true, we set the output directory to output classpath,
// i.e to the directory where compiler is going to output the classfiles.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may not be unique, unfortunately. Even if you include the absolute path to the output directory, two compiles happening on different machines might use the same output directory, and not be identical.

I don't have any great ideas here. Hashing the source files themselves relative to the current working directory might work? Using a random number or UUID would make builds non-deterministic.

Maybe this information does need to be user provided, and you could require that the dataDir is a relative path that you'd use in this position?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I have currently set the dataDir to be the targetID in case the writeToClasspath option is set to true.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should update this comment, and below.

Copy link

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

// which needs to be stored SEPARATELY from each other (but along with THEIR instrument files).
// Thus, to segregate (instruments + measurements) for each target, we need this subdir.
// This was not an issue with previous version as then we explicitly
// specified a different [dataDir] each time we compiled a target.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this back to being the case? The dataDir will still be required?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If by dataDir you mean the variable on line 96, then yes, you are correct - dataDir is no longer necessary.

Copy link
Author

@sammy-1234 sammy-1234 Jul 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That being said, I still think its a good idea to provide an option for setting the directory prefix at runtime (using the System var in this case) - would help in setting it to runtime classpath. What do you think?

Copy link
Author

@sammy-1234 sammy-1234 Jul 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your concern regarding doing additional setup here (its effect in terms of performance)


// Overriding the [dataDir] if [writeToClasspath] is true.
// If writeToClasspath is true, we set the output directory to output classpath,
// i.e to the directory where compiler is going to output the classfiles.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should update this comment, and below.

if(options.writeToClasspath){
settings.outputDirs.getSingleOutput match {
case Some(dest) =>
//creating targetId out of [dataDir]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should validate that this is a relative path.

@gslowikowski
Copy link
Member

Hi guys. Thank you for your effort. I was very busy recently. Analyzing and testing your work will require considerable amount of time. I promise to do it soon.

@stuhood
Copy link

stuhood commented Sep 17, 2019

Thank you @gslowikowski!

@callicles
Copy link

Is this still needed?

@ckipp01 ckipp01 deleted the branch scoverage:main May 29, 2022 08:38
@ckipp01 ckipp01 closed this May 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants