Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JPERF-1106: Gather Apache2 logs #163

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,19 @@ Dropping a requirement of a major version of a dependency is a new contract.
## [Unreleased]
[Unreleased]: https://github.com/atlassian/aws-infrastructure/compare/release-3.0.0...master

## Added
- Add `DiagnosableLoadBalancer`.
- Add ability to `gatherEvidence` from `ApacheProxyLoadBalancer` by making it `DiagnosableLoadBalancer`.
- Make it possible to specify `extraMeasurementSources` inside `Jira`, so that it's possible to `gatherResults` from any of Jira hosted integrations, e.g. load balancer or plugins.
- Let `Jira` produced by `DataCenterFormula` gather logs of `ApacheProxyLoadBalancer`. Resolve [JPERF-1106].
- Make `StartedNode` a `MeasurementSource`.

## Fixed
- Let every `MeasurementSource` inside `Jira` finish its gathering even if one of them fails. Fix [JPERF-1114].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's split those changes. One sounds good, while the other needs some further considerations.


[JPERF-1106]: https://ecosystem.atlassian.net/browse/JPERF-1106
[JPERF-1114]: https://ecosystem.atlassian.net/browse/JPERF-1114

## [3.0.0] - 2023-05-25
[3.0.0]: https://github.com/atlassian/aws-infrastructure/compare/release-2.29.0...release-3.0.0

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
package com.atlassian.performance.tools.awsinfrastructure

internal class FailSafeRunnable(
private val delegates: Iterable<Runnable>
) : Runnable {
override fun run() {
val exceptions = delegates.mapNotNull {
try {
it.run()
null
} catch (e: Exception) {
e
}
}

when {
exceptions.isEmpty() -> return
exceptions.size == 1 -> throw exceptions[0]
else -> {
val root = Exception("Multiple exceptions were thrown and are added suppressed into this one")
exceptions.forEach { root.addSuppressed(it) }
throw root
}
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ import com.atlassian.performance.tools.awsinfrastructure.api.hardware.M4ExtraLar
import com.atlassian.performance.tools.awsinfrastructure.api.hardware.Volume
import com.atlassian.performance.tools.awsinfrastructure.api.loadbalancer.ApacheEc2LoadBalancerFormula
import com.atlassian.performance.tools.awsinfrastructure.api.loadbalancer.ApacheProxyLoadBalancer
import com.atlassian.performance.tools.awsinfrastructure.api.loadbalancer.DiagnosableLoadBalancer
import com.atlassian.performance.tools.awsinfrastructure.api.loadbalancer.LoadBalancerFormula
import com.atlassian.performance.tools.awsinfrastructure.api.network.Network
import com.atlassian.performance.tools.awsinfrastructure.api.network.NetworkFormula
Expand All @@ -19,6 +20,7 @@ import com.atlassian.performance.tools.awsinfrastructure.jira.DataCenterNodeForm
import com.atlassian.performance.tools.awsinfrastructure.jira.DiagnosableNodeFormula
import com.atlassian.performance.tools.awsinfrastructure.jira.StandaloneNodeFormula
import com.atlassian.performance.tools.awsinfrastructure.jira.home.SharedHomeFormula
import com.atlassian.performance.tools.awsinfrastructure.loadbalancer.LoadBalancerMeasurementSource.Extension.asMeasurementSource
import com.atlassian.performance.tools.concurrency.api.AbruptExecutorService
import com.atlassian.performance.tools.concurrency.api.submitWithLogContext
import com.atlassian.performance.tools.infrastructure.api.app.Apps
Expand Down Expand Up @@ -326,6 +328,8 @@ class DataCenterFormula private constructor(
loadBalancer.waitUntilHealthy(Duration.ofMinutes(5))
}

val loadBalancerResultsSource = (provisionedLoadBalancer.loadBalancer as? DiagnosableLoadBalancer)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First time used this as? syntax. Casts are usually bad, however given the circumstances of our current model I think this is a better solution than putting a new field into ProvisionedLoadBalancer and I'm also not certain if adding methods to LoadBalancer would be a good idea. Extending the interface and detecting it allows for wrapping of DiagnosableLoadBalancer implementations without losing the feature.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed casts are bad. See https://github.com/atlassian/aws-infrastructure/blob/release-2.29.0/src/main/kotlin/com/atlassian/performance/tools/awsinfrastructure/api/jira/DataCenterFormula.kt#L360:

if (loadBalancer is ApacheProxyLoadBalancer) listOf(loadBalancer::updateJiraConfiguration) else emptyList()

Fortunately, ApacheProxyLoadBalancer will match both is/as? casts, so it will function.
We need hooks API to avoid further tight coupling. If the apache logs are necessary, then we can merge the tech debt. But if not, then let's skip that part.

?.asMeasurementSource(resultsTransport.location)
val jira = Jira.Builder(
nodes = nodes,
jiraHome = RemoteLocation(
Expand All @@ -336,6 +340,7 @@ class DataCenterFormula private constructor(
address = loadBalancer.uri
)
.jmxClients(jiraNodes.mapIndexed { i, node -> configs[i].remoteJmx.getClient(node.publicIpAddress) })
.extraMeasurementSources(listOfNotNull(loadBalancerResultsSource))
.build()

logger.info("$jira is set up, will expire ${jiraStack.expiry}")
Expand Down
Original file line number Diff line number Diff line change
@@ -1,40 +1,49 @@
package com.atlassian.performance.tools.awsinfrastructure.api.jira

import com.atlassian.performance.tools.awsinfrastructure.api.CustomDatasetSource
import com.atlassian.performance.tools.awsinfrastructure.FailSafeRunnable
import com.atlassian.performance.tools.awsinfrastructure.api.RemoteLocation
import com.atlassian.performance.tools.concurrency.api.submitWithLogContext
import com.atlassian.performance.tools.infrastructure.api.MeasurementSource
import com.atlassian.performance.tools.infrastructure.api.jvm.jmx.JmxClient
import com.atlassian.performance.tools.jvmtasks.api.TaskTimer.time
import com.google.common.util.concurrent.ThreadFactoryBuilder
import org.apache.logging.log4j.LogManager
import org.apache.logging.log4j.Logger
import java.net.URI
import java.util.concurrent.Executors

/**
* @param extraMeasurementSources source of results/diagnostics of: reverse proxy, Crowd, LDAP, DVCS, DB, Jira plugins
* or anything that can be integrated with Jira as part of web application provisioning
*/
class Jira private constructor(
private val nodes: List<StartedNode>,
val jiraHome: RemoteLocation,
val database: RemoteLocation,
val address: URI,
val jmxClients: List<JmxClient> = emptyList()
val jmxClients: List<JmxClient>,
private val extraMeasurementSources: List<MeasurementSource>
) : MeasurementSource {
private val logger: Logger = LogManager.getLogger(this::class.java)

override fun gatherResults() {
if (nodes.isEmpty()) {
logger.warn("No Jira nodes known to JPT, not downloading node results")
val firstNode = nodes.firstOrNull()
val measurementSources = extraMeasurementSources + nodes + listOfNotNull(firstNode?.let { AnalyticsLogsSource(it) })
if (measurementSources.isEmpty()) {
logger.warn("No result sources known to Jira, can't download anything")
return
}
val executor = Executors.newFixedThreadPool(
nodes.size.coerceAtMost(4),
measurementSources.size.coerceAtMost(4),
ThreadFactoryBuilder()
.setNameFormat("results-gathering-thread-%d")
.build()
)
nodes.map { executor.submitWithLogContext("gather $it") { it.gatherResults() } }
.forEach { it.get() }
time("gather analytics") { nodes.firstOrNull()?.gatherAnalyticLogs() }
FailSafeRunnable(
measurementSources.map { executor.submitWithLogContext("gather $it") { it.gatherResults() } }
.map { Runnable { it.get() } }
).run()

executor.shutdownNow()
}

Expand All @@ -55,14 +64,26 @@ class Jira private constructor(
private val address: URI
) {
private var jmxClients: List<JmxClient> = emptyList()
private var extraMeasurementSources: List<MeasurementSource> = emptyList()

fun jmxClients(jmxClients: List<JmxClient>) = apply { this.jmxClients = jmxClients }
fun extraMeasurementSources(extraMeasurementSources: List<MeasurementSource>) = apply { this.extraMeasurementSources = extraMeasurementSources }

fun build() = Jira(
nodes = nodes,
jiraHome = jiraHome,
database = database,
address = address,
jmxClients = jmxClients
jmxClients = jmxClients,
extraMeasurementSources = extraMeasurementSources
)
}

private class AnalyticsLogsSource(
private val node: StartedNode
) : MeasurementSource {
override fun gatherResults() {
node.gatherAnalyticLogs()
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ package com.atlassian.performance.tools.awsinfrastructure.api.jira

import com.atlassian.performance.tools.aws.api.Storage
import com.atlassian.performance.tools.awsinfrastructure.api.aws.AwsCli
import com.atlassian.performance.tools.infrastructure.api.MeasurementSource
import com.atlassian.performance.tools.infrastructure.api.jira.JiraGcLog
import com.atlassian.performance.tools.infrastructure.api.process.RemoteMonitoringProcess
import com.atlassian.performance.tools.ssh.api.Ssh
Expand All @@ -15,10 +16,10 @@ class StartedNode(
private val jiraPath: String,
private val monitoringProcesses: List<RemoteMonitoringProcess>,
private val ssh: Ssh
) {
) : MeasurementSource {
private val resultsDirectory = "results"

fun gatherResults() {
override fun gatherResults() {
ssh.newConnection().use { shell ->
monitoringProcesses.forEach { it.stop(shell) }
val nodeResultsDirectory = "$resultsDirectory/'$name'"
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
package com.atlassian.performance.tools.awsinfrastructure.api.loadbalancer

import com.atlassian.performance.tools.aws.api.StorageLocation
import com.atlassian.performance.tools.awsinfrastructure.api.aws.AwsCli
import com.atlassian.performance.tools.infrastructure.api.Sed
import com.atlassian.performance.tools.infrastructure.api.loadbalancer.LoadBalancer
import com.atlassian.performance.tools.jvmtasks.api.ExponentialBackoff
import com.atlassian.performance.tools.jvmtasks.api.IdempotentAction
import com.atlassian.performance.tools.ssh.api.Ssh
Expand All @@ -14,7 +15,7 @@ class ApacheProxyLoadBalancer private constructor(
private val ssh: Ssh,
ipAddress: String,
httpPort: Int
) : LoadBalancer {
) : DiagnosableLoadBalancer {

override val uri: URI = URI("http://${ipAddress}:$httpPort/")

Expand Down Expand Up @@ -81,6 +82,15 @@ class ApacheProxyLoadBalancer private constructor(
connection.execute("echo \"$line\" | sudo tee -a $APACHE_CONFIG_PATH")
}

override fun gatherEvidence(location: StorageLocation) {
ssh.newConnection().use { connection ->
val resultsDir = "/tmp/s3-results"
connection.execute("mkdir -p $resultsDir")
connection.execute("cp -R /var/log/apache2 $resultsDir")
AwsCli().upload(location, connection, resultsDir, Duration.ofMinutes(1))
}
}

class Builder(
private val ssh: Ssh
) {
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
package com.atlassian.performance.tools.awsinfrastructure.api.loadbalancer

import com.atlassian.performance.tools.aws.api.StorageLocation
import com.atlassian.performance.tools.infrastructure.api.loadbalancer.LoadBalancer

interface DiagnosableLoadBalancer : LoadBalancer {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name suggestions welcome

Copy link
Contributor Author

@pczuj pczuj May 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could make that LoadBalancer-agnostic, however that would make the model more abstract and that possibly could make it harder to understand. Looking for thoughts on that.

fun gatherEvidence(location: StorageLocation)
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
package com.atlassian.performance.tools.awsinfrastructure.loadbalancer

import com.atlassian.performance.tools.aws.api.StorageLocation
import com.atlassian.performance.tools.awsinfrastructure.api.loadbalancer.DiagnosableLoadBalancer
import com.atlassian.performance.tools.infrastructure.api.MeasurementSource

internal class LoadBalancerMeasurementSource(
private val loadBalancer: DiagnosableLoadBalancer,
private val target: StorageLocation
) : MeasurementSource {
internal object Extension {
fun DiagnosableLoadBalancer.asMeasurementSource(
target: StorageLocation
) = LoadBalancerMeasurementSource(
loadBalancer = this,
target = target
)
}

override fun gatherResults() {
loadBalancer.gatherEvidence(target)
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
package com.atlassian.performance.tools.awsinfrastructure

import org.assertj.core.api.Assertions.assertThat
import org.junit.Test

class FailSafeRunnableTest {
@Test
fun shouldExecuteAllEvenIfFirstFails() {
var executed1 = false
var executed2 = false
var executed3 = false
val runnable = FailSafeRunnable(
listOf(
Runnable { executed1 = true; throw Exception("Fail 1") },
Runnable { executed2 = true; throw Exception("Fail 2") },
Runnable { executed3 = true; throw Exception("Fail 3") }
)
)

try {
runnable.run()
} catch (e: Exception) {
// Expected and ignored, so that we can go to asserts
}

assertThat(executed1).isTrue()
assertThat(executed2).isTrue()
assertThat(executed3).isTrue()
}

@Test
fun shouldThrowAllFailures() {
val runnable = FailSafeRunnable(
listOf(
Runnable { throw Exception("Banana") },
Runnable { throw Exception("Apple") },
Runnable { throw Exception("Pear") },
Runnable { throw Exception("Peach") }
)
)

val exception = try {
runnable.run()
null
} catch (e: Exception) {
e
}

val allExceptions = listOf(exception!!) + exception.suppressed.toList()
val allMessages = allExceptions.map { it.message }
assertThat(allMessages).contains("Banana", "Apple", "Pear", "Peach")
}


@Test
fun shouldExecuteAll() {
val allIndexes = Array(20) { it }
val finishedIndexes = mutableListOf<Int>()
val runnable = FailSafeRunnable(
allIndexes.map { index -> Runnable { finishedIndexes.add(index) } }
)

runnable.run()

assertThat(finishedIndexes).contains(*allIndexes)
}
}