Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Kafka integration test log spam #1800

Open
Abacn opened this issue Aug 19, 2024 · 7 comments
Open

[Bug]: Kafka integration test log spam #1800

Abacn opened this issue Aug 19, 2024 · 7 comments
Assignees
Labels
bug Something isn't working p2

Comments

@Abacn
Copy link
Contributor

Abacn commented Aug 19, 2024

Related Template(s)

N/A

Template Version

N/A

What happened?

Java PR Action has long been flaky, at least in the past it clearly shows which tests failed. However, recently the test log size has increased substantially, likely due to Kafka template and test development. Now the log is of > 80 MB.

A majority of the log reads

2024-08-19T15:39:29.6622952Z [kafka-producer-network-thread | producer-7] INFO org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-7] Node 1 disconnected.
2024-08-19T15:39:29.6626237Z [kafka-producer-network-thread | producer-7] WARN org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-7] Connection to node 1 (/10.128.0.40:57609) could not be established. Node may not be available.
2024-08-19T15:39:29.6629592Z [kafka-admin-client-thread | adminclient-3] INFO org.apache.kafka.clients.NetworkClient - [AdminClient clientId=adminclient-3] Node 1 disconnected.
2024-08-19T15:39:29.6633503Z [kafka-admin-client-thread | adminclient-3] WARN org.apache.kafka.clients.NetworkClient - [AdminClient clientId=adminclient-3] Connection to node 1 (/10.128.0.40:57609) could not be established. Node may not be available.

and

2024-08-19T13:31:21.7130918Z [docker-java-stream--996043679] INFO org.apache.beam.it.testcontainers.TestContainerResourceManager - confluentinc/cp-kafka:7.3.1: [2024-08-19 13:23:47,251] INFO [Broker id=1] Handling LeaderAndIsr request correlationId 1 from controller 1 for 5 partitions (state.change.logger)
2024-08-19T13:31:21.7133326Z 
2024-08-19T13:31:21.7143323Z [docker-java-stream--996043679] INFO org.apache.beam.it.testcontainers.TestContainerResourceManager - confluentinc/cp-kafka:7.3.1: [2024-08-19 13:23:47,253] TRACE [Broker id=1] Received LeaderAndIsr request LeaderAndIsrPartitionState(topicName='testkafkatogcsbinaryencoding-20240819-132347-044185', partitionIndex=0, controllerEpoch=1, leader=1, leaderEpoch=0, isr=[1], partitionEpoch=0, replicas=[1], addingReplicas=[], removingReplicas=[], isNew=true, leaderRecoveryState=0) correlation id 1 from controller 1 epoch 1 (state.change.logger)

Relevant log output

No response

@Abacn Abacn added bug Something isn't working p2 needs triage labels Aug 19, 2024
@AnandInguva
Copy link
Contributor

I think we need to separate the Kafka IT tests from Java tests since there are so many Kafka IT tests. Having a separate GH action for kafka tests. would solve this partially

@AnandInguva
Copy link
Contributor

I can take a look at this later this week and put up a PR that separates the Kafka tests into a different GH actions suite.

@AnandInguva AnandInguva self-assigned this Aug 19, 2024
@Abacn
Copy link
Contributor Author

Abacn commented Aug 29, 2024

The log spam causes the mvn command stuck for 30 minutes to process logs to console, between the last test run and finalize, see #1817 (comment)

We have to disable printing logs until resolved.

@AnandInguva
Copy link
Contributor

@Abacn John mentioned that there is a way to configure Kafka to emit fewer logs.

We have to disable printing logs until resolved.
What do you mean by this?

@Abacn
Copy link
Contributor Author

Abacn commented Aug 29, 2024

@Abacn John mentioned that there is a way to configure Kafka to emit fewer logs.

We have to disable printing logs until resolved.
What do you mean by this?

Like replace "-e" to "-q" here:

fullArgs = append(fullArgs, "-e")
to see if the workflow no longer stuck at processing log

@Abacn
Copy link
Contributor Author

Abacn commented Sep 6, 2024

Try to set loglevel but did not work

LogManager logManager = LogManager.getLogManager();
  java.util.logging.Logger rootLogger = logManager.getLogger("");
  rootLogger.setLevel(Level.WARNING);

or

java.util.logging.Logger logger = java.util.logging.Logger.getLogger("org.apache.beam.it.testcontainers.TestContainerResourceManager");
logger.setLevel(Level.WARNING);

likely because it was added as a log consumer to test container:

https://github.com/apache/beam/blob/6901d7c862388ded58e1cda3286c429edab58c7c/it/testcontainers/src/main/java/org/apache/beam/it/testcontainers/TestContainerResourceManager.java#L75

@Abacn
Copy link
Contributor Author

Abacn commented Sep 20, 2024

The original issue title noted two separate issue (Java test flaky / Kafka log spam), now Kafka PR has been separated from Java PR, though the log spam issue still present. Changed the title and assigned to kafka test owner

@Abacn Abacn changed the title [Bug]: Java PR Action flaky / extremely hard to debug due to Kafka log spam [Bug]: Kafka integration test log spam Sep 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working p2
Projects
None yet
Development

No branches or pull requests

2 participants