Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consumer doesn't consume after onLost #1288

Open
ya-at opened this issue Jul 19, 2024 · 5 comments
Open

Consumer doesn't consume after onLost #1288

ya-at opened this issue Jul 19, 2024 · 5 comments

Comments

@ya-at
Copy link

ya-at commented Jul 19, 2024

When the broker is down, the consumer loses connection to the broker, and tries to reconnect, then onLost happens and after that runloop will never call poll(), so there are no new events. Is it a desired behavior? If yes, then how to restart consumer when onLost happens? (Last event was consumed at 20:08). Also in the application there are two consumer groups and they read the same topic (in parallel); one fails (onLost happens), one continues to work (onLost doesn't happen, since it's connected to a broker that doesn't go down).

Related #1250. Version: 2.8.0.

Logs (the first message — the newest)
2024-07-19 20:10:37.441	zio-kafka-runloop-thread-3   zio.kafka.consumer.internal.Runloop   Offering partition assignment Set()   location: Runloop.scala:523
2024-07-19 20:10:37.438	zio-kafka-runloop-thread-3   zio.kafka.consumer.internal.Runloop.makeRebalanceListener   onLost done   location: Runloop.scala:240
2024-07-19 20:10:37.433	zio-kafka-runloop-thread-3   zio.kafka.consumer.internal.PartitionStreamControl   Partition sample-topic-3 lost   location: PartitionStreamControl.scala:98 partition: 3 topic: sample-topic
2024-07-19 20:10:37.428	zio-kafka-runloop-thread-3   zio.kafka.consumer.internal.Runloop.makeRebalanceListener   1 partitions are lost   location: Runloop.scala:234
2024-07-19 20:10:37.375	zio-kafka-runloop-thread-3   zio.kafka.consumer.internal.Runloop   Starting poll with 1 pending requests and 0 pending commits, resuming Set(sample-topic-3) partitions   location: Runloop.scala:466
2024-07-19 20:10:37.375	zio-kafka-runloop-thread-3   zio.kafka.consumer.internal.Runloop   Processing 0 commits, 1 commands: Poll   location: Runloop.scala:733
2024-07-19 20:10:37.325	zio-kafka-runloop-thread-3   zio.kafka.consumer.internal.Runloop   Starting poll with 1 pending requests and 0 pending commits, resuming Set(sample-topic-3) partitions   location: Runloop.scala:466
2024-07-19 20:10:37.325	zio-kafka-runloop-thread-3   zio.kafka.consumer.internal.Runloop   Processing 0 commits, 1 commands: Poll   location: Runloop.scala:733
2024-07-19 20:10:37.275	zio-kafka-runloop-thread-3   zio.kafka.consumer.internal.Runloop   Starting poll with 1 pending requests and 0 pending commits, resuming Set(sample-topic-3) partitions   location: Runloop.scala:466
2024-07-19 20:10:37.275	zio-kafka-runloop-thread-3   zio.kafka.consumer.internal.Runloop   Processing 0 commits, 1 commands: Poll   location: Runloop.scala:733
2024-07-19 20:10:37.225	zio-kafka-runloop-thread-3   zio.kafka.consumer.internal.Runloop   Starting poll with 1 pending requests and 0 pending commits, resuming Set(sample-topic-3) partitions   location: Runloop.scala:466
2024-07-19 20:10:37.225	zio-kafka-runloop-thread-3   zio.kafka.consumer.internal.Runloop   Processing 0 commits, 1 commands: Poll   location: Runloop.scala:733
2024-07-19 20:10:37.174	zio-kafka-runloop-thread-3   zio.kafka.consumer.internal.Runloop   Starting poll with 1 pending requests and 0 pending commits, resuming Set(sample-topic-3) partitions   location: Runloop.scala:466
2024-07-19 20:10:37.174	zio-kafka-runloop-thread-3   zio.kafka.consumer.internal.Runloop   Processing 0 commits, 1 commands: Poll   location: Runloop.scala:733
2024-07-19 20:10:37.124	zio-kafka-runloop-thread-3   zio.kafka.consumer.internal.Runloop   Starting poll with 1 pending requests and 0 pending commits, resuming Set(sample-topic-3) partitions   location: Runloop.scala:466
2024-07-19 20:10:37.124	zio-kafka-runloop-thread-3   zio.kafka.consumer.internal.Runloop   Processing 0 commits, 1 commands: Poll   location: Runloop.scala:733
@ya-at ya-at changed the title Consumer doesn't consumer after onLost Consumer doesn't consume after onLost Jul 19, 2024
@erikvanoosten
Copy link
Collaborator

erikvanoosten commented Jul 20, 2024

Hello @ya-at. Thanks for your detailed bug report.
I think we can release 2.8.1 and then (due to #1252) a lost partition is no longer considered fatal.

@erikvanoosten
Copy link
Collaborator

erikvanoosten commented Jul 20, 2024

For now, your options are:

  • restart the consumer by re-subscribing
  • create a new consumer
    - downgrade zio-kafka to 2.6.0.
  • restart the application

@erikvanoosten
Copy link
Collaborator

Correction: #1252 is already part of zio-kafka 2.8.0, so something else is going on.

@erikvanoosten
Copy link
Collaborator

The newest log line (first line) indicates that no partitions (Set()) are assigned to this consumer. That should not cause polling to stop! (See shouldPoll, subscriptionState.isSubscribed and assignedStreams.isEmpty should be true in this case.)

Can you check the java consumer configurations?

@ya-at
Copy link
Author

ya-at commented Jul 20, 2024

Can you check the java consumer configurations?

The settings are almost default. Things we changed are client.id, group.id, metrics.reporter (and these options are passed through ConsumerSettings). I don't think it's because of metrics.reporter, since every consumer has this property.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants