-
Notifications
You must be signed in to change notification settings - Fork 498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Offset::end() with topic replicas reads from beginning. #3788
Comments
@gibbz00 thanks for reporting, we'll take a look. |
Thanks for the example code @gibbz00! I used it to reproduce the behavior and then analyze the issue a bit further. Overall it looks like it has to do with our replica startup logic. If a client start looking at end offsets before a FLV_SHORT_RECONCILLATION default of 10 seconds, you can get different end offset values. This depends on the spu connected to. Once the sync starts, the replicas stay in sync in a finer interval. The FLV_SHORT_RECONCILLATION time can be customized by setting an environment variable of the same name to a value in seconds. If the cluster is started with It is a little unexpected that we don't start the sync up earlier and I have opened an issue to take a look at that over a longer term. (and fix the mispelling of reconciliation). #3790 |
Some small modifications to rule out a race from producer to consumer, and play with when reading End started: use std::sync::Arc;
use fluvio::{metadata::topic::TopicSpec, FluvioAdmin, Offset, RecordKey};
use futures::TryStreamExt;
use std::time::Duration;
const DELAY_MILLIS: u64 = 1000;
const MAX_RECORDS: u8 = 15;
const REC_TRIGGER_OFFSET: u8 = 11; // start sometime after FLV_SHORT_RECONCILLIATION setting in seconds
const TOPIC_NAME: &str = "dectest-offset";
const TOPIC_REPLICAS: u32 = 3;
#[tokio::main]
async fn main() -> anyhow::Result<()> {
reset_topic().await?;
println!("Number of topic replicas: {}", TOPIC_REPLICAS);
let notify_start = Arc::new(tokio::sync::Notify::new());
tokio::spawn(consume(TestOffset::Beginning, notify_start.clone()));
tokio::spawn(consume(TestOffset::End, notify_start.clone()));
let producer = fluvio::producer(TOPIC_NAME).await?;
for index in 0..MAX_RECORDS {
producer.send(RecordKey::NULL, index.to_string()).await?;
println!("[PRODUCER] sent: {}", index);
tokio::time::sleep(Duration::from_millis(DELAY_MILLIS)).await;
}
Ok(())
}
#[derive(PartialEq)]
enum TestOffset {
Beginning,
End,
}
async fn consume(offset: TestOffset, onotify: Arc<tokio::sync::Notify>) -> anyhow::Result<()> {
// wait for consume signal to start
if offset == TestOffset::End {
onotify.notified().await;
}
let mut stream = fluvio::consumer(TOPIC_NAME, 0)
.await?
.stream(match offset {
TestOffset::Beginning => Offset::beginning(),
TestOffset::End => Offset::end(),
})
.await?;
while let Some(record) = stream.try_next().await? {
let index = record.get_value().as_utf8_lossy_string().parse::<u8>()?;
println!(
"[CONSUMER_{}] recieved: {}",
match offset {
TestOffset::Beginning => "BEGINNING",
TestOffset::End => "END",
},
index
);
if offset == TestOffset::Beginning && index == REC_TRIGGER_OFFSET {
onotify.notify_waiters();
}
if index == MAX_RECORDS - 1 {
break;
}
}
Ok(())
}
async fn reset_topic() -> anyhow::Result<()> {
let admin = FluvioAdmin::connect().await?;
let _ = admin.delete::<TopicSpec>(TOPIC_NAME).await;
admin
.create(TOPIC_NAME.to_string(), false, TopicSpec::new_computed(1, TOPIC_REPLICAS, None))
.await?;
Ok(())
} |
Hi, thanks for the response and taking your time to analyze the issue 😊 Can also confirm that your conclusion seems correct by running your code over here. (Only when Again, big thank you |
I'll close this, but if you run into more problems or have added questions, feel free to reopen or create a new issue |
As the title suggests, I'm unable to get a consumer with
Offset::end()
to work when the topic replica factor is higher than 1.Here's what I'm using to reproduce this:
And here's the output:
About 10% of the times though, things execute correctly, showing the same output as the first case.
Replicas were enabled by applying this resource:
Versions
The text was updated successfully, but these errors were encountered: