Skip to content
This repository has been archived by the owner on Mar 15, 2024. It is now read-only.

Leaked log message from previous Syzkaller reproducer run #21

Open
shunghsiyu opened this issue Jul 7, 2020 · 4 comments
Open

Leaked log message from previous Syzkaller reproducer run #21

shunghsiyu opened this issue Jul 7, 2020 · 4 comments
Labels
bug Something isn't working

Comments

@shunghsiyu
Copy link
Contributor

Sometimes printk messages do not immediately show up when the reproducer runs, and are only emitted when the next reproducer starts to run.

In the example below, we can see the backtrace for previous reproducer PID: 22329 Comm: 7e93129ee310878 showed up in the result of 7e7470f0058e0950706a4fd684c2d86c7b43df31.

{
  "tid": "7e7470f0058e0950706a4fd684c2d86c7b43df31",
  "runs": 1,
  "passed": 0,
  "failed": 1,
  "broken": 0,
  "skipped": 0,
  "warnings": 0,
  "runtime": "14.1485938299447",
  "log": [
    "...omitted...",
    "kL.{$d'reH~_bPm~Uq.1:?_.|[y{gSHeU&Oo{n`r+pr/Q7^U'rDZ4yqrF1umrVE]ZPs)|L@!_9[5pesyzwrap  -d /opt/ltp/testcases/bin -n 7e7470f0058e0950706a4fd684c2d86c7b43ddf31; echo cmd-exit-108-$?",
    "tst_test.c:1245: INFO: Timeout per run is 0h 00m 20s",
    "syzwrap.c:71: INFO: https://syzkaller.appspot.com/bug?id=7e7470f0058e0950706a4fd684c2d86c7b43df31",
    "syzwrap.c:74: INFO: /opt/ltp/testcases/bin/7e7470f0058e0950706a4fd684c2d86c7b43df31",
    "7e93129ee310878a85ad9d6617cf768b635d2aee: cxa_atexit.c:100: __new_exitfn: Assertion `l != NULL' failed.",
    "[  296.036863] 000: traps: 7e93129ee310878[22272] general protection fault ip:7f09e6f95e79 sp:7f09e7768cc8 error:0 in libc-2.26.so[7f09e6f72000+1b2000]",
    "[  296.576930] 000: ------------[ cut here ]------------",
    "[  296.576975] 000: WARNING: CPU: 0 PID: 22329 at fs/super.c:236 destroy_unused_super.part.6+0x120/0x150",
    "[  296.576990] 000: Modules linked in:",
    "[  296.576993] 000: CPU: 0 PID: 22329 Comm: 7e93129ee310878 Not tainted 5.3.18-rt_syzkaller #1 SLE15-SP2 (unreleased)",
    "...omitted..."
  ]
}

This is a hard one to solve because there may be task related to last reproducer lingering around somewhere (e.g. inside work queues).

Some ideas for this issue:

  1. Adding sleep after each reproducer run (slow down a little bit and can't reliably avoid this issue)
  2. Reboot after each reproducer run (a bit crazy as this will cause a massive slow down, but should work reliably)
  3. Using dmesg to flush the printk buffer (not sure if this will work)
@shunghsiyu shunghsiyu changed the title Leak log message from previous Syzkaller reproducer run Leaked log message from previous Syzkaller reproducer run Jul 7, 2020
@richiejp richiejp added the bug Something isn't working label Jul 8, 2020
@metan-ucw
Copy link
Owner

I doubt that this will ever get fixed, the timing are just too unpredictable and rebooting or even loading from snapshot would make the testrun way too slow.

I guess that the best we can do is to keep the dmesg in one big file and only store line offset to the relevant part in the testcase.

@richiejp
Copy link
Collaborator

richiejp commented Jul 8, 2020

We could try waiting for the system load to decrease below some threshold before continuing. However it seems like some of the metrics for measuring load are broken on the low latency scheduler on my machine so this could be tricky.

I guess you can also compare the PIDs in the stack trace with the reproducer's.

@shunghsiyu
Copy link
Contributor Author

I decide to take the crazy idea of rebooting after every single reproducer run to see how painful vs useful it is.

Any suggestion for the runltp-ng option name if the crazy idea is useful enough for a PR? (--always-reboot?)

@richiejp
Copy link
Collaborator

I'd prefer the more generic --always-revert because of #23

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants