Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to list files corresponding to corrupted data segments? #372

Open
bopolissimus opened this issue Jan 14, 2016 · 4 comments
Open

How to list files corresponding to corrupted data segments? #372

bopolissimus opened this issue Jan 14, 2016 · 4 comments

Comments

@bopolissimus
Copy link

Apologies for posting this question here (although this could be a feature request if the feature doesn't currently exist). librelist is not accepting my subscribe attempts from several different addresses.

My backup device is an external USB device. Every few years I replace them as they develop bad blocks. If the bad blocks happen to be under one or another of the data/#/##### files then a very small part of my backup files will be corrupt.

I can recover most of the backup data since only a few of the data/#/##### blocks will have been corrupted. I copy the attic data to a good drive, run attic check --repair on the copy, and can then use the backup.

Is there an attic command that will allow me to determine which files (in which archives) are affected by the bad blocks? e.g., if I have:

Starting repository check...
Error reading segment 3
attempting to recover work-backup.attic/data/0/3
Error reading segment 11
attempting to recover work-backup.attic/data/0/11

How would I determine which archives/files are affected so that I know to not try to restore those files from backup (or remove them after a restore)?

@ThomasWaldmann
Copy link
Contributor

What you have posted is part of the (relatively low level) repository check.
Recovering a segment file first creates a backup copy <filename>.beforerecover and then transfers all entries with valid crc / length to <filename> (and skips over everything invalid).
At that place, it has no idea about archives or files, it just deals with segment entries.

An archive check would go over all archives and check whether it still has all chunks it needs for every file (or other item). If it misses a chunk, it replaces it with a all-zero chunk of same length (and it also tells you path/filename if that happens).

In general: as the warning says when you invoke "attic check", be very careful. Maybe make a backup of your repo first before trying to repair it.

@ThomasWaldmann
Copy link
Contributor

BTW, maybe you want to monitor the SMART status of your disks. Often you can see in there that a disk develops problems, before it actually starts to severely fail.

@bopolissimus
Copy link
Author

Many thanks Thomas. Yes, I've done the backup (first backup from the failing drive, so some of the data files will be bad as the data is read from bad blocks and the read fails), and then the check on the backup.

The repository check ran for long enough that it's only the next morning (this morning) that I noticed that the archive checks do list the broken files :-).

So the repository check shows:
Error reading segment 3
attempting to recover work-backup.attic/data/0/3
Error reading segment 11
attempting to recover work-backup.attic/data/0/11
Error reading segment 77854
attempting to recover work-backup.attic/data/7/77854

and then the per-archive check shows the broken files:

Analyzing archive work-2015-11-19 (1/71)
backup/someDirectory/someFile: Missing file chunk detected (Byte 10556744-10589663)
backup/someDirectory/someFile: Missing file chunk detected (Byte 10589663-10616774)

etc.

I'll just need to sure to keep the output log of attic check so I'll know which files to remove/ignore after an extract.

Would this (keeping track of known bad files and not extracting them during an extract) be worth a feature request? Or would we normally just need to know/keep track and ignore/remove the known bad files after an extract? It would seem a very useful feature from a usability standpoint, particularly since extracts and checks (or looking at check reports after errors are reported during schedule checks) will happen so rarely that people are likely to forget to at least keep a log of the checks (not such a problem if the checks are automated though).

OTOH, I'm probably just not doing things right since I'm only slowly figuring out how to get things done the right way. I use this for my desktop so haven't automated much more than the daily backup runs.

smartctl -- yes, I'll be doing that. For some of my drives I can't do it. This is all for home use, not enterprise, so my drives are cheap external USB ones and quite a lot of smartctl features are often not available :-).

@ThomasWaldmann
Copy link
Contributor

a lot of usb drives will work with smartctl -dsat /dev/sdx.
not sure about the bad file tracking feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants