Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tagging archives for "prune"? #846

Closed
enkore opened this issue Apr 6, 2016 · 7 comments
Closed

Tagging archives for "prune"? #846

enkore opened this issue Apr 6, 2016 · 7 comments

Comments

@enkore
Copy link
Contributor

enkore commented Apr 6, 2016

Currently borg prune can only restrict the archives to be pruned by a common prefix. This works for naming schemes were the "prune-relevant" part of archive names is in the front, e.g. system-<hostname>-<date> and userdata-<hostname>-<date>, but doesn't really work for anything else.

Adding tags, i.e. a list of arbitrary strings (excluding "," which would be the tag separator) would help. "prune" and other commands using "--prefix" would get a "--tags" option, and only archives which have all (or any, discuss) tags listed would be affected (and they should be immutable for this reason).


EDIT: Different approach maybe, no extra metadata fields, backwards-applicable.

The names are already there. Most people probably already have some kind of "tag-gy" names, like those above or yyyy-mm-dd-hostname-part. We could just add something like --tags some,tags (always use , as delimiter here?) and --tag-delim - (what delim as default?). Then in stuff like prune:

tags = set(args.tags.split(args.tag_delim))
for archive in ...:
  if set(archive.name.split(args.tag_delim)) <= tags:
    ...  # prune
@enkore enkore added the question label Apr 6, 2016
@ThomasWaldmann
Copy link
Member

Mixing names and tags feels unclean. Tags could be separate archive metadata.

@enkore
Copy link
Contributor Author

enkore commented Apr 9, 2016

Good point, but I'm unsure whether that's not okay here (as a design decision). #866 made me think "Hm, what is the archive name really for?". "Recycling" it for tagging isn't a really clean thing to do, but it seems quite practical to me (if it's 100 % explicit opt-in). In a way "tags" would just be a different way of looking at the "name" field.

@billyc
Copy link

billyc commented Mar 10, 2017

I'd like to bump this feature request for tags/aliases.

After spending so much time in the git universe, I find myself wishing I could apply additional tags to specific borg archives.

Embedding tags in the archive name is currently possible, but it's quite unruly when you want to use multiple tags for an archive. For example, I already use the archive name to embed hostname, timestamp, and one or two other fields. I also want to add additional tags such as "@latest" and "@release-1". This gets messy quickly. Worse, I sometimes want to move a tag such as @latest from one archive to another.

If you're just using borg to backup files (granted, its original mission), there probably isn't a lot of need for tags. But if, like me, you have found borg's deduplication to be massively useful in other situations, like archiving very large files used in a data analysis pipeline :-) then the ability to assign multiple tags to an existing archive becomes really important.

Currently, my work-around is to create the original archive with the naming scheme I've devised, and then to immediately create multiple additional archives with names that begin with "@" -- @latest, @v1.0, @beta2, etc. Each one of those additional archives takes a couple minutes to scan/create, and adds just a few hundred bytes to the repository since the contents are completely identical to the original archive. (Well, as long as the files haven't changed in those couple minutes.)

It would be really nice to eliminate that slowdown by adding tag metadata.

I envision the UI being something like this:

  • Create a new tag and point it to an existing archive:
    borg tag [repo::archive-name] [tag1] [tag2] ...

  • List all tags and the archives they point to
    borg tag --list [repo]

  • Deleting tags could re-use the existing borg delete command or could also be a command option:
    borg tag -d [repo] [tagname]

Thanks for considering this!

@ChrisDowning
Copy link

Only started trying out borg recently but wanted to +1 the tagging idea. I can see a use-case relevant to backups whereby tags are used to define which of multiple cloud services an archive is backed up to. I imagine (based on other discussions) that the cloud backup would most likely be via a separate tool which picks up on the tags and, for example, handles creating a *.tgz file to be uploaded. (You could even add backup frequency as a separate detectable tag, but that sort of thing would be within the scope of the backup tool rather than borg itself.)

@billyc
Copy link

billyc commented Mar 14, 2017

See issue #2300 for a possible tag implementation. It's currently more like git tag than like Gmail labels -- in other words, additional aliases can exist for an archive, but they need to be unique. It might not be hard to merge that idea with what's discussed here -- labels applied to multiple archives.

@ThomasWaldmann
Copy link
Member

See also #8425.

@ThomasWaldmann
Copy link
Member

ThomasWaldmann commented Oct 4, 2024

Guess this is implemented in borg2:

  • prune supports -a
  • --match-archives (-a) can be given multiple times
  • -a tags:FOO,BAR
  • -a tags:FOO -a tags:BAR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants