Replies: 3 comments 5 replies
-
Hi @nkrusch These are very interesting questions! Pleas allow me to transfer this issue to our Discussion tab to continue and see from there if we can create new feature issues. |
Beta Was this translation helpful? Give feedback.
2 replies
-
Hi. I find this discussion interesting (thanks). Does |
Beta Was this translation helpful? Give feedback.
1 reply
-
Can you please give me some tips or code on doing this for Zoo? Thank you.
…On Fri, 16 Dec 2022, 05:46 Neea Rusch, ***@***.***> wrote:
I have continued to work on this problem since. ZooAttack does not have
mask support
<https://github.com/Trusted-AI/adversarial-robustness-toolbox/blob/987052c405e05d458276299aafc7d47bb584e738/art/attacks/evasion/zoo.py#L204-L212>.
Depending on your use case, it may be doable with some modification.
—
Reply to this email directly, view it on GitHub
<#1798 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AOLOROC22GVBAVN4KH2PWPLWNPQ2NANCNFSM55AJNEIQ>
.
You are receiving this because you commented.Message ID:
<Trusted-AI/adversarial-robustness-toolbox/repo-discussions/1798/comments/4416098
@github.com>
|
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Is your feature request related to a problem? Please describe.
Suppose I want to perform an evasion attack, where only a subset of attributes can be mutated by the adversary; the remaining attributes cannot be modified. A related but separate question is: how to denote the group of attributes for binarized data, where only one category can be selected at once (and selecting multiple would render the data invalid)?
In this example, values of (A, B) are immutable. (C,D,E) are binarized values of a categorical attribute after preprocessing; the rest (F, ....) can be mutated freely by the adversary:
How can I setup the attack so that these constraints are guaranteed to be preserved in the generated adversarial instances?
This question is for the ART in general, and I am looking for an existing (or future) way to achieve this behavior.
Describe the solution you'd like
I would like to specify explicitly, as an attack parameter, the im/mutable attributes and similar firm constraints about relationships between attributes (if there is an existing way to achieve this behavior, please advice).
Describe alternatives you've considered
It is unclear to me currently, if the specific attacks in theory support this kind of constrained scenario (I will need to review the papers).
Assuming this can be done, then the technical alternatives are to: (A) run the attack first, then post-prune the examples that are invalid, or (B) extend the toolkit to support this behavior. Simply removing immutable attributes is not an option, because they are needed for training.
This question may be silly in black-box setting, where attacker is not supposed to know about the internals of the classifier, however, let's assume it is "common knowledge" that the data must adhere to some format, that extends beyond the classifier, and attacker is aware of this. Then it is not unreasonable to assume attacker wants to preserve these constraints.
Beta Was this translation helpful? Give feedback.
All reactions