🚀
PhD student @ EPFL🇨🇭. Interested in robustness and generalization in LLMs.
-
EPFL
- Lausanne
-
09:48
(UTC +01:00) - https://andriushchenko.me/
- @maksym_andr
Highlights
- Pro
Pinned Loading
-
tml-epfl/llm-past-tense
tml-epfl/llm-past-tense PublicDoes Refusal Training in LLMs Generalize to the Past Tense? [NeurIPS 2024 Safe Generative AI Workshop (Oral)]
-
tml-epfl/llm-adaptive-attacks
tml-epfl/llm-adaptive-attacks PublicJailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [arXiv, Apr 2024]
-
JailbreakBench/jailbreakbench
JailbreakBench/jailbreakbench PublicJailbreakBench: An Open Robustness Benchmark for Jailbreaking Language Models [NeurIPS 2024 Datasets and Benchmarks Track]
-
RobustBench/robustbench
RobustBench/robustbench PublicRobustBench: a standardized adversarial robustness benchmark [NeurIPS 2021 Benchmarks and Datasets Track]
-
square-attack
square-attack PublicSquare Attack: a query-efficient black-box adversarial attack via random search [ECCV 2020]
-
relu_networks_overconfident
relu_networks_overconfident PublicWhy ReLU networks yield high-confidence predictions far away from the training data and how to mitigate the problem [CVPR 2019, oral]
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.