Full Program
Summary:
Adversarial attacks pose significant threats to machine learning models with white-box attacks such as Fast Gradient Sign Method (FGSM) Projected Gradient Descent (PGD) and Basic Iterative Method (BIM) achieving high success rates when model gradients are accessible. However in real-world scenarios direct access to model internals is often restricted necessitating black-box attack strategies that typically suffer from lower effectiveness. In this work we propose a novel approach to transform white-box attacks into black-box attacks by leveraging state-of-the-art surrogate models including MultiLayer Perceptrons (MLP) and XGBoost (XGB). Our method involves training a surrogate model to mimic the decision boundaries of an inaccessible target model using pseudo-labeling thereby enabling the application of gradient-based white-box attacks in a black-box setting. We systematically compare our approach against conventional black-box attacks such as Zero Order Optimization (ZOO) evaluating their effectiveness in terms of attack success rates transferability and computational efficiency. The results demonstrate that surrogate-assistedAuthor(s):
Dimitrios-Christos Asimopoulos
MetaMind Innovations P.C.
Greece
Panagiotis Radoglou-Grammatikis
K3Y Ltd
Greece
Panagiotis Fouliras
University of Macedonia
Greece
Konstandinos Panitsidis
University of Western Macedonia
Greece
Georgios Efstathopoulos
Greece
Thomas Lagkas
Democritus University of Thrace
Greece
Vasileios Argyriou
Kingston University London
United Kingdom
Igor Kotsiuba
Durham University Business School
United Kingdom
Panagiotis Sarigiannidis
University of Western Macedonia
Greece