IEEE-CSR - Peer Review & Conference Management System

Summary:

Adversarial attacks pose significant threats to machine learning models with white-box attacks such as Fast Gradient Sign Method (FGSM) Projected Gradient Descent (PGD) and Basic Iterative Method (BIM) achieving high success rates when model gradients are accessible. However in real-world scenarios direct access to model internals is often restricted necessitating black-box attack strategies that typically suffer from lower effectiveness. In this work we propose a novel approach to transform white-box attacks into black-box attacks by leveraging state-of-the-art surrogate models including MultiLayer Perceptrons (MLP) and XGBoost (XGB). Our method involves training a surrogate model to mimic the decision boundaries of an inaccessible target model using pseudo-labeling thereby enabling the application of gradient-based white-box attacks in a black-box setting. We systematically compare our approach against conventional black-box attacks such as Zero Order Optimization (ZOO) evaluating their effectiveness in terms of attack success rates transferability and computational efficiency. The results demonstrate that surrogate-assisted

Author(s):

Dimitrios-Christos Asimopoulos
MetaMind Innovations P.C.
Greece

Panagiotis Radoglou-Grammatikis
K3Y Ltd
Greece

Panagiotis Fouliras
University of Macedonia
Greece

Konstandinos Panitsidis
University of Western Macedonia
Greece

Georgios Efstathopoulos
Greece

Thomas Lagkas
Democritus University of Thrace
Greece

Vasileios Argyriou
Kingston University London
United Kingdom

Igor Kotsiuba
Durham University Business School
United Kingdom

Panagiotis Sarigiannidis
University of Western Macedonia
Greece