EFSAttack: Edge Noise-Constrained Black-Box Attack Using Artificial Fish Swarm Algorithm (2024)

1. Introduction

Deep neural networks (DNNs) are widely used in various fields, such as image classification [1], natural language processing [2], and speech recognition [3]. With the rapid development and extensive application of deep learning technology, adversarial example attacks have become a significant concern in the field of artificial intelligence security. Adversarial example attacks refer to the manipulation of input data with small perturbations, causing the target DNN models to produce misclassifications or erroneous outputs [4]. This attack method poses potential threats to the credibility and security of deep learning models in practical applications. Additionally, the existence of adversarial examples can measure the robustness of system models and provide targeted guidance for security defenses, highlighting the importance of researching adversarial example generation methods.

Adversarial example generation techniques can be mainly classified into two categories: white-box attacks and black-box attacks. White-box attacks refer to scenarios where attackers can obtain information about the target network, including the model, parameters, and training set. Attackers can generate adversarial examples by utilizing the gradient information of the target network, such as the fast gradient sign method (FGSM) [5] and the projected gradient descent (PGD) method [6]. However, the attack conditions for white-box attacks are more stringent and may not be easily realized in practical scenarios. Black-box attacks refer to scenarios where attackers cannot obtain information about the target network but only have access to its outputs. Based on the different outputs, black-box attack algorithms can be further divided into decision-based attacks and score-based attacks. Decision-based black-box attacks are aimed at attackers who can only get the final prediction result of the model, such as Boundary Attack [7], HopSkipJumpAttack [8], and Query-Efficient Boundary-based blackbox Attack (QEBA) [9], while score-based black-box attacks target models with accessible confidence scores, such as PatchAttack [10]. Generally, score-based black-box attacks require fewer queries than decision-based black-box attacks, making them more efficient. Therefore, this paper will focus on score-based black-box attacks. Score-based black-box attacks query the confidence score of the model by sending the constructed samples and updating the adversarial perturbation based on the confidence score, thus generating adversarial examples that can deceive the model.

Score-based black-box attack methods face two main challenges. Firstly, the attack process is prone to system detection. In most cases, black-box attack methods require submitting a large number of queries to the target model to obtain feedback information. This can arouse suspicion from the target system and increase the risk of detection. The system may detect abnormal query strategies and take corresponding defensive measures, making the attack difficult to succeed. Secondly, current black-box attack methods primarily focus on controlling the magnitude of perturbations while neglecting the impact of perturbation placement on the stealthiness of adversarial examples. In certain scenarios, even when using small perturbations, the resulting adversarial examples may still appear abnormal. This requires attackers to design the perturbation’s location and intensity more delicately to reduce the perceptibility of adversarial examples. To address the shortcomings of the current research, we aim to design a novel black-box attack technique that can successfully generate adversarial examples without raising suspicion, thereby advancing the boundaries of effectiveness and stealth in adversarial attacks. To achieve this goal, the method is required to satisfy the following subrequirements: (1) High attack success rate. The method should effectively deceive the target model, causing it to make incorrect predictions on adversarial examples. (2) Imperceptible perturbations. Adversarial examples should closely resemble the original samples, preserving their visual characteristics and potentially increasing the stealthiness of the attack. (3) Fewer queries. To ensure effectiveness, a black-box attack method must generate adversarial examples with minimal queries, decreasing the risk of detection.

In this work, we propose a novel Edge noise-constrained black-box attack method using the artificial Fish Swarm algorithm, EFSAttack for short. As illustrated in Figure 1, the proposed EFSAttack contains three key components: edge map extraction, edge noise-constrained population initialization, and edge noise-constrained population evolution. Specifically, for each input image, the edge map extraction module initially derives an initial edge map via edge detection algorithms. Then, we extend the initial edge map by proposing the edge extension methods to attain the refined edge map. Subsequently, the edge noise-constrained population initialization module initiates a set of random noises, which, after being filtered through the edge map to retain only edge-aligned disturbances, yields edge-constrained noise. This tailored noise is subsequently superimposed onto the input image to produce the initialized population for the attack. Next, the edge noise-constrained population evolution module innovates upon the updating strategies of the artificial fish swarm algorithm. It directs the population’s advancement under the constraint of edge noise, launching attacks on the target model and harnessing the feedback from the model to compute fitness functions. These functions guide the evolutionary process of the population until the attack is successful. Moreover, extensive comparative algorithms and ablation experiments on the CIFAR-10 and MNIST datasets have fully validated the effectiveness of our proposed EFSAttack.

Our main contributions can be summarized as follows:

We introduce the concept of edge noise constraint to indicate the low-frequency region of the image where perturbations are added, effectively improving the concealment of the adversarial examples.
We improve the artificial fish swarm algorithm based on the edge noise constraint for the black-box attack task, including edge noise-constrained population initialization and edge noise-constrained population initialization evolution.
We demonstrate the effectiveness of EFSAttack by conducting empirical evaluations on the CIFAR-10 and MNIST datasets.

The structure of this paper is as follows: Section 2 introduces the related research and background knowledge; Section 3 discusses the details of our method; Section 4 verifies the effectiveness and performance advantages of the proposed method through experiments; and finally, Section 5 summarizes the paper.

2. Related Work

2.1. Adversarial Attacks

2.1.1. White-Box Attacks

White-box attacks can be carried out if the attacker has detailed knowledge of the neural network’s structure and parameters or possesses a neural network with a similar structure and parameters. The FGSM calculates the gradient of the loss function concerning the input features and multiplies the sign of the gradient by a small constant (perturbation budget). The adjusted gradient vector is then directly added to the original sample, enabling the construction of adversarial examples in a single step. PGD improves upon the FGSM by iteratively updating the gradient multiple times. Instead of a one-time perturbation addition like the FGSM, PGD utilizes multiple steps of small gradient updates to approach the optimal adversarial perturbation, resulting in more robust adversarial examples. White-box attacks have relatively high requirements for attackers, limiting their practical applications. The black-box attack proposed by Carlini and Wagner [11] is an iterative optimization-based low-distortion adversarial example generation algorithm. It designs a loss function that has a small value in the adversarial example but a large value in the original sample, allowing for the search of adversarial examples by minimizing this loss function. White-box attacks require attackers to have complete knowledge of the target model’s structure, parameters, and internal operations. In practice, attackers may not have access to all of this information, thereby limiting the application of white-box attacks.

2.1.2. Black-Box Attacks

Black-box attacks can only access the neural network’s inputs and outputs and are divided into decision-based black-box attacks and score-based black-box attacks.

Decision-based black-box attacks refer to situations where attackers can only observe the model’s final classification result without access to the intermediate confidence or score information. Boundary Attack, HopSkipJumpAttack, and QEBA generate adversarial examples by simulating and estimating the decision boundary. In these methods, the sample is initialized as the target image and moved along the classification boundary between the source and target images. This process can maintain the adversarial nature of the samples while reducing the distance between them. HopSkipJumpAttack utilizes binary information at the decision boundary for gradient direction estimation. QEBA finds the classification boundary between the source and target images through a binary search and then estimates the gradient of adversarial perturbations in a low-dimensional subspace using the Monte Carlo method. Decision-based black-box attacks often require numerous queries to generate adversarial examples because they can only obtain the model’s predicted labels.

Score-based black-box attack methods allow attackers to observe the scores or confidence information outputted by the model. These scores reflect the model’s prediction level for different classes. Attackers can use this information to guide the generation of adversarial examples, making the model more inclined to classify the adversarial examples into the desired category.

In score-based black-box attacks, one type of method is score-based patch attacks. PatchAttack utilizes monochrome patches and employs sampling and reinforcement learning to search for the position and shape of rectangular patches. However, monochrome patches often result in a very low success rate for attacks. The authors of [12] construct a class-specific texture dictionary through style transfer to create targeted patch attacks. Sparse-rs [13] designs initialization schemes and sampling distributions for adversarial patches. The adversarial examples generated by these methods have obvious perturbations and are easily detected as anomalies.

Another type is evolution algorithm-based black-box attack methods. OnePixel [14] and SparseEvo [15] use the genetic algorithm (GA) for attacks, performing operations such as crossover and mutation within the framework of evolutionary algorithms. However, this binary representation cannot define continuous regions, thus making it impossible to model and execute patch attacks. AdversarialPSO [16] generates adversarial examples based on the Particle Swarm Optimization (PSO) algorithm. This algorithm first divides the search space, and different particles randomly select several image blocks to apply perturbations for particle swarm initialization and then perform an optimization search.

Current black-box attack methods still require a high number of model queries and mainly focus on the magnitude of perturbations while ignoring the influence of the perturbation location on the concealment of adversarial examples.

2.2. Heuristic Algorithms

Heuristic algorithms are approximate solution strategies based on experience or intuitive reasoning. They draw inspiration from optimization mechanisms found in nature, biology, and social phenomena to solve complex optimization problems, particularly those that are difficult to solve precisely using traditional methods. Common heuristic algorithms include the GA [17], PSO [18], ant colony algorithms [19], simulated annealing algorithms [20], etc.

The GA is an optimization algorithm based on Darwin’s theory of evolution. It searches for the optimal solution by simulating the process of genetic inheritance and mutation in organisms. The basic operations of the GA include selection, crossover, mutation, etc., which iterate continuously to approximate the optimal solution. PSO is an optimization algorithm based on collective behavior. It searches for the optimal solution by simulating the behavior patterns of biological groups, such as bird flocks and fish schools. The basic operations of PSO include updating the velocity, position, etc., which iterate continuously to approximate the optimal solution. The ant colony algorithm is an optimization algorithm that simulates the foraging behavior of ants. It searches for the optimal solution by simulating the process of ant pheromone communication. The basic operations of ant colony algorithms include ant movement, pheromone evaporation, etc., which iterate continuously to approximate the optimal solution. However, the classical heuristic algorithms mentioned above all have certain limitations. For example, the GA may fall into premature convergence, PSO is easily influenced by local optima, and ant colony algorithms have complex parameter settings.

The artificial fish swarm algorithm (AFSA) [21] simulates the behavioral patterns of fish societies, such as foraging, schooling, and following. Individual fish adjust their status by perceiving neighborhood information, cooperating to search for the optimization region, and avoiding the problem of local optima. The algorithm has strong global search capability and simple parameter settings, making it easier to implement and adjust, and has been widely used in many fields [22].

3. Methods

In this section, we first introduce the problem definition and then we provide the details of EFSAttack.

3.1. Problem Description

Black-box attacks aim to find a perturbation $δ$ without computing gradients, such that the resulting image $\tilde{x}$ with the added perturbation deceives the target model to make incorrect predictions, which can be formally described as follows:

$y \neq a r g m a x f (\tilde{x}) s . t . \tilde{x} = x + δ a n d ‖ δ_{p} ‖ < ε$

(1)

where $x$ represents the original image, $\tilde{x}$ represents the perturbed image, $y$ represents the ground-truth class, and $δ$ represents a perturbation with an $L_{p}$ norm value less than $ε$ . $f$ is the target model. $f (x) = \{f_{1}, f_{2}, \dots, f_{k}\}$ represents the confidence vector outputted by the model, and $f_{i}$ represents the probability that model $f$ assigns to class $i$ for input $x .$ Each operation of inputting an image to the target model to obtain the confidence vector is referred to as one query.

3.2. EFSAttack

To tackle the mentioned challenges, we present EFSAttack, an edge noise-constrained black-box attack method based on the artificial fish swarm algorithm. The overall process of EFSAttack is shown in Figure 2. EFSAttack first extracts the edge map of the input image through the edge map extraction module, including the edge detection and the edge enhancement steps. Then, the edge noise-constrained population initialization module samples random noises and superimposes them on the input image, and the edge map constrains the noises to the edge regions of the image. Further, the samples in the population are fed into the target model to check if the attack is successful. If the attack succeeds, the procedure returns the adversarial example. Otherwise, we execute the edge noise-constrained population evolution to generate a new population and query the model again, repeating this process until attacking successfully.

3.2.1. Edge Noise Constraint

The edge of an image refers to the pixels that exhibit abrupt changes in grayscale values. By analyzing the neuron features in the hidden layers of neural networks using the deconvolution visualization algorithm, it can be observed that shallow neurons extract low-level features, such as color and edges. Furthermore, the extraction of higher-level semantic information relies on edge details, ultimately impacting the model’s predictions. To demonstrate the impact of edge features, we randomly selected 1000 samples from the CIFAR-10 dataset and visualized the features extracted from the original samples and the samples with added noise in the image edge region using the t-SNE method [23] in a two-dimensional space. We employed examples from three categories as depicted in Figure 3. Each point in the figure represents an individual sample, with points of the same color belonging to the same class. The distances between points denote the degree of similarity among the extracted image features. Specifically, Figure 3a illustrates the distribution of the features from unaltered original images, while Figure 3b reveals how the feature distribution changes when edge noise is introduced to the images. We discovered that in Figure 3a, we can easily observe the class boundary denoted as the red line. However, this boundary becomes blurred in Figure 3b after introducing noise into the edge regions of the images. The coherence of intra-class features decreases and the overlap region of inter-class features increases, indicating that the number of samples that the model tends to confuse increases. This situation provides greater opportunities for generating adversarial examples that can deceive the model.

Additionally, the human visual system has varying sensitivity to changes in different frequency regions of an image. It is more sensitive to alterations in low-frequency regions and relatively less sensitive to changes in high-frequency regions [24]. High-frequency regions correspond to areas in an image with significant variations in grayscale values, such as edges, textures, and noise. These regions exhibit sudden shifts in brightness, color, or shape. Conversely, low-frequency regions exhibit smooth transitions in grayscale values and represent large, flat areas in an image. Figure 4 visually demonstrates this concept. In the illustration, the brown background represents a low-frequency region where any added noise is easily detectable. However, in Figure 4c, the noise is deliberately limited to the edge region of the image, effectively excluding the background noise depicted in Figure 4b. Consequently, this selective placement of noise makes it virtually imperceptible to the human eye.

Given the above, we introduce the concept of edge noise constraint, aiming to confine the adversarial perturbations added to the image within the edge region, as shown in Equation (2).

$\tilde{x} = x + δ ⊙ M_{x}$

(2)

where $x$ represents the original image, $δ$ denotes the noise sampled from a Gaussian distribution, and $M_{x}$ represents the edge map, indicating the edge region of the image $x$ . The operator $⊙$ denotes the element-wise multiplication operation and $\tilde{x}$ represents the perturbed image under the edge noise constraint. The specific implementation steps are as follows:

Edge extraction. To begin, we employ an edge detection algorithm [25] to extract the initial edge information from the given original image $x$ . This process generates the initial edge map $M_{x}^{i n i}$ , which is a single-channel binary matrix. In this matrix, a value of 1 at a particular position indicates that the corresponding position in the original image belongs to the edge region.
Edge expansion. To achieve a more continuous edge region, we perform edge expansion by shifting the initial edge map by one pixel in four directions: up, down, left, and right. We then add these shifted edge maps together and duplicate the edge map in the channel direction to ensure its channel number is consistent with that of the original image. Eventually, we obtain the final edge map $M_{x}$ .
Edge noise constraint. To enforce the noise $δ$ to be confined within the edge region, we perform an element-wise multiplication between the noise matrix and the edge map matrix. This operation masks out the noise in the non-edge region, ensuring that it only affects the pixels belonging to the edges. Finally, we add the resulting processed noise to the original image. This process generates the final perturbed image, where the noise is constrained to the edge region of the image.

The edge noise constraint achieves a clever balance between two crucial aspects: adversarial perturbation and stealthiness. By confining the noise to the edge region of the image, it effectively disrupts the semantic information while maintaining the imperceptibility of the perturbation, which perfectly fulfills the core requirement for generating adversarial examples.

3.2.2. Population Initialization Based on Edge Noise Constraint

Population initialization is a crucial step in the process, which involves initializing a set of perturbed images and population parameters and setting the objective function. In this section, we propose an innovative population initialization strategy that incorporates the edge noise constraint to generate adversarial examples of candidate solutions that balance both adversarial perturbation and stealthiness.

Let us consider a population of size N. When processing a given target image $x$ , we first extract its edge map $M_{x}$ . Next, we utilize a Gaussian distribution to randomly sample N noise vectors. These noise vectors are then element-wise multiplied with the edge map to ensure that the noise primarily affects the edge region of the image. Subsequently, we overlay the processed noise onto the original image, resulting in N perturbed image samples. Collectively, these samples form the initial population denoted as $X = \{{\tilde{x}}_{1}, {\tilde{x}}_{2}, \dots, {\tilde{x}}_{N}\}$ , where ${\tilde{x}}_{i} = x + δ_{i} ⊙ M_{x}$ . By employing the edge noise constraint method for population initialization, we can ensure that the generated initial population possesses higher quality. This population consists of perturbed images that have the potential to be close to optimal solutions. As a result, in the subsequent search process, it becomes possible to discover high-quality adversarial examples using fewer queries.

The population parameters consist of the population size N, the visual range of artificial fish denoted as $Visual$ , and the step size for artificial fish movement denoted as $Step$ . In the black-box attack scenario, N represents the number of perturbed images in the population. $Visual$ indicates the perception distance of the perturbed images. Specifically, a perturbed image can perceive others with an $L_{2}$ distance smaller than the visual range. $Step$ represents the maximum magnitude of the pixel value changes when updating the perturbed images. In Section 4.3.3, we will analyze the impact of different parameters on black-box attacks and set the parameter values manually based on the analysis.

To quantify the quality of perturbed images in the population, we define an objective function tailored for black-box attack tasks, as shown in Equation (3).

$o_{i} = - l o g f_{y} ({\tilde{x}}_{i}) + l o g \sum_{j = 0, j \neq y}^{j = k} f_{j} ({\tilde{x}}_{i})$

3.2.3. Population Evolution Based on Edge Noise Constraint

The AFSA updates potential solutions by simulating the preying, swarming, following, and moving behaviors of a fish school. However, directly applying the population evolution method of the AFSA can disrupt the noise distribution of perturbed images in the initial population, introducing noise in the low-frequency regions of the images and compromising their stealthiness. Therefore, we improve the population evolution method of the AFSA for black-box attack scenarios with the edge noise constraint. Specifically, we propose four population evolution strategies: edge noise-constrained preying, edge noise-constrained swarming, and edge noise-constrained following. These strategies enforce an edge noise constraint during population evolution to address the issue of noise diffusion.

Edge noise-constrained preying. The perturbed image ${\tilde{x}}_{i}$ in the population checks if there exists a new perturbed image ${\tilde{x}}_{n e w}$ with stronger adversarial properties within its visual range ( $d_{{\tilde{x}}_{i}, {\tilde{x}}_{n e w}} < v i s u a l$ ). If such an image is found, ${\tilde{x}}_{i}$ moves one step toward ${\tilde{x}}_{n e w}$ . To ensure that the updation of ${\tilde{x}}_{i}$ does not introduce new noise in the low-frequency regions, we employ the edge noise constraint to sample and generate ${\tilde{x}}_{n e w}$ based on Equation (4).

${\tilde{x}}_{n e w} = x + δ_{L_{2} < v i s u a l} ⊙ M_{x}$

(4)

Specifically, $δ_{L_{2} < v i s u a l}$ represents Gaussian noise sampled randomly with an $L_{2}$ norm smaller than $V i s u a l$ . This noise is element-wise multiplied with the edge map $M_{x}$ of the target image and then added to the original image x to create the new perturbed image ${\tilde{x}}_{n e w}$ , which satisfies the edge noise constraint.

The objective function values, $o_{i}$ and $o_{n e w}$ , of ${\tilde{x}}_{i}$ and ${\tilde{x}}_{new}$ are computed based on Equation (3). If $o_{i} < o_{n e w}$ , it indicates that ${\tilde{x}}_{new}$ exhibits stronger adversarial properties and is more likely to cause misclassification by the model compared to ${\tilde{x}}_{i}$ . In this case, ${\tilde{x}}_{i}$ moves one step toward the direction of ${\tilde{x}}_{new}$ , as shown in Equation (5). Here, $r a n d o m (S t e p)$ represents a random number between 0 and $S t e p$ , which introduces randomness to expand the search range and increase the possibility of finding better adversarial examples.

${\tilde{x}}_{i} = {\tilde{x}}_{i} + \frac{{\tilde{x}}_{n e w} - {\tilde{x}}_{i}}{‖ {\tilde{x}}_{n e w} - {\tilde{x}}_{i} ‖} \times r a n d o m (S t e p)$

(5)

As ${\tilde{x}}_{i}$ and ${\tilde{x}}_{new}$ only differ in the pixel values corresponding to the edge map and do not contain any additional noise in other regions, all the other values remain the same. Therefore, ${\tilde{x}}_{i}$ only updates the perturbation values in the corresponding edge map region, without introducing new noise in the low-frequency regions, maintaining the stealthiness of the perturbations.

Edge noise-constrained swarming. The perturbed image ${\tilde{x}}_{i}$ searches within its field of view for other existing perturbed images in the population, and if it finds $n_{f}$ perturbed images within its field of view, their pixel values are averaged to generate a new perturbed image, ${\tilde{x}}_{c}$ . Clearly, this operation ensures that the noise distribution of ${\tilde{x}}_{c}$ complies with the edge noise constraint. Subsequently, the performance of both ${\tilde{x}}_{i}$ and ${\tilde{x}}_{c}$ is calculated on the objective function. If ${\tilde{x}}_{c}$ exhibits superior adversarial effects compared to ${\tilde{x}}_{i}$ and the density of perturbed images around ${\tilde{x}}_{c}$ is moderate, then ${\tilde{x}}_{i}$ moves one step toward ${\tilde{x}}_{c}$ . Otherwise, the edge noise-constrained preying is executed.

The perturbed image density is determined by comparing $o_{c} / n_{f}$ with $μ o_{i}$ , where $o_{c}$ and $o_{i}$ represent the objective function values of ${\tilde{x}}_{c}$ and ${\tilde{x}}_{i}$ , respectively. $μ$ is a scalar representing the crowding factor. If $o_{c} / n_{f} > μ o_{i}$ , the density of the perturbed images around ${\tilde{x}}_{c}$ is not too high. A low crowding factor encourages perturbed images to perform a fine-grained search in known promising regions, while a high crowding factor motivates perturbed images to explore unknown regions.

Edge noise-constrained following. The perturbed image ${\tilde{x}}_{i}$ searches for the most adversarial perturbed image within its field of view denoted as ${\tilde{x}}_{b}$ . If the density of the perturbed images around ${\tilde{x}}_{b}$ is moderate, ${\tilde{x}}_{i}$ moves one step toward ${\tilde{x}}_{b}$ . The updated perturbed image still satisfies the edge noise constraint.

The perturbed images are iteratively updated based on the aforementioned scheme until generating the adversarial examples or reaching the maximum iteration count. During the iteration, the preying behavior identifies the optimal solution, the following behavior escapes from the local optimal solutions, and the swarming behavior gathers the fish swarm around the optimal solution. Further, by introducing the edge noise constraint mechanism, noise is not introduced in the low-frequency regions of the image, enhancing the stealthiness of the adversarial examples. Additionally, the edge noise blurs class boundaries, enabling the perturbed images to cause misclassification with fewer iterations, thereby increasing the attack success rate and reducing the queries.

3.2.4. Edge Noise Reduction

During the attack, we constrain the perturbations to the edge regions. However, not all edge noise affects the target model. To reduce the perturbations, we apply edge noise reduction to the adversarial examples.

We evaluate the $L_{2}$ distance between the original images and the adversarial examples before reduction. If the distance exceeds the predefined threshold, we execute the following operations:

Divide the adversarial example and the original image into blocks.
Iterate through all the image blocks. (i) Replace the block in the adversarial example with the counterpart in the original image. (ii) Query the target model to check if the current adversarial example causes the model to classify incorrectly. (iii) If the current adversarial example causes misclassification, retain the replacement of the image block; otherwise, revert the replacement.

Through these steps, we can partially remove redundant noise, enhancing the stealthiness of the adversarial examples. Additionally, by pre-evaluating the $L_{2}$ values, we balance the queries and reduction process. Algorithm 1 shows the overall process of EFSAttack.

Algorithm 1 Overall procedure of EFSAttack
Input: target image x, max iteration number M, population size N, step size, the visual range Output: the adversarial example ${\tilde{x}}_{a d v}$
1	$M_{x} \leftarrow Get the edge map of the target image x$
2	Sample noise $δ$ from a Gaussian distribution.
3	Initialize the population X according to Equation (2).
4	$f o r epoch \leftarrow 1 to M d o$
5	$f o r {\tilde{x}}_{i} in X d o$
6	$X_{f} \leftarrow {the samples in the vision of \tilde{x}}_{i}$
7	$i f \|X_{f}\| > 0 t h e n$
8	$Swarm ({\tilde{x}}_{i}, X_{f}$ )
9	$e l s e$
10	$Prey ({\tilde{x}}_{i})$
11	$X_{f} \leftarrow {the samples in the vision of \tilde{x}}_{i}$
12	$i f \|X_{f}\| > 0 t h e n$
13	$Follow ({\tilde{x}}_{i}, X_{f}$ )
14	$e l s e$
15	$Prey ({\tilde{x}}_{i})$
16	end for
17	if attacking successfully then
18	return ${\tilde{x}}_{bestg}$
19	end if
20	end for
21	Divide the adversarial example and the original image into several blocks.
22	$f o r each block d o$
23	${\tilde{x}}_{adv} \leftarrow {Substitute the block in \tilde{x}}_{bestg} with the counterpart in x$
24	$i f {\tilde{x}}_{bestg} is an adversarial example t h e n$
25	continue
26	else
27	undo the $Substitute$ operation
28	end if
29	return ${\tilde{x}}_{a d v}$

4. Experiments

4.1. Setup

Datasets. In this section, we evaluate the performance of our method on two commonly used datasets, MNIST and CIFAR-10. MNIST consists of a collection of grayscale handwritten digits, with 60,000 training samples and 10,000 testing samples. Each class contains 7000 samples of size 28 × 28. The CIFAR-10 dataset contains 60,000 color images of size 32 × 32, divided into a training set of 50,000 images and a test set of 10,000 images. We conducted attacks using EFSAttack on the first 1000 correctly classified samples from the MNIST and CIFAR-10 test sets and reported the results obtained.

Metrics. To evaluate EFSAttack, we considered the following three metrics: the success rate (the ratio of successfully generated adversarial examples to the total number of samples), average $L_{2}$ distance between input images and their corresponding adversarial examples, and average number of queries required to generate adversarial examples.

Parameters. For the CIFAR-10 dataset, we set the population size to 5, the initial $Step$ to 0.5 with an increment rate of 1.1, and the initial $Visual$ to 0.01 with a decrease rate of 0.98. For the MNIST dataset, we set the population size to 5, the initial $Step$ to 1.5 with an increment rate of 1.1, and the initial $Visual$ to 0.01 with a decrease rate of 0.98.

Baseline Models. We compared our results with the following algorithms on the CIFAR-10 and MNIST benchmark datasets:

Zeroth Order Optimization (ZOO) [26]. ZOO generates adversarial examples based on differential evolution. It evaluates the coordinates of the samples after applying perturbations, approximates the gradients for each coordinate, and then applies small perturbations in the direction of the gradient to generate adversarial examples. ZOO reduces the number of queries required by using random coordinate descent and attack space dimensionality reduction.
GenAttack [27]. GenAttack generates adversarial examples using the GA. It proposes the use of dimensionality reduction and adaptive parameter scaling to reduce the number of queries required.
AdversarialPSO. AdversarialPSO generates adversarial examples based on PSO. This algorithm first divides the search space and initializes the particle swarm by randomly selecting a few image blocks to apply perturbations. It then performs an optimization search.
ABCAttack [28]. ABCAttack generates adversarial examples based on the artificial bee colony algorithm.
Multi-Group PSO with Random Redistribution (MGRR-PSO) [29]. MGRR-PSO uses multiple groups of PSO with random redistribution to generate perturbations. It solves the problem of the PSO algorithm getting stuck in local optima, leading to low attack success rates.
Brownian Arithmetic Optimization Algorithm (BAOA) [30]. The BAOA uses simple arithmetic operations inspired by the Brownian motion of molecules in fluids and gases to search for the best adversarial examples in high-dimensional image space.

All the experiments were conducted on a server equipped with a 32-core Intel (R) Xeon (R) CPU E5-2620 v4 2.10 GHz and two NVIDIA GeForce RTX 2080 Ti GPUs.

4.2. Results

Table 1 presents a comparison of the performance of different methods across various evaluation metrics. From the table, we can make the following observations.

Regarding the queries. The average number of queries for the ZOO method on the CIFAR-10 and MNIST datasets is 12,800 and 384,000, respectively. However, in practical environments, MLaaS providers and other remote hosts typically monitor the number of queries, so submitting such a large number of queries is unrealistic. In contrast, our EFSAttack method has an average number of queries of 541 and 583, respectively, which is much lower than that of ZOO and also better than the GenAttack and AdversarialPSO methods. This indicates that our method can converge faster, search more effectively for valid adversarial examples, and is less likely to be detected in real-world applications.

Regarding the attack success rate. On the CIFAR-10 and MNIST datasets, our EFSAttack method demonstrated a very high attack success rate, which was on par with the ZOO, MGRR-PSO Attack, and ABCAttack methods and significantly better than the GenAttack and AdversarialPSO methods. This proves that our EFSAttack method is very effective in maintaining high attack success rates.

Regarding the $L_{2}$ distance. In addition, the average $L_{2}$ distance of the adversarial examples generated by the EFSAttack method on the CIFAR-10 and MNIST datasets is 1.396 and 3.745, respectively. Although higher than the ZOO method, it is still within a reasonable range and is better than other methods outside of ZOO on the MNIST dataset and significantly better than the ABCAttack, MGRR-PSO Attack, and BAOA methods on CIFAR-10. This indicates that our method can generate highly stealthy adversarial examples.

In summary, our EFSAttack achieves significant performance gains on the attack success rate, number of queries, and stealthiness. The adversarial examples generated by EFSAttack on the CIFAR-10 dataset are shown in Figure 5a. The first row shows the original images with labels ‘automobile’, ‘airplane’, ‘horse’, and ‘dog’. The second row shows the generated adversarial examples with labels ‘truck’, ‘cat’, ‘cat’, and ‘deer’. The adversarial examples generated on the MNIST dataset are shown in Figure 5b. The first row shows the original images with labels ‘4’, ‘5’, ‘4’, and ‘8’. The second row shows the generated adversarial examples with labels ‘9’, ‘9’, ‘9’, and ‘9’. By observing the adversarial example images, we find that it is difficult for the human eye to detect the differences between the original image and the adversarial example.

4.3. Ablation Studies

4.3.1. Updation Strategy

To validate the effectiveness of the update strategies, we conducted three derivative experiments on the CIFAR-10 and MNIST datasets by disabling the preying, swarming, and following behaviors. We named these three methods as EFSAttack-w/o-prey, EFSAttack-w/o-swarm, and EFSAttack-w/o-follow, respectively. The performance of these methods is presented in detail in Table 2.

When the preying behavior is disabled (EFSAttack-w/o-prey), the attack success rate significantly decreases to 22.8% and 2.2%, because the perturbed images lack the necessary updating mechanism. As an alternative behavior to swarming and following, preying ensures self-updating and improvement in the current distribution of perturbed images, driving them to explore a better solution space in the absence of external guidance or facing local optimal traps. As shown in Figure 6, assuming that the initialized perturbed images are not within each other’s field of view, if there is no foraging behavior, the perturbed images will not be effectively updated and explored.

The swarming and following behaviors achieve exploration and exploitation of the solution space through coordinated updates of the perturbed images, effectively preventing the algorithm from falling into local optimal states and ensuring that the global optimal solution is found as much as possible. After disabling the swarming (EFSAttack-w/o-swarm), the average $L_{2}$ distance on the CIFAR-10 and MNIST datasets increased to 2.512 and 3.949, respectively. After disabling the following (EFSAttack-w/o-follow), the average $L_{2}$ distance on the CIFAR-10 and MNIST datasets increased to 2.516 and 3.961, respectively.

Through the analysis of these ablation experiment results, we can conclude that preying, swarming, and following strategies play important roles in optimizing the search process and preventing the algorithm from falling into a locally optimal solution. These behaviors cooperate to promote the effectiveness of the search process and the discovery of the global optimal solution.

4.3.2. Noise Constraint Strategy

In this study, we proposed two noise constraint strategies, namely, edge noise constraint and redundant noise handling, to enhance the generation quality of adversarial examples. To validate the effectiveness of these two strategies, we compared EFSAttack with the following two derived methods: (1) EFSAttack-w/o-mask: In this variant, we removed the constraint on adding noise only at the edges and allowed for perturbations to be added at any location in the image. (2) AHG-Net-w/o-reduction: In this version, we disabled the redundant noise removal operation, which means that no additional steps were taken to eliminate redundant noise after generating adversarial examples using the artificial fish swarm algorithm. Table 3 provides a detailed comparison of the performance of our EFSAttack and its derived methods, while Figure 7 intuitively demonstrates the adversarial examples generated by EFSAttack-w/o-mask, EFSAttack-w/o-reduction, and EFSAttack.

Firstly, by comparing Figure 7b,d, we can observe that the adversarial examples generated by EFSAttack exhibit better performance, which indicates that removing the constraint on edge perturbations may hurt the performance of EFSAttack. EFSAttack-w/o-mask only restricts the magnitude of perturbations during the fish swarm initialization/adversarial example initialization and fish swarm movement/search process, without imposing constraints on the location of perturbations, resulting in a relatively random distribution of perturbations. When noise appears in the low-frequency region of the image, human eyes can perceive the changes sharply. In contrast, the adversarial examples generated by EFSAttack determine the high-frequency regions of the input sample through edge detection and not only restrict the size of the perturbations but also ensure that the perturbations are superimposed on the high-frequency regions of the image during the fish swarm initialization/adversarial example initialization and fish swarm movement/search process. Therefore, even with larger perturbations, humans find it difficult to perceive the changes due to their insensitivity to high-frequency signals. Thus, the method of adding edge noise is crucial for generating high-quality adversarial examples. Secondly, the average $L_{2}$ distance of EFSAttack-w/o-reduction is higher than that of EFSAttack and EFSAttack-w/o-mask. By comparing Figure 7c,d, we can see that the perturbations in the adversarial examples generated by EFSAttack-w/o-reduction are more pronounced, confirming that the operation of removing redundant noise can effectively eliminate irrelevant noise and improve the quality of generated adversarial examples.

4.3.3. Hyperparameter Analysis

To evaluate the impact of different hyperparameter values on the performance of EFSAttack, we conducted experiments with various population sizes, field-of-view ranges, and step sizes. We randomly selected 100 images from the CIFAR-10 and MNIST datasets and performed attacks using different parameter configurations. The performance results are shown in Figure 8.

Population Size. As shown in Figure 8a, the average $L_{2}$ distance during the generation of adversarial examples decreases when the population size increases. This is because a larger population size enhances the algorithm’s global search capability. A larger population size means more ‘explorers’ deployed in the search space, increasing the probability of finding the global optimal solution. Additionally, a larger population size can also lead to more intense competition and cooperation among individuals, which benefits the algorithm’s local search efficiency. Competition among individuals helps the fish swarm converge to local optimal solutions more quickly, while cooperation among individuals helps prevent the fish swarm from prematurely getting trapped in local optima. However, although increasing the population size can improve the convergence speed and solution accuracy of the algorithm, it may also increase the computation costs and time complexity. As shown in the graph, as the population size increases, the number of queries also increases, and the increase is much greater than the decrease in the average $L_{2}$ distance. Therefore, it is advisable to choose a smaller population size within an acceptable range of perturbations.

Visual Range. As shown in Figure 8b, the average $L_{2}$ distance of generated adversarial examples decreases when the field of view expands, but the average number of queries increases. This is because the field of view directly affects the search capability for candidate images and the perception level of other candidate images. A smaller field of view focuses on local search, allowing for faster convergence to locally optimal solutions with fewer queries, but it generates larger perturbations in the generated adversarial examples. On the other hand, a larger field of view enables fish individuals to perceive information from a greater distance, thereby increasing the possibility of discovering global optimal solutions, but it requires more queries. To address this, we employ an adaptive field of view strategy with gradual decay. In the initial stages, we encourage exploration by using a larger field of view. As the algorithm approaches the optimal solution, we gradually decrease the field of view to achieve rapid convergence and strike a balance between searching for optimal solutions and reducing the number of queries required.

Step Size. As shown in Figure 8c, the perturbation of generated adversarial examples slightly increases with an increase in step size, but the query count significantly decreases. For example, when the step size increases from 0.1 to 0.2, the perturbation increases by less than 3%, while the query count decreases from about 1000 to below 800, reducing by more than 20%. Within an acceptable range of perturbations, we tend to choose a larger step size to improve the convergence speed and reduce the number of queries required.

5. Discussion

The stealthiness of adversarial attacks can be examined from the following two dimensions: (1) The stealthiness of the attack process to the target system, which is reflected in the query number required. The more queries an attack submits, the easier it is to be detected by the system. (2) The stealthiness of the generated adversarial examples, which is reflected in whether the noises are noticeable by the human. Compared to the existing methods that focus on controlling the magnitude of the perturbation but overlook the impact of noise placement on both the convergence rate of the algorithm and the stealthiness of the adversarial disturbance, EFSAttack introduces the concept of edge noise, ingeniously leveraging the discrepancy between a target model’s sensitivity to edge features in images and the human eye’s insensitivity to high-frequency areas. On the one hand, EFSAttack accelerates the convergence speed of the algorithm and reduces the number of queries required by confining the noise to the edge region, enhancing the stealth of the attack process. On the other hand, EFSAttack improves the population initialization and evolution strategies of the AFSA with edge noise constraints, guaranteeing that the noise remains confined to regions where human perception is dulled, thereby ensuring that the noises are less perceptible by humans. In contrast, methods like GenAttack and AdversarialPSO may introduce noise across any part of the image, making the adversarial examples more conspicuous.

Although the proposed method effectively enhances the stealthiness of the black-box attacks, there are still some issues to be addressed: First, the proposed method effectively addresses the black-box attack scenario that relies on the model’s confidence score. However, in real-world environments, obtaining the confidence score of the target model is often infeasible. Therefore, it is significant to conduct a thorough study of the decision-based black-box attack methods. Then, although EFSAttack significantly reduces the number of queries and increases the success rate of attacks through edge noise strategies, its effectiveness against highly robust defense mechanisms such as adversarial training or integrated defense systems remains to be evaluated. Future research could prioritize enhancing the penetration of EFSAttack in the presence of these advanced defense techniques and require further algorithm optimization to adapt to different attack strategies under various defense mechanisms.

6. Conclusions

We proposed a black-box attack method named EFSAttack based on the AFSA under the constraint of edge noise. In this paper, we studied the effect of the location of noise addition on the success rate of attack and the concealment of adversarial examples and proposed the concept of edge noise constraint. Further, we applied the AFSA to the task of adversarial examples generation and improved the population initialization and population evolution strategies of the artificial fish swarm algorithm based on edge noise constraints. Our work strikes a balance between query efficiency and adversarial stealth, outperforming the existing methods by dramatically curtailing the queries necessary for successful attacks and imbuing generated examples with heightened undetectability. Our work has important implications for the security and robustness of neural networks, and it underscores the need to improve defense mechanisms against such attacks.

Author Contributions

Conceptualization and methodology, J.G.; software, J.G. and B.W.; validation, X.W., K.Z. and C.W.; writing—original draft preparation, J.G.; writing—review and editing, X.W. and C.W.; supervision, K.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data that support the findings of this study are openly available at https://www.cs.toronto.edu/~kriz/cifar.html?spm=5176.28103460.0.0.230a3da2cPISoU (accessed on 21 March 2024) and http://yann.lecun.com/exdb/mnist/?spm=5176.28103460.0.0.230a3da2cPISoU, accessed on 21 March 2024.

Conflicts of Interest

We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, and there is no professional or other personal interests of any nature or kind in any product, service, and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled.

References

Yang, J.; Shi, R.; Wei, D.; Liu, Z.; Zhao, L.; Ke, B.; Pfister, H.; Ni, B. Medmnist v2-a large-scale lightweight benchmark for 2d and 3d biomedical image classification. Sci. Data 2023, 10, 41. [Google Scholar] [CrossRef] [PubMed]
Chang, Y.; Wang, X.; Wang, J.; Wu, Y.; Yang, L.; Zhu, K.; Chen, H.; Yi, X.; Wang, C.; Wang, Y.; et al. A survey on evaluation of large language models. ACM Trans. Intell. Syst. Technol. 2024, 15, 1–45. [Google Scholar] [CrossRef]
Radford, A.; Kim, J.W.; Xu, T.; Brockman, G.; McLeavey, C.; Sutskever, I. Robust speech recognition via large-scale weak supervision. In Proceedings of the 40th International Conference on Machine Learning, Honolulu, HI, USA, 23–29 July 2023; pp. 28492–28518. [Google Scholar]
Szegedy, C.; Zaremba, W.; Sutskever, I.; Bruna, J.; Erhan, D.; Goodfellow, I.; Fergus, R. Intriguing properties of neural networks. arXiv 2013, arXiv:1312.6199. [Google Scholar]
Goodfellow, I.J.; Shlens, J.; Szegedy, C. Explaining and harnessing adversarial examples. arXiv 2014, arXiv:1412.6572. [Google Scholar]
Madry, A.; Makelov, A.; Schmidt, L.; Tsipras, D.; Vladu, A. Towards deep learning models resistant to adversarial attacks. arXiv 2017, arXiv:1706.06083. [Google Scholar]
Brendel, W.; Rauber, J.; Bethge, M. Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models. arXiv 2021, arXiv:1712.04248. [Google Scholar]
Chen, J.; Jordan, M.I.; Wainwright, M.J. Hopskipjumpattack: A query-efficient decision-based attack. In Proceedings of the 2020 IEEE Symposium on Security and Privacy (sp), San Francisco, CA, USA, 18–21 May 2020; pp. 1277–1294. [Google Scholar]
Li, H.; Xu, X.; Zhang, X.; Yang, S.; Li, B. Qeba: Query-efficient boundary-based blackbox attack. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2020; pp. 1221–1230. [Google Scholar]
Yang, C.; Kortylewski, A.; Xie, C.; Cao, Y.; Yuille, A. Patchattack: A black-box texture-based attack with reinforcement learning. In European Conference on Computer Vision; Springer: Berlin/Heidelberg, Germany, 2020; pp. 681–698. [Google Scholar]
Carlini, N.; Wagner, D. Towards evaluating the robustness of neural networks. In Proceedings of the 2017 IEEE Symposium on Security and Privacy (sp), San Jose, CA, USA, 22–26 May 2017; pp. 39–57. [Google Scholar]
Gatys, L.A.; Ecker, A.S.; Bethge, M. Image style transfer using convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 2414–2423. [Google Scholar]
Croce, F.; Andriushchenko, M.; Singh, N.D.; Flammarion, N.; Hein, M. Sparse-rs: A versatile framework for query-efficient sparse black-box adversarial attacks. In Proceedings of the AAAI Conference on Artificial Intelligence, Online, 22 February–1 March 2022; pp. 6437–6445. [Google Scholar]
Su, J.; Vargas, D.V.; Sakurai, K. One pixel attack for fooling deep neural networks. IEEE Trans. Evol. Comput. 2019, 23, 828–841. [Google Scholar] [CrossRef]
Vo, V.Q.; Abbasnejad, E.; Ranasinghe, D.C. Query efficient decision based sparse attacks against black-box deep learning models. arXiv 2022, arXiv:2202.00091. [Google Scholar]
Mosli, R.; Wright, M.; Yuan, B.; Pan, Y. They might not be giants crafting black-box adversarial examples using particle swarm optimization. In Proceedings of the Computer Security-ESORICS 2020: 25th European Symposium on Research in Computer Security, ESORICS 2020, Guildford, UK, 14–18 September 2020; Proceedings, Part II 25. Springer: Berlin/Heidelberg, Germany, 2020; pp. 439–459. [Google Scholar]
Holland, J.H. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence; MIT Press: Cambridge, MA, USA, 1992. [Google Scholar]
Eberhart, R.; Kennedy, J. A new optimizer using particle swarm theory. In Proceedings of the Sixth International Symposium on Micro Machine and Human Science, Nagoya, Japan, 4–6 October 1995; pp. 39–43. [Google Scholar]
Dorigo, M.; Birattari, M.; Stutzle, T. Ant colony optimization. IEEE Comput. Intell. Mag. 2006, 1, 28–39. [Google Scholar] [CrossRef]
Kirkpatrick, S.; Gelatt, C.D., Jr.; Vecchi, M.P. Optimization by simulated annealing. Science 1983, 220, 671–680. [Google Scholar] [CrossRef] [PubMed]
Neshat, M.; Sepidnam, G.; Sargolzaei, M.; Toosi, A.N. Artificial fish swarm algorithm: A survey of the state-of-the-art, hybridization, combinatorial and indicative applications. Artif. Intell. Rev. 2014, 42, 965–997. [Google Scholar] [CrossRef]
Pourpanah, F.; Wang, R.; Lim, C.P.; Wang, X.-Z.; Yazdani, D. A review of artificial fish swarm algorithms: Recent advances and applications. Artif. Intell. Rev. 2023, 56, 1867–1903. [Google Scholar] [CrossRef]
Van der Maaten, L.; Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
Kaur, M.; Jindal, S.; Behal, S.J. A study of digital image watermarking. J. Res. Eng. Appl. Sci. 2012, 2, 126–136. [Google Scholar]
Canny, J. A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 1986, 8, 679–698. [Google Scholar] [CrossRef] [PubMed]
Chen, P.-Y.; Zhang, H.; Sharma, Y.; Yi, J.; Hsieh, C.-J. Zoo: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, Dallas, TX, USA, 3 November 2017; pp. 15–26. [Google Scholar]
Alzantot, M.; Sharma, Y.; Chakraborty, S.; Zhang, H.; Hsieh, C.-J.; Srivastava, M.B. Genattack: Practical black-box attacks with gradient-free optimization. In Proceedings of the Genetic and Evolutionary Computation Conference, Prague, Czech Republic, 13–17 July 2019; pp. 1111–1119. [Google Scholar]
Cao, H.; Si, C.; Sun, Q.; Liu, Y.; Li, S.; Gope, P. Abcattack: A gradient-free optimization black-box attack for fooling deep image classifiers. Entropy 2022, 24, 412. [Google Scholar] [CrossRef] [PubMed]
Suryanto, N.; Kang, H.; Kim, Y.; Yun, Y.; Larasati, H.T.; Kim, H. A distributed black-box adversarial attack based on multi-group particle swarm optimization. Sensors 2020, 20, 7158. [Google Scholar] [CrossRef] [PubMed]
Mukeri, A.F.; Gaikwad, D.P. Towards Query Efficient and Derivative Free Black Box Adversarial Machine Learning Attack. Int. J. Image Graph. Signal Process. 2022, 13, 16. [Google Scholar] [CrossRef]

Figure 1. The overall framework of EFSAttack. For the input image, we first extract its edge map via edge detection and edge expansion. Then, in the population initialization process, we sample noise from the Gaussian distribution and preprocess the noise with the edge map before adding it to the input image. Finally, the perturbation images in the population are continuously updated through the update strategy during the population evolution iterations until the adversarial examples are generated. The input image can be correctly classified by the model as the ‘automobile’, while the adversarial example generated by EFSAttack is misclassified as the ‘truck’ by the model.

Figure 2. The overall process of EFSAttack.

Figure 3. Distribution plots in a two-dimensional space of the features extracted from different classes of samples in the CIFAR-10 dataset. (a) Features of the original examples. (b) Features of the examples with edge noise.

Figure 4. The effects of adding noise at different locations on the image. (a) The original image. (b) Adding noise to the entire region of the image. (c) Adding noise to the edge region of the image.

Figure 5. Adversarial examples generated by EFSAttack on CIFAR-10 and MNIST datasets. (a) EFSAttack on CIFAR-10. (b) EFSAttack on MNIST.

Figure 6. The distribution of perturbed images. The five images are not within each other’s field of view. Without the random update strategy, the perturbed images will not be updated.

Figure 7. Comparison of adversarial examples generated by EFSAttack and its derived algorithms. (a) The input examples; (b) the adversarial example generated by EFSAttack-w/o-mask; (c) the adversarial examples generated by EFSAttack-w/o-reduction; and (d) the adversarial examples generated by EFSAttack.

Figure 8. The impact of different hyperparameters on the performance of the CIFAR-10 and MNIST datasets. (a) Population size. (b) Visual range. (c) Step size.

Table 1. Performance comparison of EFSAttack and compared algorithms.

	CIFAR-10			MNIST
	Success Rate	Avg. Queries	Avg. L2	Success Rate	Avg. Queries	Avg. L2
ZOO	100.00%	12,800	0.199	100%	384,000	1.496
GenAttack	96.50%	1360	1.3651	94.45%	1801	5.191
AdversarialPSO	99.60%	1224	1.414	96.30%	593	4.143
ABCAttack	98.60%	330	1.643	100%	629	4.010
MGRR-PSO Attack	100.00%	694	1.767	100%	1288	4.2805
BAOA	69.00%	820	1.990	-	-	-
EFSAttack(ours)	100.00%	541	1.396	98.4%	583	3.745

Table 2. Performance of EFSAttack with different update strategies.

	CIFAR-10			MNIST
	Success Rate	Avg. Queries	Avg. L2	Success Rate	Avg. Queries	Avg. L2
EFSAttack	100%	541	1.396	98.4%	583	3.745
EFSAttack-w/o-prey	22.8%	5	0.105	2.2%	5	0.038
EFSAttack-w/o-swarm	99.6%	471	2.512	98.4%	528	3.949
EFSAttack-w/o-follow	99.6%	473	2.516	98.3%	525	3.961

Table 3. Performance of EFSAttack with different noise constraint strategies.

	CIFAR-10			MNIST
	Success Rate	Avg. Queries	Avg. L2	Success Rate	Avg. Queries	Avg. L2
EFSAttack	100%	541	1.396	98.4%	585	3.769
EFSAttack-w/o-mask	100%	497	1.304	100%	515	3.703
EFSAttack-w/o-reduction	100%	393	3.103	98.1%	416	5.865

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).