Abstract
The goal of textual adversarial attack methods is to replace some words in an input text in order to make the victim model misbehave. This paper proposes an effective word-level adversarial attack method based on sememes and an improved quantum-behaved particle swarm optimization (QPSO) algorithm. The sememe-based substitute method, which uses the words sharing the same sememes as the substitutes of the original words, is first employed to form the reduced search space. Then, an improved QPSO algorithm, so-called historical information guided QPSO with random drift local attractor (HIQPSO-RD), is proposed to search the reduced search space for adversarial examples. The HIQPSO-RD introduces historical information into the current mean best position of the QPSO, for the purpose of improving the convergence speed of the algorithm, by enhancing its exploration ability and preventing the premature convergence of the swarm. The proposed algorithm employs the random drift local attractor technique to make a good balance between its exploration and exploitation, so that the algorithm can find a better adversarial attack example with low grammaticality and perplexity. In addition, it employs a two-stage diversity control strategy to enhance the search performance of the algorithm. Experiments on three natural language processing (NLP) datasets, with three commonly used nature language processing models as victim models, show that our method achieves higher attack success rates and with lower modification rates than the state-of-the-art adversarial attack methods. Moreover, the results of human evaluations show that adversarial examples generated by our method can better maintain the semantic similarity and grammatical correctness of the original input.
Original language | English |
---|---|
Pages (from-to) | (In-Press) |
Number of pages | 12 |
Journal | IEEE Transactions on Neural Networks and Learning Systems |
Volume | (In-Press) |
DOIs | |
Publication status | Published - 19 Jun 2023 |
Keywords
- Deep neural networks (DNNs)
- natural language processing (NLP)
- particle swarm optimization (PSO)
- sememe-based substitute method
- textual adversarial attacking