Preserving the Integrity of Large Language Models: Strategies Against Adversarial Attacks and Input Distortions
Keywords:
Large Language Models , Robustness, Adversarial Attacks, Input Perturbations, Adversarial Training, Robust Optimization, Input Preprocessing, VulnerabilitiesAbstract
Large language models (LLMs) have demonstrated unprecedented performance across diverse natural language processing tasks, yet their vulnerability to adversarial attacks and input distortions raises concerns about their integrity and reliability. This paper investigates strategies for preserving the integrity of LLMs by mitigating adversarial attacks and input distortions. By addressing the crucial issue of integrity preservation, this research contributes to the development of trustworthy and dependable large language models for real-world applications.
Downloads
Published
01-05-2024
Issue
Section
Articles
How to Cite
Preserving the Integrity of Large Language Models: Strategies Against Adversarial Attacks and Input Distortions. (2024). Asian American Research Letters Journal, 1(1). https://aarlj.com/index.php/AARLJ/article/view/10