Skip to main content

Adversarial Math Word Problem Generation

Publication ,  Conference
Xie, R; Huang, C; Wang, J; Dhingra, B
Published in: Emnlp 2024 2024 Conference on Empirical Methods in Natural Language Processing Findings of Emnlp 2024
January 1, 2024

Large language models (LLMs) have significantly transformed the educational landscape. As current plagiarism detection tools struggle to keep pace with LLMs' rapid advancements, the educational community faces the challenge of assessing students' true problem-solving abilities in the presence of LLMs. In this work, we explore a new paradigm for ensuring fair evaluation-generating adversarial examples which preserve the structure and difficulty of the original questions aimed for assessment, but are unsolvable by LLMs. Focusing on the domain of math word problems, we leverage abstract syntax trees to structurally generate adversarial examples that cause LLMs to produce incorrect answers by simply editing the numeric values in the problems. We conduct experiments on various open- and closed-source LLMs, quantitatively and qualitatively demonstrating that our method significantly degrades their math problem-solving ability. We identify shared vulnerabilities among LLMs and propose a cost-effective approach to attack high-cost models. Additionally, we conduct automatic analysis to investigate the cause of failure, providing further insights into the limitations of LLMs.

Duke Scholars

Published In

Emnlp 2024 2024 Conference on Empirical Methods in Natural Language Processing Findings of Emnlp 2024

DOI

Publication Date

January 1, 2024

Start / End Page

5075 / 5093
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Xie, R., Huang, C., Wang, J., & Dhingra, B. (2024). Adversarial Math Word Problem Generation. In Emnlp 2024 2024 Conference on Empirical Methods in Natural Language Processing Findings of Emnlp 2024 (pp. 5075–5093). https://doi.org/10.18653/v1/2024.findings-emnlp.292
Xie, R., C. Huang, J. Wang, and B. Dhingra. “Adversarial Math Word Problem Generation.” In Emnlp 2024 2024 Conference on Empirical Methods in Natural Language Processing Findings of Emnlp 2024, 5075–93, 2024. https://doi.org/10.18653/v1/2024.findings-emnlp.292.
Xie R, Huang C, Wang J, Dhingra B. Adversarial Math Word Problem Generation. In: Emnlp 2024 2024 Conference on Empirical Methods in Natural Language Processing Findings of Emnlp 2024. 2024. p. 5075–93.
Xie, R., et al. “Adversarial Math Word Problem Generation.” Emnlp 2024 2024 Conference on Empirical Methods in Natural Language Processing Findings of Emnlp 2024, 2024, pp. 5075–93. Scopus, doi:10.18653/v1/2024.findings-emnlp.292.
Xie R, Huang C, Wang J, Dhingra B. Adversarial Math Word Problem Generation. Emnlp 2024 2024 Conference on Empirical Methods in Natural Language Processing Findings of Emnlp 2024. 2024. p. 5075–5093.

Published In

Emnlp 2024 2024 Conference on Empirical Methods in Natural Language Processing Findings of Emnlp 2024

DOI

Publication Date

January 1, 2024

Start / End Page

5075 / 5093