Benchmarking Failures in Tool-Augmented Language Models
Published in NAACL, 2025
Paper on latest benchmark accepted to NAACL
Recommended citation: Eduardo TreviƱo*, Hugo Contant*, James Ngai, Graham Neubig, Zhiruo Wang (2025). Benchmarking Failures in Tool-Augmented Language Models." NAACL.
Download Paper