Posts by Collection

publications

Benchmarking Failures in Tool-Augmented Language Models

Published in NAACL, 2025

Paper on latest benchmark accepted to NAACL

Recommended citation: Eduardo Treviño*, Hugo Contant*, James Ngai, Graham Neubig, Zhiruo Wang (2025). Benchmarking Failures in Tool-Augmented Language Models." NAACL.
Download Paper

teaching

Module 0: Intro to Competition Math

Daily Challenge @ Expii, High School 2022

TA for Intro to Competition Math

10-708: Probabilistic Graphical Models

Carnegie Mellon Machine Learning Department, Spring 2025

TA for Graduate Course in Graphical and Generative Models