Revisiting Automated Evaluation for Long-form Table Question Answering in the Era of Large Language Models EMNLP 2024 TQA Is This a Bad Table? A Closer Look at the Evaluation of Table Generation from ...