@inproceedings{verma-etal-2023-evaluating,
    title = "Evaluating Paraphrastic Robustness in Textual Entailment Models",
    author = "Verma, Dhruv  and
      Lal, Yash Kumar  and
      Sinha, Shreyashee  and
      Van Durme, Benjamin  and
      Poliak, Adam",
    booktitle = "Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)",
    month = jul,
    year = "2023",
    address = "Toronto, Canada",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.acl-short.76",
    pages = "880--892",
    abstract = "We present PaRTE, a collection of 1,126 pairs of Recognizing Textual Entailment (RTE) examples to evaluate whether models are robust to paraphrasing. We posit that if RTE models understand language, their predictions should be consistent across inputs that share the same meaning. We use the evaluation set to determine if RTE models{'} predictions change when examples are paraphrased. In our experiments, contemporary models change their predictions on 8-16{\%} of paraphrased examples, indicating that there is still room for improvement.",
}