Paraphrase Identification: Current State of the Art
By: Zia Ul-Qayyum (Corresponding Author), Muhammad Aslam, Wasif Altaf, Muhammad Ramzan
Paraphrasing generally may be done at various levels like at word, sentence, paragraph or discourse level. However, from NLP perspective, research issues related to paraphrasing include paraphrase generation, paraphrase acquisition and paraphrase identification. Paraphrase identification (PI) is an important research dimension having practical implementations which are of paramount importance in application domain like Information Retrieval, Automatic Identification of Copyright Infringement, Question Answering, Natural Language Generation, Modelling Language Perception in an Intelligent Agent and Intelligent Tutoring Systems etc. PI has been approached previously by various lexical, syntactic, semantic and hybrid techniques. Moreover, machine learning approaches have also been used. This paper provides a comprehensive overview of this important research domain along with a review of prevalent techniques being employed to address the PI problem.