ICTIR ’15 paper on Entity Linking in Queries
In this paper, we differentiate between the task of entity linking in documents, entity liking in queries, and semantic mapping (entity ranking). We also present the proper evaluation metrics for the task of entity linking in queries, examine the publicly available datasets, and introduce a new manually curated dataset for entity linking in queries. We further show effective methods for addressing this task. All the resources (including dataset, runs, and evaluation scripts) and the slides are made publicly available.
Below, I summarize the paper and highlight the take away messages:
– Output of entity linking in documents is set of entity-mention pairs, while the output for entity linking in queries should be set of entity set(s), referred to as interpretation set. Entity linking in queries should not be addressed as the task of semantic mapping, which ranks entities based on their relevance to the query (see Table 1 and Section 3).
– Entity linking in queries should be evaluated by Precision, Recall and F-measure (see Section 3.4). Some refinements and a new evaluation metric (lean evaluation) are proposed (see Equations 2-7).
– The semantic mapping task should not be considered as an entity linking task. Entity disambiguation is an essential part of entity linking, which is completely ignored in semantic mapping (see Section 3).
– Comparing semantic mapping to the results generated by traditional entity linking methods is inappropriate and results in false and misleading claims (see Section 6.3).
– A new dataset, Y-ERD, is introduced and made publicly available for the task of entity linking in queries. This dataset contains 2398 queries, which is much larger that the preciously introduced ERD-dev (91 queries) dataset (see Section 4).
– A simple, yet very effective, entity ranking model is introduced (see Equation 17).
– An effective greedy algorithm (GIF) is introduced for the task of interpretation finding, which is the final goal of entity linking in queries (see Section 5.3).