Applying Markov Logic to Document Annotation and Citation Deduplication
Résumé
Structured learning approaches are able to take into account the relational
structure of data, thus promising an enhancement over non-relational
approaches. In this paper we explore two document-related tasks in relational
domains setting, the annotation of semi-structured documents and the citation
deduplication. For both tasks, we report results of comparing relational learning
approach namely Markov logic, to non-relational one namely Support Vector
Machines (SVM). We discover that increased complexity due to the relational
setting is difficult to manage in large scale cases, where non-relational models
might perform better. Moreover, our experiments show that in Markov logic,
the contribution of its probabilistic component decreases in large scale domains,
and it tends to act like First-order logic (FOL).