Transformation of MSNBC and ACE2004 to NIF

MSNBC and ACE2004 are two popular gold standard datasets, commonly used in quality validation of EL approaches.

MSNBC, available here, was constructed from MSNBC news and an automatic disambiguation named entities process. On the other hand, the original ACE2004 is available from here, containing an annotated subset of ACE co-reference data set.

Due to the increasing availability of tools aimed to deal with NIF, we transform both datasets to NIF, making available these new versions of them MSNBC and ACE2004. It is part of our work [1].

[1] Henry Rosales-Méndez, Aidan Hogan and Barbara Poblete. NIFify: Supporting NIF for Entity Linking. (in progress)