Multi-CoreSC Annotation Guidelines

Posted on 5 December 2016 22:44 - No Comments

Here you can find the annotation guidelines used by experts to annotate publications with multiple Core Scientific Concepts (CoreSC) per sentence.

Multi-CoreSC CRA corpus (MCCRA)

Posted on 27 May 2016 10:54 - No Comments

As part of the SAPIENTA project 50 papers from the domain of Cancer Risk Assessment (CRA) were manually annotated by three biology experts, allowing multiple core scientific concepts per sentences. The corpus and its evaluation is described in our LREC 2016 paper: Multi-label Annotation in Scientific Articles – The Multi-label Cancer Risk Assessment Corpus You…

Read More

ART Project

Posted on 10 November 2010 13:53 - No Comments

The project that produced the SAPIENT tool for annotation of general scientific papers.

The ART Corpus

Posted on 10 November 2010 13:53 - No Comments

As part of the ART project 265 chemistry papers were manually annotated using core scientific concepts. The resultant corpus contains over 1 million words or 40,000 sentences. For further information and downloading the corpus visit: Please reference the corpus as: Liakata Maria and Soldatova Larisa. 2009. The ART corpus. Technical report, Aberystwyth University. All…

Read More

Easily browsable ART corpus

Posted on 10 November 2010 13:52 - No Comments

A site for browsing papers in the ART corpus hosted at UKOLN. Contains the corpus description and the pages can also be downloaded from here.