Frequently Asked Questions

  1. What is SAPIENTA?
  2. What is SAPIENT?
  3. How do I install SAPIENT and/or SAPIENTA?
  4. How do I run SAPIENTA or SAPIENT as a standalone process?
  5. Do I need to be on-line to work with SAPIENTA?
  6. How do I upload a paper?
  7. How do I annotate a paper?
  8. How do I save my annotations?
  9. What is CoreSC Annotation?
  10. What is Clear CoreSC Annotations?
  11. What is OSCAR Annotation?
  12. What is Clear OSCAR Annotations?
  13. How do I remove or change an existing annotation?
  14. Where are my annotations stored?
  15. What is the Comments area for?
  16. What is the OSCAR key?
  17. How do I quit SAPIENTA or SAPIENT?
  18. “Warning unresponsive script” : How do I deal with this?
  19. My PC has 512M or less RAM available. Can I run SAPIENT or SAPIENTA?
  20. Can I have two browser tabs open at the same time pointing to different papers viewed by SAPIENT?
  21. How can I port SAPIENTA or SAPIENT to work with other XML schemas?

1. What is SAPIENTA?


SAPIENTA stands for “Semantic Annotation of Papers: Interface & ENrichment Tool Automated” and incorporates a machine learning classifier for identifying CoreSCs trained using Conditional Random Fields (CRF). The machine learning classifier has been evaluated on 265 chemistry and bio-chemistry papers yielding more than 50% average accuracy for the 11 Core Scientific Concepts. The automatically generated concepts have been used to generate automatic summaries, evaluated in a question answering task by chemistry experts, yielding a precision of 75% and a recall of 66%. SAPIENTA also allows multi-label annotation at the sentence level and has been used by three biology experts to annotate 50 biology papers from Pubmed Central.

SAPIENTA is the successor of SAPIENT and can be also used for manual multi-label annotation of sentences.

2. What is SAPIENT?


SAPIENT stands for “Semantic Annotation of Papers: Interface & ENrichment Tool”. It is an annotation interface implemented as a web application, to help users annotate scientific papers in XML, sentence by sentence, with a set of concepts called Core Scientific Concepts (CoreSCs) (See Guidelines for the Annotation of General Scientific Concepts, http://ie-repository.jisc.ac.uk/88/. Note that General Scientific Concepts have been rebranded as Core Scientific Concepts). CoreSCs constitute the set of concepts essential for describing a scientific investigation. However, SAPIENT can also be used in conjunction with other annotation schemes to annotate papers in XML sentence by sentence (See question 21). SAPIENT also incorporates Oscar3 functionality, allowing the automatic annotation of chemical named entities.

The functionality of SAPIENT is now incorporated in SAPIENTA, which also allows the recognition of core scientific concepts automatically. Using SAPIENTA one can also perform multi-label annotation at the sentence level. We recommend that you use SAPIENTA rather than SAPIENT. However, SAPIENT is still available to download from here.

3. How do I install SAPIENT and/or SAPIENTA?


Both pieces of software are installed in the same manner.

  1. First you need to make sure you have downloaded and installed the web browser Firefox 3 or later from http://www.mozilla.com/en-US/firefox/.

    NOTE: You need Firefox 3 or later because SAPIENT and SAPIENTA make use of some javascript technology that was not the default in earlier browsers. We have a different version of SAPIENTA for Firefox 3 and later versions of Firefox, choose the one appropriate for you. SAPIENT is only available for Firefox 3 at the moment. Make sure you have java 1.6 or later. You just need the Java Runtime Environment (JRE). To check if you have java, open a command line (or command prompt in windows see 16. ) and type: “java -version”. Then click enter. This should tell you if you have java and which version.

    If you don’t have java, you can download the latest version for your operating system (OS) here. SAPIENTA and SAPIENT are java based and therefore will theoretically run on all operating systems (OS).
    Information in all other questions pertains to SAPIENTA, but they apply to SAPIENT as well.

  2. You have several options:
    1. Run the latest version of SAPIENTA for Firefox 3 or later. Both versions are available from here [INCLUDES CORESC AUTOMATION, RECOMMENDED]
    2. Decide to use SAPIENT from within Oscar3 [NO CORESC AUTOMATION]. In this case refer to the README.txt file released with Oscar3 for instructions on how to install and run Oscar3. A link to SAPIENT will appear from the index page of Oscar3.
    3. Decide to run SAPIENT as a standalone process. [NO CORESC AUTOMATION]

      In this case you can:

      1. Download sapient.jar from sourceforge.net or click here to download from this site.
      2. Compile your own sapient.jar from the source code in sourceforge.net. If you choose this option, we assume you know what you are doing. Make sure you have java 1.6 or later. Then run ant with the ARTbuild.xml file, instead of the build file.

4. How do I run SAPIENTA or SAPIENT as a standalone process?


If you have Firefox (3 or later) installed and java 1.6 or later, you can run SAPIENTA [SAPIENT only runs with Firefox 3 at the moment]. Open a command prompt and navigate to the “sapienta” directory you have just created (see Q17).

Type: “java -Xmx512m -jar sapienta.jar Server” The first time you run this, it will ask you about whether you want to configure a server. Answer ‘yes’. It will then ask you if you want a full web server, answer ‘full’. It will then ask you if you would like to specify a working directory. Just press enter. Finally, it will ask if you want to lock the server down so that it can only be accessed from the machine it is running on. Answer ‘yes’.

If the Server setup is successful, you should see a 4-line message appearing at the command prompt, one of which should be: Server ready – go to http://127.0.0.1:8181/.

In Firefox, navigate to the URL: http://localhost:8181. You should be able to see the SAPIENT Index now!

NOTE: The first time you run sapient.jar as described above, it will create new folders in your sapient directory. Put any files for annotation in the “corpora” folder. Annotations will be stored automatically in the “scrapbook” folder.

5. Do I need to be on-line to work with SAPIENTA?


For manual annotation you don’t need to be on-line to work with either SAPIENT or SAPIENTA.
It runs its own, safe, webserver which is locked to the outside world. If you have SAPIENT or SAPIENTA running (see Q3), you can access http://localhost:8181 and the SAPIENT/SAPIENTA annotation interface.

HOWEVER, for automatic annotation with SAPIENTA you do need to be online, as papers are sent to an external server which runs the automation process.

6. How do I upload a paper?


SAPIENTA allows one to upload papers in XML. The schemas it supports currently are SciXML and the Pubmed Central DTD. However, SAPIENTA should be able to handle any XML document which has the elements: <PAPER></PAPER>,<TITLE></TITLE>,<ABSTRACT></ABSTRACT>and <BODY> </BODY> The minimal DTD that would work with SAPIENTA is available for download here.

On the SAPIENTA Index page click on “Browse” to locate the folder containing the paper(s) you want to upload (we recommend that you store papers in the “corpora” directory). Select a paper and click “Open”. You then need to give a name, preferably the name of the paper without the suffix. Click on “Upload”. You should see a link to the paper appearing on the page. A folder with the same name as the paper you have just uploaded will appear in the “scrapbook” folder. From now onwards, all previously uploaded papers should appear as links when you go to SAPIENT Index.

7. How do I annotate a paper?


It is recommended that you read the paper first in .pdf prior to annotating it sentence by sentence. When annotating the papers you should also have annotation guidelines handy.


At the SAPIENT Index page, click on the paper you want to annotate. This will re-direct you to a new page, where the paper is displayed sentence by sentence.


Annotation involves selecting for each sentence an option from EACH of the three drop-down menus below it. Please do not leave sentences with incomplete annotations.


a) The first drop-down corresponds to the types of CoreSC one can assign to a sentence. The CoreSCs are also visible at all times in the top menu bar.


b) Depending on the type of CoreSC you have chosen, you may also need to specify properties of the CoreSC, which corresponds to the second drop-down (subtypes). For most CoreSC types the only subtype option is <None>, which means that there are no properties to be chosen. <Object> , <Method> and <Motivation> (the latter in SAPIENTA but not SAPIENT) constitute exceptions. <Method> can have the properties <New>/<Old>, specifying whether the <Method> is <Old> or <New> (see the annotation guidelines). Or it can have the properties <Advantage>/<Disadvantage>. The latter properties can be chosen when there is already a sentence annotated as <Method> and the current sentence refers to the <Advantages> or <Disadvantage> of the particular method. Similarly, <Object> can have the properties <New> and <Advantages>/<Disadvantage>. <Motivation> can have the properties <New>/<Old>/<Future>
c) The third drop-down corresponds to concept identifiers (IDs). A concept may span over several sentences, so we cannot rely on using sentence IDs (the numbers to the left of each sentence) to keep track of different concepts. Concept IDs are contingent upon the type of CoreSC; Once you have selected the CoreSC type of a sentence, its concept ID can either correspond to an already annotated sentence of the same CoreSC type or it can be a new ID, for a new instance of this CoreSC type. To choose between the two possibilities, decide whether the sentence talks about the same CoreSC concept as a previous sentence or not.

8. How do I save my annotations?


This questions only applies to manual annotation with SAPIENTA or SAPIENT, as automatic annotations are saved at the time classifier results are transferred locally.

When you manually annotate a sentence by selecting an option for each of its drop-downs, these changes will last for as long as you keep the browser open and the server running.


To save your changes to a file, click on the link “Save” in the top menu bar. An alert will verify that the changes have been saved. Sometimes you may need to wait for a couple of seconds. The annotations have been translated into XML and saved in the file “mode2.xml” of the folder in the scrapbook directory, which corresponds to the current paper. The “mode2.xml” for each paper is updated every time you make a change to an annotation and click on “Save”. It is also loaded in and translated back into html every time you click on the current paper from SAPIENT Index.


You don’t have to save after each and every change and you don’t have to completely finish a paper in one go. You can work however you like as long as you remember to save your changes before quitting SAPIENT or closing the browser.


Note: To annotate a sentence, you need to choose an option from ALL three drop-downs. If you don’t specify a concept ID, the annotation will not be saved, even if you click on “Save”.


If you want to transfer over your work to another machine, copy the contents of your “scrapbook” and “corpora” folders over.

9. What is CoreSC Annotation?


This questions applies to SAPIENTA but not SAPIENT.

SAPIENTA allows users to automatically annotate core scientific concepts in papers (CoreSC).
You need to be online for this feature to work.
When you click on the “CoreSC Annotation” link, the system asks for your email address and your paper will be submitted for classification on an EBI server, which will run a CoreSC machine learning classifier training using CRFs. You will be notified once the paper is classified by email (it takes about 1-2 means for the queuing system to detect it). This way you can upload several papers at once.

10. What is Clear CoreSC Annotations?


If you don’t want to remove just a single sentence annotation from a paper, as suggested in question 13, but instead you want to remove ALL of your CoreSC sentence annotations for that paper click on “Clear CoreSC Annotation”.


This is an option we would not recommend using very often unless you want to start annotating a paper from scratch. “Clear CoreSC Annotations” takes effect immediately, removing all annotations from the mode2.xml file without requiring Save, so be careful!

Note: In SAPIENT this feature used to be called “Clear Own Annotation”

11. What is OSCAR Annotation?


In SAPIENTA If you click on the link “OSCAR Annotation” in the top menu bar, SAPIENTA will invoke Oscar3, a system for the automatic annotation of noun phrases representing chemical entities. These annotations are colour-coded and you can consult the “Oscar key” for their interpretation.


You can remove these annotations at any point by clicking on “Clear OSCAR Annotations”.

Note that in SAPIENT, this feature was called “Auto Annotation” and “Clear Auto Annotation”

12. What is Clear OSCAR Annotations?


See question 11.

Note that in SAPIENT this feature used to be called “Clear Auto Annotations”.

13. How do I remove or change an existing annotation?


To remove or change a single sentence annotation, modify the options selected for that particular sentence accordingly and click on “Save”.

14. Where are my annotations stored?


Once you click on “Save” your annotations are stored in “mode2.xml” in the folder scrapbook which bears the same name as the current paper.

15. What is the Comments area for?


If you want to make remarks about particular sentences during the annotations of a paper (e.g. if you had difficulties making a decision, you may want to keep a note of the alternative CoreSCs you considered), you can use the comments area. When you click “Save”, any comments you have entered are saved in the “comments.txt” file in the scrapbook folder for that paper.


Every time you edit the comments area and save the changes, the “comments.txt” file of the corresponding paper is updated.

16. What is the OSCAR key?


See question 11.

17. How do I quit SAPIENTA or SAPIENT?


To quit SAPIENT, close down the tab in Firefox pointing to the http://localhost:8181 location and stop the sapient.jar running in the command line. (Windows users can select control prompt window, Ctrl + C, then close the command prompt window).


Remember you need to restart the server as in Q3 to use SAPIENT again. For your convenience, you may want to bookmark the server address (http://localhost:8181).

18. “Warning unresponsive script” : How do I deal with this?


This message is less likely to occur with SAPIENTA.

It may come up when you have clicked to view a paper in SAPIENT which you have already annotated. It happens because the Javascript is probably too memory heavy for your computer. Just click “continue” every time the message pops up, and eventually the paper will display fully.


Please DON’T click “Save” before the script has finished loading (or afer cancelling the script) as this will only partially save your annotations.

Another remedy to stop this problem from occurring is to allocate less RAM to SAPIENT or SAPIENTA when starting the server. For example you could try “java -Xmx249m -jar sapient.jar Server”.

Finally, the more expensive solution (with long term benefits, though) is a RAM upgrade.

19. My PC has 512M or less RAM available. Can I run SAPIENT or SAPIENTA?


Yes, you can run SAPIENT or SAPIENTA, but you will have to allocate less memory to it. When you run the server, try using “java -Xmx249m -jar sapient.jar Server”.

You are also more likely to experience warning about unresponsive Javascript.

20. Can I have two browser tabs open at the same time pointing to different papers viewed by SAPIENT?


Yes, there is no reason why you cannot have two browser tabs open looking at SAPIENT or SAPIENTA. However, if you have 512M RAM or less it may cause your browser to crash.

21. How can I port SAPIENTA or SAPIENT to work with other XML schemas?


There are two aspects to this question, namely the following:


(a) How can SAPIENT recognise papers written in other XML schemas?


(b) How can SAPIENT annotate papers according to annotation schemas other than CISP?


In answer to (a) see question 6

Two example documents are downloadable from the Software page of this site to give you an indication of how SAPIENT will work. The first one, test.xml, is a full paper in SciXML and the second, testsmall.xml is a minimal version of a document that will be accepted by SAPIENT.

In answer to b), in order to write your own sentence based schemas to use with SAPIENT, you need to obtain the source code for SAPIENT (the source code for SAPIENTA is not available for download at this stage). To this, you need to add a new .xsl file in the uk.ac.aber.art_tool.art_tool_web.xsl package and substitute mode2.xsl for this new file wherever it is referenced in ARTSciXMLDocument.java. The latter class is found in the uk.ac.aber.ar_tool package.

To give you an example of an alternative annotation schema we have included the dummy fruit.xsl. Instructions are available as comments in fruit.xsl, mode2.xsl and art-tool.js . Re-compile SAPIENT running ant and you will have a new version of SAPIENT, working for your particular sentence base annotation schema.