[Elsnet-list] New paper and resources to support anatomical entity recognition at literature scale
Paul.Thompson at manchester.ac.uk
Mon Oct 28 12:25:45 CET 2013
Anatomical entity mention recognition at literature scale
Sampo Pyysalo and Sophia Ananiadou
Bioinformatics 2013, doi: 10.1093/bioinformatics/btt580
Anatomical entities ranging from sub-cellular structures to organ systems are central to biomedical science, and mentions of these entities are essential to understanding the scientific literature. Despite extensive efforts to automatically analyse various aspects of biomedical text, there have been only few studies focusing on anatomical entities, and no dedicated methods for learning to automatically recognize anatomical entity mentions in free-form text have been introduced.
We present AnatomyTagger, a machine learning-based system for anatomical entity mention recognition. The system incorporates a broad array of approaches proposed to benefit tagging, including the use of UMLS- and OBO-based lexical resources, word representations induced from unlabelled text, statistical truecasing, and non-local features. We train and evaluate the system on a newly introduced corpus that substantially extends on previously available resources, and apply the resulting tagger to automatically annotate the entire Open Access scientific domain literature. The resulting analyses have been applied to extend services provided by the Europe PMC literature database.
All tools and resources introduced in this work are available from http://nactem.ac.uk/anatomytagger
The following resources described in the paper have all been made available under open source (MIT) and open data (CC BY-SA) licences:
- AnatEM: corpus of 1200 documents manually annotated for 13,700 anatomical entity mentions
- AnatomyTagger: tool for the recognition of anatomical entity mentions in text
- Results of tagging all of the 600,000 PMC OA full-text documents, identifying 48M anatomical entity mentions
For these and other resources, please see the AnatomyTagger homepage:
School of Computer Science
National Centre for Text Mining
Manchester Institute of Biotechnology
University of Manchester
131 Princess Street
Tel: 0161 306 3091
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Elsnet-list