[Elsnet-list] text mining of full text articles/books

John McNaught John.McNaught at manchester.ac.uk
Fri Sep 30 12:51:56 CEST 2011


Dear elsnetters,

This mail is primarily addressed to academic researchers, and is in
relation to proposals by the UK Government to introduce legislation to
create a copyright exception for non-commercial text and data mining. It 
would also be of interest to hear about such attempts in relation to 
other areas of NLP/CL, e.g. corpus linguistics, machine translation, 
multidocument summarisation, ...

I would be interested to hear from any academic researcher (from any
country) who has attempted to obtain access to published content
(especially full text articles and books) for research purposes
involving text mining or data mining, and has not been successful in
obtaining such access. Brief details are perfectly OK, e.g.

Institution/Country:
Research envisaged:  (very brief generic indication e.g. 1 sentence)
Reason(s) for failure to obtain access:
e.g. (by no means a closed list)
* blanket refusal
* read licensing conditions and gave up at that point (which particular
conditions presented barriers?)
* protracted negotiations leading nowhere, life is too short, gave up
* would have had to contact too many publishers to seek permission
* could not feasibly assign individual author attribution especially in
data mining phases
* payment requested even though your institution subscribes to the
journals or the e-books
* could not release results to or build services on results for the
community, so not worthwhile to pursue
	A special case of this is: got access only within the context of a 
collaborative research project involving the publisher as a data
provider, but could not use content or results outside that project for
the benefit of the community
* format or broker issues (told "you have access already via your
institutional subscriptions" but this turns out to be access for humans
via some intermediate application that prevents or hinders text mining)

Any information of the above kind would be very welcome in helping to
form an evidence base.

John McNaught

-- 
John McNaught                   John.McNaught at manchester.ac.uk
School of Computer Science

and
Deputy Director
National Centre for Text Mining
Manchester Interdisciplinary Biocentre
University of Manchester
131 Princess Street                      tel: +44.161.306.3098
Manchester                               fax: +44.161.306.5201
M1 7DN                                   web: www.nactem.ac.uk
UK                                            www.textminingcentre.ac.uk



More information about the Elsnet-list mailing list