MESUR summary

The Los Alamos National Laboratory Research library was awarded funding from the Andrew W. Mellon Foundation for a two-year project on the investigation of usage-based scholarly evaluation metrics. The project’s Principal Investigator will be Johan Bollen who will conduct the work under the auspices of the Los Alamos National Laboratory Research library Digital Library Research & Prototyping led by Herbert Van de Sompel.


The introduction of digital dissemination models has introduced the need for novel means to evaluate the scholarly communication process. Usage data has attracted considerable attention since it does not suffer from publication delays and can in principle be recorded for any type of scholarly communication item. Unfortunately, the definition and validation of usage-based scholarly evaluation metrics has been problematic. Most usage data sets are recorded for the user communities of particular services, i.e. they are not representative of the scholarly community as a whole, and are insufficiently linked to other sources of information on the scholarly communication process for them to be cross-validated.


The proposed work therefore consists of the definition of a formal model of the scholarly communication process. This model semantically relates a range of bibliographic, citation and usage data. The project aims to use the model to obtain and organize data from a variety of potential sources, such as for example JSTOR, CiteSeer, HighWire and Medline. The Andrew W. Mellon foundation may play a facilitating role in the identification of appropriate data sources and the establishment of subsequent agreements. LANL will in addition leverage local data sets such as LANL’s usage data, recently acquired California State University usage data and licensed ISI data sets. Recent LANL-Ex Libris collaborations focused on the acquisition of usage data can complement these efforts. This work will result in a large-scale reference data set on which a program for the definition and validation of usage-based scholarly evaluation metrics can be conducted. The proposed work will first focus on determining the overall properties and structure of the scholarly communication process on the basis of this reference data set. The acquired information will subsequently inform the definition of a range of usage-based scholarly evaluation metrics. A program for the cross-validation of the defined metrics will then commence.


The project will benefit the scholarly community involved with usage-based evaluation metrics by the definition of a formal model of the scholarly communication process, lessons learned from the generation of a large-scale reference data set, the definition and validation of a range of usage-based scholarly evaluation metrics, the formulation of a set of guidelines on their semantics and a resulting taxonomic model of the concept of scholarly status.