Strategic assessment of research performance indicators

- an ARC Linkage Project


Project personnel

Researchers

Research Assistants

The increasing importance of quantitative indicators of research performance

In most OECD countries, increasing emphasis is being laid on greater public accountability, with a need to demonstrate the effectiveness and efficiency of government-supported research. A workshop held by the OECD in 1997 characterised recent evaluation of basic research as “a rapid growth industry” (OECD 1997).
This new demand for research evaluation can not be fully catered for by traditional peer review which has only a finite capacity. Researchers can devote only a limited proportion of their time to peer evaluation before their own work begins to suffer. As a result, there has been an increased use of quantitative performance indicators that have the added advantage of being more cost efficient. Peer review at the institutional or system level is very expensive - it has been estimated that the British Research Assessment Exercise (RAE) expends 10% of available research funding on the process (Schnitzler & Kazemzadeh 1995).
Australia is no exception to this changing policy environment. The first major study on performance measures for universities reported little systematic use of research indicators at the level of department or institution (Bourke 1985). By 1991, the use of a wide range of performance indicators was being proposed, including the establishment of a research publications collection (Linke et al. 1991).
The introduction of the Research Quantum (RQ) in the early 1990s saw the first distribution of research funds to universities based on a formula encapsulating a number of performance measures (graduate student numbers or completion rates, research income, and publications). The RQ was subsequently replaced by two new funding schemes - the Institutional Grants Scheme (IGS) and the Research Training Scheme (RTS). Both use a formula comprised of the same three elements, though the weighting for each element varies between schemes. In 1996, a study undertaken on behalf of the National Board of Employment, Education and Training reported that most universities used some research performance measures as a basis for the distribution of a portion of their research funds within the institution.
The application of quantitative performance indicators - be it in funding formulas or elsewhere - is not without problems. An indication of this can be seen in the public discussions and opposing views on what is actually measured by performance indicators and how they affect the research enterprise. For example, the publications element in the RQ/IGS/RTS formulas attracted a considerable amount of attention in submissions to recent government enquiries(White Paper 1999, Senate 2001, Batterham 2000). A proposal to remove the publications component from the formula was vigorously opposed by those institutions who believed they would be financially disadvantaged.
Another source of tension within the sector is the internal use of variations of the RQ formula by many institutions to distribute research funds to faculties or even to individuals (Anderson et al. 1996, Marginson & Considine 2000). This has occurred in spite of the fact that the formula was never designed for intra-institutional funding allocations (Strand 1998).
It is essential that when the deployment of performance measures in the higher education sector results in substantive impact, the choice of measures used rests on a sound knowledge base covering their validity, fairness, transparency, independence, cost and the impact they will have on the research enterprise. By assessing an extensive range of performance measures, this study will provide Australian research management and research policy makers with rigorous information on which to base informed judgements on the utilisation of quantitative indicators of research performance.

Critical assessment of performance indicators

The assessment of research performance on the basis of quantitative indicators must meet a number of requirements, of which three are paramount. Firstly, it must accurately measure the characteristics which have been selected as the basis for the assessment. Secondly, it must be just - it must not create disadvantages for institutions or for researchers working in specific fields for reasons which cannot be influenced by them. And finally, its effects on research must be consistent with policy objectives. It may be impossible for every performance measure to satisfy each requirement, and trade-offs will frequently have to be made (Sizer 1998).
Quantitative indicators of research performance are used in two different contexts. They are increasingly applied systematically to rank the performance of institutions, groups or individuals, or to feed into formulas used to distribute research funds. Despite the rapid development of this use of quantitative measures, no thorough critical assessment of the indicators used has been undertaken in Australia or elsewhere (OECD 1997). Criticisms of the RQ by Anderson et al. (1996) and Marginson & Considine (2000) remain on a descriptive level and are not backed up by empirical research.
A recent investigation of Australia’s scientific output gives raise to concerns about the continued use of formulas in their existing forms (Butler 2001, 2002). It documents a significant increase in the country’s journal output, accompanied by a worrying decrease in the relative international impact of these publications as measured by citations. The timing of this productivity increase in relation to the introduction of funding formulas suggests that there is a causal relationship. This assumption is further supported by micro-sociological studies on researchers’ adaptive behaviour (Knorr-Cetina), anecdotal evidence from an Australian Study (Marginson & Considine 2000), and results of an extensive survey among Australian academics (Taylor 2001a, b). However, none of these studies provides conclusive evidence of the RQ’s impact on research practice. Moreover, the indicators applied in this formula have not been investigated separately and can therefore not be assessed.
Quantitative indicators are also being applied in ad hoc evaluations. In this context, science studies have focused on the application of bibliometric indicators for the evaluation of research performance. A number of problems have been identified where such evaluations rest on a single indicator. Using data from the Institute for Scientific Information (ISI), sociologists of science have demonstrated that publication counts are a poor measure of quality. Contrasting publication and citation counts in an analysis of American physicists, Cole and Cole (1967) classified scientists into four different types according to their publication practices. They found that publication counts tend to overrate “mass producers” (high publication counts, low citation counts) while underrating “perfectionists” (low publication counts, high citation counts). Later analyses also showed that it is misleading to apply simple publication counts across fields because publication practices (and thus average numbers of publications per year) are field-specific (Moed et al. 1985a).
The use of ISI’s journal impact factors for evaluation has been heavily criticised because it is easy to use but the calculation method employed has serious flaws (e.g. Glaenzel and Moed 2002). Recent work in scientometrics has focused on the application of advanced bibliometric indicators, such as citation counts that have been normalised by field, the journals used, or other characteristics (e.g. Schubert 1988). However the validity of these indicators has only been assessed against the peer review process of evaluation (e.g. van Raan 1996, Rinia et al. 1998). No comparative analysis on the application of those indicators in repeated evaluations or in funding mechanisms has been undertaken. The one exception is the recent development of a funding allocation model at the Delft University of Technology by researchers from the Centre for Science and Technology Studies (CWTS) at the University of Leiden (Van Leeuwen and Moed 2002).
The conduct of research is a complex activity No single measure will provide and adequate assessment of its performance, and it is necessary to use a suite of indicators. The choice of performance measures sends a powerful message about what is and is not considered an important outcome of that research (Weiler 2000). It is essential that all measures used to distribute scarce research funds, or to determine the relative standing of institutions or researchers, are critically analysed and the full ramifications of their deployment understood.

Aims of the project

The project aims to provide a knowledge base that supports informed decisions on the use of quantitative indicators of research performance. This knowledge base will consist of:

The project’s primary focus will be the range of bibliometric indicators that can be used for assessing the written output of research. However, other input and output measures will be included in the analysis.
The study will not attempt to identify a single ‘best-practice’ list of performance measures for use in the higher education sector. A basic premise of the project is that the ideal measures to be applied in any context will vary according to institutional settings, management priorities, and the basic purpose of the exercise in which they are being deployed. For example, measures to be used in a formula to distribute funds between institutions may have little overlap with measures aimed at identifying the leading researchers in a university. However, in achieving its aims the project will provide analysts with rigorous data on which to make informed judgements on the employment of performance indicators in a variety of common situations.

Significance and innovation

The use of performance indicators sends a powerful message to those being evaluated. Implicit in the choice of measures is a statement of what those utilising them consider to be most important. Participants at a 1997 OECD workshop noted that, in spite of the increasing emphasis on measuring research performance, “the effectiveness of the various approaches to the evaluation of research has not been critically assessed” (OECD 1997).
This study aims to overcome the gap in our knowledge base on performance indicators by undertaking an extensive empirical analysis to critically assess the measures commonly used. It will also undertake a systemic investigation of alternative performance measures and will assess measures used in other higher education systems. The database constructed in the course of the study, drawing information from a wide range of sources and covering an extensive variety of research characteristics, will be the most comprehensive information bank yet assembled on these issues.
On completion of this study, it will be possible, for the first time, to compare a wide range of performance measures (in terms of validity, fairness, transparency, impact on research, cost, and behavioural impact) when deciding on the most appropriate indicators to be introduced in a specific context. Until now, policy debate has foundered on a lack of firm empirical data to guide decision making. When the adoption of a modified version of England’s Research Assessment Exercise (RAE) was mooted in 1997, debate on the proposal foundered on institutions’ inability to calculate its resource implications (Bourke 1997).
The study will enable research managers and governments to make informed judgements on the deployment of performance indicators. Any projected changes in government policy, or in the administrative practices of research institutions, will now be informed by an extensive knowledge base on a wide range of possible indicators. That knowledge can be used to judge the robustness and likely impact of using performance measures in the specific context proposed.
It will provide a shared information base for dialogue with government over higher education research policy and enable a greater understanding of the implications of proposed management strategies.

Identifying performance measures

The first step in this study will identify possible performance measures applicable in assessing research. Three different strategies will be used to obtain a comprehensive overview of such indicators:

Strategy 1

All quantitative performance measures currently in use in the Australian higher education sector will be identified. This will be accomplished by extensively utilising internet resources to obtain institutional research management policies and procedures. In addition, follow-up interviews will be undertaken with research managers - either by phone or in person. The information obtained in this phase of the study will be used to select the institutions to be covered in more detailed case studies.

Strategy 2

A comprehensive search to identify additional bibliometric performance indicators will be undertaken. We will canvass the literature extensively for all measures that could conceivably be used to assess research performance. An important part of this work will be to determine whether bibliometric indicators developed by CWTS for ad hoc evaluations can be applied in a more systematic manner in the Australian context.
CWTS has undertaken several studies focusing on the evaluation of research using a number of complex bibliometric techniques. While most have not focussed on higher education systems in total, the measures they encompassed may be highly relevant to this study. The centre’s work has covered the problematic question of indicators that can be applied to the humanities and social sciences (Nederhof and van Raan 1989). CWTS has also validated a number of indicators against more traditional peer review judgements (Nederhof and van Raan 1987, van Raan 1996, Rinia et al. 1998), and have undertaken several evaluations of university research performance (Moed et al. 1985b, Moed and van Raan 1988).

Strategy 3

Other quantitative performance indicators will be identified. A number of studies have sought the opinion of academics on the most appropriate indicators by which to judge their research performance. A report commissioned in the early 1990s by the National Board of Employment Education and Training (NBEET) detailed the responses of nearly 4000 Australian academics to a survey on research performance indicators (NBEET 1993). In addition to the standard performance indicators (publication counts, research student numbers and completion rates, and levels of external research earnings), this study identified keynote addresses and prizes as important indicators of research performance.
In England, a working group set up by Higher Education Funding Council of England (HEFCE) to consider the role of quality assurance and education, identified several measures that universities were putting forward as indicators of quality: patents, innovations and spin-off developments, consultancies, industry links, awards, prizes and fellowships, journal editorships and editorial board membership, visiting positions elsewhere, and professional body activities (HEFCE 2000).
Other studies have looked at performance measures for specific fields of research. These include a detailed study of performance indicators that are relevant to the creative arts (Strand 1998), and an analysis of book reviews as a measure of research performance in the history of medicine (Lewison 2001). Other less common elements used include: institutional rankings, survey responses from graduate students, and number of patents (Husso et al. 2000, Shale 1999).

Establishing an experimental database

A central element of the proposed study will be the establishment of an experimental database, which will be used to test the efficacy of proposed performance measures identified by our research. This database will consist of:

The various components of the experimental database will be classified by field of research. This will enable us to test the use of various measures at both the sectoral and institutional level, and identify the effects of field-specific characteristics.
In order to provide an additional test of robustness to any proposed measure, the project will draw on international data from CWTS. CWTS maintains a database of all ISI-indexed publications, and in addition has institutional data equivalent to REPP’s Australian data for a number of European higher education systems. Research activities are global. Most performance measures that can be applied in the Australian context should also be applicable in other systems, and testing the measures in an additional setting will strengthen their assessment.

Assessing the performance measures

Each performance measure will initially be assessed on the ease with which the necessary data can be accessed and/or compiled. The robustness of those measures that are deemed practical, will then be assessed in relation to:

In order to fully assess the impact of indicators used in an institutional setting (rather than in the sector as a whole), case studies of selected Australian universities will be conducted using a representative sample of institutions. Senior research managers from several universities have already signaled their interest in participating in this phase of the project. Institutions will be chosen to reflect a range of research management strategies and research cultures, and will include both research-intensive universities and regional universities with a small research base. These case studies will be discussed with the universities’ senior research managers. In order to obtain information about likely impacts on research practices, the project will draw on the social studies of science literature and on the experience and opinions of the wider research community.
The project will identify performance measures suitable for deployment in the Australian higher education system, and will identify the range of contexts in which their use is appropriate. This will involve providing for each possible measure a comprehensive list of benefits and shortcomings. The project will also investigate the use of differential weighting systems to handle major variations that occur in the practices of researchers in the different fields.

Communication of results

The results of the research undertaken in this project will be disseminated through several communication channels to all interested parties - peers in the research field, participants in the project, relevant government agencies, and senior research administrators.

Communication with peers: The major findings of the project will be published in international journals focussing on research policy, the sociology of science, and bibliometrics. They will also form the basis for presentations at international conferences and seminars in these and related fields.

Communication with policy analysts
The results of the study will be published in a detailed report describing in detail the assessment of all measures analysed. The identity of individual institutions will be protected to ensure the focus of discussion is on the performance measures themselves, not the relative standing of institutions. The aim of the report would be to raise issues and provide detailed data to inform further discussions and policy decisions.
Communication with senior research managers and policy makers
Both intermediate and final results of case studies will be discussed with senior research managers of the universities involved. Prior to the completion of the final report, a workshop of senior research managers and policy makers will be held to discuss the findings of the study and seek feedback.

References

Anderson, Don, Richard Johnson, Bruce Milligan, 1996: Performance-based funding of universities. Commissioned Report No. 51, National Board of Employment, Education and Training. Canberra.

Batterham , Robin, 2000: The Chance to Change: Final report by the Chief Scientist. Canberra, Department of Industry, Science and Resources.

Bourke, Paul, 1997: Evaluating University Research: The British Research Assessment Exercise and Australian Practice, National Board of Employment, Education and Training Commissioned Report No. 56, Australian Government Publishing Service, Canberra.

Butler, Linda, 2001: Monitoring Australia’s Scientific Research: Partial indicators of Australia’s research performance. Canberra: Australian Academy of Science.

Butler, Linda, 2002: ‘Explaining Australia’s increased share of ISI publications - the effects of a funding formula based on publication counts’, Research Policy, 32(1):143-155.

Cole, Stephen and Jonathan Cole, 1967: ‘Scientific Output and Recognition, a Study in the Operation of the Reward System in Science’, American Sociological Review, 32(3), 377-90.

Glänzel, Wolfgang and Henk Moed, 2002: ‘Journal impact measures in bibliometric research’, Scientometrics 53(2), 171-193.

HEFCE, 2000 (Higher Education Funding Council of England): HEFCE Fundamental Review of Research Policy and Funding: Sub-group to consider the role of quality assurance and evaluation. www.hefce.ac.uk/Research/review/sub/qaa.pdf.

Husso, Kai, Karjalainen, Sakari and Parkkari, Tuomas, 200: The State and Quality of Scientific Research in Finland, www.aka.fi/users/33/1979.cfm.

Knorr-Cetina, Karin D., 1981: The Manufacture of Knowledge: An Essay on the Constructivist and Contextual Nature of Science. Oxford et al.: Pergamon Press.

Lewison, Grant, 2001: ‘Evaluation of books as research outputs in history’. Research Evaluation. 10(2), 89-95.

Linke, Russel et al., 1991. Performance Indicators in Higher Education. Report of a Trial Evaluation Study Commissioned by the Commonwealth Department of Employment, Education and Training. Volume 1, Commonwealth Department of Employment, Education and Training.

Marginson, Simon and Mark Considine, 2000: The Enterprise University. Power, Governance and Reinvention in Australia. Cambridge: Cambridge University Press.

Moed, H, W.J.M Burger, J.G. Frankfort, A.F.J. Van Raan, 1985a: ‘The application of bibliometric indicators: Important field- and time-dependent factors to be considered’, Scientometrics 8, 177-203.

Moed, H, W.J.M Burger, J.G. Frankfort, A.F.J. Van Raan, 1985b: ‘The use of bibliometric data for the measurement of university research performance’, Research Policy, 14, 131-149.

Moed, Henk F and Anthony FJ van Raan, ‘Indicators of research performance: applications in university research policy’, in Anthony FJ van Raan (ed), Handbook of Quantitative Studies of Science and Technology, Elsevier, Netherlands.

NBEET, 1993 (National Board of Employment, Education and Training): Research Performance Indicators Survey. Commissioned Report No.21. Canberra.

Nederhof, Anton J and Anthony FJ van Raan, 1987: ‘Peer Review and Bibliometric Indicators of Scientific Performance: A Comparison of Cum Laude Doctorates with Ordinary Doctorates in Physics’, Scientometrics, 11(5-6), 333-350.

Nederhof, Anton J and Anthony FJ van Raan, 1989: ‘A Validation Study of Bibliometric Indicators: The Comparative Performance of Cum Laude Doctorates in Chemistry’, Scientometrics, 17(5-6), 427-435.

OECD, 1997: The Evaluation of Scientific Research: Selected Experiences, OCDE/GD(97)194, Paris.

Rinia, E.J., van Leewen, Th.N., van Vuren, H.G. and van Raan, A.F.J., 1998: ‘Comparative Analysis of a set of Bibliometric Indicators and Central Peer Review Criteria: Evaluation of Condensed Matter Physics in the Netherlands’, Research Policy, 27(1), 95-107.

Schnitzer, Klaus.and Kazemzadeh, Foad, 1995: Formelgebundene Finanzzuweisung des Staates an die Hochschulen - Erfahrungen aus dem europäischen Ausland, HIS-Kurzinformationen A 11/1995, Hannover.

Senate Employment, Workplace Relations, Small Business and Education References Committee, 2001: Universities in Crisis, Commonwealth of Australia, www.aph.gov.au/senate/committee/eet_ctte/public uni/report.

Shale Doug, 1999: Alberta’s performance based funding mechanism and the Alberta Universities. University of Calgary: www.uquebec.ca/conf-quebec/actes/s12.pdf.

Sizer, John, 1998: ‘The politics of performance assessment: lessons for higher education? A comment.’, Studies in Higher Education, 13 (1), 101-3.

Strand, Dennis, 1998: Research in the Creative Arts. No 98/6, Evaluations and Investigations Program, Department of Employment, Education, Training and Youth Affairs, Canberra.

Taylor, Jeannette, 2001a: ‘Improving performance indicators in higher education: the academics’ perspective’, Journal of Further and Higher Education, 25(3), 379-393.

Taylor, Jeannette, 2001b: ‘The impact of performance indicators on the work of university acadmenics: evidence from Australian universities’, Higher Education Quarterly, 55(1), 42-61.

van Leeuwen, Thed N. and Henk Moed, 2002: ‘Development and application of journal impact measures in the Dutch science system’, Scientometrics, 53(2), 249-266.

van Raan, Anthony FJ, 1996: ‘Advanced bibliometric methods as quantitative core of peer review based evaluation and foresight exercises’, Scientometrics, 36(3), 397-420.

Weiler Hans W, 2000: ‘States, Markets and University Funding: new paradigms for the reform of higher education in Europe’. Compare, 30 (3), 333-9.

White Paper, 1999: Knowledge and Innovation: A policy statement on research and research training. Department of Employment, Training and Youth Affairs. www.detya.gov.au/archive/highered/whitepaper.


Back to Top of Page