Analyzing the Performance of an Institutional Scientific Repository – A Case Study

Scientific knowledge evolution is mainly based on an effective dissemination of research results. The concept of Open Access gives us the theoretical foundation of a model for accessing scientific knowledge, free from the constraints of traditional publishing and technologically supported by the Internet. Institutional Repositories are information systems that allow preserving, storing and disseminating scientific knowledge produced in higher education and scientific research institutions. They increase the visibility and the citation level of the documents. They also contribute to minimizing negative aspects like plagiarism of content because documents are exposed to peers in real time. As an alternative way to the traditional system of publishing scientific research content, repositories are developed in a cultural climate of great visibility leading to an immediate critical evaluation by peers. The Scientific Repository of the Polytechnic Institute of Castelo Branco – Portugal (RCIPCB) was created in 2009 but its official presentation took place in Maria Eduarda Rodrigues and António Moitinho Rodrigues Liber Quarterly Volume 22 Issue 2 2012 99 January 2010. Its main purposes are promoting Open Access (OA), and preserving and disseminating the scientific knowledge produced at the Polytechnic Institute of Castelo Branco (IPCB). Using DSpace as a technological platform, RCIPCB is an institutional project supported by the president of the IPCB. Therefore, the present study was developed with the aim of analyzing the performance of RCIPCB considering the evolution and growth in terms of users, archiving and self-archiving, the number of published documents (scientific) versus deposited documents in 2010 and the heterogeneity among communities/collections and its causes. Data were collected in RCIPCB, in the 2010 scientific publication list of the institute and through a questionnaire survey distributed among the members of the community with most documents deposited and those of the community with the fewest documents. For data collected in RCIPCB and in the publication list, average, standard deviation and counts were calculated. Data collected from questionnaires were analyzed with SPSS. The results show that RCIPCB indicates an asymmetric growing dynamics. Nevertheless, it reflects the institutional organization, in the sense that the communities related to the older schools possess more documents than the communities related to more recent schools. Communities having higher numbers of deposited documents seem to have also higher levels of searches and downloads. Therefore, it increases significantly the visibility of the institution and its researchers. Concerning the 2010 scientific production when compared with the deposit level of the corresponding community, the results show that the number of documents deposited is much lower than the number of published documents. Data obtained from the questionnaire answers from the communities The School of Agriculture (ESACB) and The School of Applied Arts (ESART) suggest that the strategy of communication used by RCIPCB is correct because everybody knows about the Repository. However, that is not related to the number of documents deposited. They also suggest that the strategy is not efficient and it needs some improvements in order to become effective. Considering the results it is clear that RCIPCB needs to have a mandatory depositing policy that might also be extended to user registration. Those factors would minimize both the heterogeneity and the asymmetric growth of communities and collections. Moreover, it would also decrease the difference between scientific production and the corresponding deposit in RCIPCB.


Introduction
The evolution of scientific knowledge is based mostly on an effective diffusion of research results (Prosser, 2005;Duarte, Paiva, & Silva, 2007) by scholarly communication.
Open Access (OA) offers us the theoretical foundation for the dissemination of scientific knowledge, free from the severe constraints imposed by scientific content publishers (Rodrigues & Rodrigues, 2011;Vézina, 2006).Open Access is achieved by the "golden OA", when authors publish in open access journals and the "green OA", related to institutional repositories (Harnad et al., 2008;Rodrigues, 2004;Saraiva & Rodrigues, 2010).
The present study aims to analyze the performance of the Scientific Repository of the Polytechnic Institute of Castelo Branco -Portugal (RCIPCB) which is an institutional repository.This will be analyzed in various dimensions, which seek to identify the main problems and list some solutions that can address the problems highlighted.

Institutional Repositories -Overview
According to Lynch (2003) an institutional repository "is a set of services that a university offers to the members of its community for the management and dissemination of digital materials created by the institution and its community members".
Within the academic community, reputation is an important aspect that influences directly information production, dissemination and consumption.For some authors publishing in open access repositories may be seen as something that can affect seriously their reputation, because open access repositories contain much more than peer reviewed papers.They include also more informal documents (Grundman, 2009).
The Open Access movement advocates the free dissemination of all scientific literature, allowing all to read, download, copy and reference the full text of documents (Saraiva & Rodrigues, 2010).
In this context both thematic and institutional repositories have been set up.They establish free access to scientific knowledge (Frias & Travieso Rodriguez, 2008) since the publication in open access journals is not yet widespread.Some authors mention even that repositories are generally used to implement fast access to scientific knowledge (Batista & Ferreira, 2006).A repository is a system that provides an alternative to the traditional system of scholarly publishing; institutional repositories are developed in a climate of high visibility and public exposure, which allows their content to be critically evaluated by peers in real time (Marques & Maio, 2007).
Repositories stimulate scientific production in a competitive way, allowing its reuse, on a basis of sharing and collaboration awareness (Seonghee & Boryung, 2008).
There is also, through the repositories, a reduction of the spread of information/knowledge produced by researchers, since they bring together in a single location all scientific output produced by an institution and its researchers (Marques & Maio, 2007), contributing to a drastic reduction of time and publication costs.Some authors even consider that institutional repositories increase the efficiency of the publication process, contributing also to increasing the visibility of their institutions (Rodrigues, 2010;Saraiva & Rodrigues, 2010).At present, the WEB age, authors should provide their scientific production by placing it in their institutional repositories, without restrictions or limitations of any kind.Some authors even argue that Institutional Repositories produce a higher return, and that they should therefore be encouraged, financed and released.Regardless of their electronic platform, institutional repositories gather documents and metadata into a single system, and allow locating unequivocally a document in the context of scientific publishing through its unique identifier.These two features add value to institutional repositories, making them important mediators in the dissemination of scientific work process (Womack, 2002).

Institutional Repositories in Portugal -general aspects
In Portugal, Open Access is realized mostly through institutional repositories.The number of Portuguese institutional repositories increased from 3 in 2007 to 35 in 2011 (UMIC, 2012).However, the development level differs from institution to institution.There are differences not only in terms of institutions but also within the institutions where asymmetries are evident in terms of communities and collections (Rodrigues & Rodrigues 2011;Saraiva & Rodrigues, 2010).The difficulties often stem from a low submission rate and a lack of mandatory policies, which combined make it very difficult to both manage and maintain institutional repositories (Rodrigues, 2004).

The Scientific Repository of the Polytechnic Institute of Castelo Branco
The Polytechnic Institute of Castelo Branco is a higher education institution that includes six schools related to different scientific fields of study.The IPCB started with two schools: The ESACB and the School of Education (ESECB), both founded in the 1980s.In the 1990s the School of Management (ESGIN) and the School of Technology (ESTCB) and the ESART were founded.Finally, in 2002 The School of Health (ESALD) joined the IPCB.
The IPCB has about 5,000 students distributed over graduation courses and master degree courses.Teaching staff/researchers are distributed as shown in Table 1:  & Rodrigues, 2011;Rodrigues, 2010).RCIPCB aims to bring together all scientific documents produced at the IPCB by its professors and researchers in order to make them freely available to the scientific and academic community in general.

Methodology
The RCIPCB data with reference to November 2011 were collected at http://repositorio.ipcb.ptand were analyzed in order to verify the size of the corresponding communities and collections in terms of both number of documents, and number of IPCB registered users.
To evaluate the relationship between the total number of documents produced and the number of documents submitted to the repository, we considered the ESACB community case, since this is the largest RCIPCB community.The scientific production in 2010 (papers, communications, books/book chapters, posters) was taken as a reference, comparing the documents produced by the researchers with the documents actually deposited in RCIPCB until November 2011.

RCIPCB
The ESACB Community showed the highest number of deposited documents (375) while the ESART Community showed the fewest (41).One of the reasons for such a difference might be related to how old the schools are.The older School, which is the ESACB Community, has more documents deposited than the more recent school, the ESART Community, and because of that the teaching staff/researchers of the older school might eventually have published more documents than the teaching staff/researchers from the more recent school (Table 3).Nevertheless there are other factors that might also  Source: RCIPCB (until Nov. 2011) contribute to this like the scientific fields or even the inadequacy of the existing collections regarding the type of documents produced by teaching staff/ researchers from ESART, as musical scores, drawings and paintings, clothes and furniture.
Considering these two extreme cases of RCIPCB communities, we analyzed the number of deposited documents concerning the two file forms: archiving and self-archiving.Table 4 shows the results obtained for those two communities.From the 416 documents filed at the ESACB and ESART communities, only 7% were self-archived (28 documents).
The results are similar to those found by Xia (2008), who refers that the authors are not very enthusiastic about self-archiving even though they are familiar with its practice.
We also analyzed the data related to repository users.Comparing the number of RCIPCB registered users with the total number of teaching staff/researchers per community (Table 1) we see that the ESACB, the ESECB and the ESTCB communities have more registered users than the ESALD, the ESART and the ESGIN communities (Figure 1).This confirms that the oldest schools have the highest number of registered RCIPCB users.3, demonstrates that a higher number of deposited documents leads to a higher number of downloads (R=0.91;P<0.05) and searches (R=0.99;P<0.01) which allows us to say that the visibility level was increased because of the higher number of deposited documents.

Scientific production/deposited documents
When comparing the 2010 scientific production by the researchers from the ESACB Community, with the number of those documents that have been deposited in RCIPCB, we verify that there was great heterogeneity in the number of deposited documents, ranging from 78.4% for the Scientific and technical communications collection to 23.8% for the Peer reviewed papers collection (Table 5).The low rate obtained for the Peer reviewed papers collection (only 23.8% deposited on RCIPCB) might be related to the scientific journals' copyright policy.This idea is also mentioned by Grundman (2009), who also adds the reputation factor.The total average percentage found for the documents that have not been deposited in RCIPCB, close to 40%, is a very high rate when considering the Repository goal.In our opinion this is due to the absence of a mandatory policy.That was also found by other other autors (Harnad et al., 2008;Grundman, 2009;Xia, 2008;Bankier & Perciali, 2008;Covey, 2011).

Sample characterization -Context data
From the 40 questionnaires that were distributed to teaching staff/researchers, we received 26 completed answers (65%), divided equally over both The number of years the respondents had worked in the IPCB ranged from 6 to 20 years (12, 46.2%) to more than 20 years (14, 53.8%).

"Open Access" Movement
When asked about their knowledge of the Open Access Movement all the respondents reported having been aware of this movement either through the conferences organized by the IPCB (18, 69.2%), by searching the Internet (7, 26.9%), through the mass media (3, 11.5%), by e-mail (3, 11.5%) or through the library staff (3, 11.5%).All the respondents reported they were willing to put their scientific production in "open access repositories" and all but one, 25 out of 26 (96.2%) agreed in providing open access and full text of scientific literature in general.

Knowledge about RCIPCB
With regard to the knowledge about RCIPCB, 25 out of 26 (96.2%) of the respondents reported knowing RCIPCB.16 out of the respondents (64%) indicated that they had been informed by internal promotion, 7 (28%) through conferences organized by the IPCB, 4 (16%) through the IPCB Office of Information, 3 (12%) through the RCIPCB Newsletter and 1 (4%) by colleagues.Data suggest that the diffusion strategy used at IPCB is consistent with the objective.

Users and registration
Registration in the repository is mandatory in order to do self-archiving or to receive e-mail updates, for instance.Thus, when asked if they had registered in RCIPCB, 15 (60%) of the respondents said they were registered.Out of the 10 researchers (40%) who are not registered, 8 said that lack of time was the reason for not doing so and only 2 reported not knowing that they could register themselves in RCIPCB.7 respondents indicated that they wished to register in RCIPCB in the future.Although these results indicate that the RCIPCB actions of disclosure have fulfilled the objectives, it is necessary to continue the dissemination and training activities to ensure the systematic growth of RCIPCB communities/collections.Other authors identified the same trends (Grundman, 2009;Frias, Travieso Rodriguez, 2008;Bankier, Perciali, 2008).As we combined the age of the respondents with the record in the repository, we found that 83.3% of the respondents were aged between 41 and 50 years old, 70% of respondents were 51 or more years old and 50% of respondents were between 31 and 40 years old.Of the total respondents who reported having registered in RCIPCB (n = 15), 80% belong to the ESACB community.Considering the age, the data contradict some literature (Covey, 2011) that states that older researchers may not be as receptive to register in a repository.

Archiving documents
With regard to archiving of documents, 76% (n=19) of the respondents reported having their own papers filed in RCIPCB.Similar data are referred to by Swan & Brown, cited by Cassela (2010): 81%.
22 of the respondents (88%) said that they wanted to deposit more of their own documents in RCIPCB and 13 (52%) wanted to do it by self-archiving.However, 4 of these respondents reported that they also wanted to deposit documents in RCIPCB with the help of library staff.Of the respondents reporting that RCIPCB did not contain any of their own documents (6, 24%), 1/3 said that they wanted to deposit documents or authorize the respective filing on a voluntary basis.The low levels found for the parameters in this section of the questionnaire, may be related to the fact that there is not a mandatory policy.
Of the researchers who reported having documents deposited in RCIPCB (n = 19), 84.2% belong to the ESACB community.Of these 19 users, only 15 (78.9%) are registered in RCIPCB.Considering the values found, there seems to be more interest for RCIPCB in the ESACB community than in the ESART community.Some authors mention that this could be related with the authors' attitude (Xia & Sun, cited by Cassella, 2010).

Scientific publications
Concerning the information about scientific publication (n = 19 respondents), 13 (68.4%) of the respondents reported having from 1 to 5 documents deposited in RCIPCB, 5 (26.3%) from 11 to 20 documents and only 1 (5.3%) from 6 to 10 documents.The highest number of deposited documents (11-20 documents) is associated with teaching staff/researchers who have worked for more than 20 years at the IPCB.That might be related to the fact that more documents were published by those researchers.We also found that in the ESART community nobody has submitted more than 5 documents (3).
Liber Quarterly Volume 22 Issue 2 2012 The ESACB community shows deposited documents for all categories up to 20 documents (1 to 5 documents -10; 6 to 10 documents -1; 11 to 20 documents -5).This might be related to the age of these schools, considering that the ESACB community belongs to an older school than ESART.But this could also be related with the collections, considering the specificity of some outputs produced by teaching staff/researchers from ESART, like music for example.

Use of RCIPCB
Regarding the use of RCIPCB 40% of the respondents (10) reported not having accessed the RCIPCB.From the 15 respondents that accessed the repository (60%), 9 (36%) said that they usually did it once a week.The percentage of teaching staff/researchers that access the RCIPCB is the same in both communities, 57%.

RCIPCB using purposes
RCIPCB is an open access repository and everybody can use it.We found that 4 (40%) of the 10 teaching staff/researchers that mentioned that they were not registered in RCIPCB actually use it.Therefore, in order of importance, the main objectives for accessing RCIPCB were to search scientific information (13, 52%), consultation of their own documents (5, 20%), access to full text (4, 16%), and query statistics (4, 16%).It should be noted that 60% (15) of the respondents said that they recommended the RCIPCB to their students to search specific subjects (60% of cases), to access documents of their own (33.3% of cases), to access other authors' documents (33.3% of cases) and to download full text documents (26.7% of cases).In this section of the questionnaire the respondents could give multiple answers.
Those figures are not very bad, taking into consideration that about 75% of researchers use Google as their first search option (Frias, Travieso Rodriguez, 2008).
The importance of RCIPCB When asked about the importance of the RCIPCB, all the respondents said that it is very important for the IPCB.
Table 6 shows that, using a scale from 1 (unimportant) to 5 (extremely important), 52% of the respondents gave a score of 5 for the parameter "importance of the RCIPCB for assessing the IPCB" and 44% of respondents highlighted the score 4 to the parameter "importance of RCIPCB in terms of dissemination of their scientific production as an author or co-author".
In average the respondents rated with 4.44 (±0.651;P>0.05) the importance of the RCIPCB for assessing the IPCB and rated with 3.84 (±0.898;P>0.05) the importance of the RCIPCB in terms of the dissemination of their scientific production an as author or co-author.The responses obtained for these parameters are similar to the ones obtained by other authors (Grundman, 2009).

Conclusions
The results obtained show that the RCIPCB indicates an asymmetric growing dynamics.Nevertheless, it reflects the institutional organization, in the sense that the communities related to the older schools have more documents than the communities related to more recent schools.Communities having a higher number of deposited documents have also higher levels of searches and downloads.This increases significantly the visibility of the institution and its researchers.
The scientific production in 2010, compared with the deposit level of the corresponding community, shows that the number of documents deposited is much lower than the number of published documents.This might be related to the fact that the RCIPCB is still in its early days and also to the lack of a mandatory policy, which seems to be related also to the low levels of selfarchiving.Data obtained from the questionnaire survey applied to the ESACB and ESART communities suggest that the strategy of communication used by the RCIPCB is correct because almost everybody knows about the Repository, but this appears to be poorly related to the number of documents deposited.This also shows that the strategy is not efficient and that some improvements are needed in order to become effective.There is still a considerable number of teaching staff/researchers who are not even registered in the RCIPCB but who intend to do it.They also consider that the RCIPCB is very important not only for the institution's reputation but also for their individual reputation.
Considering the results of the survey, it is clear that the RCIPCB needs to have a mandatory depositing policy that might also be extended to user registration.We are convinced that this could minimize heterogeneity and asymmetric communities and collections growth, bridging scientific production and the corresponding deposit in the RCIPCB.

DADOS DE CONTEXTO
The ESART and the ESACB communities were used as examples to assess the individual knowledge of IPCB teaching staff/researchers about the RCIPCB.For this purpose, 40 questionnaires were randomly distributed among 20 ESACB teaching staff/researchers and 20 ESART teaching staff /researchers in July 2011.The surveys included 27 questions, namely about context data, data on their knowledge about the Open Access movement and the RCIPCB, about data submitting, archiving, use and importance of RCIPCB to the teaching staff/researcher and to the organization.At the end of the month 26 completed questionnaires were collected, 19 from the ESACB community and 7 from the ESART community.The data collected from the questionnaires were analyzed with SPSS (average, standard deviation and Pearson Chi-square).

Table 1 :
Number of IPCB's teaching staff /researchers in the different schools.

Table 4 :
File form at ESACB and ESART communities.

Table 6 :
The importance of the RCIPCB.