Why and how people are visiting websiteDr. Zemskov, Andrei. I., library director, firstname.lastname@example.org;
Dr. Goncharov, Michael. V., library department head, email@example.com
Russian National Public Library for Science and Technology,
107996, Moscow, K-31, GSP 6, Russia, www.gpntb.ru
The goal of the study is to analyze the user behavior and the motivation of web site visitors for further improvements of user services. According to governmental statistics there are 6 mln. regular Internet users in Russia at beginning 2003 (which stands for 4% of total population) and Internet activity has been growing rapidly.
Methodology. We did not carry out direct questioning but have performed comparative analysis of statistical data produced by OPAC module and by website statistics. Data collected refer to 15.12.02-15.01.03 period and previous samplings demonstrated the same or the like trends. One could easily understand difficulties of indirect comparative analysis:
Which information is produced? National Report “Russian Information Resources” indicates that basically (more 90%) information is produced for internal use. Less than 5% is produced for public application. There are more general data:
Table 1. Annual global production of information:
25 TB of newspapers;
10 TB of periodicals (ca. 1 mln. issues);
2 TB of books (1 300 000 titles);
195 TB of internal information.
Russian reader general preferences are given in the (Table 2).
Table 2. Russian reader preferences 2001, % of reading audience
Professional publications 22%
Literature for youth and children 19%
Dictionary, Vocabulary 14%
Love stories 12%
Cooking recipes, housekeeping advises 11%
Science fiction 8%
Foreign poetry 1,5%
The library resources. Library collection comprises 8 mln. items, mainly on pure and applied sciences, engineering, economics and so on.
Table 3. Traditional statistics (items)
Books 2. 0 mln.
Periodicals 3.8 mln.
Other materials 1.9 mln.
microforms 1. 6 mln.
Unpublished translations 0.3 mln.
Digital documents 6. 2 ths. (less 0.1% of total collection)
Table 4. Library collections in terms of information resources available
Books 2 TB
Periodicals and other materials 3 TB
Digital offline resource 0,6 TB (ca. 10 % of total)
Website of the Russian National library for Science and Technology (www.gpntb.ru) has been opened since 1995. Until that date we had X.25 packet communication technology. During 1995 – 1997 total content was ca. 700 MB, of which 95% are OPAC and Union Catalogue of SciTech publications. At beginning 1999 total information was increased up to 1.2 GB by addition of bibliographical DB and full text materials. In 2001-2002 we expanded technical Internet line capacity to 2Mb/s. We have 4 servers: communication, firewall, applications and file server. Nowadays site content annual growth rate is ca. several per cents. We have 358 PC, of which 324 are LAN connected. There are 83 PC for users, of which 45 are Internet connected.
Comparative analysis of expert evaluations of libraries’ websites features positioning of the Russian NPLSaT website amidst another library sites.
Table 5. Expert evaluation of library sites. Number one corresponds to the highest appreciation
Basically, our site is like other federal (or national) level library sites, in particular, Russian State library, Russian National library, Library of Natural sciences, Central Agricultural and Central Medical libraries and so on. Essential difference is Union Catalog of ST publications at our site. Nevertheless, we are sure that findings of our study could be applied to evaluations of other special library user behavior.
Which documents prefer our readers? While registration readers declare subjects of interest; at reader's choice registration could be permanent or temporary.
Table 6. Declared subjects of interest (January – April 2002)
One could see that stability coefficient for Physics, Math., (1.28), Construction and Architecture (1.28), Electronics, Radio (1.28), Power Engineering, Communications (1.26) features better coincidence of permanent and provisional requests if compared with Ecology (1.12, Economics (1.16), Chemistry, Chem. Technology (1.20). As a rule, people are coming to solve one certain problem. In any case more 2/3 registered readers need reference service but not permanent library work.
How subject distribution of library collection satisfies user requests?
(Table 7) presents 11 major (the most numerous) subjects of State STI Subject Heading Tables (SH numbers) and circulation data for April – December 2002
Table 7. Major subjects of book collection
Ratio of subject part of collection to circulation (lending) could be referred to as a completeness of subject collection. This parameter indicates range of choice for reader; it varies from 5.5 (economics) to 37 (physics). For further analysis we consider an aggregated data on several subject groups.
Table 8. Circulation referred to subject groups
Findings of subject analysis. Readers of this specialized library require access to pure and applied sciences. Theirs needs of LIS documents (group 6) are fairly behind in priorities, despite relative completeness of this subject collection, readers could get 23 times more titles than they really took.
Declared requests in general do not differ from really asked. Strikingly high priority of Informatics and STI could be explained by certain misunderstanding. At registration people suppose that STI means publications on metallurgy, engineering, etc., and have no pronounced interest to LIS problems as a science.
Does activity of requests depend on stocks? This is a fundamental question; term “critical mass” usually defines some threshold, which marks change in behavior of system. Common sense supposes something like “dose – dependent” relations between stock and requests. But at monotonous growth of total number of OPAC records, requests varied different way, see (Table 9). So we failed to find correlation between stock and requests, much stronger are seasonal variations of library visits.
Table 9. Requests versus total number of OPAC records
Is there dependence of publication year on requests? Website pages are accessible starting from year of publication, i.e. last opening or modernization of site. That is why we have analyzed activity of requests as a function of publication year and of subject.
Fig. 1. Requests on books Group 4: Mathematics, Cybernetics, Physics, Chemistry, Mechanics (SH numbers 27, 28, 29, 30, 31)
Fig. 2. Requests on books Group 3: Ecology and General Problems (SH numbers 38, 81, 82, 87)
Fig. 3. Requests on books in general
Findings of request activity analysis. All studied profiles feature 4 stages of activity.
The first stage features the development of the request activity from initial zero level to the maximum value, which takes approximately 1.5 years. This process usually is not supported or accelerated by advertising campaign. Besides, under monitoring were pretty large collections, multi thousand ones. Contribution to time lag due to the in-library processing should be taken into account as well.
The second stage corresponds to maximum request activity and duration of this period is from 3 to 7 years depending on subject group. We could only notify that documents of group 6 (informatics and so on) keep readers interest for 7-9 years.
The next stage demonstrates decrease of readers’ interest during 2-3 years. Again, universal character of averaged reader behavior is shown.
The final stage corresponds to stable and small activity: 1 - 2 items per year are requested for our subject collections. Empirical formula (in relative units) for dependence of requests activity (Y) on books publication age (x): Y = x2e (1-x)
What is the reason of request decrease on pretty wide spectrum of subjects? For overwhelming majority of monitored subjects there was not any developments or discoveries and nobody has cancelled old facts. The main reason is a strive for a new information, and expectations of:
So, actualization is very important factor and web site designer (system administrator) should take care of the regular updating .
Unfortunately we could not find any correlation of requests activity on publication age for periodicals.
Comparison of library visits and web site visits. These data are the grounds of statistics, see Fig 10. One could see certain growth after 1998 crisis, but in general this curve features some fluctuations around average 240-270 thousand visits per year. As for web site visits, one could see monotonous growth during 6 years.
Fig. 4. Dynamics of the traditional library visits and website visits
Visitor Sessions profiles.
Table 10. Average web site visits per day
(100 sessions of the same visitor are counted as one).
Profile by number of visits is an important category for user satisfaction evaluations; 2.18% of all visitors are visiting our website 10 times per month or more.
Fig. 5. Distribution by number of visits during one month
More 22% visitors come directly to known pages, in particular to OPAC, Union Catalogue, doctoral theses. Just the same way majority of visitors are quitting site from the search pages. Ratio of total number of visits to full text pages visits is 6.
With respect to content of pages website visitors’ priorities are as follows:
Overwhelming majority of our readers are citizens of Moscow and suburbs. Total population of this region is ca. 15 mln. and that is our potential audience. Services for remote users via ILL stand for 1% of service in library premises. Therefore geographic distribution of users presents narrow function with conventional half width 150-200 km.
Fig. 6. Conventional distribution of library readers
Regional distribution of remote users (in terms of Internet visits) presents all continents, regions, and countries (including 7 users from Polynesia).
Fig. 7. Conventional website visitors’ distribution
The most active external referring sites are Yandex, Aport, Rambler, Google. Library’s web site is indexed by 300 search systems and directories.
Distribution of requests on seasons, days and daytime.
Monthly and daily distributions of library visitors are presented at (Fig. 8 and Fig. 9).
Fig. 8. Library visits by months 2002.
Fig. 9. Library visits by days
Activity of website users by day of the week and by hours of the day is presented at (Fig. 9 and Fig. 11). The busiest day (20% of total visits) is Thursday, the least busy - Sunday. Website traffic by months has no seasonal variations
Fig. 10. Activity of website users by day of the week
Web site is visited without seasonal or daily gaps.
Fig. 11. Activity of web site visitors by hour of the day
Maximum web site visits fall on 11.00-17.00 (Moscow local time). Notice non stop service around the clock: the difference between activity level as function of hours is not high.
Fig. 12. Activity of library visits by hour of the day
A gap of activity at lunchtime corresponds to appreciation of library visit as a work.
Local users of remote digital resources. We have statistical data on requests from our library readers’ to digital documents (more 1200 scientific journals) of Russian Foundation of Basic Research (RFBR) Electronic Library.
Fig. 13. Requests on digital full text documents of RFBR Electronic Library
Majority of users (80%) prefer Microsoft Internet Explorer and 97% prefer Windows platform, Cyrillic-coding chart cp-1251.
Average duration of web site session 14 min., which is equivalent to browsing (reading) 7 printed pages.
Average reader (700 - 900 readers daily) session duration is 3 hours, which includes search and waiting of order time.
For the year 2002 there were 260 000 visits to our library and circulation (lending in reading rooms) was 2 300 000 items, or 9 items per visitor. Having assumed 300 pages for an item, it corresponds 2 700 pages for each reader. Keeping in mind average human ability to read no more 100 pages for 3 hour, we get pretty low efficiency of work.
In total readers have ordered 1725 GB information (less 30% of total collection content), of which they could read less than 65 GB.
For 2002 there were downloaded from web site 6 370 000 files, another words all content of web site was browsed 15 times.
Copyright © 2004. International Library Information and Analytical Center.
All rights reserved.