Insights into ETD Access Trends and Characteristics at PUC-Rio

Examining Accesses by
Examining Accesses by
Country, Language and
Country, Language and
Area of Knowledge
Area of Knowledge
ETD 2011 – Cape Town
ETD 2011 – Cape Town
Ana Pavani
Ana Pavani
Laboratório de Automação de Museus, Bibliotecas
Laboratório de Automação de Museus, Bibliotecas
Digitais e Arquivos
Digitais e Arquivos
Departamento de Engenharia Elétrica
Departamento de Engenharia Elétrica
Pontifícia Universidade Católica do Rio de Janeiro
Pontifícia Universidade Católica do Rio de Janeiro
Brazil
Brazil
http://www.maxwell.lambda.ele.puc-rio.br/apavani@lambda.ele.puc-rio.br
ETD 2011 – South Africa
ETD 2011 – South Africa
This work is a continuation of a work presented last
year in Austin. The two works differ in the following
aspects:
In 2010, there were 71 data sets and this work
considers 85 
(20% more)
East Timor  was included because accesses from
this country have started happening
The UNDP has changed the way HDI is computed,
so this data has been updated, as well as the
populations of the countries
My co-author left the university, so this time I am
by myself
ETDs, PUC-Rio, BDTD & NDLTD
ETDs, PUC-Rio, BDTD & NDLTD
PUC-Rio
Rio de Janeiro
Brazil
PUC-Rio is a small private university. It is divided in 3
PUC-Rio is a small private university. It is divided in 3
centers and each has graduate programs:
centers and each has graduate programs:
CTCH
CTCH
 
(Humanities)
 – 6
CCS
CCS
 
(Social Sciences)
 – 10
CTC
CTC
 
(Science & Technology)
 – 10
 The oldest graduate program (EE) started in 1963.
 The newest graduate program is less than 5 years old.
Characteristics of PUC-Rio’s ETD program:
Characteristics of PUC-Rio’s ETD program:
First published ETD – May 2000
ETDs became mandatory – Aug 2002
Number of ETDs – 5,694 
(Jun 2011)
CTCH – 1,442
CCS – 1,291
CTC – 2,961
Yearly average number of defended T&Ds
(*)
590
 (*) 2007, 2008, 2009 & 2010; (**) 2006, 2007, 2008 & 2009.
 There is retrospective digitization.
ETDs are made available in chapters 
(graduate school
regulation – please, don’t ask me the reason!, but it will
change as of Oct 2011)
PUC-Rio’s ETDs, BDTD
PUC-Rio’s ETDs, BDTD
(*)
(*)
 and NDLTD
 and NDLTD
 (**) 
 (**) 
:
:
Number of BDTD institutions – 97 
(OAI-PMH data
providers)
Number of BDTD metadata records – 170K
+
 
(BDTD
is an OAI-PMH data and service provider)
BDTD records are/were harvested by OCLC and
other institutions, and made available 
 
worldwide
Brazilian ETDs are the largest collection in
Portuguese available worldwide
 (*) BDTD – Biblioteca Digital de Teses e Dissertações = Brazilian Nat’l Consortium.
 (**) You must know what NDLTD stands for!!!
Accesses to PUC-Rio’s ETDs:
Accesses to PUC-Rio’s ETDs:
Access logs saved since – Jun 2004
Number of monthly logs when article was written –
85
pt & es IN THE WORLD
pt & es IN THE WORLD
pt is the official or one of
pt is the official or one of
the official languages of:
the official languages of:
 Angola
 Brazil
 Cape Verde
 Equatorial Guinea (*)
 East Timor (**)
 Guinea-Bissau
 Macau (***)
 Mozambique
 Portugal
 Sao Tome and Principe
es is the official or one of
es is the official or one of
the official languages of:
the official languages of:
 Argentina
 Bolivia
 Chile
 Colombia
 Costa Rica
 Cuba
 Dominican Rep
 Ecuador
 El Salvador
 Equatorial Guinea (*)
 Guatemala
 Honduras
 Mexico
 Nicaragua
 Panama
 Paraguay
 Peru
 Puerto Rico
 Spain
 Uruguay
 Venezuela
(*) es & pt official
(**) less than 5% of the population know it; it was banned
during the Indonesian rule
(***) UNDP did not publish in the last report; other data
were used
Assumptions for the analysis:
Assumptions for the analysis:
ETDs are very specialized items – people who seek
ETDs are highly educated
es and pt are quite similar languages – educated
people who can speak one can read the other
es and pt-speakers are potential readers of PUC-
Rio’s ETDs
2 countries were not considered:
Brazil – is the home country
US – there are very large groups of es and pt-speaking
persons but neither one is the language of the country
2 groups were defined:
international group
international group
” – all countries except Brazil and the
US
pt+es group
pt+es group
” – all countries that have pt and/or es as one
of the official languages
Factors considered to influence accesses to ETDs:
Population size
Level of education
Access to the Internet
DEALING WITH COUNTRIES DIFFERENCES
DEALING WITH COUNTRIES DIFFERENCES
Mexico has
110M
inhabitants
Sao Tome and
Principe has
165K
inhabitants
Portugal and
Spain are in
Europe
Argentina and
Honduras are
in Latin
America
Angola and
Mozambique
are in Africa
Portugal has
10M
inhabitants
Spain has 45M
inhabitants
Equatorial
Guinea has the
2 languages
Quantization of potential accesses from countries
Quantization of potential accesses from countries
that are very different :
that are very different :
Need to find data on the factors that may influence
accesses to ETDs:
 Population size – easy
 Level of education – difficult (literacy rates are easy!)
 Access to the Internet – difficult
 All data should be considered in the same time-frame
Knowledge that the second and the third factors
are dependent on how developed countries are
Knowlede that it was necessary to combine the 3
factors
Decision on how to deal the countries differences:
Decision on how to deal the countries differences:
Use UNDP’s HDI – Human Development Index that
contains information on the second and the third
factors 
(HDI combines indicators of life expectancy,
education and income; the new way it is computed contains
means years of schooling and expected years of schooling,
going beyond literacy rates)
Decision to combine HDI with the population size 
 
 
Index I = Population x HDI
Index I = Population x HDI
Comments:
Comments:
21 es-speaking and 10 pt-speaking countries
(Equatorial Guinea was counted in both)
Average HDI for es-speaking countries is 34.16%
higher than the other group
Population of the es-speaking countries is almost
7.4 times the population of the other group
Index I for the es-speaking group is 12.36 times
the same index for the pt-speaking group
 
The expectation was to have many more accesses from es-speaking countries than
from pt-speaking countries!!
WORKING WITH DATA AND RESULTS
WORKING WITH DATA AND RESULTS
Information:
Information:
Number of sets of data – 85 
(one for each month)
For each set, 16 variables were computed 
(examples
– number of countries, number of pt-speaking 
 
countries
countries, total number of accesses, etc)
All data were computed for the complete set and
for each of the 3 areas of knowledge
From the sets 
From the sets 
(collection and areas )
(collection and areas )
 side
 side
This analysis focused on the way the whole collection
and each individual set – CTCH, CCS and CTC – were
accessed from countries in different groups.
Results:
Results:
Total number of countries that accessed ETDs –
204
CTCH – 183
CCS – 183
CTC – 189
 
 
Total number in the “
international group
international group
” – 202
CTCH – 181
CCS – 181
CTC – 187
Maximum number of countries in the “
international
international
group
group
” in a month – 143
CTCH – 112
CCS – 108
CTC – 132
 
 
Maximum number of countries in the 
 
pt+es
pt+es
 
 
group
group
” in a month – 28 
(maximum possible 30)
CTCH – 27
CCS – 27
CTC – 27
Number of months with accesses from 100 or more
countries – 42
CTCH – 18
CCS – 15
CTC – 32
Some percentages follow
Comments:
Comments:
Absolute values for CTC are higher – this area has
the largest collection 
(higher than the sum of the others)
Percentages  for CTC are lower, except for accesses
from the “
international group
international group
Is it more international?
It seems that language is not very important in C&T
From the accesses side
From the accesses side
This analysis focused on the way accesses behaved
for the complete collection and how they spread
among the sets – CTCH, CCS and CTC.
The collection and the sets have different profiles –
numbers of ETDs and numbers of partitions. For this
reason, normalization was necessary.
Quantization of potential accesses to sets of works
Quantization of potential accesses to sets of works
with different profiles:
with different profiles:
Sets are very different in:
 Numbers of ETDs
 Numbers of partitions per ETD
This means that numbers of accesses had to be
normalized in order to compare accesses to the
sets
This work presents a 
first attempt 
first attempt 
to quantize the
way the “
average ETD
average ETD
” in a set “
attracts
attracts
” accesses
Decision on how to deal the sets differences:
Decision on how to deal the sets differences:
Combine average numbers of ETDs with average of
average numbers of partitions 
 
 
Index EI = 1 / (average number of ETDs x
Index EI = 1 / (average number of ETDs x
average of average numbers of partitions)
average of average numbers of partitions)
 numbers computed for the total numbers of acesses x index EI
 numbers to be viewed as accumulated
 monthly averages can be obtained dividing by 85
Comments:
Comments:
When normalized data is consideredn, the average
number of accesses 
(per ETD in Science & Technology)
from the international group is the lowest among
all
The same happens with accesses from the es+pt
and pt-speaking groups, and Portugal as well
The reason is that ETDs in this group have the
lowest average of accesses per ETD among the 3
subsets
When normalized data is considered, the average
number of accesses 
(per ETD in Humanities)
 from the
international group is the highest among all
The same happens with accesses from the es+pt
and pt-speaking groups, and Portugal as well
The reason is that ETDs in this group have the
highest average of accesses per ETD among the 3
subsets
FINAL COMMENTS
FINAL COMMENTS
Percentage wise, international accesses are the
most significant for ETDs in S&T
At the same time, the “
average S&T ETD
average S&T ETD
attracts
attracts
” less international accesses than ETDs in
other areas of knowledge and the “
average
average
Humanities ETD
Humanities ETD
” “
attracts
attracts
” the most
In all areas of knowledge, accesses from:
 es- and/or pt-speaking countries are the most significant
 pt-speaking countries are the most significant in the
es+pt–group
es+pt–group
Portugal are the most significant in the pt-group
New ways of defining “
attraction
attraction
” should be
examined
 
Results seem to indicate that language and
Results seem to indicate that language and
HDI are important factors in accesses
HDI are important factors in accesses
Thank you!
Thank you!
Muito obrigada!
Muito obrigada!
Slide Note
Embed
Share

The exploration of ETD access patterns and program specifics at PUC-Rio reveals interesting developments such as increased data sets, new country inclusions, and changes in co-authorship. The university's ETD program, divided into three centers, showcases a rich history and a growing repository of electronic theses and dissertations, with mandatory submission since 2002 and a retrospective digitization process in place.

  • ETD trends
  • PUC-Rio
  • electronic theses
  • academic programs
  • access patterns

Uploaded on Oct 03, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. ETD 2011 Cape Town Examining Accesses by Country, Language and Area of Knowledge

  2. ETD 2011 South Africa Ana Pavani Laborat rio de Automa o de Museus, Bibliotecas Digitais e Arquivos Departamento de Engenharia El trica Pontif cia Universidade Cat lica do Rio de Janeiro Brazil apavani@lambda.ele.puc-rio.br http://www.maxwell.lambda.ele.puc-rio.br/

  3. This work is a continuation of a work presented last year in Austin. The two works differ in the following aspects: In 2010, there were 71 data sets and this work considers 85 (20% more) East Timor was included because accesses from this country have started happening The UNDP has changed the way HDI is computed, so this data has been updated, as well as the populations of the countries My co-author left the university, so this time I am by myself

  4. ETDs, PUC-Rio, BDTD & NDLTD

  5. PUC-Rio Rio de Janeiro Brazil

  6. PUC-Rio is a small private university. It is divided in 3 centers and each has graduate programs: CTCH (Humanities) 6 CCS (Social Sciences) 10 CTC (Science & Technology) 10 The oldest graduate program (EE) started in 1963. The newest graduate program is less than 5 years old.

  7. Characteristics of PUC-Rios ETD program: First published ETD May 2000 ETDs became mandatory Aug 2002 Number of ETDs 5,694 (Jun 2011) CTCH 1,442 CCS 1,291 CTC 2,961 Yearly average number of defended T&Ds(*) 590 (*) 2007, 2008, 2009 & 2010; (**) 2006, 2007, 2008 & 2009. There is retrospective digitization.

  8. ETDs are made available in chapters (graduate school regulation please, don t ask me the reason!, but it will change as of Oct 2011)

  9. All CTCH CCS CTC Number of ETDs 5,694 1,442 1,291 2,961 Average number of ETDs June 2004 to June 2011 3,553.1 888.9 726.1 1,938.3 Average number of partitions June 2011 7.3 7.9 7.3 7.0 Average of averages number of partitions June 2004 to June 2011 6.93 7.82 6.83 6.57

  10. PUC-Rios ETDs, BDTD(*)and NDLTD(**) : Number of BDTD institutions 97 (OAI-PMH data providers) Number of BDTD metadata records 170K+(BDTD is an OAI-PMH data and service provider) BDTD records are/were harvested by OCLC and other institutions, and made available worldwide Brazilian ETDs are the largest collection in Portuguese available worldwide (*) BDTD Biblioteca Digital de Teses e Disserta es = Brazilian Nat l Consortium. (**) You must know what NDLTD stands for!!!

  11. Accesses to PUC-Rios ETDs: Access logs saved since Jun 2004 Number of monthly logs when article was written 85

  12. pt & es IN THE WORLD

  13. Worldwide Western Languages Internet es is the official or one of the official languages of: Portuguese 7th 3rd 6th Argentina Bolivia Chile Colombia Costa Rica Cuba Dominican Rep Ecuador El Salvador Equatorial Guinea (*) Guatemala Honduras Mexico Nicaragua Panama Paraguay Peru Puerto Rico Spain Uruguay Venezuela Spanish 2nd 1St 3rd pt is the official or one of the official languages of: Angola Brazil Cape Verde Equatorial Guinea (*) East Timor (**) Guinea-Bissau Macau (***) Mozambique Portugal Sao Tome and Principe (*) es & pt official (**) less than 5% of the population know it; it was banned during the Indonesian rule (***) UNDP did not publish in the last report; other data were used

  14. Assumptions for the analysis: ETDs are very specialized items people who seek ETDs are highly educated es and pt are quite similar languages educated people who can speak one can read the other es and pt-speakers are potential readers of PUC- Rio s ETDs 2 countries were not considered: Brazil is the home country US there are very large groups of es and pt-speaking persons but neither one is the language of the country

  15. 2 groups were defined: international group all countries except Brazil and the US pt+es group all countries that have pt and/or es as one of the official languages Factors considered to influence accesses to ETDs: Population size Level of education Access to the Internet

  16. DEALING WITH COUNTRIES DIFFERENCES

  17. Sao Tome and Principe has 165K inhabitants Mexico has 110M inhabitants Portugal and Spain are in Europe Argentina and Honduras are in Latin America Angola and Mozambique are in Africa Portugal has 10M inhabitants Equatorial Guinea has the 2 languages Spain has 45M inhabitants

  18. Quantization of potential accesses from countries that are very different : Need to find data on the factors that may influence accesses to ETDs: Population size easy Level of education difficult (literacy rates are easy!) Access to the Internet difficult All data should be considered in the same time-frame Knowledge that the second and the third factors are dependent on how developed countries are Knowlede that it was necessary to combine the 3 factors

  19. Decision on how to deal the countries differences: Use UNDP s HDI Human Development Index that contains information on the second and the third factors (HDI combines indicators of life expectancy, education and income; the new way it is computed contains means years of schooling and expected years of schooling, going beyond literacy rates) Decision to combine HDI with the population size Index I = Population x HDI

  20. All CTCH Total population 420,281,000 57,858,800 Average HDI 0.707 0.527 Index I 309,420,871 25,114,111

  21. Comments: 21 es-speaking and 10 pt-speaking countries (Equatorial Guinea was counted in both) Average HDI for es-speaking countries is 34.16% higher than the other group Population of the es-speaking countries is almost 7.4 times the population of the other group Index I for the es-speaking group is 12.36 times the same index for the pt-speaking group The expectation was to have many more accesses from es-speaking countries than from pt-speaking countries!!

  22. WORKING WITH DATA AND RESULTS

  23. Information: Number of sets of data 85 (one for each month) For each set, 16 variables were computed (examples number of countries, number of pt-speaking countries countries, total number of accesses, etc) All data were computed for the complete set and for each of the 3 areas of knowledge

  24. From the sets (collection and areas ) side

  25. This analysis focused on the way the whole collection and each individual set CTCH, CCS and CTC were accessed from countries in different groups.

  26. Results: Total number of countries that accessed ETDs 204 CTCH 183 CCS 183 CTC 189 Total number in the international group 202 CTCH 181 CCS 181 CTC 187

  27. Maximum number of countries in the international group in a month 143 CTCH 112 CCS 108 CTC 132 Maximum number of countries in the pt+es group in a month 28 (maximum possible 30) CTCH 27 CCS 27 CTC 27

  28. Number of months with accesses from 100 or more countries 42 CTCH 18 CCS 15 CTC 32 Some percentages follow

  29. % accesses All CTCH CCS CTC from the international group 8.48 7.99 7.89 9.12 in the international group from the es+pt-sepaking group 69.03 73.27 68.56 66.32 in the es+pt-speaking group from pt-speaking countries 82.07 87.11 84.44 77.35 in the international group from pt-speaking countries 56.65 63.83 57.89 51.30 in the international group from Portugal 49.74 57.39 49.54 44.69 in the es+pt-speaking group from Portugal 72.05 78.27 72.26 67.39 in the pt-speaking group from Portugal 87.89 88.92 85.57 87.12

  30. Comments: Absolute values for CTC are higher this area has the largest collection (higher than the sum of the others) Percentages for CTC are lower, except for accesses from the international group Is it more international? It seems that language is not very important in C&T

  31. From the accesses side

  32. This analysis focused on the way accesses behaved for the complete collection and how they spread among the sets CTCH, CCS and CTC. The collection and the sets have different profiles numbers of ETDs and numbers of partitions. For this reason, normalization was necessary.

  33. Quantization of potential accesses to sets of works with different profiles: Sets are very different in: Numbers of ETDs Numbers of partitions per ETD This means that numbers of accesses had to be normalized in order to compare accesses to the sets This work presents a first attempt to quantize the way the average ETD in a set attracts accesses

  34. Decision on how to deal the sets differences: Combine average numbers of ETDs with average of average numbers of partitions Index EI = 1 / (average number of ETDs x average of average numbers of partitions)

  35. All CTCH CCS CTC Index EI 0.000041 0.000144 0.000202 0.000079

  36. Average numbers of All CTCH CCS CTC Accesses 740.40 901.73 755.93 644.32 Accesses from the int group 62.77 72.01 59.63 58.79 Accesses from the es+pt-speaking group 43.33 52.76 40.88 38.99 Accesses from the pt-speaking countries 35.56 45.96 34.52 30.16 Accesses from Portugal 31.22 41.32 29.54 26.27 numbers computed for the total numbers of acesses x index EI numbers to be viewed as accumulated monthly averages can be obtained dividing by 85

  37. Comments: When normalized data is consideredn, the average number of accesses (per ETD in Science & Technology) from the international group is the lowest among all The same happens with accesses from the es+pt and pt-speaking groups, and Portugal as well The reason is that ETDs in this group have the lowest average of accesses per ETD among the 3 subsets

  38. When normalized data is considered, the average number of accesses (per ETD in Humanities) from the international group is the highest among all The same happens with accesses from the es+pt and pt-speaking groups, and Portugal as well The reason is that ETDs in this group have the highest average of accesses per ETD among the 3 subsets

  39. FINAL COMMENTS

  40. Percentage wise, international accesses are the most significant for ETDs in S&T At the same time, the average S&T ETD attracts less international accesses than ETDs in other areas of knowledge and the average Humanities ETD attracts the most In all areas of knowledge, accesses from: es- and/or pt-speaking countries are the most significant pt-speaking countries are the most significant in the es+pt group Portugal are the most significant in the pt-group

  41. New ways of defining attraction should be examined Results seem to indicate that language and HDI are important factors in accesses

  42. Thank you! Muito obrigada!

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#