Google Scholar
Google Scholar is a freely accessible web search engine that indexes the full text or metadata of scholarly literature across an array of publishing formats and disciplines. Released in beta in November 2004, the Google Scholar index includes most peer-reviewed online academic journals and books, conference papers, theses and dissertations, preprints, abstracts, technical reports, and other scholarly literature, including court opinions and patents.[1] While Google does not publish the size of Google Scholar's database, scientometric researchers estimated it to contain roughly 389 million documents including articles, citations and patents making it the world's largest academic search engine in January 2018.[2] Previously, the size was estimated at 160 million documents as of May 2014.[3] An earlier statistical estimate published in PLOS ONE using a Mark and recapture method estimated approximately 80–90% coverage of all articles published in English with an estimate of 100 million.[4] This estimate also determined how many documents were freely available on the web.
Type of site | Bibliographic database |
---|---|
Owner | |
URL | scholar |
Registration | Optional |
Launched | November 20, 2004 |
Current status | Active |
Google Scholar has been criticized for not vetting journals and for including predatory journals in its index.[5]
History
Google Scholar arose out of a discussion between Alex Verstak and Anurag Acharya,[6] both of whom were then working on building Google's main web index.[7][8] Their goal was to "make the world's problem solvers 10% more efficient"[9] by allowing easier and more accurate access to scientific knowledge. This goal is reflected in the Google Scholar's advertising slogan – "Stand on the shoulders of giants" – taken from a quote by holy Bernard of Chartres and is a nod to the scholars who have contributed to their fields over the centuries, providing the foundation for new intellectual achievements.
Scholar has gained a range of features over time. In 2006, a citation importing feature was implemented supporting bibliography managers (such as RefWorks, RefMan, EndNote, and BibTeX). In 2007, Acharya announced that Google Scholar had started a program to digitize and host journal articles in agreement with their publishers, an effort separate from Google Books, whose scans of older journals do not include the metadata required for identifying specific articles in specific issues.[10] In 2011, Google removed Scholar from the toolbars on its search pages,[11] making it both less easily accessible and less discoverable for users not already aware of its existence. Around this period, sites with similar features such as CiteSeer, Scirus, and Microsoft Windows Live Academic search were developed. Some of these are now defunct; although in 2016, Microsoft launched a new competitor, Microsoft Academic.
A major enhancement was rolled out in 2012, with the possibility for individual scholars to create personal "Scholar Citations profiles".[12]
A feature introduced in November 2013 allows logged-in users to save search results into the "Google Scholar library", a personal collection which the user can search separately and organize by tags.[13] A metrics feature now supports viewing the impact of academic journals,[14] and whole fields of science, via the "metrics" button. This reveals the top journals in a field of interest, and the articles generating these journal's impact can also be accessed.
Features and specifications
Google Scholar allows users to search for digital or physical copies of articles, whether online or in libraries.[15] It indexes "full-text journal articles, technical reports, preprints, theses, books, and other documents, including selected Web pages that are deemed to be 'scholarly.'"[16] Because many of Google Scholar's search results link to commercial journal articles, most people will be able to access only an abstract and the citation details of an article, and have to pay a fee to access the entire article.[16] The most relevant results for the searched keywords will be listed first, in order of the author's ranking, the number of references that are linked to it and their relevance to other scholarly literature, and the ranking of the publication that the journal appears in.[17]
Groups and access to literature
Using its "group of" feature, it shows the available links to journal articles. In the 2005 version, this feature provided a link to both subscription-access versions of an article and to free full-text versions of articles; for most of 2006, it provided links to only the publishers' versions. Since December 2006, it has provided links to both published versions and major open access repositories, including those posted on individual faculty web pages and other unstructured sources identified by similarity. On the other hand, Google Scholar doesn't allow to filter explicitly between toll access and open access resources, a feature offered Unpaywall and the tools which embed its data, such as Web of Science, Scopus and Unpaywall Journals, used by libraries to calculate the real costs and value of their collections.[18]
Citation analysis and tools
Through its "cited by" feature, Google Scholar provides access to abstracts of articles that have cited the article being viewed.[19] It is this feature in particular that provides the citation indexing previously only found in CiteSeer, Scopus, and Web of Science. Google Scholar also provides links so that citations can be either copied in various formats or imported into user-chosen reference managers such as Zotero.
"Scholar Citations profiles" are public author profiles that are editable by authors themselves.[12] Individuals, logging on through a Google account with a bona fide address usually linked to an academic institution, can now create their own page giving their fields of interest and citations. Google Scholar automatically calculates and displays the individual's total citation count, h-index, and i10-index. According to Google, "three quarters of Scholar search results pages [...] show links to the authors' public profiles" as of August 2014.[12]
Related articles
Through its "Related articles" feature, Google Scholar presents a list of closely related articles, ranked primarily by how similar these articles are to the original result, but also taking into account the relevance of each paper.[20]
US legal case database
Google Scholar's legal database of US cases is extensive. Users can search and read published opinions of US state appellate and supreme court cases since 1950, US federal district, appellate, tax, and bankruptcy courts since 1923 and US Supreme Court cases since 1791.[19] Google Scholar embeds clickable citation links within the case and the How Cited tab allows lawyers to research prior case law and the subsequent citations to the court decision.[21] The Google Scholar Legal Content Star Paginator extension inserts Westlaw and LexisNexis style page numbers in line with the text of the case.[22]
Ranking algorithm
While most academic databases and search engines allow users to select one factor (e.g. relevance, citation counts, or publication date) to rank results, Google Scholar ranks results with a combined ranking algorithm in a "way researchers do, weighing the full text of each article, the author, the publication in which the article appears, and how often the piece has been cited in other scholarly literature".[17] Research has shown that Google Scholar puts high weight especially on citation counts[23] and words included in a document's title.[24] In searches by author or year, the number of citations is highly determinant, whereas in keyword searches the number of citations is probably the factor with the most weight, but other factors also participate.[25] As a consequence, the first search results are often highly cited articles.
Limitations and criticism
Some searchers found Google Scholar to be of comparable quality and utility to subscription-based databases when looking at citations of articles in some specific journals.[26][27] The reviews recognize that its "cited by" feature in particular poses serious competition to Scopus and Web of Science. A study looking at the biomedical field found citation information in Google Scholar to be "sometimes inadequate, and less often updated".[28] The coverage of Google Scholar may vary by discipline compared to other general databases.[29] Google Scholar strives to include as many journals as possible, including predatory journals, which "have polluted the global scientific record with pseudo-science, a record that Google Scholar dutifully and perhaps blindly includes in its central index."[30] Google Scholar does not publish a list of journals crawled or publishers included, and the frequency of its updates is uncertain. Bibliometric evidence suggests Google Scholar's coverage of the sciences and social sciences is competitive with other academic databases; however as of 2017, Scholar's coverage of the arts and humanities has not been investigated empirically and Scholar's utility for disciplines in these fields remains ambiguous.[31] Especially early on, some publishers did not allow Scholar to crawl their journals. Elsevier journals have been included since mid-2007, when Elsevier began to make most of its ScienceDirect content available to Google Scholar and Google's web search.[32] However, a 2014 study[4] estimates that Google Scholar can find almost 90% (approximately 100 million) of all scholarly documents on the Web written in English. Large-scale longitudinal studies have found between 40 and 60 percent of scientific articles are available in full text via Google Scholar links.[33]
Google Scholar puts high weight on citation counts in its ranking algorithm and therefore is being criticized for strengthening the Matthew effect;[23] as highly cited papers appear in top positions they gain more citations while new papers hardly appear in top positions and therefore get less attention by the users of Google Scholar and hence fewer citations. Google Scholar effect is a phenomenon when some researchers pick and cite works appearing in the top results on Google Scholar regardless of their contribution to the citing publication because they automatically assume these works' credibility and believe that editors, reviewers, and readers expect to see these citations.[34] Google Scholar has problems identifying publications on the arXiv preprint server correctly. Interpunctuation characters in titles produce wrong search results, and authors are assigned to wrong papers, which leads to erroneous additional search results. Some search results are even given without any comprehensible reason.[35][36] Google Scholar is vulnerable to spam.[37][38] Researchers from the University of California, Berkeley and Otto-von-Guericke University Magdeburg demonstrated that citation counts on Google Scholar can be manipulated and complete non-sense articles created with SCIgen were indexed from Google Scholar.[39] They concluded that citation counts from Google Scholar should only be used with care especially when used to calculate performance metrics such as the h-index or impact factor. Google Scholar started computing an h-index in 2012 with the advent of individual Scholar pages. Several downstream packages like Harzing's Publish or Perish also use its data.[40] The practicality of manipulating h-index calculators by spoofing Google Scholar was demonstrated in 2010 by Cyril Labbe from Joseph Fourier University, who managed to rank "Ike Antkare" ahead of Albert Einstein by means of a large set of SCIgen-produced documents citing each other (effectively an academic link farm).[41] As of 2010, Google Scholar was not able to shepardize case law, as Lexis can.[42] Unlike other indexes of academic work such as Scopus and Web of Science, Google Scholar does not maintain an Application Programming Interface that may be used to automate data retrieval. Use of web scrapers to obtain the contents of search results is also severely restricted by the implementation of rate limiters and CAPTCHAs. Google Scholar does not display or export Digital Object Identifiers (DOIs), a de facto standard implemented by all major academic publishers to uniquely identify and refer to individual pieces of academic work.
Search engine optimization for Google Scholar
Search engine optimization (SEO) for traditional web search engines such as Google has been popular for many years. For several years, SEO has also been applied to academic search engines such as Google Scholar.[43] SEO for academic articles is also called "academic search engine optimization" (ASEO) and defined as "the creation, publication, and modification of scholarly literature in a way that makes it easier for academic search engines to both crawl it and index it".[43] ASEO has been adopted by organizations such as Elsevier,[44] OpenScience,[45] Mendeley,[46] and SAGE Publishing[47] to optimize their articles' rankings in Google Scholar. ASEO has negatives.[39]
See also
References
- "Search Tips: Content Coverage". Google Scholar. Retrieved 27 April 2016.
- Gusenbauer, Michael (2018-11-10). "Google Scholar to overshadow them all? Comparing the sizes of 12 academic search engines and bibliographic databases". Scientometrics. 118: 177–214. doi:10.1007/s11192-018-2958-5. ISSN 0138-9130. S2CID 53249161.
- Orduña-Malea, E., Ayllón, J. M., Martín-Martín, A., & Delgado López-Cózar, E. (2015). Methods for estimating the size of Google Scholar. Scientometrics, 104(3), 931–49. ArXiv
- Trend Watch (2014) Nature 509(7501), 405 – discussing Madian Khabsa and C Lee Giles (2014) The Number of Scholarly Documents on the Public Web, PLOS ONE 9, e93949.
- Kolata, Gina (30 October 2017). "Many Academics Are Eager to Publish in Worthless Journals". The New York Times. Retrieved 2 November 2017.
- Giles, J. (2005). "Science in the web age: Start your engines". Nature. 438 (7068): 554–55. Bibcode:2005Natur.438..554G. doi:10.1038/438554a. PMID 16319857. S2CID 4432132.
- Hughes, Tracey (December 2006). "An interview with Anurag Acharya, Google Scholar lead engineer". Google Librarian Central.
- Assisi, Francis C. (3 January 2005). "Anurag Acharya Helped Google's Scholarly Leap". INDOlink. Archived from the original on 2011-06-08. Retrieved 2007-04-19.
- Steven Levy (2015) The gentleman who made Scholar. "Back channel" on Medium.
- Quint, Barbara (August 27, 2007). "Changes at Google Scholar: A Conversation With Anurag Acharya". Information Today.
- Madrigal, Alexis C. (3 April 2012). "20 Services Google Thinks Are More Important Than Google Scholar". Atlantic.
- Alex Verstak: "Fresh Look of Scholar Profiles". Google Scholar Blog, August 21, 2014
- James Connor: "Google Scholar Library". Google Scholar Blog, November 19, 2013
- "International Journal of Internet Science – Google Scholar Citations". Retrieved 2014-08-22.
- Google Scholar Library Links
- Vine, Rita (January 2006). "Google Scholar". Journal of the Medical Library Association. 94 (1): 97–99. PMC 1324783.
- "About Google Scholar". Retrieved 2010-07-29.
- Denise Wolfe (2020-04-07). "SUNY Negotiates New, Modified Agreement with Elsevier - Libraries News Center University at Buffalo Libraries". library.buffalo.edu. University at Buffalo. Retrieved 2020-04-18.
- "Google Scholar Help".
- Official Google Blog: Exploring the scholarly neighborhood
- Dreiling, Geri (May 11, 2011). "How to Use Google Scholar for Legal Research". Lawyer Tech Review.
- "Google Scholar Legal Content Star Paginator". Archived from the original on 2012-03-14. Retrieved 2011-06-06.
- Jöran Beel and Bela Gipp. Google Scholar's Ranking Algorithm: An Introductory Overview. In Birger Larsen and Jacqueline Leta, editors, Proceedings of the 12th International Conference on Scientometrics and Informetrics (ISSI'09), vol. 1, pp. 230–41, Rio de Janeiro, July 2009. International Society for Scientometrics and Informetrics. ISSN 2175-1935.
- Beel, J.; Gipp, B. (2009). Google Scholar's ranking algorithm: The impact of citation counts (An empirical study) (PDF). 2009 Third International Conference on Research Challenges in Information Science. pp. 439–46. doi:10.1109/RCIS.2009.5089308. ISBN 978-1-4244-2864-9. S2CID 843045.
- Rovira, Cristòfol; Guerrero-Solé, Frederic; Codina, Lluís (2018-06-18). "Received citations as a main SEO factor of Google Scholar results ranking". Profesional de la Información. 27 (3): 559–569. doi:10.3145/epi.2018.may.09. ISSN 1699-2407.
- Bauer, Kathleen; Bakkalbasi, Nisa (September 2005). "An Examination of Citation Counts in a New Scholarly Communication Environment". D-Lib Magazine. 11 (9). doi:10.1045/september2005-bauer.
- Kulkarni, A. V.; Aziz, B.; Shams, I.; Busse, J. W. (2009). "Comparisons of Citations in Web of Science, Scopus, and Google Scholar for Articles Published in General Medical Journals". JAMA: The Journal of the American Medical Association. 302 (10): 1092–96. doi:10.1001/jama.2009.1307. PMID 19738094.
- Falagas, M. E.; Pitsouni, E. I.; Malietzis, G. A.; Pappas, G. (2007). "Comparison of PubMed, Scopus, Web of Science, and Google Scholar: Strengths and weaknesses". The FASEB Journal. 22 (2): 338–42. doi:10.1096/fj.07-9492LSF. PMID 17884971. S2CID 303173.
- Kousha, K.; Thelwall, M. (2007). "Google Scholar citations and Google Web/URL citations: A multi-discipline exploratory analysis" (PDF). Journal of the American Society for Information Science and Technology. 57 (6): 1055–65. Bibcode:2007JASIS..58.1055K. doi:10.1002/asi.20584.
- Beall, Jeffrey (November 2014). "Google Scholar is Filled with Junk Science". Scholarly Open Access. Archived from the original on 2014-11-07. Retrieved 2014-11-10.
- Fagan, Jody (2017). "An evidence-based review of academic web search engines, 2014–2016: Implications for librarians' practice and research agenda". Information Technology and Libraries. 36 (2): 7–47. doi:10.6017/ital.v36i2.9718.
- Brantley, Peter (3 July 2007). "Science Direct-ly into Google". O'Reilly Radar. Archived from the original on 21 April 2008.
- Martín-Martín, Alberto; Orduña-Malea, Enrique; Ayllón, Juan Manuel; Delgado López-Cózar, Emilio (2014-10-30). "Does Google Scholar contain all highly cited documents (1950–2013)?". arXiv:1410.8464 [cs.DL].
- Serenko, A.; Dumay, J. (2015). "Citation classics published in knowledge management journals. Part II: Studying research trends and discovering the Google Scholar Effect" (PDF). Journal of Knowledge Management. 19 (6): 1335–55. doi:10.1108/JKM-02-2015-0086.
- Jacso, Peter (24 September 2009). "Google Scholar's Ghost Authors, Lost Authors, and Other Problems". Library Journal. Archived from the original on 7 June 2011.
- Péter Jacsó (2010). "Metadata mega mess in Google Scholar". Online Information Review. 34: 175–91. doi:10.1108/14684521011024191.
- On the Robustness of Google Scholar against Spam
- Scholarly Open Access – Did A Romanian Researcher Successfully Game Google Scholar to Raise his Citation Count? Archived 2015-01-22 at the Wayback Machine
- Beel, Joeran; Gipp, Bela (December 2010). "Academic search engine spam and google scholar's resilience against it" (PDF). Journal of Electronic Publishing. 13 (3). doi:10.3998/3336451.0013.305.
- "Publish or Perish". Anne-Wil Harzing.com. Retrieved 2013-06-15.
- Labbe, Cyril (2010). "Ike Antkare one of the great stars in the scientific firmament" (PDF). Laboratoire d'Informatique de Grenoble RR-LIG-2008 (technical report). Joseph Fourier University.
- Benn, Oliver (March 9, 2010). "Is Google Scholar a Worthy Adversary?" (PDF). The Recorder.
- Beel, Jöran; Gipp, Bela; Wilde, Erik (2010). "Academic Search Engine Optimization (ASEO)" (PDF). Journal of Scholarly Publishing. 41 (2): 176–90. doi:10.3138/jsp.41.2.176.
- "Get found – optimize your research articles for search engines".
- "Why and how should you optimize academic articles for search engines?".
- "Academic SEO – Market (And Publish) or Perish". 2010-11-29.
- "Help Readers Find Your Article". 2015-05-19.
Further reading
- Jensenius, F., Htun, M., Samuels, D., Singer, D., Lawrence, A., & Chwe, M. (2018). "The Benefits and Pitfalls of Google Scholar" PS: Political Science & Politics, 51(4), 820-824.
External links
Wikidata has the properties: |