|
|
There are several different search engines for searching the biomedical literature.
How can you find the one that is best for you?
This site discusses how to compare biomedical search engines and shows comparative results for three leading options.
Quality, not Quantity -
The goal of searching the literature is to find the right facts and the right references.
In that endeavor, getting 100,000 hits is not better than retrieving 50,000 hits, when there were only 100 documents that were actually relevant.
This means you need a search engine that covers the relevant content and enables you to quickly hone in on the papers you need.
It is like finding a needle in the haystack; more hay just makes the task harder and longer.
Ease of Understanding Why You Got the Results -
Once you get a list of hits, the next thing most of us do is peruse the results to see if the search indeed found relevant documents. You need a search engine that makes this easy since this can be a time-consuming task.
Ease of Exploring the Results and Finding What You Need -
When you do a literature search for the purpose of finding information, such as what genes are associated with a particular process, extracting the desired information can be another pain point. The easier the search engine makes that step, the better for you.
And, when you do a literature search to find an article to use as a reference, refining the results to get to the right reference can also be a hit-or-miss effort. Refining the search too much can exclude the key reference; not refining enough leaves you with a large haystack. Thus, the better search engine will support intuitive drill-down.
Back to Top
PubMed
PubMed is the US Government supported effort to provide a means to search the biomedical literature. Over the years, this amazing effort has been critical for essentially every research project and for most physicians.
Search and ranking algorithms: PubMed uses the Entrez search engine for keyword matches of your search term against the text in the title, abstract, and citation information of each document. There is no full-text document searching, although links to the full-text is provided on some of the separate document pages.
The Entrez engine also matches your search terms against the MeSH headings to find documents. By default, results are sorted by date, and can be resorted by author, journal or title. No relevance sorting is available.
Content: PubMed covers the citation information for over 20 million journal articles.
PubMed Alternatives: Most of the alternative biomedical literature search sites are based on PubMed and use the NCBI Entrez search engine - even if the specific interface looks different. Hence, these PubMed "knock-offs" are not included in this comparative analysis.
Google Scholar
Google Scholar is a cross-disciplinary search engine for journal articles and other scholarly works.
Search and ranking algorithms: Google Scholar does keyword matching to find articles that contain all of your search terms. The results are presented in relevance order, where relevance combines the text of the article with citation, journal, and author information. The citation information appears to dominate, so that highly cited articles are listed first, not necessarily the articles that are relevant to your needs.
Content: Google Scholar's actual content is not well-defined, but essentially covers whatever Google can reach with its search engine indexing robots, limited to whatever Google decides is scholarly. As such, this search engine covers the most full-text content as most publishers allow Google to index their articles. This search engine also covers patents and legal opinions.
Quertle
Quertle is a rapidly growing option in the biomedical search engine area. Quertle is the only major biomedical search engine to focus on semantic searching to improve the relevance of results. Like PubMed and Google Scholar, Quertle is freely accessible.
Search and ranking algorithms: Quertle uses natural language processing to match your search against actual assertions made by the author, rather than simple keyword matching or proximity. Quertle uses an extensive ontology so that all synonyms or aliases for your search terms will be applied automatically. The ontology also allows child concepts to be found; for example, a search for "rodent" will find "mice", "rats", etc. By default, Quertle ranks the results based on relevance, where relevance is primarily determined by how many times the document supports the assertion. In addition to its advanced relationship-based results, Quertle simultaneously does a simple keyword matching search, with those results presented in a separate tab.
Content: Quertle's content covers all of PubMed plus a large number of full-text articles, the NIH RePORTER database of NIH grants, the TOXLINE database of toxic effects of chemicals, biomedical news reports, and whitepapers such as research reports from biomedical companies.
Back to Top
All searches were done on January 26, 2011. The same search was entered on each site, differing only, if needed, to accommodate the syntax required by each site. The results below show the first results page, including the number of documents found. Searches were carried out as most users would, that is, not using the advanced features or limits.
Back to Top
PubMed
Click to view larger image.
PubMed found 202 documents from this search. It is unclear without further examination why some of the documents were found (e.g., item #2), but it is clear that the results include maids as support persons (items 1 and 3).
Click here to execute this query today and explore the results.
Google Scholar
Click to view larger image.
Google Scholar found "about 295,000" documents from this search. Many of these results (e.g., 1, 2, and 4) are from the author name Maid. Result #3 has nothing to do with MAID nor is the term MAID (or any of its synonyms) found anywhere in the article. Result #5 is about household help. As noted above, more hits are not better.
Click here to execute this query today and explore the results.
Quertle
Click to view larger image.
Quertle found 19 documents that refer to the MAID protein. Because Quertle automatically searches for all aliases of search terms, the documents were found because GCIP, Dip1, etc. are equivalent to MAID. The 19 results are highly relevant and contain assertions by the authors about this protein. An additional 7 results are found by traditional keyword searches (the Keyword Results tab).
Click here to execute this query today and explore the results.
Back to Top
PubMed
Click to view larger image.
PubMed found no documents from this search. The term NO has been excluded as a "stopword" (common terms excluded from searching), without recognizing that this is Nitric Oxide instead of the negative "no".
Click here to execute this query today and explore the results.
Google Scholar
Click to view larger image.
Google Scholar found "about 5,030,000" documents from this search. Many of these results (including all on the first page) are from the author initials or last name.
Click here to execute this query today and explore the results.
Quertle
Click to view larger image.
Because Quertle's extensive ontology and semantic rules are fine-tuned for biomedical sciences, Quertle was able to find 88,075 documents that refer to Nitric Oxide, consistent with the vast literature on this compound.
Click here to execute this query today and explore the results.
Back to Top
PubMed
Click to view larger image.
PubMed found 57321 documents from this search. All of the documents contain the terms "disease" and "aging". It is unclear how to go about using these results to identify the actual diseases of aging without reading every article.
Click here to execute this query today and explore the results.
Google Scholar
Click to view larger image.
Google Scholar found "about 1,660,000" documents from this search. As with the PubMed results, most appear to contain the terms "disease" and "aging", but again the only way to catalogue a list of the diseases would be to read many of the 1.6 million results.
Click here to execute this query today and explore the results.
Quertle
Click to view larger image.
Quertle found 4,736 documents that assert a relationship between disease and aging. Note that Quertle automatically finds alternative spellings, such as "ageing", as well as the related concepts of "longevity" and "senescence" for "aging" and "illness", "malady", etc. for "diseases". Quertle also found 60,065 documents that contain both search terms (or their synonyms), but not limited to a relationship.
Click here to execute this query today and explore the results.
In this case, the goal of the search was to find out which diseases are associated with aging. For this, Quertle has a unique feature called Power Terms™. A Power Term represents an entire class of entities. For the search of interest here, the better search to do is "$Diseases of aging", where the Power Term $Diseases represents any disease (or more broadly anything bad physiologically), such as diabetes, Alzheimer's, or oxidative stress - but NOT the term "disease" or any of its synonyms. This way, the results focus in on specific diseases as seen in the next image.
Click to view larger image.
The search with the Power Term finds more relevant documents, and Quertle takes this a step further. For all searches, Quertle automatically identifies the Key Concepts and presents them on the left with the other filters. When a Power Term is used, those Key Concepts include a listing of all members of the Power Term that were found in the literature, as in the next figure.
In this case, the list shows the diseases that were stated in the literature to be associated with aging.
Click here to execute this query today and explore the results and the Key Concepts.
Back to Top
Getting the results is only the start of the literature searching process. Next is understanding whether the results are on target and drilling down to find what you need.
PubMed -
To understand the results on PubMed and see if they are relevant you can read the titles on the results page. Of course, the titles do not usually tell you enough, so you have to open a new page for each result to read the abstract. Search terms are not highlighted in the results list nor on the abstract pages. Access to your institution's library subscriptions is from the abstract page. Once you have identified appropriate articles, you can use the Related Citations link to find more articles that contain similar terms (including MeSH terms). You can also use the Also Try search suggestions. For refining your search, you need to enter a new query (or add to the current one) or filter your results by Reviews or Free Full-text available. Additional filters (limits) are available on separate pages. But mostly, it is a matter of reading every title and opening the individual articles - a very time consuming process.
Google Scholar -
With Google Scholar, three lines of text from each article is presented along with the title. Sometimes, but not always (see the examples above), the additional lines contain your search term(s) in bold. In most cases, further understanding of the relevance requires you to open the link provided. Often, this will take you to a publisher's page, where only the abstract can be viewed without a subscription or purchasing the document. You can, however, set a preference indicating your institution to place an additional link to the document on the results page. To refine your search, you need to modify your query. You can also limit the results to a period of time (e.g., since 2005). Once you find the needle in the haystack, you can use Google's Cited By or Related Articles links. But, as with PubMed, exploring results is mostly the time consuming process of trial and error refining your query and opening a large number of documents to find what is relevant.
Quertle -
For each result, Quertle presents the title as well as the sentence(s) that assert relevant relationships. Your search terms are clearly highlighted so that you can immediately see if the results are relevant. In addition, you can view the abstract for any article without having to leave the results page. Furthermore, links are provided to view the original document from the source Quertle received it from, as well as an available link to your institution's subscriptions. To refine your search on Quertle, a collection of easy-to-use filters are provided on the results page. Included in those filters are a list of automatically extracted Key Concepts that serve as immediate drill-down links. This way, you can quickly follow-up on concepts of interest, and perhaps find new concepts that you did not already know were associated with your search terms.
Back to Top
The comments below are moderated. To submit a comment for inclusion, send the comment to comments at biomedicalsearchengines.com. Please include your initials and location.
|
|