A Novel and Objective Measure of Scientific Impact

Introduction

Two months ago, I wrote that Intuitive Quantitation was hard at work building a new metric for measuring scientific impact – one that worked independently from traditional citation metrics. I am pleased to report that those efforts have been largely successful. This post presents some of our results when applying our new methodology to several disease topic areas.

Previously, I teased the idea of tracking the uptake of ideas and keywords as a potential method for calculating impact. We were inspired to develop a system based on memes, hearkening back to the work of Richard Dawkins who posited that ideas propagate through cultures much as genes do through biological systems. Through quantitative tracking of idea uptake within a topic, we could theoretically identify and track specific instances where new, and critically successful, concepts originated and began to spread.

From Title, Abstract, and Keywords to score

Our analysis relies on tracking concepts in publications through unique words and phrases (n-grams) appearing in the title, abstract and keyword list. Our hypothesis was that the frequency of an impactful publication’s n-grams would measurably increase in subsequent published papers. To prepare the data for analysis, each title and abstract was put through a process of tokenization where trivial words were discarded and duplicate words removed. This generated sets of single words (monograms), two-word phrases (bigrams) and three-word phrases (trigrams). For any given paper each n-gram was compared to all other papers in the search set to determine the frequency each term appears prior to publication, versus the frequency that the term appears post publication. Given by the equation:

$$ \text{TermScore} = \left( \frac{\text{OccurancesFuture}}{\text{#Papers in Future}} \right) - \left( \frac{\text{OccurancesPast}}{\text{#Papers in Past}} \right) $$

A paper’s overall score was then calculated by averaging the individual term scores.

We also applied a concept-based impact analysis to measure the uptake of MeSH Keywords. Following similar logic, we hypothesized that the first paper to use a highly successful keyword within a topic area should be considered more impactful than a paper whose keywords are not well adopted by future publications.

Combining these data using statistical tools and analysis we generated scores for each paper based entirely on the usage of n-grams and uptake of keywords.

We were inspired to develop a system based on memes, hearkening back to the work of Richard Dawkins who posited that ideas propagate through cultures much as genes do through biological systems..

– Benjamin Verdoorn, Senior Data Scientist

Data

Our initial data suggests that these new metrics add critical information to the evaluation of paper impact.  We performed IQuant Engine searches on three specific disease areas and analyzed the results.

Search

Number of Results

Range of Scores

Average Score

Parkinson’s and Dementia [Title/Abstract]

9903

-3.89 – 11.22

-0.004

Restless Leg or Restless legs

6358

-3.01 – 16.25

0.016

Charcot Marie Tooth [Title/Abstract]

5479

-5.11 – 7.78

0.031

The following papers from the search topics scored highly using our new metrics.

The new metrics appear to generate results that are independent from citation metrics while still highlighting papers that could reasonably be seen as genuinely impactful. While we recognize that no metric can fully capture a concept as complex as scientific impact, we posit that this novel analysis not only identifies papers that may otherwise have been overlooked, but can also tag papers in which the citation metrics alone may have exaggerated their true impact.

While the results so far have been informative and robust, we have identified certain situations in which this score may be less accurate. The most glaring of these is in the subset of papers published within the last 1-2 years. While it can be interesting to look at the data for these papers, our language-based metrics can be skewed by the small number of post-publication papers available for analysis, sometimes producing artifactual results.  Additionally, we have found that some of the oldest papers with very few MeSH keywords, n-grams, or preceding papers can generate anomalous scores.

Caveats aside, these new metrics provide critical additional data that will help us provide useful insights to our partners and collaborators, as well as prompting several new ideas for our own research. Our current focus is on developing a comprehensive impact score using a combination of all the available metrics (language based and traditional citation based). We believe this is likely to provide the best possible coverage and the most complete picture possible. Check back next time for more data as I explore these exciting possibilities.

Let us know how we can help enhance your research.

We work with scientists, drug discovery professionals, pharmaceutical companies and researchers to create custom reports and precision analytics to fit your project's needs – with more transparency, on tighter timelines, and prices that make sense.

A Novel and Objective Measure of Scientific Impact

Introduction

Two months ago, I wrote that Intuitive Quantitation was hard at work building a new metric for measuring scientific impact – one that worked independently from traditional citation metrics. I am pleased to report that those efforts have been largely successful. This post presents some of our results when applying our new methodology to several disease topic areas.

Previously, I teased the idea of tracking the uptake of ideas and keywords as a potential method for calculating impact. We were inspired to develop a system based on memes, hearkening back to the work of Richard Dawkins who posited that ideas propagate through cultures much as genes do through biological systems. Through quantitative tracking of idea uptake within a topic, we could theoretically identify and track specific instances where new, and critically successful, concepts originated and began to spread.

From Title, Abstract, and Keywords to score

Our analysis relies on tracking concepts in publications through unique words and phrases (n-grams) appearing in the title, abstract and keyword list. Our hypothesis was that the frequency of an impactful publication’s n-grams would measurably increase in subsequent published papers. To prepare the data for analysis, each title and abstract was put through a process of tokenization where trivial words were discarded and duplicate words removed. This generated sets of single words (monograms), two-word phrases (bigrams) and three-word phrases (trigrams). For any given paper each n-gram was compared to all other papers in the search set to determine the frequency each term appears prior to publication, versus the frequency that the term appears post publication. Given by the equation:

$$ \text{TermScore} = \left( \frac{\text{OccurancesFuture}}{\text{#Papers in Future}} \right) - \left( \frac{\text{OccurancesPast}}{\text{#Papers in Past}} \right) $$

A paper’s overall score was then calculated by averaging the individual term scores.

We also applied a concept-based impact analysis to measure the uptake of MeSH Keywords. Following similar logic, we hypothesized that the first paper to use a highly successful keyword within a topic area should be considered more impactful than a paper whose keywords are not well adopted by future publications.

Combining these data using statistical tools and analysis we generated scores for each paper based entirely on the usage of n-grams and uptake of keywords.

Data

Our initial data suggests that these new metrics add critical information to the evaluation of paper impact.  We performed IQuant Engine searches on three specific disease areas and analyzed the results.

Search

Number of Results

Range of Scores

Average Score

Parkinson’s and Dementia [Title/Abstract]

9903

-3.89 – 11.22

-0.004

Restless Leg or Restless legs

6358

-3.01 – 16.25

0.016

Charcot Marie Tooth [Title/Abstract]

5479

-5.11 – 7.78

0.031

The following papers from the search topics scored highly using our new metrics.

The new metrics appear to generate results that are independent from citation metrics while still highlighting papers that could reasonably be seen as genuinely impactful. While we recognize that no metric can fully capture a concept as complex as scientific impact, we posit that this novel analysis not only identifies papers that may otherwise have been overlooked, but can also tag papers in which the citation metrics alone may have exaggerated their true impact.

While the results so far have been informative and robust, we have identified certain situations in which this score may be less accurate. The most glaring of these is in the subset of papers published within the last 1-2 years. While it can be interesting to look at the data for these papers, our language-based metrics can be skewed by the small number of post-publication papers available for analysis, sometimes producing artifactual results.  Additionally, we have found that some of the oldest papers with very few MeSH keywords, n-grams, or preceding papers can generate anomalous scores.

Caveats aside, these new metrics provide critical additional data that will help us provide useful insights to our partners and collaborators, as well as prompting several new ideas for our own research. Our current focus is on developing a comprehensive impact score using a combination of all the available metrics (language based and traditional citation based). We believe this is likely to provide the best possible coverage and the most complete picture possible. Check back next time for more data as I explore these exciting possibilities.

Let us know how we can help enhance your research.

We work with scientists, drug discovery professionals, pharmaceutical companies and researchers to create custom reports and precision analytics to fit your project's needs – with more transparency, on tighter timelines, and prices that make sense.