G. N. Kawchuk, R. Guan, C. Keen, B. Hauer, G. Kondrak

August 2020, Volume 29, Issue 8, pp 1917 - 1924 Original Article Read Full Article 10.1007/s00586-020-06447-y

First Online: 22 May 2020

Using artificial intelligence algorithms to identify existing knowledge within the back pain literature


Artificial intelligence algorithms can now identify hidden data patterns within the scientific literature. In 2019, these algorithms identified a thermoelectric material within the pre-2009 chemistry literature; years before its discovery in 2012. This approach inspired us to apply this algorithm to the back pain literature as the cause of back pain remains unknown in 90% of cases.


We created a subset of all PubMed abstracts containing “back” and “pain” and then trained the Word2vec algorithm to predict word proximity. We then identified word pairings having high vector proximities between three spinal domains: anatomy, pathology and treatment. We plotted both between-domain and within-domain proximities then used the highest proximity pairs as ground truths in analogy testing to identify known associations (e.g., Canal is to Stenosis as Multifidus is to ?)


We found  50,038 abstracts resulting in 27,984 unique words and 108,252 instances of “back pain”. Ground truth pairings ranged in proximity from 0.86 to 0.70. Plotting revealed unique proximity representations between the three spine domains. From analogy testing, we identified 13 known word associations (pars_interarticularis is to stress_reaction as nerve_root is to compression).


Artificial intelligence algorithms can successfully extract complex concepts from back pain literature. While use of AI algorithms to discover potentially unknown word associations requires future validation, our results provide investigators with a novel tool to generate new hypotheses regarding the origins of LBP and other spine related topics. To encourage use of these tools, we have created a free web-based app for investigator-driven queries.

Read Full Article