Bandolier's Little Book of Making Sense of the Medical Evidence
By Andrew Moore & Henry McQuay
(Oxford University Press, 2006)
This is the farthest I strayed so far from the area of the library technology. This book is ouside of the LIS domain altogether. However, to my mind the importance of the evidence-based research in the LIS field is so great, that I feel it appropriate to summarize the experience of the evidence-based medicine (EBM) - the area where evidence-based research has been developped and where it demonstrated best results - in this blog entry.
I will only pay attention to the parts of this book that have general methodological value for any evidence-based study.
"Too often someone will claim an evidence base when the evidence they have is a study of two men and a dog, in which the dog got better and the men weren't ill anyway."
(from Introduction p. XII)
What being evindence-based means:
- minimum amount of evidence (meaning not less, but possibly more)
- evidence of good quality
- evidence free of bias
Realistically, only about 1% of all the articles published in medical journals are scientifically sound.
Peer review process does not guarantee the soundness of the published material.
Law of initial results: If the first results are spectacular, the subsequent ones are always mediocre.
The likelihood that the results of the study are true is directly proportional to
- the sample size
- the effect size
and inversely proportional to
- the number of relationships in the area of study to the number of relationships selected for the study
- the degree of flexibility in design, definitions, outcomes, and analytical modes
- the degree of financial and other interests
- the "hotness" of the field (the number of competing scientific teams involved)
Information -> Knowledge -> Wisdom
Information from available sources is filtered and distilled into knowledge (information gathering that is characterized by systematization, generalization, and quality assurance), which is used according to the unique circumstances at hand to produce wisdom (practical application of knowledge to a given case to achieve disired results).
Evidence-based practice is characterized by:
- Production of hight quality evidence through research and scientific review
- Production and dissemination of evidence-based practical guidelines
- Implementation of evidence-based practices through education and change management
- Evaluation of compliance with agreed practices
Searching for evidence.
Identify the question that needs to be asked, select the information sources (databases) that need to be searched, choose search strategy (keywords and keyword combinations), run the search, evaluate the results retrieved, modify the search strategy or the infromation sources and repeat if necessary.
The four component of the question (PICO):
- Population
- Intervention
- Comparison of intervention
- Outcome of interest
Example: Is acupuncture effective in chronic back pain?
Population: all patients with chronic back pain or only those with osteoarthritiis or only those after back surgery, etc.
Intervention: studies that involve even a single session or only those providing a course of a minimum of 4 sessions, what types of acupuncture are going to be included, etc.
Comparison: to treatement with analgesics, placebo, sham acupuncture, etc.
Outcome: measure of pain, mobility, quality of life, adverse effects, etc.
The importance of size.
Random play of chance is a factor we can not ignore, and studies of a smaller size are much more prone to chance effects than larger ones.
High level of statistical significance can be generated by play of chance only. The smaller the difference between control and treatment, the larger trial sizes we need to make sure that the results are clinically valid (not just statistically valid).
Trials of bigger sizes usually have higher quality in general trial architecture than trials of small sizes.
The size of the trial is not equivalent to the number of people who participated in the trial. It's equivalent to the number of events that were registered during the trial. If the number of events is small, then, even if the tested population is large, the trial is very prone to chance influances.
Meta-analysis that aggregates a big number of small trials will very likely reflect the bias common to these trials (noise in - noise out rule).
Clinical trials fundamentals.
Randomization.
Trials are randomized to exclude selection bias. Randomization by tossing a coin or using a computer program is essential. Randomization by date of birth or first name is not acceptable (they are not random).
Blinding.
Blinding minimizes observer bias. Double-blind means that neither the patient nor the doctor knows which treatment has been given.
Superiority trial attempts to show that treatment work better than control.
Equivalence trail attempts to show that the effect of treatment is equivalent to the effect of control (notoriously difficult task).
Non-inferiority trial attempts to show that the treatment is no worse than control.
Class effect refers to a group of treatment that show the same positive and negative effects on the patients. If class effect is proven, you chose the cheapest treatment from the class.
Measuring outputs and utility.
Output is a quantitative measure of benefit or harm the treatment is going to produce for a group of patients similar to those used in the trial.
Odds ratio is the ratio of people having the desired effect to the number of people not having the effect.
Relative risk is the ratio of experimental event rate (EER) to the control event rate (CER)
Both odds ratio and relative risk show whether there is statistical significance in our results. If we lack statistical significance, the data is useless. Statistical significance alone can not ensure usefulness of data.
Relative risk reduction is relative risk divided by CER.
Absolute risk increase is the difference between EEC and CER.
Number needed to treat (NNT) shows how many patients we need to treat on avearage to get one positive result. This is a value of clinical significance of the trial.
Effect size is the standardized observed effect: difference between the mean of experimental group and the mean of control group divided by the standard deviation.
p-value: probability that observed results have occured by chance.
Bias in academic studies.
Reasons for bias:
- Lack of randomization
- Lack of double-blinding
- Insufficient size
- Duplication of results
- Geography (Asian studies, studies from USSR show consistently higher positive results)
- Language (positive result is much likelier to be published in English where it will be more cited)
- Publication (only results that "achieved" something get published)
- Industry or marketing bias (who pays for the study, what study gets to be used in marketing)
Clinical trial validity.
The important thing is to compare like with like. The length of the trial can have profound effect on its outcome. Some treatment work only short-time, others have their effects (positive or negative) accumulating over a long period of time. Dosage and intensity of the treatment will have to be similar in comparing different trials.
Avoid using averages where individual results are more important (treatment had full effect on 1/2 of patients and none on the other half; average effect of 50% does not reflect the fact that every other patient did not experience any effect at all).

