Mammography and probability
Before we come to the central point of this article, we have to answer the question what mammography is. It is an imaging technique which uses a low-dose x-ray system for the examination of breasts. An examination, using this method, is called a mammogram. Mamographs are widely considered a successful measure to detect early breast cancer in women not showing other symptoms and to discover and diagnose the disease in women already experiencing symptoms. Doctors point out that mammographs can show suspicous changes in breasts up to two years before other symptoms such as a lump, pain or nipple discharge appear, i.e. before a patient or physician can feel changes.
This article is not about the possible risks arising from the radiation. The radiation dose from a mammogram is about 0.7 mSv, the same as the average person receives from natural background radiation in about three months time.
False Positive examination results, i.e. a patient is showing a positive result though she has not breast cancer. Five percent to 15 percent of screening mammograms are positive (in our diagram the value is 9.6 %), i.e. further tests are required, e.g.additional mammograms, ultrasound or even a biopsy. Most of the biopsy results turn out negative, proving that no cancer is present. Women who are undergoing yearly examinations between 40 and 49 years have a 30 % chance of receiving one a false-positive-mammogram during these years. What is worse is that between seven and eight percent of the women will end up having a breast biopsy due to false-positive-result. The estimate for false-positive mammograms is about 25 percent for women ages 50 or older.
Let's concentrate on a hypothetical example: A woman who is undergoing mammography routinely receives a positive result. She asks her doctor about her chances. According to his calculations, he informs her, that her chances are between 20 to 30 percent that the result might be wrong.
The doctor used the following data for his miscalculations:
One percent of all the women undergoing a mammography suffer from breast cancer. For women stricken with cancer the test results are in 80 percent positive. On the other hand the test results are positive in 9.6 percent of healthy women. .
Our doctor failed to use Bayesian reasoning. But he is in "good" company, as many doctors share is innumeracy problem. Many studies, e.g. Gigerenzer and Hoffrage 1995 and Casscells, Schoenberger, and Grayboys 1978, show, show that many doctors have the same problem. About 15 percent of the doctors were capable of solving the problem, but most of them got results between 70 and 80 percent. The authors of another article ("Communicating Statistical Information", Science Magazin, 22 December 2000) say, that just one out of 24 doctors are capable of solving this problem.
But why is it so difficult to get the correct answer?
You will find the correct answer at the end of this article, if you want to spare yourself the mathematical explanations.
The included diagram will be helpful. Warning: The diagram on this page is not scaled properly. It's purpose is to make things easier to be understood.
We know, that 1 percent of all woman undergoing regularly mammography suffer from breast cancer. The test result is correct in 80 percent of all cases.
IN other words: The probability of receiving a positive result for a woman suffering from breast cancer is 80 percent.
B = Breast Cancer,
Pos = positive result of examination,
H = healthy)
P( Pos | B ) = 0.8
P( Pos | H ) = 0.096
P(B) = 0,01
P(Pos) = 0.8 * 0.01 + 0.096 * 0.99 = 0.10304
The probability that a woman with a positive result is in fact sick with breast cancer, can now be calculated using Bayes Theorem:
P(B | Pos) = P(Pos | B) * P(B) / P(Pos)
P(B | Pos) = 0.8 * 0.01 / 0.10304 = 0.07764
The chances for the unfortunate woman of our example are a lot better than the doctor had told her, i.e. 92.2 % of being not sick instead of 20 - 30 %.
Annoation: Another detailled example about this topic can be found on our website "Python Course". It can be found in the chapter "Introduction into Text Classification".