The concept of causation is a very difficult philosophical topic and does not have a definitive definition. The causation implied by the Research Analysis model has been of interest to me for some time, but the inspiration for this article came from Handbook of Analytic Philosophy of Medicine by Kazem Sadegh-Zadeh . Sadegh-Zadeh provides a great overview of the philosophy of causation and, in particular, Etiology the science of clinical causation. “Etiology, from the Greek term αιτια (aitia) meaning “the culprit” and “cause”, is the inquiry into clinical causation or causality including the causes of pathological processes and maladies” .
In this article, I will explore some of the tools provided in the handbook (You should refer to section 7.5 of the handbook for a more thorough introduction and overview) and discuss how they can be applied to the Research Analysis model. I will explore the probabilistic interpretations of claims, the concept of causation in relation to claims and the causal relevance of competing claims.
Let’s take an example claim from Research Analysis:
(1) Statins decrease coronary events in normal humans
This claim can be found in Research Analysis here and its semantics are analysed in this previous article here. The semantic analysis resulted in the following logical description of the claim:
(2) ꓯe (DECREASE(statin, myocardial infarction, e) & IN(e, normal human))
Where e represents an event. In my previous article, I discussed that the word case may be more natural than event for this model, where “the concept of a case suggests that the time period and medical process will be appropriate to the specific disease treatment paradigm”.
A lot of the logical formalisation relates to the event or case. If one specific case is considered then we have the simpler claim:
(3) DECREASE(statin, myocardial infarction)
This claim suggests that there is a decreasing relationship between:
At this point probability theory can be introduced. Sadegh-Zadeh [1, p253] shows using probability theory that an event B is probabilistically independent of another event A if and only if (iff):
(4) P(B|A) = p(B)
Where P(B|A) is the probability of the event B conditional on the fact that the event A has occurred. If events A and B are independent then the fact that event A has occurred will have no impact on the probability of event B occurring and thus the conditional probability will simply equal the probability of B occurring independently of A.
Dependence of two events is then simply represented by the opposite probability relationship:
(5) P(B|A) ≠ p(B)
That is, two events are dependent if the conditional probability of B given A is not equal to the probability of even B occurring independently – the fact that A occurred effects the probability that B will occur.
Coming back to the simple example, the claim “The administration of statins decreases myocardial infarction” implies that there is a conditional dependence between the two events and that:
(6) p(Myocardial infarction | The administration of statins) ≠ p(Myocardial infarction)
That is, the probability of myocardial infarction given that statins have been administered to the patient is not equal to the probability of myocardial infarction in general. There is a probabilistic dependence between the event of statin administration and myocardial infarction.
Note that this probabilistic “dependence” does not imply that there is any causal interaction between the two events A and B or myocardial infarction and statins, only that there is a probabilistic correlation between the events. Correlation may exhibit one of two directions in particular cases giving:
(7) Positive correlation: p(B|A) > p(B)
(8) Negative correlation: p(B|A) < p(B)
In our example, “decrease” is intended to mean that there is a negative correlation between the events and using the probabilistic terminology we have:
(9) p(Myocardial infarction | The administration of statins) < p(Myocardial infarction)
This statement says that the administration of statins decreases the probability of myocardial infarction when compared to the general unconditional population or that there is a negative correlation between myocardial infarction and the administration of statins. As discussed in our article relating to Popper’s philosophy of science (find it here), the declarative statements like (1,2) found in Research Analysis are scientific statements in a form that can be logically falsified, but in the real world there is never certainty and thus experience only ever supports probabilistic correlations like that in (9) (actually even probabilities are technically not available according to Popper, but probability has use beyond its risks in many fields of science).
Does the statement (9) imply that there is a causal link between statins and myocardial infarction? To answer this question it is necessary to introduce some further concepts.
Probabilistic relevance or irrelevance
The Research Analysis model has always highlighted the importance of the reference population or model for each claim by requiring the specification of the reference species, disease model and whether it is a whole animal or organ model. Most medical research begins in cell culture or animal models, but has the goal of moving into human applications. It is important to clearly separate claims that relate to mice from those that relate to humans. For this reason, the concept of probabilistic relevance conditional on a reference population or background context is introduced below (refer [1, p255-257] for a more detailed introduction):
(10) p(B|X∩A) > p(B|X) Positive probabilistic relevance or conditional correlation
(11) p(B|X∩A) < p(B|X) Negative probabilistic relevance or conditional correlation
(12) p(B|X∩A) = p(B|X) Probabilistic irrelevance or no conditional correlation
Positive probabilistic relevance (10) says that the probability of event B conditional on both events X and A occurring (X ∩ A) is greater than the probability of event B conditional on X alone. In this presentation of probabilistic relevance, X represents the reference population and A and B are the events for which cause and effect are being evaluated. The following example using (1) can be provided:
(13) p(Myocardial infarction | normal humans ∩ the administration of statins) < p(Myocardial infarction | normal humans)
This sentence says that the probability of myocardial infarction is lower in normal humans that have been administered statins than it is in normal humans in general. As noted, Research Analysis has always included the reference population or background context because research is conducted in many different species, genetic types and disease models. Sadegh-Zadeh in his work notes that the notion of background context is of great importance when analyzing issues of causality and that “There are no such things as ‘causes’ isolated from the context where they are effective or not. The background context will therefore constitute an essential element of our concept of causality.” [1, p 256]
Spurious correlations & Screening Off
To this point in the discussion, it has not been possible to introduce the concept of causation and instead the weaker concepts of relevance and correlation have been used. Where there is a non-zero relevance or correlation between two events A and B as in (10,11), then B could be a potential cause of A, but “Correlation does not imply causation”. To define the concept of causation it is necessary first do define spurious correlation and the concept of screening off.
(14) Screening off: X screens A off from B iff p(B|X∩A) = p(B|X)
This says that X screens off A if and only if A is, in relation to the reference population X (or some other event or set of events), probabilistically irrelevant to B [1, p257]. We can take this concept a step further and define a spurious cause by incorporating it into the sentences (10-12) to assess whether there is an alternative event C that explains the probabilistic relevance of A on B.
(15) Spurious cause: A is a spurious cause of B if there is a C such that p(B |X ∩ A ∩ C) = p(B |X ∩ C).
We can rephrase this and introduce the concept of time order as follows: in a reference population X, an event A is a spurious cause of an event B iff:
An example of a spurious cause can be provided as follows:
p(Death | Humans ∩ AIDS ∩ HIV) = p(Death |Humans ∩ HIV)
In this example, AIDS is screened off by HIV. AIDS is the spurious cause that is screened off by HIV infection. The time order of events is discussed further in the next section.
In the previous example, AIDS could certainly be a cause of Death in untreated individuals. However, AIDS is screened off by HIV. Both AIDS and HIV can be considered as causes of death. Here the concept of Dominant cause can be introduced to provide a ranking between causes.
(16) Dominant Cause: At1 is a dominant cause of Bt2 in X iff there is no t such that for all events Ctin X:
This definition says that a cause is dominant if no simultaneous or later event is able to screen it off from the effect [1, p275]. This can be demonstrated with the AIDS example:
In this example there would exist the following probability relationship:
p(Death2010 | Humans ∩ HIV2001 ∩ AIDS2003) = p(Death2010 | Humans ∩ AIDS2003)
This relationship says that the combination of HIV with a Human and AIDS provides the same probability of death as the combination of AIDS with a Human. It can also be seen that 2001 ≤ 2003 < 2010, which says that the HIV occurred prior to the AIDS. This result confirms that AIDS is not the dominant cause of death in this case as there does exist an earlier event HIV2001 that screens of AIDS2003 from Death2010. The following shows the causes reordered:
p(Death2010 | Humans ∩ AIDS2003 ∩ HIV2001) = p(Death2010 | Humans ∩ HIV2001)
This relationship in a similar way says that the combination of AIDS with a Human and HIV provides the same probability of death as the combination of HIV with a Human. However, in this case the Ct event (AIDS) does not occur in time between the At1 (HIV) and Bt2 (Death) terms. So AIDS cannot be the dominant cause.
The concept of dominant cause provides a means for ruling out spurious causes and for keeping track of the cause that has not been ruled out to date. But in reality we will never be able to test all possible events as causes. Knowledge of the dominant cause of a disease will always be subject to future falsification. This is further complicated by the fact that causal chains run off into the infinite past. Taking the example, it may be that the HIV infection was caused by unprotected sex. At the time 2001 in the example above, HIV may be the dominant cause, but if unprotected sex at an earlier time is considered then the HIV would be screened off by the unprotected sex. Looking at the causes of HIV infection, it can be seen that while unprotected sex may be a common cause of HIV it is not the only one. There is also sharing of needles, infusion with HIV infected blood, etc. So there may be many causes of a disease given a broad background population like all humans even though there may only be one cause for a specific person who contracts HIV. There can also be a common cause for the several symptoms of a disease. Finally, it is rare that there is a single cause for an event. It is usually the case that there are several events that contribute to any future event and we will explore the concept of casual relevance below. Causation is a far more complex concept than most people realise. The concepts presented in this article, and more thoroughly by Sadegh-Zadeh , provide some valuable tools for assessing causation and making more thorough use of the concept.
The concept of causal relevance is useful metric for answering the question: What event is causally more relevant to a particular disease? Causal relevance can be defined as:
(17) cr(A,B,X) = p(B|X∩A) – p(B|X)
This states that causal relevance is simply the difference in the probability of B given the background context X and causal event A and the just the probability of B given the background context X. An example would be:
In a given year:
p(Myocardial infarction | normal humans) = 1%
p(Myocardial infarction | normal humans ∩ smoking) = 2%
cr = p(Myocardial infarction | normal humans ∩ smoking) – p(Myocardial infarction | normal humans)
= 0.02 – 0.01 = 0.01
The numbers are just estimates, but they suggest that while smoking may double the chance of myocardial infarction (MI) it does not have a high causal relevance within a one year period.
Some examples that demonstrate causal relevance [1, p277]:
causal irrelevance amounts to cr(A,B,X) = 0 (no relevance) eg. causal relevance of your healthcare number to myocardial infarction
positive causal relevance is cr(A,B,X) > 0 (causing) eg. smoking to myocardial infarction
negative causal relevance is cr(A,B,X) < 0 (discausing, preventing) eg. statins to myocardial infarction
maximum positive causal relevance cr(A,B,X) = 1 (maximum efficiency) eg. mechanically clamping your coronary artery to myocardial infarction
maximum negative causal relevance cr(A,B,X) = −1 (maximum prevention) eg. no example
Causal relevance as defined here is not a probability at least due to its range from −1 to +1. It is simply a quantitative function that provides a measures of the context-relative causal impact of events.
Causal relevance can also be used to compare the relative importance of different events to an outcome by comparing their causal relevance.
If cr(A1, B, X) > cr (A2, B, X), then A1 is causally more relevant to B in X than A2.
cr(smoking, myocardial infarction, normal humans) > cr (healthcare number, myocardial infarction, normal humans)
This says that smoking is a stronger cause of myocardial infarction than is your healthcare number.
In later articles or versions of this article, we will explore how the concepts of causation and relative causation might be applied to the Research Analysis model and platform.
Version 1.0, 19th March, 2017