Notes:Statistical test results
Contents
[hide]Notes
Let T:=(u,v) be a statistical test, and P=1 denote the thing being tested for being "true", then:
- P[T=1 | P=1]=u and
- P[T=0 | P=0]=v
Here we will investigate P[T=1], P[P=1 | T=1], P[P=0 | T=1] and so forth
Observations
To find say P[P=i] you'd have to have P[T=j] for j=0 and j=1 already known - these both require some knowledge about the population
Findings
- P[T=j]=P[P=0]P[T=j | P=0]+P[P=1]P[T=j | P=1]
- P[P=i | T=j]:=P[P=i∩T=j]P[T=j]=P[T=j | P=i]P[P=i]P[T=j]
- Which we develop:
- =P[T=j | P=i]P[P=i]P[P=0]P[T=j | P=0]+P[P=1]P[T=j | P=1] - notice the denominator only depends on j - the value of T
- Notice:
- We can find P[T=j | P=i] from the definition of T
- P[P=i] must come from somewhere
- P[T=j] - we will find below
- Which we develop:
We make the following definitions:
- Let P[P=1]:=p
Then:
- Results given the test evaluates to positive
- P[P=1 | T=1]=pu(1−p)(1−v)+pu
- Notice that next we could find P[P=0 | T=1] as 1−P[P=1 | T=1]
- P[P=0 | T=1]=(1−p)(1−v)(1−p)(1−v)+pu
- P[P=1 | T=1]=pu(1−p)(1−v)+pu
- Results given the test evaluates to negative
- P[P=1 | T=0]=p(1−u)(1−p)v+p(1−u)
- Notice that next we could find P[P=0 | T=0] as 1−P[P=1 | T=0]
- P[P=0 | T=0]=(1−p)v(1−p)v+p(1−u)
- P[P=1 | T=0]=p(1−u)(1−p)v+p(1−u)
The result P[P=1 | T=0] is very important in diagnostic tests as this would be a subject that has the property but failed the test, usually the function of a (preliminary at least) test is to not miss any possible subjects - usually at the costs of more false positives - which are cases where the test was positive, but the property is absent.
Specifically:
- Notice that to have P[P=1 | T=0]=0 - no chance of having the property if your test was negative - that we require p(1−u)=0[Note 1]
- if p=0 (i.e. P[P=1]=0) then this is a pointless test.
- Thus we observe we must have 1−u=0 or u=1
- this is to say in order to have a subject with the property failing the test being an impossibility we require the probability of the test being positive given the subject has the property is complete certainty
- we could also say that false negatives are an impossibility
- Thus we observe we must have 1−u=0 or u=1
- if p=0 (i.e. P[P=1]=0) then this is a pointless test.
- a corollary to this is that if u=1 then testing negative means you can be completely certain that the subject does not have the property
Under the conditions of false negatives being an impossibility
Then:
- P[P=1 | T=1]=pp+(1−p)(1−v)
- P[P=0 | T=1]=(1−p)(1−v)p+(1−p)(1−v)
- P[P=1 | T=0]=0 - as discussed above
- P[P=0 | T=0]=1
Analysis
- Here I document a form of analysis I like to apply in some areas of statistics and probability, it's not named but extremely useful
To study tests I like to make the following definitions:
- p=10−k
- This means that "1 in 10k subjects have the property", for k=0 it's 1 in 1 (certainty), for k=1 it's 1 in 10, for k=2 it's 1 in 100, so on so forth
- Notice how after k=0 everything is quite "rare" (as 1 in 10 is certainly not common)
- This is the rarity convention, as larger k means the property is rarer.
- Conversely we could want "9 in 10" or "99 out of 100", in this case let q:=1−p now:
- q=1−10−k is how we'd define it, so k=0 is 0 in 1 (impossibility), for k=1 it's 9 in 10, for k=2 it's 99 in 100, so on so forth
- This is the commonality convention
These are also very useful when plotted, compared to a plot that shows p directly on the range [0,1] as it'll get very steep - by using k for k∈R≥0 a lot of the situation is visible.
Notes
- Jump up ↑ The reader should convince himself that a limit where the denominator tends to positive infinity cannot happen