Knowledge-dependent frequentist probabilities

This is going to be a (relatively) geeky post which I tried to make understandable for lay people.

Given the important role than epistemological assumptions play in debate between theists and atheists, I deemed it necessary to first write a groundwork upon which more interesting discussions (about the existence of God, the historicity of Jesus, miracles, the paranormal…) will lie.

Bayesianism, Degrees of belief

In other posts I explained why I am skeptical about the Bayesian interpretation of probabilities as degrees of belief. I see no need to adjust the intensity of our belief in string theory (which is a subjective feeling) in order to do good science or to avoid irrationality.

Many Bayesians complain that if we don’t consider subjective probabilities, a great number of fields  such as economy, biology, geography or even history would collapse.
This is a strong pragmatic ground for being a Bayesian I hear over and over again.

Central limit theorem and frequencies

I don’t think this is warranted for I believe that the incredible successes brought about by probabilistic calculations concern events which are (in principle) repeatable and therefore open to a frequentist interpretation of the related likelihoods.

According to a knowledge-dependent interpretation of frequentism I rely on the probability of an event is its frequency if the known circumstances were to be repeated an infinite number of times.

Let us consider an ideal dice which is thrown in a perfectly random way. Obviously we can only find approximations of this situation in the real world, but a computer can reasonably do the job.

In the following graphics, I plotted the results for five series of trials.

The frequentist probability of the event is defined as

,

that is the limit of the frequency of “3” when the number of trials becomes close to infinity.

This is a mathematical abstraction which never exists in the real world, but from the 6000-th trial onward the frequency is a very good approximation of the probability which will converge to the probability according to the central limit theorem.

Actually my knowledge-dependent frequentist interpretation allows me to consider the probability of unique events which have not yet occurred.

For example, a Bayesian wrote that “the advantage of this view over the frequency interpretation is that it can deal with cases where there is no relative frequency to draw on: for example, Gigerenzer mentions the first ever heart transplant patient who was given a 70% chance of survival by the surgeon. Under the frequency interpretation that statement made no sense, because there had never actually been any similar operations by then.“

I think there are many confusions going on here.
Let us call K the total knowledge of the physician which might include the different bodily features of the patient, the state of his organs and the hazard of the novel procedure.

The frequentist probability would be defined as the ratio of surviving patients divided by the total number of patients undergoing the operation if the known circumstances underlying K were to be repeated a very great (actually infinite) number of times.Granted, for many people this does not seem as intuitive as the previous example with the dice.
And it is obvious there existed for the physician no frequency he could have used to directly approximate the probability.
Nevertheless, this frequentist interpretation is by no means absurd.

The physician could very well have used Bayes’s theorem to approximate the probability while having only used other frequentist probabilities, such as the probability that the body reacting in a certain way would be followed by death or the probability that introducing a device in some organs could have lethal consequences.

Another example is the estimation of the probability it is going to rain tomorrow morning as you will wake up.

While the situation you are confronted with might very well be unique in the whole history of mankind, the probability is well defined by the frequency of rain if all the circumstances you know of were to be repeated an extremely high number of times.

Given this extended, knowledge-dependent variant of frequentism, the probabilities of single events are meaningful and many fields considered as Bayesian (such as economical simulations, history or evolutionary biology) could be as well interpreted according to this version of frequentism.

It has a great advantage: it allows us to bypass completely subjective degrees of belief and to focus on an objective concept of probability.

Now, some Bayesians could come up and tell me that it is possible that the frequentist probabilities of the survival of the first heart transplant patient or of the weather does not exist: in other words, if the known circumstances were to be repeated an infinite number of times, the frequency would keep oscillating instead of converging to a fixed value (such as 1/6 for the dice).

This is a fair objection, but such a situation would not only show that the frequentist probability does not exist but that the Bayesian interpretation is meaningless as well.

It seems utterly nonsensical to my mind to say that every rational agent ought to have a degree of belief of (say) 0.45 or 0.87 if the frequency of the event (given all known circumstances) would keep fluctuating between 0.01 and 0.99.
For in this case the event is completely unpredictable and it seems entirely misguided to associate a probability to it.

Another related problem is that in such a situation a degree of belief could be no nothing more than a pure mind state with no relation to the objective world whatsoever.

As professor Jon Williamson wrote:
Since Bayesian methods for estimating physical probabilities depend on a given prior probability function, and it is precisely the prior that is in question here, this leaves classical (frequentist) estimation methods—in particular confidence interval estimation methods—as the natural candidate for determining physical probabilities. Hence the Bayesian needs the frequentist for calibration.”

But if this frequentist probability does not exist, the Bayesian has absolutely no way to relate his degree of  belief to reality since no prior can be defined and evaluated.

Fortunately, the incredible success of the mathematical treatment of uncertain phenomenons (in biology, evolution, geology, history, economics and politics to name only a few) show that we are justified in believing in the meaningfulness of the probability of the underlying events, even if they might be quite unique.

In this way, I believe that many examples Bayesians use to argue for the indispensability of their subjectivist probabilistic concept ultimately fail because the same cases could have been handled using the frequentist concept I have outlined here.

However this still leaves out an important aspect: what are we to do about theories such as the universal gravitation, string theory or the existence of a multiverse?
It is obvious no frequentist interpretation of their truth can be given.
Does that mean that without Bayesianism we would have no way to evaluate the relative merits of such competing models in these situations?
Fortunately no, but this will be the topic of a future post.
At the moment I would hate to kill the suspense 🙂

A mathematical proof of Bayesianism?

This is going to be another boring post (at least for most people who are not nerds).

However before approaching interesting questions such as the existence of God, morality and history a sound epistemology (theory of knowledge) must already be present. During most (heated) debates between theists and atheists, people tend to take for granted many epistemological principles which are very questionable.

This is why I spend a certain amount of my time exploring such questions, as a groundwork for more applied discussions.

I highly recommand all my reader to first read my two other posts on the concept of probability before reading what follows.

Bayesianism is a theory of knowledge according to which our degrees of belief in theories are well defined probabilities taking on values between 0 and 1.

According to this view, saying that string theory has a probability of 0.2 to be true is as meaningful as saying that a normal dice randomly thrown has a probability of 1/6 to produce a “3”.

Bayesians like asserting over and over again that it is mathematically proven to say we ought to compute the likelihood of all beliefs according to the laws of probability and first and foremost Bayes formula:

Here I want to debunk this popular assertion. Bayes theorem can be mathematically proven for frequential probabilities but there is no such proof that ALL our degrees of belief behave that way.

Let us consider (as an example) the American population (360 millions people) and two features a person might have.

CE (Conservative Evangelical): the individual believes that the Bible contains no error.

FH (Fag Hating): the individual passionately hates gay people.

Let us suppose that 30% of Americans are CE and that 5.8% of Americans hate homosexuals.

The frequencies are f(CE) = 0.30 and f(FH) = 0.058

Let us now consider a random event: you meet an American by chance.
What is the probability that you meet a CE person and what is the probability that you meet a FH individual?
According to a frequentist interpretation, the probability equals the frequency of meeting such kinds of persons given a very great (actually infinite) number of encounters.
From this it naturally follows that p(CE) = f(CE) = 0.30 and p(FH) = f(FH) = 0.058

Let us now introduce the concept of conditional probability: if you meet a Conservative Evangelical, what is the probability that he hates faggots p(FH|CE)? (the | stands for „given“).

If you meet a fag-hating person, what is the probability that he believes in Biblical inerrancy p(CE|FH)?

To answer these questions (thereby proving Bayes theorem) it is necessary to get back to our consideration of frequencies.

Let us consider that 10% of all Conservative Evangelicals and 4% of people who are not CE hate faggots: f(FH/CE) = 0.1 and f(FH/CE) = 0.04. The symbol ⌐ stands for the negation (denial) of a proposition.

The proportion of Americans who are both conservative Evangelicals and fag-haters is f(FHCE) = f(FH/CE)*f(CE) = 0.1*0.3 = 0.03.

The proportion of Americans who are NOT conservative Evangelicals but fag-haters is f(FH∩⌐CE) = f(FH/⌐CE)*f(⌐CE) = 0.04*0.7 = 0.028.

Logically the frequency of fag-haters in the whole American population is equal to the sum of the two proportions:

f(FH) = f(FHCE) + f(FH∩⌐CE) = 0.03 + 0.028 = 0.058

But what if we are interested to know the probability that a person is a conservative Evangelical IF that person hates queers p(CE|FH)?

This corresponds to the frequency(proportion) of Conservative Evangelicals among Fag-Haters: f(CE|FH).

We know that f(FHCE) = f(CE∩FH) = f(CE|FH)*f(FH)

Thus f(CE|FH) = f(FH∩CE) / f(FH)

Given a frequentist interpretation of probability, this entails that

which is of course Bayes theorem. We have mathematically proven it in this particular case but the rigorous mathematical demonstration would be pretty much the same given events expressable as frequencies.

If you meet an American who hates gays, the probability that he is a Conservative Evangalical is 51.72% (given the validity of my starting values above).

But let us now consider the Bayesian interpretation of probability (our degree of confidence in a theory) in a context having nothing to do with frequencies.

Let S be “String theory is true“ and UEP “an Undead Elementary Particle has been detected during an experience in the LHC“.

In that context, the probabilities correspond to our confidence in the truth of theories and hypotheses.

We have no compelling grounds for thinking that

, that is to say that is the way our brains actually work or ought to work that way in order to strive for truth.

The mathematical demonstration used to prove Bayes theorem relies on related frequencies and cannot be employed in a context where propositions (such as S and UEP) cannot be understood as frequencies.
Considering ALL our degrees of beliefs like probabilities is a philosophical decision and not an inevitable result of mathematics.

I hope that I have been not too boring for lay people.

Now I have a homework for you: what is the probability that Homeschooling Parents would like to employ my post as an introduction to probability interpretation, given that they live in the Bible Belt  p(HP|BB)?

On the ontology of the objective Bayesian probability interpretation

Warning: this post is going to analyse mathematical concepts and will most likely cause intense headaches to non-mathematical brains.

At the beginning I wanted to make it understandable for lay people before I realized I am not the right man for such a huge task.

I considered it necessary to write it since Bayesian considerations plays a very important role in many scientific and philosophical fields, including metaphysic problems such as the existence of God.

Basically, objective Bayesianism is a theory of knowledge according to which probabilities are degrees of belief (and vice-versa) whose values can be objectively identified by every rational agent disposing of the same information.

It stands in opposition to frequentism which stipulates that the probability of an event is identical with the frequency of a great (nearly infinite) number of events.

I illustrated how this plays out in a previous post.

The name of the philosophy stems from Bayes theorem which stipulates that

where P(A|B) is the probability of an event A given an event B, B the probability of the event B given the event A, P(A) and P(B) the total probabilities of the event A and B, respectively.

At that point, it is important to realize that the Bayesian identification of these probabilities with degrees of belief in the hypotheses A and B is a philosophical decision and not a mathematical result, as many Bayesians seem to believe.

Bayes theorem is utilized to actualize the probability of the theory A as new data (the truth of B) come in. Unless one believes in infinite regress, there is going to be basic probabilities called priors which cannot themselves be deduced from former probabilities or likelihoods.

Here I want to go into two closely related problems of Bayesian epistemology, namely those of the ontological nature of these probabilities and the values one objectively assigns to them.

Let us consider that I throw a coin in the air. My degree of belief (1/2) it will land on heads is a subjective brain state which may (or should) be related to a frequency of action if betting money is involved.

But let us now consider the young Isaac Newton who was considering his newly developed theory of universal gravitation. What value should his degree of belief have taken on BEFORE he had begun to consider the first data of the real world?

Newton’s universal law of gravitation, when suitably supplemented with plausible background assumptions, can be said to confer probabilities on observations. But what does it mean to say that the law has a probability in the light of those observations? More puzzling still is the idea that it has a probability before any observations are taken into account. If God chose the laws of nature by drawing slips of paper from an urn, it would make sense to say that Newton’s law has an objective prior. But no one believes this process model, and nothing similar seems remotely plausible.”

Frequentism provides us with well-defined probabilities in many situations. The likelihood of getting a coin coming down as heads is identical with the frequency of this event if I were to repeat it an infinite number of times and the central limit theorem guarantees that one gets an increasingly better approximation of this quantity with a growing number of trials.

But what does the likelihood of the theory of universal gravitation being 2%, 5% or 15% mean?

And once one has come up with a definition one thinks to be valid, what is the objective value for the probability prior to any observation being taken into account?

I could not find any answer in the Bayesian papers I have read until now, these questions are apparently best ignored. But to my mind they are very important if you pretend to be building up a theory of knowledge based on probabilities.

Next episode: a mathematical proof of Bayesianism?

Thematic list of ALL posts on this blog (regularly updated)

My other blog on Unidentified Aerial Phenomena (UAP)

Why probabilities matter

In real life, it’s pretty rare (some would even say utterly impossible) to be sure of anything at all, like knowing it’s going to rain in one hour, that a conservative president is going to be elected, that you will be happily married in two years and so on and so forth.

We all recognize that it is only meaningful to speak of the probability or likelihood of each of these events.

The question of how to interpret their profound nature (ontoloy) is however, far from being an easy one.

I will use the basic proposition: if I roll the dice, there is a probability of 1/6 I will get a 3 in order to illustrate the two main interpretation of the probability concept out there.

1. Frequentism

According to this interpretation, the probability of an event equals its frequency if it is repeated an infinite number of times. If you roll a dice a great number of time, the frequency of the event (that is the number of 3s divided by the total number of rollings) will converge towards 1/6.

Mathematically it is a well defined concept and in many cases it can be relatively easily approximated. One of the main difficulties is that it apparently fails to account for the likelihood of unique situations, such as that (as far as we know in 2013) the Republicans are going to win the next American elections.

This brings us to the next popular interpretation of probability.

2. Bayesianism

For Bayesians, probabilities are degrees of belief and each degree of belief is a probability.

My degree of belief that the dice will fall onto 3 is 1/6.

But what is then a „degree of belief“? It is a psychological mind state which is correlated with a certain readiness for action.

According to many proponents of Bayenianism, degrees of belief are objective in so far that every rational creature disposing of a set of information would have exactly the same.

While such a claim is largely defensible for many situations such as the rolling of dices, the spread of a disaease or the results of the next elections, there are cases where it does not seem to make any sense at all.

Take for exampling the young Isaac Newton who was considering his newly developed theory of universal gravitation. What value should his degree of belief have taken on BEFORE he had begun to consider the first data of the real world?

And what would it mean ontologically to say that we have a degree of belief of 60% that the theory is true? What is the relation (in that particular situation) between the intensity of certain brain processes and the objective reality?

Such considerations have led other Bayesians to give up objectivity and define „degrees of belief“ as subjective states of mind, which might however be objectively constrained in many situations.

Another criticism of (strong) Bayesianism is that it ties the concept of probability to the belief of intelligent creatures. Yet it is clear that even in an universe lacking conscious beings, the probability of the decay of an atom and of more fundamental quantum processes would still exist and be meaningful.

For completeness, I should mention the propensity interpretation of Karl Popper who viewed the likelihood of an event as an intrinsic tendency of a physical system to tend towards a certain state of affairs.

So this was my completely unbiased (pun intended!) views on probabilities.

When debating (and fighting!) each other, theists and atheists tend to take their own epistemology (theory of knowledge) as granted.

This often leads to fruitless and idle discussions.

This is why I want to take the time to examine how we can know, what it means to know, before discussing what we can (and cannot) know.

Thematic list of ALL posts on this blog (regularly updated)

My other blog on Unidentified Aerial Phenomena (UAP)

Next episod: Naked Bayesianism.