Warning: this post is going to analyse mathematical concepts and will most **likely** cause intense headaches to non-mathematical brains.

At the beginning I wanted to make it understandable for lay people before I realized I am not the right man for such a huge task.

I considered it necessary to write it since Bayesian considerations plays a very important role in many scientific and philosophical fields, including metaphysic problems such as the existence of God.

Basically, objective Bayesianism is a theory of knowledge according to which probabilities are degrees of belief (and vice-versa) whose values can be objectively identified by every rational agent disposing of the same information.

It stands in opposition to frequentism which stipulates that the probability of an event is identical with the frequency of a great (nearly infinite) number of events.

I illustrated how this plays out in a previous post.

The name of the philosophy stems from Bayes theorem which stipulates that

where P(A|B) is the probability of an event A given an event B, B the probability of the event B given the event A, P(A) and P(B) the total probabilities of the event A and B, respectively.

At that point, it is important to realize that the Bayesian **identification** of these probabilities with degrees of belief in the hypotheses A and B is a philosophical decision and not a mathematical result, as many Bayesians seem to believe.

Bayes theorem is utilized to actualize the probability of the theory A as new data (the truth of B) come in. Unless one believes in infinite regress, there is going to be basic probabilities called priors which cannot themselves be deduced from former probabilities or likelihoods.

Here I want to go into two closely related problems of Bayesian epistemology, namely those of the ontological nature of these probabilities and the values one objectively assigns to them.

Let us consider that I throw a coin in the air. My degree of belief (1/2) it will land on heads is a subjective brain state which may (or should) be related to a frequency of action if betting money is involved.

But let us now consider the young Isaac Newton who was considering his newly developed theory of universal gravitation. What value should his degree of belief have taken on BEFORE he had begun to consider the first data of the real world?

The great science philosopher Elliot Sobert wrote this about this particular situation:

“*Newton’s universal law of gravitation, when suitably supplemented with plausible background assumptions, can be said to confer probabilities on observations. But what does it mean to say that the law has a probability in the light of those observations? More puzzling still is the idea that it has a probability before any observations are taken into account. If God chose the laws of nature by drawing slips of paper from an urn, it would make sense to say that Newton’s law has an objective prior. But no one believes this process model, and nothing similar seems remotely plausible.” *

Frequentism provides us with well-defined probabilities in many situations. The likelihood of getting a coin coming down as heads is identical with the frequency of this event if I were to repeat it an infinite number of times and the central limit theorem guarantees that one gets an increasingly better approximation of this quantity with a growing number of trials.

But what does the likelihood of the theory of universal gravitation being 2%, 5% or 15% mean?

And once one has come up with a definition one thinks to be valid, what is the** objective value** for the probability prior to any observation being taken into account?

I could not find any answer in the Bayesian papers I have read until now, these questions are apparently best ignored. But to my mind they are very important if you pretend to be building up a theory of knowledge based on probabilities.

Next episode: a mathematical proof of Bayesianism?

Thematic list of ALL posts on this blog **(regularly updated)**

My other blog on Unidentified Aerial Phenomena (UAP)

You’re looking for prior probability, sometimes called a ‘universal prior’.

That is an excellent question. It is [I think fairly well] known that there is no escape from radical skepticism (a completely uninformative prior). So we have to somehow assume lawful behavior. There is possibly a way to pick which laws to first guess, based on the following from SEP: Category Theory:

Category theory is a super-abstract branch of math; it can be used to ‘formalize’ e.g. set theory, which in turn can ‘formalize’ e.g. numbers. So it’s quite meta. Anyway, there is an objective sense in which we can give ‘simpler’ laws a higher probability of fitting the data than more complex laws. I’m not sure whether that’s the best way to go about things, but it’s an avenue worth exploring.

In a sense, you’re ‘just’ doing regression analysis on sensory data. But it’s a little trickier than that, because you have to ask, “which bits of the data I observe with my eyes are relevant?” This is not a trivial question! If you want to get a bit of an idea how it’s tricky, take a look at the field of computer vision. We humans are too used to all the sense we can make out of what our eyes see before we even have well-developed rationality. Computer vision makes you start all over!

That’s enough for a start. 🙂

Many thanks for your thoughtful comment!

“Anyway, there is an objective sense in which we can give ‘simpler’ laws a higher probability of fitting the data than more complex laws. ”

I doubt it. In the category “parsimony” of my blog, I explain why I think this is begging the question unless you define likelihood in a self-serving way.

The only way one could make define an OBJECTIVE Bayesian probability should look like this:

Having a degree of belief of (15%) in a theory is a brain state which turns out (in the long run) to correspond to true beliefs in 15% of the cases.

However I am not aware of any objective Bayesian resorting to this and I would be glad if readers could indicate me relevant papers discussing that. Yet this seems to me to be the only game in town for people willing to construct an objective epistemology based on probability.

Do you think this could be a solution for Bayesians willing to hold fast on objectivity?

This doesn’t make sense to me. I’m not sure what you’re saying.

So, the best universal prior is one which perfectly matches reality. That is, unless there is something like true randomness or libertarian free will—such that things aren’t deterministic. In such a situation, the best universal priors would seem to be ones that optimize for some combination of:

(1) what is most likely to be the case;

(2) what would be better if it were the case.

Where ‘better’ is defined according to ‘the good’. If we presuppose that humans start out with excellent universal priors (before any socialization), then that means we are pre-programmed to:

(1′) perceive reality correctly

(2′) perceive morality correctly

Or have I erred? I like this conversation. 🙂

Hey Luke, let us suppose there are different brain states:

A) Each time this brain state comes up, the associated belief will turn out to be true in 10% of the case

B) Each time this brain state comes up, the associated belief will turn out to be true in 12% of the case

C) Each time this brain state comes up, the associated belief will turn out to be true in 15% of the case

and so on and so forth.

Could a Bayesian tries to be objective by adopting such a definition of “degree of belief”, or does it sound far-fetched?

Your example seems to be very wooden and simple and I’m not at all convinced that Bayesian inference must work this way, or whether it even works well that way. Furthermore, there is likely something wrong with your ‘true’: usually we merely come up with

modelswhichsufficiently approximatesome phenomenon. What makes a given approximation ‘sufficient’ can vary from situation to situation.But perhaps I am erring by implicitly thinking that Bayesian inference

itselfis merely a model. 🙂 I would not be at all surprised if there is some better kind of inference for AI that we discover at some point in the future! So perhaps I’m not being a ‘realist’ in the way that you are, and in the way that the Bayesians you talk about are?“If we presuppose…”

if

@xon-xoff

Presupposing cannot always be avoided. Proper presupposing merely lets a tentative belief develop a bit without

havingto be supported by evidence. Scientists have to do this from time to time, if not more often. But you are right to point out thatsometimespresupposing is invalid.Read F Ramsey (truth and probability as well as sturt blackburn’s and writings on ramsey under teleo-semantics and success semantics) and reichenbach (the theory of probability); this analysis of degree of belief as a metalinguistic truth frequency, and is often used by frequentists. Generally the concept of degree of belief used, particularly in attempts at proving the PP, is relative the objective language probability interpretation in question. Van Fraassen discusses this notion also in paper on calibration, ‘ a frequentist justification of personalism’ 1983. The betting interpretation bayesian is always supervenient on frequentism, insofar that fair bets are defined as breaking even on the long run Perhaps Russell 1948 uses this notion although he uses the principle of indifference based on post facto finite frequentism. Essentially

My instinct is that

there is nomeaningful “objective value for the probability” of something being true, prior to any observations. I don’t see how you can assign probability to one-off occurrences if there’s no previous experience of them. As I understand it, many mathematicians take this view too.cheers from rainy Vienna, zilch

[…] Next episod: Naked Bayesianism. […]

[…] also believe that, unlike most scientific theories or propositions, historical events (including those about the evolution of the universe, earth and life) have an […]

also the central limit theorem and other convergence theorems are strictly irrelevant for frequentism. Convergence is certain and is part of the semantics from which the probability axioms are proved for frequentism not the other way around