I came up with some more criticisms of Bayesianism:
- Bism requires Evidence Individuation for every update, which it offers no guidance on; the criticism relates to theory laden values.
- A Hypothesis space scope problem (note: distict from catch-all problem) which concerns what the hypothesis space is about and what hypothesis spaces you have. The hypotheses spaces themselves are dependent on prior theories/hypotheses. This is similarly a CR inspired criticism. It also makes the complexity of bayesianism much higher. It interacts with EI too.
- One solution might be to have just one hypothesis space over everything and the hypotheses in there are sets/combinations of hypotheses from other more particular spaces (example: $H_k:$ general relativity AND genomic theory AND not astrology AND lizardmen control the US government AND koalamen control the Aussie government AND (lizardmen and koalamen get along on easter iff lunch is provided) AND ...).
- Naturally, this seems like a bad answer and most bayesians probably wouldn't like it.
(Note: I'm only going into detail about the first in this post; I'll expand on the second one later when I have more time)
Also, I think there's still lots of philosophy left to do in the world (things to discover, new thoughts to have). Some of my thoughts might be novel, and others (or close enough) I found in the literature with 202X publication dates (or I found related ideas). That's significant (to me) because they happened after the last time I thought about bayesianism (like 2020), so if I'd had some of these thoughts earlier, then they'd be novel enough to publish apparently (not that I would have done the work to publish them but yeah).
I don't think the stuff I thought of here was that hard to conceive, so it's reasonable that there's still more lowish hanging fruit. (Or maybe I'm unaware of prior art from earlier philosophers (like Popper) because they used different phrasing or something that makes it harder to search for)
Also, I got AI to do some expansion of these ideas and some mathematical proofs. One thing to come out of that is a trilemma of existing problmes (the trilemma appears novel). Repo is here. One thing that occurs to me is that I didn't do a good job of isolating my inputs or saving raw copies, so it's hard to tell what I came up with vs what was deduced. I steered the AI plenty while planning and doing the research, but it also just found and connected things on its own. I ran the stuff I generated through some adversarial chatgpt stuff and even when I framed it with social bias and in a tone that I expected it to be slop, it still said that there were some good new ideas in there.
Evidence Individuation (EI)
There is prior work around double counting evidence, but maybe not from this angle.
Foundational claims: evidence/data is theory laden, and interpreting data requires an explanatory theory or equivalent (these are often called 'models' in bayesianism).
Claim: Bayesianism requires and depends on explanations before any update can be performed. This is because, before evidence, data, or an event can be used for a bayesian update, it needs to be individuated (like epistemically deduplicated). Even simply counting events might be impossible (in B-ism) without an extension to B-ism that explains how to count, and what an event is, etc.
Additional: learning about a hypothesis can force you to update any/all hypothesis spaces (and thus priors) for any/all things you believe. This is because it can arbitrarily recontextualize events and data.
Additional: disagreements about EI can lead to divergence between bayesians even when observing the same events (not just different rates of convergence). This means that the bayesianist claim to convergence depends on an unknown and unspecified model for EI.
Example
Say you were researching supernova or some bright astronomical event, and you observe rare double/triple/quadruple (super)novas or similar. Every time you see a group like this, the timing between them is different, and they're in a different spot in the sky.
How should a B-ist use these events to do bayesian updates? Are they individual data points? What about when you have a hypothesis that it's some interaction (like a chain reaction) between binary/ternary star systems? And what about when you add the hypothesis that it's due to gravitational lensing? I think in these kinds of cases, it's not trivial or obvious how to calculate the update.
Without the right hypothesis/explanation, a B-ist might treat them as independent data points, and part of the problem here is that this artificially increases their confidence/posterior and you need external (to B-ism) error checking to detect it.
Moreover, if you accidentally double counted some evidence for this or other matters (eg in testing predictions of how frequently supernovas should occur based on nuclear theories about star internals), then learning about a new hypothesis in one area can retroactively affect other topics in unpredictable ways. You might learn that, if a named hypothesis $H_i$ is true, then you've been double-counting for decades. Those other topics/hypothesis spaces got contaminated by the incorrect prior understanding of what was being observed. Updates must therefore be non-local at least some of the time, and therefore the output of every update is technically an input to all other updates.
Possible Solutions
One way I can think for B-ism to handle this is to partition priors conditional on hypotheses.1 so you get multiple sets of priors where each set assumes some true hypothesis for the purpose of event individuation and then you hope the sets eventually converge (intuition: it's possible to mathematically prove that 'it always converges' is false). This might be a lot of computation (compared to normal b-ism) but it seems feasible for finite hypothesis spaces (eg isolated coin-flip type stuff).
However, this formulation of B-ism completely breaks down with open hypothesis spaces (like science) because you need catch-all hypotheses (though there are attempts like solomonoff induction to try and address this). Catch-alls are a problem because they are opaque (and not explanatory) and thus cannot tell you how to individuate events and thus you can't partition over catch-alls. Catch-alls encapsulate what you don't know, and EI requires an explanation of the structure and meaning behind the data. If you could expand out catch-alls into actual theories then the computational requirement would explode. That said, figuring out how to do the expansion would be an achievement in and of itself -- it'd require like iterating over all possible ideas, e.g., as computer programs or something.
Another way B-ism could handle EI is to formulate the hypothesis space so that any and all data observed is an event. Don't worry about individuation at all (since all data is atomic and is a unique event), and just feed in data at the max bandwidth. This seems kind of like what LLMs do for training. This maybe kinda works, but the tradeoff is that you lose all explanatory power. Using the B-ists 'ideal bayesian reasoner' fallback, the best the ideal b-ist reasoner can be with this formulation is a kind of black-box oracle.
The problem for B-ism is that it doesn't do either of these, and, practically speaking, people don't do the necessary steps (often they seem unaware of the problem) and probably don't want to. People like working at the level of explanations and conceptualized events rather than raw data. Both the solutions, at the very least, move the process used by 'ideal bayesian reasoner' further away from what people do and what is practical.
See this post also on the CF forum: https://discuss.criticalfallibilism.com/t/some-more-issues-with-bayesianism/2164