Probly

Correlated Bernoulli Variables Calculator

Enter exactly three values to compute the rest. User-entered values are black, calculated values are grey. Invalid states are shown in red.

1

Conditional Probabilities

Correlation

Visualizations

Formulas

Controls

Summary Statistics

Mean:
Variance:
Std. Dev:
25th %ile:
Median:
75th %ile:
50% Interval:
90% Interval:
95% Interval:
99% Interval:

Sampling

Density plot

Adjust:

Controls

Summary Statistics

Mean:
Variance:
Std. Dev:
25th %ile:
Median:
75th %ile:
50% Interval:
90% Interval:
95% Interval:
99% Interval:

Density plot

Adjust:
Formulas

The marginal density of the k-th order statistic

In an i.i.d. sample of size \(n\), the CDF and PDF of the \(k\)-th order statistic, \(X_{(k)}\), are:

$$ \begin{aligned} \mathbb{P}(X_{(k)} \le x) &= \sum_{i=k}^n {n \choose i} F(x)^i(1-F(x))^{n-i} \\ f_{X_{(k)}}(x) &= k{n\choose k}f(x)F(x)^{k-1}(1-F(x))^{n-k} \end{aligned} $$

Note that the case \(k=n\) is usually called maximum, and \(k=1\) is called minimum.

Marginal Distributions

Marginal for X

Marginal for Y

Dependence Structure

Samples from Joint Distribution

Conditional Density

Adjust:
Formulas

Let \(u = F_X(x)\), \(v = F_Y(y)\), \(x' = \Phi^{-1}(u)\), and \(y' = \Phi^{-1}(v)\).

Conditional on \(X=x\)

$$ f_{Y|X}(y|x) = \frac{f_Y(y)}{\phi(y')} \frac{1}{\sqrt{1-\rho^2}} \phi\left(\frac{y' - \rho x'}{\sqrt{1-\rho^2}}\right) $$

Conditional on \(X \ge x\)

$$ f_{Y|X \ge x}(y) = \frac{f_Y(y)}{1-u} \left[ 1 - \Phi\left(\frac{x' - \rho y'}{\sqrt{1-\rho^2}}\right) \right] $$

Conditional on \(X \le x\)

$$ f_{Y|X \le x}(y) = \frac{f_Y(y)}{u} \Phi\left(\frac{x' - \rho y'}{\sqrt{1-\rho^2}}\right) $$

Process Parameters

\( dX_t = m\,dt + \sigma\,dW_t \)

Sample Paths of \( X_t \)

Adjust:

First Hitting Time \( T_b \)

Density of \( T_b = \inf\{t \ge 0 \mid X_t = b\} \)

Hitting Probability \( \mathbb{P}(T_b \le t) \)

Adjust:

Hitting Probability by Barrier distance

\( \mathbb{P}(\{X_t = b \text{ for any } 0\le t\le T\}) \) as a function of barrier \(b\)

Adjust:
Formulas

Distribution of the process at time \(t\)

The value of the process \(X_t\) at time \(t\) is normally distributed:

$$ X_t \sim \mathcal{N}(X_0 + mt, \sigma^2 t) $$

First Hitting Time

Let \(a = b - X_0\) be the distance to the barrier. The first time the process hits the barrier, \(T_b = \inf\{t \ge 0 \mid X_t = b\}\), follows an Inverse-Gaussian distribution with PDF:

$$ f_{T_b}(t) = \frac{|a|}{\sigma\sqrt{2\pi t^3}} e^{-\frac{(a-mt)^2}{2\sigma^2 t}} $$

Hitting Probability by time \(T\)

The probability of hitting the barrier \(b\) at or before time \(T\) is:

$$ \mathbb{P}(T_b \le T) = \Phi\left(\frac{mT-a}{\sigma\sqrt{T}}\right) + e^{\frac{2ma}{\sigma^2}} \Phi\left(\frac{-mT-a}{\sigma\sqrt{T}}\right) $$

where \(a=b-X_0\) and \(\Phi\) is the standard normal CDF. This applies when \(a > 0\). If \(a < 0\), the signs of both \(a\) and \(m\) are flipped.

Expected Value Calculator

Enter the probability mass function (PMF) of a discrete random variable.

PMF Plot

Inputs

Define states, actions, prior probabilities, and payoffs to calculate the value of information.

Value of Information

Expected Value of Sample Information (EVSI)

Formulas & Assumptions

Decision making with (unreliable) information

Assume there are multiple possible states of the world, and multiple actions you can take. The question is: How much effort/money should you invest into learning more about which state of the world you’re in before making a decision? How reliable would your information source need to be to make the investment worth it?

Try out these examples:

  • Link: In the face of low/medium/high demand, should you build a piece of software yourself, or outsource it? How much should you be willing to pay for a somewhat reliable survey that tells you something about the demand landscape?
  • Link: With four hours time investment you could automate a business process that’ll save others some amount of time. How much time would you want to spend figuring out if this process is done often enough and your automation would actually save time? And how accurate would your information need to be to make this research worthwhile instead of just automating the process at the danger of wasting a little of your time?
  • Link: You have a machine that’s expensive to replace, and you can do maintainance if there’s a problem. Should you call an expert to come on site or just do maintainance?
  • Link: You want to book a flight for an event but the organizers are still deciding between two possible dates. You can buy a non-refundable ticket at the danger of having to buy another one; you can wait but the price for the earlier flight will be more expensive; or you can buy a ticket that allows you to change the date (for a small fee). If a website offers you to “reserve this ticket price for one week for $x”, how much should this be worth to you?

Model Definition

This model calculates the value of imperfect information under specific assumptions of uniform accuracy and symmetric error. This is a special case of the expected value of sample information (EVSI).

Inputs:

  • States \(S_i\): A set of \(n\) mutually exclusive states of nature, \(\{S_1, \ldots, S_n\}\).
  • Actions \(A_j\): A set of \(m\) possible actions, \(\{A_1, \ldots, A_m\}\).
  • Prior \(\pi\) over states: A prior probability distribution, where \(\pi_i = \mathbb{P}(S_i)\) and \(\sum \pi_i = 1\).
  • Payoffs \(V\): An \(m \times n\) matrix where \(V_{ji}\) is the payoff of taking action \(A_j\) if state \(S_i\) occurs.
  • Accuracy \(a\): A scalar \(a \in (0, 1)\) representing the probability that the information source correctly identifies the true state.

The Update Steps

We assume there’s a report \(R\) that identifies the state of nature with some possibility of error. The update from prior beliefs (\(\pi_i=\mathbb{P}(S_i)\)) to posterior beliefs (\(\mathbb{P}(S_k\mid R_j)\)) works through specifying the information's reliability and then applying Bayes' theorem.

1. Construct the Likelihood Matrix (\(L\))

The \(n \times n\) likelihood matrix \(L\), where \(L_{ij} = \mathbb{P}(R_j\mid S_i)\), formalizes the simplified “uniform accuracy \(a\), symmetric error” assumption:

  • The probability of a correct report is \(\mathbb{P}(R_i\mid S_i) = a\).
  • The probability of an incorrect report is \(1-a\), distributed evenly among the \(n-1\) other possibilities. Thus, \(\mathbb{P}(R_j\mid S_i) = \frac{1-a}{n-1}\) for \(i \neq j\).

Using the Kronecker delta \(\delta_{ij}\), the matrix elements are then: $$L_{ij} = a \cdot \delta_{ij} + \frac{1-a}{n-1} \cdot (1-\delta_{ij})$$

2. Calculate the Posterior Probabilities (\(\mathbb{P}(S_k\mid R_j)\))

Given that the survey has returned report \(R_j\), the updated (posterior) probability of the true state being \(S_k\) is found using Bayes' theorem:

$$\begin{aligned}\mathbb{P}(S_k\mid R_j) &= \frac{\mathbb{P}(R_j\mid S_k)\mathbb{P}(S_k)}{\mathbb{P}(R_j)}\\ &=\frac{L_{kj} \cdot \pi_k}{\sum_{i=1}^{n} L_{ij} \pi_i}\end{aligned}$$ This calculation is performed for each possible combination of true state \(S_k\) and report \(R_j\).

Using the Updated Probabilities

These posterior probabilities are then used to find the expected value with the information.

  1. Expected Value without Information: $$EV_{\text{without}} = \max_{\text{action }a=1..m} \left( \sum_{i=1}^{n} V_{ai} \pi_i \right)$$
  2. Expected Value with Information: $$EV_{\text{with}} = \sum_{j=1}^{n} \left[ \max_{\text{action }a=1..m} \left( \sum_{k=1}^{n} V_{ak} \mathbb{P}(S_k\mid R_j) \right) \right] \mathbb{P}(R_j)$$ Note that \(\mathbb{P}(R_j) = \sum_i L_{ij}\pi_i\), and the limiting case of “perfect information” is \(\mathbb{P}(R_j\mid S_i) = \delta_{ij}\).
  3. EVSI & EVPI: $$EV\!SI = EV_{\text{with}} - EV_{\text{without}}$$

Inputs

Define hypotheses, priors, and enter evidence to see how probabilities update via Bayes' theorem.

Posterior Probability Trajectories

Formulas

Bayesian Updating

This app performs sequential Bayesian updating. We start with a set of mutually exclusive hypotheses \(H_1, \dots, H_m\) and prior probabilities \(\mathbb{P}(H_i)\) for each.

When a piece of evidence \(E\) is observed, we update our beliefs using Bayes' theorem. The posterior probability of hypothesis \(H_i\) given evidence \(E\) is calculated as:

$$ \mathbb{P}(H_i | E) = \frac{\mathbb{P}(E | H_i) \mathbb{P}(H_i)}{\sum_{j=1}^{m} \mathbb{P}(E | H_j) \mathbb{P}(H_j)} $$

Here, \(\mathbb{P}(E | H_i)\) is the likelihood of observing evidence \(E\) if hypothesis \(H_i\) is true.

Sequential Evidence

When multiple pieces of evidence \(E_1, E_2, \dots, E_n\) are observed sequentially, the posterior from one step becomes the prior for the next. Let \(\pi_i^{(k)} = \mathbb{P}(H_i | E_1, \dots, E_k)\). Then the update rule for evidence \(E_k\) is:

$$ \pi_i^{(k)} = \frac{\mathbb{P}(E_k | H_i) \pi_i^{(k-1)}}{\sum_{j=1}^{m} \mathbb{P}(E_k | H_j) \pi_j^{(k-1)}} $$

where \(\pi_i^{(0)}\) is the initial prior \(\mathbb{P}(H_i)\).

This assumes that each piece of evidence \(E_k\) is conditionally independent of the previous evidence \(E_1, \dots, E_{k-1}\) given the hypothesis \(H_i\). (This is the “naive Bayes” assumption, and will lead to overconfident probabilities if the pieces of evidence are correlated.)

Mixture Model

Define a mixture of probability distributions. Priors must sum to 1.

Component Name Prior/Weight Distribution & Parameters

Mixture Density

Posterior Probabilities \(\mathbb{P}(\text{Component} | X=x)\)

Formulas

Mixture Density

The probability density function (PDF) of a mixture model is a weighted sum of the PDFs of its component distributions.

$$ f_{\text{mixture}}(x) = \sum_{i=1}^k w_i f_i(x; \theta_i) $$

where \(w_i\) is the weight (prior probability) of the i-th component, and \(f_i(x; \theta_i)\) is its PDF with parameters \(\theta_i\).

Posterior Probability

Given an observation \(x\), the posterior probability of it belonging to component \(C_i\) is given by Bayes' theorem:

$$ \mathbb{P}(C_i \mid X=x) = \frac{w_i f_i(x; \theta_i)}{\sum_{j=1}^k w_j f_j(x; \theta_j)} $$

Inputs

Selections of \(k\) items from \(N\)

Order Matters Order Doesn't Matter
Without Replacement Permutations \(P(N, k)\) Choices \(C(N, k)\)
With Replacement Sequences \(N^k\) Multisets \(C(N+k-1, k)\)

Birthday Problem / Collision Probability

Probability of at least one collision in a group of \(k\) items chosen from \(N\) possibilities (with replacement).

Formulas

Selections of \(k\) items from \(N\)

The number of ways to select \(k\) items from a set of \(N\) items depends on whether order matters and whether replacement is allowed.

  • Permutations (without replacement, order matters): The number of ways to choose and arrange \(k\) items from \(N\). $$P(N, k) = \frac{N!}{(N-k)!}$$
  • Choices/Combinations (without replacement, order doesn't matter): The number of ways to choose \(k\) items from \(N\). $$C(N, k) = \binom{N}{k} = \frac{N!}{k!(N-k)!}$$
  • Sequences (with replacement, order matters): The number of sequences of length \(k\) from \(N\) items. $$N^k$$
  • Multisets (with replacement, order doesn't matter): The number of ways to choose \(k\) items from \(N\) with replacement, also known as "stars and bars". $$\binom{N+k-1}{k} = \frac{(N+k-1)!}{k!(N-1)!}$$

Birthday Problem / Collision Probability

The probability that in a group of \(k\) items chosen from \(N\) possibilities with replacement, at least two are the same.

The probability of no matches is given by:

$$ P(\text{no match}) = \frac{P(N, k)}{N^k} = \frac{N!}{(N-k)! N^k} $$

The probability of at least one match is then \(1 - P(\text{no match})\).

About

This is a web app that helps you do quick probability calculations on your phone or computer. Its features fall into three main categories:

1. Exploring Probability Distributions

These tools let you visualize and work with probability distributions:

  • Distributions: Plot density functions for common continuous and discrete distributions, see summary statistics, and interactively calculate probabilities and conditional expectations.
  • Order Statistics: Calculate and plot the distribution of the k-th smallest value from a sample of n i.i.d. random variables.
  • Gaussian Copula: Construct and visualize joint distributions for two continuous random variables with arbitrary marginals and a given correlation.
  • Expected Value: Calculate the expected value from a discrete probability mass function, and optionally fit a continuous distribution to it.

2. Bayesian Reasoning

These tools are applications of Bayes' theorem, which is fundamental for updating your beliefs in light of new evidence:

  • Correlated Bernoulli Variables: Solve for the joint probability of two binary events given any combination of marginal probabilities, conditional probabilities, or their correlation.
  • Information Value: Calculate the economic value of acquiring new, imperfect information before making a decision (the Expected Value of Sample Information).
  • Mixture Models: For a value drawn from a mixed population, find the posterior probability that it came from a specific sub-population (component).
  • Multiple Hypothesis Testing: Track how the probabilities of competing hypotheses change as new evidence comes in.

3. Other Probabilistic Models

  • Wiener Process: Simulate sample paths of Brownian motion, and analyze first hitting time distributions and hitting probabilities.
  • Selection Combinatorics: Calculate permutations, combinations, and other counting problems. This app also includes a "Birthday Problem" calculator.

How it works

For all calculations there are either closed form solutions available, or some simple numeric calculations. The app uses numeric.js and jStat for most calculations. The values you input never leave your device, all computation is done locally.

The graphs and visualizations are created using d3.js.

If you make consequential decisions, don’t solely rely on the answers given here, you should perhaps double check the results with some Python code.

Bugs/Contributions

The code is available on GitHub. You are welcome to open issues there, or better yet, send a pull request. The purpose of the app is to enable quick and easy probability calculations – so I won’t be accepting contributions of the form “enter 20 data points and compute statistic X”, you’d want a proper data science environment for that.

Author & License

Copyright 2025 Julius Plenz. Published under the MIT license.

Install Probly

For a better experience, you can install this application on your device. It will work offline and feel like a native app. Click the button below to install.

Probly Logo