In conducting probability analysis, the two variables that take account of the chance of an event happening are **N** (number of observations) and **λ** (lambda – our hit rate/chance of occurrence in a single interval). When we talk about a cumulative binomial probability distribution, we mean to say that the greater the number of trials, the higher the overall probability of an event occurring.

**probability = 1 – ((1 – λ) ^{N})**

For instance, let us suppose that the odds of scoring a goal in the first five minutes of a football match is **0.05** – this is our probability.

Now, let us suppose that 50 different soccer matches are played. The probability that a goal will be scored in the first five minutes of a match now increases to over 92%:

**1 – ((1 – 0.05) ^{50}) = 0.923055**

Based on the law of large numbers, the larger the number of trials; the larger the probability of an event happening even if the probability within a single trial is very low. So, let us generate a cumulative binomial probability to demonstrate how probability increases given an increase in the number of trials.

Firstly, we define a function (with probabilities set at 2%, 4%, and 6%, along with trials of up to 100:

par(bg = '#191661', fg = '#ffffff', col.main = '#ffffff', col.lab = '#ffffff', col.axis = '#ffffff')#lambda = probability of event occuring in a single trial #powers = number of trials #mu = overall probability given n number of trialsmuCalculation <- function(lambda, powers) {1 - ((1 - lambda)^powers)} probability_at_lambda <- sapply(c(0.02, 0.04, 0.06), muCalculation, seq(0, 100, 1))

Then, we can set up our data as a data frame and then plot as normal:

probability_at_lambdadf=data.frame(probability_at_lambda) col_headings <- c("probability1","probability2","probability3") names(probability_at_lambdadf) <- col_headings probability_at_lambdadf attach(probability_at_lambdadf) plot(probability_at_lambdadf$probability1,type="o",col="#b1aef4", xlab="N", ylab="Probability", xlim=c(0, 100), ylim=c(0.0, 1.0), pch=19) lines(probability_at_lambdadf$probability2,type="o",col="red", xlab="N", ylab="Probability2", xlim=c(0, 100), ylim=c(0.0, 1.0), pch=19) lines(probability_at_lambdadf$probability3,type="o",col="green", xlab="N", ylab="Probability3", xlim=c(0, 100), ylim=c(0.0, 1.0), pch=19) title(main="Probability Chart") grid(nx = NULL, ny = NULL, col = "lightgray", lty = "dotted", lwd = par("lwd"), equilogs = TRUE) legend("bottomright", probability[2], c("probability_at_lambda_1","probability_at_lambda_2", "probability_at_lambda_3"), cex=0.6, col=c("#b1aef4","red","green"), pch=21:22, lty=1:2) proc.time()

**Sample Table**

Here is a sample table with the calculated probabilities (probability_at_lambdadf):

**Plot**

Accordingly, here is a plot of the probabilities:

## Analyse Cumulative Binomial Probability with a Shiny Web Application

This is an example of a Shiny Web application that can calculate cumulative binomial probabilities on the fly.

You’ll remember that our previous R script invoked a function to calculate binomial probabilities based on lambda (the probability of an event happening), and the power value (or number of trials).

The idea is that while the probability of an individual event happening may be low, the cumulative probability of the event happening increases with the number of trials.

**1 - ((1 - λ) ^{N})**

Here is an example of a Shiny Web App that allows us to manipulate the lambda values using a set of sliders and automatically update the probability curve.

To run this app, open the R Studio console and click **File -> New File -> Shiny Web App** and select either Single File to paste the **ui.R** and **server.R** codes together, or Multiple File to paste them separately.

Additionally, if you are new to Shiny you can find my **full tutorial on Sitepoint** that describes how to build and run a Shiny app from scratch.

**ui.R**

A few points when setting up the UI (User Interface):

**lambda**represents the probability of an event occurring in a single trial- The slider input allows the user to set different values for lambda based on the associated probability
- The plot is then outputted with the output being designated the name "ProbPlot".

library(shiny) # Define UI for application that draws a probability plot shinyUI(fluidPage( # Application title titlePanel("Cumulative Binomial Probability Plot"), # Sidebar with a slider input for value of lambda sidebarLayout( sidebarPanel( sliderInput("lambda", "Probability 1:", min = 0, max = 1, value = 0.01), sliderInput("lambda2", "Probability 2:", min = 0, max = 1, value = 0.01), sliderInput("lambda3", "Probability 3:", min = 0, max = 1, value = 0.01) ), # Show a plot of the generated probability plot mainPanel( plotOutput("ProbPlot") ) ) ))

**server.R**

Now, we set up the server - this is the part that takes the inputs and calculates the output that is eventually shown in the UI.

- The
**lambda**values represent the inputs that we defined in the UI; i.e. the user sets the probability from the slider. - The probability function is defined:
**{1 - ((1 - lambda)^powers)}** - The separate probability arrays are then calculated (probability_at_lambda, probability_at_lambda2, probability_at_lambda3)
- The probability is then plotted.

library(shiny) library(ggplot2) library(scales) # Shiny Application shinyServer(function(input, output) { # Reactive expressions output$ProbPlot <- renderPlot({ # generate lambda based on input$lambda from ui.R l=0:1 lambda <- seq(min(l), max(l), length.out = input$lambda) probability=lambda l2=0:1 lambda2 <- seq(min(l2), max(l2), length.out = input$lambda2) probability=lambda l3=0:1 lambda3 <- seq(min(l3), max(l3), length.out = input$lambda3) probability=lambda # generate trials based on lambda value muCalculation <- function(lambda, powers) {1 - ((1 - lambda)^powers)} probability_at_lambda <- sapply(input$lambda, muCalculation, seq(0, 100, 1)) probability_at_lambda2 <- sapply(input$lambda2, muCalculation, seq(0, 100, 1)) probability_at_lambda3 <- sapply(input$lambda3, muCalculation, seq(0, 100, 1)) # draw the probability par(bg = '#191661', fg = '#ffffff', col.main = '#ffffff', col.lab = '#ffffff', col.axis = '#ffffff') plot(probability_at_lambda,type="o",col="#b1aef4", xlab="N", ylab="Probability", xlim=c(0, 100), ylim=c(0.0, 1.0), pch=19) lines(probability_at_lambda2,type="o",col="red", xlab="N", ylab="Probability2", xlim=c(0, 100), ylim=c(0.0, 1.0), pch=19) lines(probability_at_lambda3,type="o",col="green", xlab="N", ylab="Probability3", xlim=c(0, 100), ylim=c(0.0, 1.0), pch=19) title(main="Cumulative Binomial Probability") }) })

## Conclusion

Today, you have learned how to:

- Generate a cumulative binomial probability distribution using R
- Use Shiny to visualise cumulative binomial probability

If you have any questions, please leave them in the comments below and I'll do my best to answer them.