What is the purpose of dgeom, pgeom, qgeom, and rgeom in R and how can they be used in data analysis?

The dgeom, pgeom, qgeom, and rgeom functions in R are part of the geometric distribution family and are used to analyze data that follows a geometric distribution.

The dgeom function calculates the probability density function (PDF) for a given number of trials and success probability. This can be used to determine the likelihood of obtaining a certain number of successes in a series of independent trials.

The pgeom function calculates the cumulative distribution function (CDF) for a given number of trials and success probability. This can be used to determine the probability of obtaining a certain number of successes or less in a series of independent trials.

The qgeom function calculates the quantile function for a given number of trials and success probability. This can be used to find the number of trials needed to achieve a certain level of success probability.

The rgeom function generates a random sample from a geometric distribution with a given number of trials and success probability. This can be used to simulate data for testing and analysis.

Overall, these functions are useful in data analysis as they allow for the calculation, determination, and simulation of data that follows a geometric distribution, providing insights into the probability and patterns of success in a series of independent trials.

A Guide to dgeom, pgeom, qgeom, and rgeom in R


This tutorial explains how to work with the in R using the following functions

  • dgeom: returns the value of the geometric probability density function.
  • pgeom: returns the value of the geometric cumulative density function.
  • qgeom: returns the value of the inverse geometric cumulative density function.
  • rgeom: generates a vector of geometric distributed random variables.

Here are some examples of cases where you might use each of these functions.

dgeom

The dgeom function finds the probability of experiencing a certain amount of failures before experiencing the first success in a series of Bernoulli trials, using the following syntax:

dgeom(x, prob) 

where:

  • x: number of failures before first success
  • prob: probability of success on a given trial

Here’s an example of when you might use this function in practice:

A researcher is waiting outside of a library to ask people if they support a certain law. The probability that a given person supports the law is p = 0.2. What is the probability that the fourth person the researcher talks to is the first person to support the law?

dgeom(x=3, prob=.2)

#0.1024

The probability that the researchers experiences 3 “failures” before the first success is 0.1024.

pgeom

The pgeom function finds the probability of experiencing a certain amount of failures or less before experiencing the first success in a series of Bernoulli trials, using the following syntax:

pgeom(q, prob) 

where:

  • q: number of failures before first success
  • prob: probability of success on a given trial

A researcher is waiting outside of a library to ask people if they support a certain law. The probability that a given person supports the law is p = 0.2. What is the probability that the researcher will have to talk to 3 or less people to find someone who supports the law?

pgeom(q=3, prob=.2)

#0.5904

The probability that the researcher will have to talk to 3 or less people to find someone who supports the law is 0.5904.

A researcher is waiting outside of a library to ask people if they support a certain law. The probability that a given person supports the law is p = 0.2. What is the probability that the researcher will have to talk to more than 5 people to find someone who supports the law?

1 - pgeom(q=5, prob=.2)

#0.262144

The probability that the researcher will have to talk to more than 5 people to find someone who supports the law is 0.262144.

qgeom

The qgeom function finds the number of failures that corresponds to a certain percentile, using the following syntax:

qgeom(p, prob) 

where:

  • p: percentile
  • prob: probability of success on a given trial

Here’s an example of when you might use this function in practice:

A researcher is waiting outside of a library to ask people if they support a certain law. The probability that a given person supports the law is p = 0.2. We will consider a “failure” to mean that a person does not support the law. How many “failures” would the researcher need to experience to be at the 90th percentile for number of failures before the first success?

qgeom(p=.90, prob=0.2)

#10

The researcher would need to experience 10 “failures” to be at the 90th percentile for number of failures before the first success.

rgeom

The rgeom function generates a list of random values that represent the number of failures before the first success, using the following syntax:

rgeom(n, prob) 

where:

  • n: number of values to generate
  • prob: probability of success on a given trial

Here’s an example of when you might use this function in practice:

A researcher is waiting outside of a library to ask people if they support a certain law. The probability that a given person supports the law is p = 0.2. We will consider a “failure” to mean that a person does not support the law. Simulate 10 scenarios for how many “failures” the researcher will experience until she finds someone who supports the law.

set.seed(0) #make this example reproducible

rgeom(n=10, prob=.2)

# 1 2 1 10 7 4 1 7 4 1

The way to interpret this is as follows:

  • During the first simulation, the researcher experienced 1 failure before finding someone who supported the law.
  • During the second simulation, the researcher experienced 2 failures before finding someone who supported the law.
  • During the third simulation, the researcher experienced 1 failure before finding someone who supported the law.
  • During the fourth simulation, the researcher experienced 10 failures before finding someone who supported the law.

And so on.

Additional Resources

x