Random number generator in arbitrary probability distribution fashion
Last Updated :
11 Sep, 2023
Given n numbers, each with some frequency of occurrence. Return a random number with probability proportional to its frequency of occurrence.
Example:
Let following be the given numbers.
arr[] = {10, 30, 20, 40}
Let following be the frequencies of given numbers.
freq[] = {1, 6, 2, 1}
The output should be
10 with probability 1/10
30 with probability 6/10
20 with probability 2/10
40 with probability 1/10
It is quite clear that the simple random number generator won’t work here as it doesn’t keep track of the frequency of occurrence.
We need to somehow transform the problem into a problem whose solution is known to us.
One simple method is to take an auxiliary array (say aux[]) and duplicate the numbers according to their frequency of occurrence. Generate a random number(say r) between 0 to Sum-1(including both), where Sum represents summation of frequency array (freq[] in above example). Return the random number aux[r] (Implementation of this method is left as an exercise to the readers).
The limitation of the above method discussed above is huge memory consumption when frequency of occurrence is high. If the input is 997, 8761 and 1, this method is clearly not efficient.
How can we reduce the memory consumption? Following is detailed algorithm that uses O(n) extra space where n is number of elements in input arrays.
1. Take an auxiliary array (say prefix[]) of size n.
2. Populate it with prefix sum, such that prefix[i] represents sum of numbers from 0 to i.
3. Generate a random number(say r) between 1 to Sum(including both), where Sum represents summation of input frequency array.
4. Find index of Ceil of random number generated in step #3 in the prefix array. Let the index be indexc.
5. Return the random number arr[indexc], where arr[] contains the input n numbers.
Before we go to the implementation part, let us have quick look at the algorithm with an example:
arr[]: {10, 20, 30}
freq[]: {2, 3, 1}
Prefix[]: {2, 5, 6}
Since last entry in prefix is 6, all possible values of r are [1, 2, 3, 4, 5, 6]
1: Ceil is 2. Random number generated is 10.
2: Ceil is 2. Random number generated is 10.
3: Ceil is 5. Random number generated is 20.
4: Ceil is 5. Random number generated is 20.
5: Ceil is 5. Random number generated is 20.
6. Ceil is 6. Random number generated is 30.
In the above example
10 is generated with probability 2/6.
20 is generated with probability 3/6.
30 is generated with probability 1/6.
How does this work?
Any number input[i] is generated as many times as its frequency of occurrence because there exists count of integers in range(prefix[i – 1], prefix[i]] is input[i]. Like in the above example 3 is generated thrice, as there exists 3 integers 3, 4 and 5 whose ceil is 5.
C++
// C++ program to generate random numbers
// according to given frequency distribution
#include <bits/stdc++.h>
using namespace std;
// Utility function to find ceiling of r in arr[l..h]
int findCeil(int arr[], int r, int l, int h)
{
int mid;
while (l < h)
{
mid = l + ((h - l) >> 1); // Same as mid = (l+h)/2
(r > arr[mid]) ? (l = mid + 1) : (h = mid);
}
return (arr[l] >= r) ? l : -1;
}
// The main function that returns a random number
// from arr[] according to distribution array
// defined by freq[]. n is size of arrays.
int myRand(int arr[], int freq[], int n)
{
// Create and fill prefix array
int prefix[n], i;
prefix[0] = freq[0];
for (i = 1; i < n; ++i)
prefix[i] = prefix[i - 1] + freq[i];
// prefix[n-1] is sum of all frequencies.
// Generate a random number with
// value from 1 to this sum
int r = (rand() % prefix[n - 1]) + 1;
// Find index of ceiling of r in prefix array
int indexc = findCeil(prefix, r, 0, n - 1);
return arr[indexc];
}
// Driver code
int main()
{
int arr[] = {1, 2, 3, 4};
int freq[] = {10, 5, 20, 100};
int i, n = sizeof(arr) / sizeof(arr[0]);
// Use a different seed value for every run.
srand(time(NULL));
// Let us generate 10 random numbers according to
// given distribution
for (i = 0; i < 5; i++)
cout << myRand(arr, freq, n) << endl;
return 0;
}
// This is code is contributed by rathbhupendra
C
//C program to generate random numbers according to given frequency distribution
#include <stdio.h>
#include <stdlib.h>
// Utility function to find ceiling of r in arr[l..h]
int findCeil(int arr[], int r, int l, int h)
{
int mid;
while (l < h)
{
mid = l + ((h - l) >> 1); // Same as mid = (l+h)/2
(r > arr[mid]) ? (l = mid + 1) : (h = mid);
}
return (arr[l] >= r) ? l : -1;
}
// The main function that returns a random number from arr[] according to
// distribution array defined by freq[]. n is size of arrays.
int myRand(int arr[], int freq[], int n)
{
// Create and fill prefix array
int prefix[n], i;
prefix[0] = freq[0];
for (i = 1; i < n; ++i)
prefix[i] = prefix[i - 1] + freq[i];
// prefix[n-1] is sum of all frequencies. Generate a random number
// with value from 1 to this sum
int r = (rand() % prefix[n - 1]) + 1;
// Find index of ceiling of r in prefix array
int indexc = findCeil(prefix, r, 0, n - 1);
return arr[indexc];
}
// Driver program to test above functions
int main()
{
int arr[] = {1, 2, 3, 4};
int freq[] = {10, 5, 20, 100};
int i, n = sizeof(arr) / sizeof(arr[0]);
// Use a different seed value for every run.
srand(time(NULL));
// Let us generate 10 random numbers according to
// given distribution
for (i = 0; i < 5; i++)
printf("%d\n", myRand(arr, freq, n));
return 0;
}
Java
// Java program to generate random numbers
// according to given frequency distribution
class GFG
{
// Utility function to find ceiling of r in arr[l..h]
static int findCeil(int arr[], int r, int l, int h)
{
int mid;
while (l < h)
{
mid = l + ((h - l) >> 1); // Same as mid = (l+h)/2
if(r > arr[mid])
l = mid + 1;
else
h = mid;
}
return (arr[l] >= r) ? l : -1;
}
// The main function that returns a random number
// from arr[] according to distribution array
// defined by freq[]. n is size of arrays.
static int myRand(int arr[], int freq[], int n)
{
// Create and fill prefix array
int prefix[] = new int[n], i;
prefix[0] = freq[0];
for (i = 1; i < n; ++i)
prefix[i] = prefix[i - 1] + freq[i];
// prefix[n-1] is sum of all frequencies.
// Generate a random number with
// value from 1 to this sum
int r = ((int)(Math.random()*(323567)) % prefix[n - 1]) + 1;
// Find index of ceiling of r in prefix array
int indexc = findCeil(prefix, r, 0, n - 1);
return arr[indexc];
}
// Driver code
public static void main(String args[])
{
int arr[] = {1, 2, 3, 4};
int freq[] = {10, 5, 20, 100};
int i, n = arr.length;
// Let us generate 10 random numbers according to
// given distribution
for (i = 0; i < 5; i++)
System.out.println( myRand(arr, freq, n) );
}
}
// This code is contributed by Arnab Kundu
Python3
# Python3 program to generate random numbers
# according to given frequency distribution
import random
# Utility function to find ceiling of r in arr[l..h]
def findCeil(arr, r, l, h) :
while (l < h) :
mid = l + ((h - l) >> 1); # Same as mid = (l+h)/2
if r > arr[mid] :
l = mid + 1
else :
h = mid
if arr[l] >= r :
return l
else :
return -1
# The main function that returns a random number
# from arr[] according to distribution array
# defined by freq[]. n is size of arrays.
def myRand(arr, freq, n) :
# Create and fill prefix array
prefix = [0] * n
prefix[0] = freq[0]
for i in range(n) :
prefix[i] = prefix[i - 1] + freq[i]
# prefix[n-1] is sum of all frequencies.
# Generate a random number with
# value from 1 to this sum
r = random.randint(0, prefix[n - 1]) + 1
# Find index of ceiling of r in prefix array
indexc = findCeil(prefix, r, 0, n - 1)
return arr[indexc]
# Driver code
arr = [1, 2, 3, 4]
freq = [10, 5, 20, 100]
n = len(arr)
# Let us generate 10 random numbers according to
# given distribution
for i in range(5) :
print(myRand(arr, freq, n))
# This code is contributed by divyesh072019
C#
// C# program to generate random numbers
// according to given frequency distribution
using System;
class GFG{
// Utility function to find ceiling
// of r in arr[l..h]
static int findCeil(int[] arr, int r,
int l, int h)
{
int mid;
while (l < h)
{
// Same as mid = (l+h)/2
mid = l + ((h - l) >> 1);
if (r > arr[mid])
l = mid + 1;
else
h = mid;
}
return (arr[l] >= r) ? l : -1;
}
// The main function that returns a random number
// from arr[] according to distribution array
// defined by freq[]. n is size of arrays.
static int myRand(int[] arr, int[] freq)
{
int n = arr.Length;
// Create and fill prefix array
int[] prefix = new int[n];
int i;
prefix[0] = freq[0];
for(i = 1; i < n; ++i)
prefix[i] = prefix[i - 1] + freq[i];
// prefix[n-1] is sum of all frequencies.
// Generate a random number with
// value from 1 to this sum
Random rand = new Random();
int r = rand.Next(prefix[n - 1] + 1;
// Find index of ceiling of r in prefix array
int indexc = findCeil(prefix, r, 0, n - 1);
return arr[indexc];
}
// Driver Code
static void Main()
{
int[] arr = { 1, 2, 3, 4 };
int[] freq = { 10, 5, 20, 100 };
// Let us generate 10 random numbers
// according to given distribution
for(int i = 0; i < 5; i++)
Console.WriteLine(myRand(arr, freq));
}
}
// This code is contributed by divyeshrabadiya07
JavaScript
<script>
// JavaScript program to generate random numbers
// according to given frequency distribution
// Utility function to find ceiling of r in arr[l..h]
function findCeil(arr, r, l, h)
{
let mid;
while (l < h)
{
mid = l + ((h - l) >> 1); // Same as mid = (l+h)/2
(r > arr[mid]) ? (l = mid + 1) : (h = mid);
}
return (arr[l] >= r) ? l : -1;
}
// The main function that returns a random number
// from arr[] according to distribution array
// defined by freq[]. n is size of arrays.
function myRand(arr, freq, n) {
// Create and fill prefix array
let prefix= [];
let i;
prefix[0] = freq[0];
for (i = 1; i < n; ++i)
prefix[i] = prefix[i - 1] + freq[i];
// prefix[n-1] is sum of all frequencies.
// Generate a random number with
// value from 1 to this sum
let r = Math.floor((Math.random()* prefix[n - 1])) + 1;
// Find index of ceiling of r in prefix array
let indexc = findCeil(prefix, r, 0, n - 1);
return arr[indexc];
}
// Driver code
let arr = [1, 2, 3, 4];
let freq = [10, 5, 20, 100];
let i;
let n = arr.length;
// Use a different seed value for every run.
// Let us generate 10 random numbers according to
// given distribution
for (i = 0; i < 5; i++)
document.write(myRand(arr, freq, n));
// This code is contributed by rohitsingh07052.
</script>
Output: May be different for different runs
4
3
4
4
4
Time Complexity: O(n)
Auxiliary Space: O(n) because extra space for the array has been used
This article is compiled by Aashish Barnwal.
Similar Reads
Randomized Algorithms Randomized algorithms in data structures and algorithms (DSA) are algorithms that use randomness in their computations to achieve a desired outcome. These algorithms introduce randomness to improve efficiency or simplify the algorithm design. By incorporating random choices into their processes, ran
2 min read
Random Variable Random variable is a fundamental concept in statistics that bridges the gap between theoretical probability and real-world data. A Random variable in statistics is a function that assigns a real value to an outcome in the sample space of a random experiment. For example: if you roll a die, you can a
10 min read
Binomial Random Variables In this post, we'll discuss Binomial Random Variables.Prerequisite : Random Variables A specific type of discrete random variable that counts how often a particular event occurs in a fixed number of tries or trials. For a variable to be a binomial random variable, ALL of the following conditions mus
8 min read
Randomized Algorithms | Set 0 (Mathematical Background) Conditional Probability Conditional probability P(A | B) indicates the probability of even 'A' happening given that the even B happened.P(A|B) = \frac{P(A\cap B)}{P(B)} We can easily understand above formula using below diagram. Since B has already happened, the sample space reduces to B. So the pro
3 min read
Randomized Algorithms | Set 1 (Introduction and Analysis) What is a Randomized Algorithm? An algorithm that uses random numbers to decide what to do next anywhere in its logic is called a Randomized Algorithm. For example, in Randomized Quick Sort, we use a random number to pick the next pivot (or we randomly shuffle the array). And in Karger's algorithm,
5 min read
Randomized Algorithms | Set 2 (Classification and Applications) We strongly recommend to refer below post as a prerequisite of this. Randomized Algorithms | Set 1 (Introduction and Analysis) Classification Randomized algorithms are classified in two categories. Las Vegas: A Las Vegas algorithm were introduced by Laszlo Babai in 1979. A Las Vegas algorithm is an
13 min read
Randomized Algorithms | Set 3 (1/2 Approximate Median) Time Complexity: We use a set provided by the STL in C++. In STL Set, insertion for each element takes O(log k). So for k insertions, time taken is O (k log k). Now replacing k with c log n =>O(c log n (log (clog n))) =>O (log n (log log n)) How is probability of error less than 2/n2? Algorithm make
2 min read
Easy problems on randomized algorithms
Write a function that generates one of 3 numbers according to given probabilitiesYou are given a function rand(a, b) which generates equiprobable random numbers between [a, b] inclusive. Generate 3 numbers x, y, z with probability P(x), P(y), P(z) such that P(x) + P(y) + P(z) = 1 using the given rand(a,b) function.The idea is to utilize the equiprobable feature of the rand(a,b)
5 min read
Generate 0 and 1 with 25% and 75% probabilityGiven a function rand50() that returns 0 or 1 with equal probability, write a function that returns 1 with 75% probability and 0 with 25% probability using rand50() only. Minimize the number of calls to the rand50() method. Also, the use of any other library function and floating-point arithmetic ar
13 min read
Implement rand3() using rand2()Given a function rand2() that returns 0 or 1 with equal probability, implement rand3() using rand2() that returns 0, 1 or 2 with equal probability. Minimize the number of calls to rand2() method. Also, use of any other library function and floating point arithmetic are not allowed. The idea is to us
6 min read
Birthday ParadoxHow many people must be there in a room to make the probability 100% that at-least two people in the room have same birthday? Answer: 367 (since there are 366 possible birthdays, including February 29). The above question was simple. Try the below question yourself. How many people must be there in
7 min read
Expectation or expected value of an arrayExpectation or expected value of any group of numbers in probability is the long-run average value of repetitions of the experiment it represents. For example, the expected value in rolling a six-sided die is 3.5, because the average of all the numbers that come up in an extremely large number of ro
5 min read
Shuffle a deck of cardsGiven a deck of cards, the task is to shuffle them. Asked in Amazon Interview Prerequisite : Shuffle a given array Algorithm: 1. First, fill the array with the values in order. 2. Go through the array and exchange each element with the randomly chosen element in the range from itself to the end. //
5 min read
Program to generate CAPTCHA and verify userA CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) is a test to determine whether the user is human or not.So, the task is to generate unique CAPTCHA every time and to tell whether the user is human or not by asking user to enter the same CAPTCHA as generated auto
6 min read
Find an index of maximum occurring element with equal probabilityGiven an array of integers, find the most occurring element of the array and return any one of its indexes randomly with equal probability.Examples: Input: arr[] = [-1, 4, 9, 7, 7, 2, 7, 3, 0, 9, 6, 5, 7, 8, 9] Output: Element with maximum frequency present at index 6 OR Element with maximum frequen
8 min read
Randomized Binary Search AlgorithmWe are given a sorted array A[] of n elements. We need to find if x is present in A or not.In binary search we always used middle element, here we will randomly pick one element in given range.In Binary Search we had middle = (start + end)/2 In Randomized binary search we do following Generate a ran
13 min read
Medium problems on randomized algorithms
Make a fair coin from a biased coinYou are given a function foo() that represents a biased coin. When foo() is called, it returns 0 with 60% probability, and 1 with 40% probability. Write a new function that returns 0 and 1 with a 50% probability each. Your function should use only foo(), no other library method. Solution:Â We know fo
6 min read
Shuffle a given array using FisherâYates shuffle AlgorithmGiven an array, write a program to generate a random permutation of array elements. This question is also asked as "shuffle a deck of cards" or "randomize a given array". Here shuffle means that every permutation of array element should be equally likely. Let the given array be arr[]. A simple solut
10 min read
Expected Number of Trials until SuccessConsider the following famous puzzle. In a country, all families want a boy. They keep having babies till a boy is born. What is the expected ratio of boys and girls in the country? This puzzle can be easily solved if we know following interesting result in probability and expectation. If probabilit
6 min read
Strong Password Suggester ProgramGiven a password entered by the user, check its strength and suggest some password if it is not strong. Criteria for strong password is as follows : A password is strong if it has : At least 8 characters At least one special char At least one number At least one upper and one lower case char. Exampl
15 min read
QuickSort using Random PivotingIn this article, we will discuss how to implement QuickSort using random pivoting. In QuickSort we first partition the array in place such that all elements to the left of the pivot element are smaller, while all elements to the right of the pivot are greater than the pivot. Then we recursively call
15+ min read
Operations on Sparse MatricesGiven two sparse matrices (Sparse Matrix and its representations | Set 1 (Using Arrays and Linked Lists)), perform operations such as add, multiply or transpose of the matrices in their sparse form itself. The result should consist of three sparse matrices, one obtained by adding the two input matri
15+ min read
Estimating the value of Pi using Monte CarloMonte Carlo estimation Monte Carlo methods are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. One of the basic examples of getting started with the Monte Carlo algorithm is the estimation of Pi. Estimation of Pi The idea is to simulate ra
8 min read
Implement rand12() using rand6() in one lineGiven a function, rand6() that returns random numbers from 1 to 6 with equal probability, implement the one-liner function rand12() using rand6() which returns random numbers from 1 to 12 with equal probability. The solution should minimize the number of calls to the rand6() method. Use of any other
7 min read
Hard problems on randomized algorithms
Generate integer from 1 to 7 with equal probabilityGiven a function foo() that returns integers from 1 to 5 with equal probability, write a function that returns integers from 1 to 7 with equal probability using foo() only. Minimize the number of calls to foo() method. Also, use of any other library function is not allowed and no floating point arit
6 min read
Implement random-0-6-Generator using the given random-0-1-GeneratorGiven a function random01Generator() that gives you randomly either 0 or 1, implement a function that utilizes this function and generate numbers between 0 and 6(both inclusive). All numbers should have same probabilities of occurrence. Examples: on multiple runs, it gives 3 2 3 6 0 Approach : The i
5 min read
Select a random number from stream, with O(1) spaceGiven a stream of numbers, generate a random number from the stream. You are allowed to use only O(1) space and the input is in the form of a stream, so can't store the previously seen numbers. So how do we generate a random number from the whole stream such that the probability of picking any numbe
10 min read
Random number generator in arbitrary probability distribution fashionGiven n numbers, each with some frequency of occurrence. Return a random number with probability proportional to its frequency of occurrence. Example: Let following be the given numbers. arr[] = {10, 30, 20, 40} Let following be the frequencies of given numbers. freq[] = {1, 6, 2, 1} The output shou
11 min read
Reservoir SamplingReservoir sampling is a family of randomized algorithms for randomly choosing k samples from a list of n items, where n is either a very large or unknown number. Typically n is large enough that the list doesn't fit into main memory. For example, a list of search queries in Google and Facebook.So we
11 min read
Linearity of ExpectationPrerequisite: Random Variable This post is about mathematical concepts like expectation, linearity of expectation. It covers one of the required topics to understand Randomized Algorithms. Let us consider the following simple problem. Problem: Given a fair dice with 6 faces, the dice is thrown n tim
4 min read
Introduction and implementation of Karger's algorithm for Minimum CutGiven an undirected and unweighted graph, find the smallest cut (smallest number of edges that disconnects the graph into two components). The input graph may have parallel edges. For example consider the following example, the smallest cut has 2 edges. A Simple Solution use Max-Flow based s-t cut a
15+ min read
Select a Random Node from a Singly Linked ListGiven a singly linked list, select a random node from the linked list (the probability of picking a node should be 1/N if there are N nodes in the list). You are given a random number generator.Below is a Simple Solution Count the number of nodes by traversing the list. Traverse the list again and s
14 min read
Select a Random Node from a tree with equal probabilityGiven a Binary Tree with children Nodes, Return a random Node with an equal Probability of selecting any Node in tree.Consider the given tree with root as 1. 10 / \ 20 30 / \ / \ 40 50 60 70 Examples: Input : getRandom(root); Output : A Random Node From Tree : 3 Input : getRandom(root); Output : A R
8 min read
Freivaldâs Algorithm to check if a matrix is product of twoGiven three matrices A, B and C, find if C is a product of A and B. Examples: Input : A = 1 1 1 1 B = 1 1 1 1 C = 2 2 2 2 Output : Yes C = A x B Input : A = 1 1 1 1 1 1 1 1 1 B = 1 1 1 1 1 1 1 1 1 C = 3 3 3 3 1 2 3 3 3 Output : No A simple solution is to find product of A and B and then check if pro
12 min read
Random Acyclic Maze Generator with given Entry and Exit pointGiven two integers N and M, the task is to generate any N * M sized maze containing only 0 (representing a wall) and 1 (representing an empty space where one can move) with the entry point as P0 and exit point P1 and there is only one path between any two movable positions. Note: P0 and P1 will be m
15+ min read