Compute a random subset

Problem statement

Write a function that takes a positive numbers n and k, eq. upper bound, size and returns an array, eq. a subset of randomly chosen numbers in 0...n of size k.

All subsets should be equally likely and, in addition, all permutations of numbers of the array should be equally likely.

EXAMPLE:

  Input: n = 9, k = 5

  Output: [1, 4, 5, 0, 6]

Solution

Here is the actual code of the algorithm that implements a solution to randomly choose a subset of k numbers in 0...n, such that any combination is equally likely possible:

func subset(n: Int, k: Int) -> [Int] {
    // Prepare state of the algorithm
    var map: [Int:Int] = [:]
    // Process each element of the subset
    for ix in 0...k - 1 {
        // Select next random element in {ix...n}
        let rx = Int.random(in: ix...n)
        // Query for existing keys in the map
        let ixv = map[ix], rxv = map[rx]
        // Insert or update `ix ~> rxv/rx` mapping pair
        map[ix] = rxv ?? rx
        // Insert or update `rx ~> ixv/ix` mapping pair
        map[rx] = ixv ?? ix
    }
    // Prepare an array to copy the resulting subset
    var seq = [Int](repeating: 0, count: k)
    // Query & copy each element of the subset in the map
    for i in 0...k - 1 { seq[i] = map[i]! }
    // Here is the answer
    return seq
}

Following is an example that the code performs as advertised and establish a baseline for any improvement work in the future:

Compute a random subset test assertions

Bits & pieces

This coding challenge introduces reader to a very ineresting technique that strikes a remarkable balance between time and space complexity characteristics and effectively solves the problem at hand.

Lets get started!

The first statement instantiates map variable used as a dictionary to keep randomly chosen numbers and as a data structure to resolve conflicts with duplicates:

// Prepare state of the algorithm
var map: [Int:Int] = [:]

Next statement is the main loop that is executed k - 1 times and randomly picks numbers in the specified range:

// Process each element of the subset
for ix in 0...k - 1 {
    ... inner block of code ...
}

Following statement, the first in the loop, is the instruction that randomly picks numbers from the given range using sliding window approach to minimize chances of chosing the same number multiple times:

// Select next random element in {ix...n}
let rx = Int.random(in: ix...n)

This statement queries the dictionary to find out whether the specified keys have been already discovered and added to the mapping, knowing about the existing keys and their values is an important part of the algorithm’s strategy used to solve conflicts with duplicates:

// Query for existing keys in the map
let ixv = map[ix], rxv = map[rx]

Next statement either creates a new or updates an existing mapping between ix and rxv/rx values. In case our random numbers generator has yielded a number that already exists, the code will use the number to fill the current element’s position (eq. ix) in the dictionary:

// Insert or update `ix ~> rxv/rx` mapping pair
map[ix] = rxv ?? rx

This statement as well creates or updates an existing mapping in the dictionary between rx and ixv/ix values. As the statement above this one uses the same conflict resolution strategy, eq. use an existing key’s value to fill the randomly generated number’s position in the dictionary:

// Insert or update `rx ~> ixv/ix` mapping pair
map[rx] = ixv ?? ix

As soon as the main loop is over, it is now time to copy all the numbers generated to an array that is returned to the calling code and this statement initializes seq variable used as an output array:

// Prepare an array to copy the resulting subset
var seq = [Int](repeating: 0, count: k)

This statement enumerates every element in 0...k - 1 range and uses the index as a key to copy the value from the dictionary into the output array:

// Query & copy each element of the subset in the map
for i in 0...k - 1 { seq[i] = map[i]! }

Finally, once all the numbers are copied the function is ready to return a k-sized subset of randomly selected numbers in 0...n range:

// Here is the answer
return seq

That’s all folks! 🍹

Here is a Swift playground for this article at Github: Compute a random subset.

Compute a random permutation

Problem statement

Write a function to generate a uniformly random permutation of {0 ... n - 1} sequence and return its subset of the specified size k. By the way, you are given a random number generator that returns integers in the set {0 ... n - 1} with equal probability, try to use as few calls to the generator as possible.

EXAMPLE:

  Input: n = 19, k = 5

  Output: [14, 7, 13, 6, 9]

Solution

Here is the actual code of the algorithm that implements a solution to generate a uniformly random permutation in the given range and return its subset of the specified size:

func random(n: Int, k: Int) -> [Int] {
    // Prepare state of the algorithm
    var seq = (0...n - 1).sorted(), perm = [Int](repeating: 0, count: k)
    // Select at most k elements to output
    for lhs in 0..<k {
        // Pick the next exchange using random generator
        let x = Int.random(in: lhs...n - 1)
        // Exchange elements
        swap(&seq, lhs, x)
        // Copy result of the exchange into the output
        perm[lhs] = seq[lhs]
    }
    // Here is the answer
    return perm
}

Following is the traditional section with assertions to validate the code performs as advertised and establish a baseline for any improvement work in the future:

Generate a random permutation test assertions

Bits & pieces

This coding challenge features a very useful technique to unformly pick random elements from a sequence.

Lets get started!

The first statement prepares state of the algorithm by initializing two sequences seq and perm, such that variable seq represents sequence of numbers from 0 to n - 1 and variable perm represents subset of chosen elements by the algorithm:

// Prepare state of the algorithm
var seq = (0...n - 1).sorted(), perm = [Int](repeating: 0, count: k)

Here is the initial state of the algorithm after both sequences are ready for further transformations:

Initial state of the variables

The following snippet is the main loop to perform selection of the elements from the sequence of numbers in seq variable:

// Select at most k elements to output
for lhs in 0..<k {
    ... inner block of code ...
}

Every iteration of the loop processes yet another number in the original sequence using current position in lhs variable to adjust random number generator’s range:

Loop iteration to generate random permutation

This statement, the first in the loop uses a technique of self-adjusting range to exclude elements selected previously and retain the equal probability when selecting the next element from sequence using random number generator:

// Pick the next exchange using random generator
let x = Int.random(in: lhs...n - 1)

As you may have noticed every iteration randomly picks position of the next element from the original sequence into x variable:

Select next random element from the original sequence

Next statement performs an actual exchange between current element in position lhs and randomly chosen element in position x:

// Exchange elements
swap(&seq, lhs, x)

To keep the distribution of elements in the output sequence uniform, the code swaps elements in the current position (lhs variable) and the next one randomly selected (x variable):

Swap elements to keep distribution uniform

The last statement in the loop simply copies the result of the previous exchange into the output sequence perm:

// Copy result of the exchange into the output
perm[lhs] = seq[lhs]

Lastly, the element randomly selected is copied to the output sequence as is without any modifications:

Copy selected element to the output sequence

As soon as the main loop is over, perm variable should have a subset of uniformly random permutation in 0...n - 1 range of the specified size:

// Here is the answer
return perm

Finally, here is how the algorithm’s resulting state looks like, when the function is about to return results to the calling code:

Final state of the algorithm's internals

That’s all folks! 💾

Here is a Swift playground for this article at Github: Compute a random permutation.

Sample online data

Problem statement

Write a function that takes a stream of elements stream, number of elements k to select from the stream and returns a sequence of k elements from the stream that are randonly chosen, such that any combination of the elements is equally likely possible (eq. uniform distribution).

EXAMPLE:

  Input: stream = [1,2,3,4,5,6,7,8,9], k = 4

  Output: [9,2,4,1]

Solution

Here is the actual code of the algorithm that implements a solution to randomly choose k elements from a stream, such that any combination of elements is equally likely possible:

func sample(stream: Stream, k: Int) -> Sequence {
    // Prepare state of the algorithm
    var seed = 0, seq = Sequence(repeating: 0, count: k)
    // Query stream for the next element (if any)
    while let next = stream.next() {
        // Select position of the element in `0...seed`
        let x = k > seed ? seed : Int.random(in: 0...seed)
        // Generator yields a value between `0` and `k - 1` is
        // a signal to replace an existing element in the sequence 
        if k > x { seq[x] = next }
        // Extend random generator's seed range by 1
        // to account for the current element
        seed += 1
    }
    // Here is the answer
    return seq
}

Following is an example that the code performs as advertised and establish a baseline for any improvement work in the future:

Sample online data test assertions

Bits & pieces

This coding challenge uses a clever technique to randomly pick items from a stream without knowing in advance how many elements there are in the stream.

Lets get started!

The first statement prepares state of the algorithm and initializes seq variable to store elements of the output sequence and seed variable used to seed the random number generator and to track count of elements read from the stream so far (since we couldn’t get estimate of the elements in the stream):

// Prepare state of the algorithm
var seed = 0, seq = Sequence(repeating: 0, count: k)

Following is the main loop that reads elements from the stream and processes each element according to the original constraint, eq. where all combinations of elements is equally likely possible:

// Query stream for the next element (if any)
while let next = stream.next() {
    ... inner block of code ...
}

This statement either seeds the random number generator and requests a number from the given range or uses its sequential index which indicates a desired (sometimes random) position of the current element in the output sequence:

// Select position of the element in `0...seed`
let x = k > seed ? seed : Int.random(in: 0...seed)

Next statement checks whether the position yielded by the random generator falls between 0 and (k - 1) range and if that’s the case the algorithm will use the current element from the stream to replace an existing item in the output sequence:

// Generator yields a value between `0` and `k - 1` is
// a signal to replace an existing element in the sequence 
if k > x { seq[x] = next }

This statement, the last within while loop adds 1 to the number elements processed so far, since the algorithm uses this data to seed the random number generator and calculate uniform distribution:

// Extend random generator's seed range by 1
// to account for the current element
seed += 1

As soon as the main loop is over, the function is ready to return the sequence of k elements randomly picked from the stream:

// Here is the answer
return seq

That’s all folks! 🍺

Here is a Swift playground for this article at Github: Sample online data.

Sample offline data

Problem statement

Write a function that takes an array of unique numbers s as input and a size k and returns a subset of the given size of the array elements. All subsets should be equally likely.

Return the result in input array itself and substitute remaining elements with 0.

EXAMPLE:

  Input:  s = [1,2,3,4,5], k = 3

  Output: s = [3,2,4,0,0]

Solution

Here is the actual code of the algorithm that implements a solution to take a random sample from the given array of unique elements:

func sample(s: inout [Int], k: Int) {
    // Prepare state of the algorithm
    let end = s.count - 1
    // Iterate thru the range
    for lhs in 0...end {
        // Clear out remaining elements with 0
        if lhs >= k { s[lhs] = 0; continue }
        // Pick the next exchange using random generator
        let rhs = Int.random(in: lhs...end)
        // Exchange elements
        swap(&s, lhs, rhs)
    }
}

Following is the traditional section with assertions to validate the code performs as advertised and establish a baseline for any improvement work in the future:

Sample offline data test assertions

Bits & pieces

This coding challenge utilizes random number generator to simulate sampling process within the specified boundaries that are adjusted with every iteration.

Lets get started!

The first statement is a simple shorthand for the array’s length, since this value is being used in a few places giving it a human-friendly name improves readability:

// Prepare state of the algorithm
let end = s.count - 1

This statement is the main loop used to iterate thru the array and process all elements according to the order of their apperance in the array:

// Iterate thru the range
for lhs in 0...end {
    ... inner block of code ...
}

This step clears out value of the current element with 0 when the algorithm is done processing the requested number of elements to sample:

Clear out remaining elements from the sequence

Here is the code that clears out the current element and immediately skips to the next iteration:

// Clear out remaining elements with 0
if lhs >= k { s[lhs] = 0; continue }

Following step uses random number generator to pick an element from the remaining sequence into rhs variable to use for the exchange operation that comes up next:

Pick a random element within lhs...end range

Here is the code for that step:

// Pick the next exchange using random generator
let rhs = Int.random(in: lhs...end)

And the last step of the algorithm actually exchanges elements in the current and the randomly chosen position as yet another sample for the resulting sequence:

Exchange chosen items to simulate sampling process

Here is the code for the last step of the algorithm:

// Exchange elements
swap(&s, lhs, rhs)

As soon as the main loop is over s argument that is marked as inout has all the values to represent the requested sample of the original sequence.

That’s all folks! 🌼

Here is a Swift playground for this article at Github: Sample offline data.

Compute the next permutation

Problem statement

Write a function that takes an array of numbers (eq. a permutation) and returns the next permutation using dictionary ordering rules.

However, if the permutation is the last permutation under the dictionary ordering, your function should return the empty array.

EXAMPLE:

  Input: p = [6,2,1,5,4,3,0]

  Output: [6,2,3,0,1,4,5]

Solution

Here is the actual code of the algorithm that implements a solution to compute the next permutation of the given sequence:

func next(p: [Int]) -> [Int] {
    // Evaluate boundaries of the suffix
    let x = edgeOfSuffix(s: p), y = p.count - 1 
    // Seems no more permutations left, hence the empty result
    if x == -1 { return [] }
    // Prepare state of the algorithm
    var perm = p
    // Find the next smallest entry after `x` in the suffix
    // and swap it with the entry in `x` position
    swapNextInSuffix(s: &perm, x: x)
    // Build the smallest suffix possible by sorting it from 
    // the smallest to the largest using reverse - since suffix
    // elements are ordered from the largest to the smallest
    reverse(seq: &perm, x + 1, y)
    // Here is the answer
    return perm
}

Following is the traditional section with assertions to validate the code performs as advertised and establish a baseline for any improvement work in the future:

Compute the next permutation test assertions

Bits & pieces

This coding challenge took mutiple iterations to get the code working correctly with edge cases and then it took a few more attempts to factor it out so that the code is readable and far easier to understand. Refactoring effort helped a lot to highlight important parts of the algorithm and turn it into something that is far more easier to read, comprehend and reason about.

Lets get started!

The first statement evaluates boundaries of the suffix (eq. sequence of numbers in ascending order) to perform the manipulations on, since that is the way to compute the next permutation under dictionary order:

// Evaluate boundaries of the suffix
let x = edgeOfSuffix(s: p), y = p.count - 1

As you have noticed, the statement above uses edgeOfSuffix helper function and below you can find its code. However, it’s worth to explain how does it work to help aid better understanding of the algorithm’s inner workings.

The current step of the algorithm evaluates boundaries of the suffix and finds out its edge, eq. first occurence where an entry on the left is less than an entry on the right:

Find an edge of the suffix

Finding the edge of the suffix is the purpose of this helper function and the result will be very useful later during find and exchange step of computing the next permutation:

public func edgeOfSuffix(s: [Int]) -> Int {
    // Put reading head `x` into the position next to the last
    // in the sequence, therefore subtract 2 from the count.
    var x = s.count - 2
    // Enumerate pairs of elements, eq. [x] and [x + 1] 
    while x >= 0 {
        // Look for a pair of elements where x < y, eq. 1 < 5
        if s[x] < s[x + 1] { break }
        // Move on to the next pair of elements
        x -= 1
    }
    // Here is the edge of the suffix or -1 if the edge is not found
    return max(x, -1)
}

Following statement will return an empty array as a result, since edge search result that equals -1 indicates the given sequence is either empty or represents the last permutation of the sequence:

// Seems no more permutations left, hence the empty result
if x == -1 { return [] }

This statement makes a local writeable copy of p argument into perm variable. so that the following steps of the algorithm will be able to make changes to the sequence during find and exchange and suffix transformation steps:

// Prepare state of the algorithm
var perm = p

Next statement is find and exchange step computes the first part of the next permutation using an in-place exchange of the suffix edge discovered earlier and an entry within the suffix that is the next smallest entry after the suffix edge’s entry:

// Find the next smallest entry after `x` in the suffix
// and swap it with the entry in `x` position
swapNextInSuffix(s: &perm, x: x)

Now, lets take a look at the internals of swapNextInSuffix helper function to better understand why it is useful in this case and how does it work.

Here is a picture outlining mechanics of the current step, eq. find the next smallest entry (starting from the end of sequence) after entry in position x and exchange the two:

Find and exchange items to get first part of the next permutation

Finding the next smallest entry after x and exchanging the two entries using in-place swap technique is the purpose of this helper function:

public func swapNextInSuffix(s: inout [Int], x: Int) {
    // Prepare state of the algorithm
    let lhs = s[x], y = s.count - 1
    // Enumerate sequence's suffix elements in reverse order,
    // eq. from n...0
    for i in (x...y).reversed() {
        // Skip entries smaller than or equal to the entry of `x`
        if lhs >= s[i] { continue }
        // Swap entry of `x` with the entry of `i`, eq. it would be
        // the first entry of `i` greater than entry of `x`
        swap(&s, x, i)
        // Break out from the loop, we're done
        break
    }
}

As soon as the exchange step is completed, the remaining part of the algorithm is suffix transform step that re-arranges elements of the suffix in reverse order (eq. from smallest to largest/left-to-right):

Suffix transformation step

Since elements of the suffix (if any) at this point are in descending order (eq. from largest to smallest/left-to-right) the code simply needs to apply reverse function to the part of the original sequence denoted by (x + 1) ... y range:

// Build the smallest suffix possible by sorting it from 
// the smallest to the largest using reverse - since suffix
// elements are ordered from the largest to the smallest
reverse(seq: &perm, x + 1, y)

After elements of the suffix are in reverse order, our function is ready to return an answer representing the next permutation of the original sequence:

// Here is the answer
return perm

That’s all folks! 🎰

Here is a Swift playground for this article at Github: Compute the next permutation.