# Sample online data

16 Sep 2018### Problem statement

Write a function that takes a stream of elements `stream`

, number of elements `k`

to select from the stream and returns a sequence of `k`

elements from the stream that are randonly chosen, such that any combination of the elements is equally likely possible (eq. uniform distribution).

```
EXAMPLE:
Input: stream = [1,2,3,4,5,6,7,8,9], k = 4
Output: [9,2,4,1]
```

### Solution

Here is the actual code of the algorithm that implements a solution to randomly choose `k`

elements from a stream, such that any combination of elements is equally likely possible:

```
func sample(stream: Stream, k: Int) -> Sequence {
// Prepare state of the algorithm
var seed = 0, seq = Sequence(repeating: 0, count: k)
// Query stream for the next element (if any)
while let next = stream.next() {
// Select position of the element in `0...seed`
let x = k > seed ? seed : Int.random(in: 0...seed)
// Generator yields a value between `0` and `k - 1` is
// a signal to replace an existing element in the sequence
if k > x { seq[x] = next }
// Extend random generator's seed range by 1
// to account for the current element
seed += 1
}
// Here is the answer
return seq
}
```

Following is an example that the code performs as advertised and establish a baseline for any improvement work in the future:

### Bits & pieces

This coding challenge uses a clever technique to randomly pick items from a stream without knowing in advance how many elements there are in the stream.

Lets get started!

The first statement prepares state of the algorithm and initializes `seq`

variable to store elements of the output sequence and `seed`

variable used to seed the random number generator and to track count of elements read from the stream so far (*since we couldn’t get estimate of the elements in the stream*):

```
// Prepare state of the algorithm
var seed = 0, seq = Sequence(repeating: 0, count: k)
```

Following is the **main loop** that reads elements from the stream and processes each element according to the original constraint, eq. where all combinations of elements is equally likely possible:

```
// Query stream for the next element (if any)
while let next = stream.next() {
... inner block of code ...
}
```

This statement either seeds the random number generator and requests a number from the given range or uses its sequential index which indicates a desired (*sometimes random*) position of the current element in the output sequence:

```
// Select position of the element in `0...seed`
let x = k > seed ? seed : Int.random(in: 0...seed)
```

Next statement checks whether the position yielded by the random generator falls between `0`

and `(k - 1)`

range and if that’s the case the algorithm will use the current element from the stream to replace an existing item in the output sequence:

```
// Generator yields a value between `0` and `k - 1` is
// a signal to replace an existing element in the sequence
if k > x { seq[x] = next }
```

This statement, the last within `while`

loop adds `1`

to the number elements processed so far, since the algorithm uses this data to seed the random number generator and calculate uniform distribution:

```
// Extend random generator's seed range by 1
// to account for the current element
seed += 1
```

As soon as the **main loop** is over, the function is ready to return the sequence of `k`

elements randomly picked from the stream:

```
// Here is the answer
return seq
```

That’s all folks! 🍺

Here is a Swift playground for this article at Github: Sample online data.