In mathematics, and in particular in combinatorics, the combinatorial number system of degree k (for some positive integer k), also referred to as combinadics, is a correspondence between natural numbers (taken to include 0) N and k-combinations, represented as strictly decreasing sequences ck > ... > c2 > c1 ≥ 0. Since the latter are strings of numbers, one can view this as a kind of numeral system for representing N, although the main utility is representing a k-combination by N rather than the other way around. Distinct numbers correspond to distinct k-combinations, and produce them in lexicographic order; moreover the numbers less than
Contents
- Ordering combinations
- Place of a combination in the ordering
- Finding the k combination for a given number
- Example
- National Lottery example in Excel
- Applications
- References
The number N corresponding to (ck,...,c2,c1) is given by
The fact that a unique sequence so corresponds to any number N was observed by D. H. Lehmer. Indeed a greedy algorithm finds the k-combination corresponding to N: take ck maximal with
The originally used term "combinatorial representation of integers" is shortened to "combinatorial number system" by Knuth, who also gives a much older reference; the term "combinadic" is introduced by James McCaffrey (without reference to previous terminology or work).
Unlike the factorial number system, the combinatorial number system of degree k is not a mixed radix system: the part
The main application of the combinatorial number system is that it allows rapid computation of the k-combination that is at a given position in the lexicographic ordering, without having to explicitly list the k-combinations preceding it; this allows for instance random generation of k-combinations of a given set. Enumeration of k-combinations has many applications, among which software testing, sampling, quality control, and the analysis of lottery games.
Ordering combinations
A k-combination of a set S is a subset of S with k (distinct) elements. The main purpose of the combinatorial number system is to provide a representation, each by a single number, of all
In order to ensure that the numbers representing the k-combinations of {0, 1, ..., n − 1} are less than those representing k-combinations not contained in {0, 1, ..., n − 1}, the k-combinations must be ordered in such a way that their largest elements are compared first. The most natural ordering that has this property is lexicographic ordering of the decreasing sequence of their elements. So comparing the 5-combinations C = {0,3,4,6,9} and C' = {0,1,3,7,9}, one has that C comes before C', since they have the same largest part 9, but the next largest part 6 of C is less than the next largest part 7 of C'; the sequences compared lexicographically are (9,6,4,3,0) and (9,7,3,1,0). Another way to describe this ordering is view combinations as describing the k raised bits in the binary representation of a number, so that C = {c1,...,ck} describes the number
(this associates distinct numbers to all finite sets of natural numbers); then comparison of k-combinations can be done by comparing the associated binary numbers. In the example C and C' correspond to numbers 10010110012 = 60110 and 10100010112 = 65110, which again shows that C comes before C'. This number is not however the one one wants to represent the k-combination with, since many binary numbers have a number of raised bits different from k; one wants to find the relative position of C in the ordered list of (only) k-combinations.
Place of a combination in the ordering
The number associated in the combinatorial number system of degree k to a k-combination C is the number of k-combinations strictly less than C in the given ordering. This number can be computed from C = { ck, ..., c2, c1 } with ck > ... > c2 > c1 as follows. From the definition of the ordering it follows that for each k-combination S strictly less than C, there is a unique index i such that ci is absent from S, while ck, ..., ci+1 are present in S, and no other value larger than ci is. One can therefore group those k-combinations S according to the possible values 1, 2, ..., k of i, and count each group separately. For a given value of i one must include ck, ..., ci+1 in S, and the remaining i elements of S must be chosen from the ci non-negative integers strictly less than ci; moreover any such choice will result in a k-combinations S strictly less than C. The number of possible choices is
and this is the index (starting from 0) of C in the ordered list of k-combinations. Obviously there is for every N ∈ N exactly one k-combination at index N in the list (supposing k ≥ 1, since the list is then infinite), so the above argument proves that every N can be written in exactly one way as a sum of k binomial coefficients of the given form.
Finding the k-combination for a given number
The given formula allows finding the place in the lexicographic ordering of a given k-combination immediately. The reverse process of finding the k-combination at a given place N requires somewhat more work, but is straightforward nonetheless. By the definition of the lexicographic ordering, two k-combinations that differ in their largest element ck will be ordered according to the comparison of those largest elements, from which it follows that all combinations with a fixed value of their largest element are contiguous in the list. Moreover the smallest combination with ck as largest element is
Example
Suppose one wants to determine the 5-combination at position 72. The successive values of
National Lottery example in Excel
For each of the
Suppose you want to find a position of a British National Lottery result 3,6,15,17,18,35 in a list of possible results. Excel function COMBIN(49,6) would show that number of results is 13983816. Now if you put numbers 3,6,15,17,18,35 each in its cell and formulas COMBIN(49-3,6), COMBIN(49-6,5), COMBIN(49-15,4), COMBIN(49-17,3), COMBIN(49-18,2), 49−35 under each of them. Use cell references instead of actual values, the actual values are provided for readability. You would get a row of numbers of 9366819,962598,46376,4960,465,14. Adding these would show particular position 10381232. Note that you do not need use formula COMBIN(49-35,1) to get 14. You can have it by subtraction 49-35 alone. Also the COMBIN function does not return 0 in case 49-X becomes less than 6. You'd need to use IF with ISNUMBER function to convert #NUM! into 0.
Now the reverse engineering is a bit trickier. You could use 49 IF statements in one cell or use solver to find maximum argument for COMBIN result to be less or equal than position number. Instead let's use a table of 6 by 49 and use MATCH function where resulting row number would be the argument and a ball number. If you make column headings from 6 to 1 (B1:G1) in descending order and row labels of 1 to 49 (A2:A50) in ascending order (vertically ascending in Excel means numbers growing from top to bottom). Then if you fill up the table with formula COMBIN($A2,B$1) starting from left top corner. $ signs will make sure that index values are always taken from heading row and label column. Replace #NUM! with zeros. You should get something like this:
6 5 4 3 2 11 0 0 0 0 0 12 0 0 0 0 1 23 0 0 0 1 3 34 0 0 1 4 6 45 0 1 5 10 10 56 1 6 15 20 15 67 7 21 35 35 21 78 28 56 70 56 28 89 84 126 126 84 36 910 210 252 210 120 45 1011 462 462 330 165 55 1112 924 792 495 220 66 12...49 13983816 1906884 211876 18424 1176 49Now if you create a new row of six cells and fill them with formulas which would find the largest values in each column which are less than or equal to position number in question: The first cell with =INDEX(B2:B50,MATCH(10381232,B2:B50)), the rest of the cells with
INDEX(C2:C50,MATCH(10381232-SUM(of previous cells),C2:C50))...INDEX(G2:G50,MATCH(10381232-SUM(of previous cells),G2:G50))This would present you with a row of values you have already seen 9366819,962598,46376,4960,465,14 Now in a next row first cell write =49-MATCH(10381232,B2:B50) and similarly
=49-MATCH(10381232-9366819,C2:C50)...=49-MATCH(10381232-9366819-962598-46376-4960-465,G2:G50)Again use the references to cells instead of actual values. You should be presented with original ball numbers of 3,6,15,17,18,35.
Now you can generate a fresh Lottery number combination from single =randbetween(1,combin(49,6)) or you could look at the list position numbers of earlier results to see if there is a trend.
Applications
One could use the combinatorial number system to list or traverse all k-combinations of a given finite set, but this is a very inefficient way to do that. Indeed, given some k-combination it is much easier to find the next combination in lexicographic ordering directly than to convert a number to a k-combination by the method indicated above. To find the next combination, find the smallest i ≥ 2 for which ci ≥ ci−1+2 (if there is no such i, take i = k+1); then increase ci−1 by one and set all cj with j < i − 1 to their minimal value j − 1. If the k-combination is represented by its characteristic vector, that is as a binary value with k bits 1, then the next such value can be computed without any loop using bitwise arithmetic: the following C++function will advance x to that value or return false:
This is called Gosper's hack; corresponding assembly code was described as item 175 in HAKMEM.
On the other hand the possibility to directly generate the k-combination at index N has useful applications. Notably, it allows generating a random k-combination of an n-element set using a random integer N with