Small bias sample space - Alchetron, the free social encyclopedia

In theoretical computer science, a small-bias sample space (also known as ϵ -biased sample space, ϵ -biased generator, or small-bias probability space) is a probability distribution that fools parity functions. In other words, no parity function can distinguish between a small-bias sample space and the uniform distribution with high probability, and hence, small-bias sample spaces naturally give rise to pseudorandom generators for parity functions.

The main useful property of small-bias sample spaces is that they need far fewer truly random bits than the uniform distribution to fool parities. Efficient constructions of small-bias sample spaces have found many applications in computer science, some of which are derandomization, error-correcting codes, and probabilistically checkable proofs. The connection with error-correcting codes is in fact very strong since ϵ -biased sample spaces are equivalent to ϵ -balanced error-correcting codes.

Bias

Let X be a probability distribution over { 0 , 1 } n . The bias of X with respect to a set of indices I ⊆ { 1 , … , n } is defined as

bias I ( X ) = | Pr x ∼ X ( ∑ i ∈ I x i = 0 ) − Pr x ∼ X ( ∑ i ∈ I x i = 1 ) | = | 2 ⋅ Pr x ∼ X ( ∑ i ∈ I x i = 0 ) − 1 | ,

where the sum is taken over F 2 , the finite field with two elements. In other words, the sum ∑ i ∈ I x i equals 0 if the number of ones in the sample x ∈ { 0 , 1 } n at the positions defined by I is even, and otherwise, the sum equals 1 . For I = ∅ , the empty sum is defined to be zero, and hence bias ∅ ( X ) = 1 .

ϵ-biased sample space

A probability distribution X over { 0 , 1 } n is called an ϵ -biased sample space if bias I ( X ) ≤ ϵ holds for all non-empty subsets I ⊆ { 1 , 2 , … , n } .

ϵ-biased set

An ϵ -biased sample space X that is generated by picking a uniform element from a multiset X ⊆ { 0 , 1 } n is called ϵ -biased set. The size s of an ϵ -biased set X is the size of the multiset that generates the sample space.

ϵ-biased generator

An ϵ -biased generator G : { 0 , 1 } ℓ → { 0 , 1 } n is a function that maps strings of length ℓ to strings of length n such that the multiset X G = { G ( y ) | y ∈ { 0 , 1 } ℓ } is an ϵ -biased set. The seed length of the generator is the number ℓ and is related to the size of the ϵ -biased set X G via the equation s = 2 ℓ .

Connection with epsilon-balanced error-correcting codes

There is a close connection between ϵ -biased sets and ϵ -balanced linear error-correcting codes. A linear code C : { 0 , 1 } n → { 0 , 1 } s of message length n and block length s is ϵ -balanced if the Hamming weight of every nonzero codeword C ( x ) is between ( 1 2 − ϵ ) s and ( 1 2 + ϵ ) s . Since C is a linear code, its generator matrix is an ( n × s ) -matrix A over F 2 with C ( x ) = x ⋅ A .

Then it holds that a multiset X ⊂ { 0 , 1 } n is ϵ -biased if and only if the linear code C X , whose columns are exactly elements of X , is ϵ -balanced.

Constructions of small epsilon-biased sets

Usually the goal is to find ϵ -biased sets that have a small size s relative to the parameters n and ϵ . This is because a smaller size s means that the amount of randomness needed to pick a random element from the set is smaller, and so the set can be used to fool parities using few random bits.

Theoretical bounds

The probabilistic method gives a non-explicit construction that achieves size s = O ( n / ϵ 2 ) . The construction is non-explicit in the sense that finding the ϵ -biased set requires a lot of true randomness, which does not help towards the goal of reducing the overall randomness. However, this non-explicit construction is useful because it shows that these efficient codes exist. On the other hand, the best known lower bound for the size of ϵ -biased sets is s = Ω ( n / ( ϵ 2 log ⁡ ( 1 / ϵ ) ) , that is, in order for a set to be ϵ -biased, it must be at least that big.

Explicit constructions

There are many explicit, i.e., deterministic constructions of ϵ -biased sets with various parameter settings:

Naor & Naor (1990) achieve s = n poly ( ϵ ) . The construction makes use of Justesen codes (which is a concatenation of Reed–Solomon codes with the Wozencraft ensemble) as well as expander walk sampling.

Alon et al. (1992) achieve s = O ( n ϵ log ⁡ ( n / ϵ ) ) 2 . One of their constructions is the concatenation of Reed–Solomon codes with the Hadamard code; this concatenation turns out to be an ϵ -balanced code, which gives rise to an ϵ -biased sample space via the connection mentioned above.

Concatenating Algebraic geometric codes with the Hadamard code gives an ϵ -balanced code with s = O ( n ϵ 3 log ⁡ ( 1 / ϵ ) ) .

Ben-Aroya & Ta-Shma (2009) achieves s = O ( n ϵ 2 log ⁡ ( 1 / ϵ ) ) 5 / 4 .

These bounds are mutually incomparable. In particular, none of these constructions yields the smallest ϵ -biased sets for all settings of ϵ and n .

Application: almost k-wise independence

An important application of small-bias sets lies in the construction of almost k-wise independent sample spaces.

k-wise independent spaces

A random variable Y over { 0 , 1 } n is a k-wise independent space if, for all index sets I ⊆ { 1 , … , n } of size k , the marginal distribution Y | I is exactly equal to the uniform distribution over { 0 , 1 } k . That is, for all such I and all strings z ∈ { 0 , 1 } k , the distribution Y satisfies Pr Y ( Y | I = z ) = 2 − k .

Constructions and bounds

k-wise independent spaces are fairly well-understood.

A simple construction by Joffe (1974) achieves size n k .

Alon, Babai & Itai (1986) construct a k-wise independent space whose size is n k / 2 .

Chor et al. (1985) prove that no k-wise independent space can be significantly smaller than n k / 2 .

Joffe's construction

Joffe (1974) constructs a k -wise independent space Y over the finite field with some prime number n > k of elements, i.e., Y is a distribution over F n n . The initial k marginals of the distribution are drawn independently and uniformly at random:

( Y 0 , … , Y k − 1 ) ∼ F n k .

For each i with k ≤ i < n , the marginal distribution of Y i is then defined as

Y i = Y 0 + Y 1 ⋅ i + Y 2 ⋅ i 2 + ⋯ + Y k − 1 ⋅ i k − 1 ,

where the calculation is done in F n . Joffe (1974) proves that the distribution Y constructed in this way is k -wise independent as a distribution over F n n . The distribution Y is uniform on its support, and hence, the support of Y forms a k -wise independent set. It contains all n k strings in F n k that have been extended to strings of length n using the deterministic rule above.

Almost k-wise independent spaces

A random variable Y over { 0 , 1 } n is a δ -almost k-wise independent space if, for all index sets I ⊆ { 1 , … , n } of size k , the restricted distribution Y | I and the uniform distribution U k on { 0 , 1 } k are δ -close in 1-norm, i.e., ∥ Y | I − U k ∥ 1 ≤ δ .

Constructions

Naor & Naor (1990) give a general framework for combining small k-wise independent spaces with small ϵ -biased spaces to obtain δ -almost k-wise independent spaces of even smaller size. In particular, let G 1 : { 0 , 1 } h → { 0 , 1 } n be a linear mapping that generates a k-wise independent space and let G 2 : { 0 , 1 } ℓ → { 0 , 1 } h be a generator of an ϵ -biased set over { 0 , 1 } h . That is, when given a uniformly random input, the output of G 1 is a k-wise independent space, and the output of G 2 is ϵ -biased. Then G : { 0 , 1 } ℓ → { 0 , 1 } n with G ( x ) = G 1 ( G 2 ( x ) ) is a generator of an δ -almost k -wise independent space, where δ = 2 k / 2 ϵ .

As mentioned above, Alon, Babai & Itai (1986) construct a generator G 1 with h = k 2 log ⁡ n , and Naor & Naor (1990) construct a generator G 2 with ℓ = log ⁡ s = log ⁡ h + O ( log ⁡ ( ϵ − 1 ) ) . Hence, the concatenation G of G 1 and G 2 has seed length ℓ = log ⁡ k + log ⁡ log ⁡ n + O ( log ⁡ ( ϵ − 1 ) ) . In order for G to yield a δ -almost k-wise independent space, we need to set ϵ = δ 2 − k / 2 , which leads to a seed length of ℓ = log ⁡ log ⁡ n + O ( k + log ⁡ ( δ − 1 ) ) and a sample space of total size 2 ℓ ≤ log ⁡ n ⋅ poly ( 2 k ⋅ δ − 1 ) .

References

Small-bias sample space Wikipedia

(Text) CC BY-SA

Contents