Harman Patil (Editor)

Small bias sample space

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

In theoretical computer science, a small-bias sample space (also known as ϵ -biased sample space, ϵ -biased generator, or small-bias probability space) is a probability distribution that fools parity functions. In other words, no parity function can distinguish between a small-bias sample space and the uniform distribution with high probability, and hence, small-bias sample spaces naturally give rise to pseudorandom generators for parity functions.

Contents

The main useful property of small-bias sample spaces is that they need far fewer truly random bits than the uniform distribution to fool parities. Efficient constructions of small-bias sample spaces have found many applications in computer science, some of which are derandomization, error-correcting codes, and probabilistically checkable proofs. The connection with error-correcting codes is in fact very strong since ϵ -biased sample spaces are equivalent to ϵ -balanced error-correcting codes.

Bias

Let X be a probability distribution over { 0 , 1 } n . The bias of X with respect to a set of indices I { 1 , , n } is defined as

bias I ( X ) = | Pr x X ( i I x i = 0 ) Pr x X ( i I x i = 1 ) | = | 2 Pr x X ( i I x i = 0 ) 1 | ,

where the sum is taken over F 2 , the finite field with two elements. In other words, the sum i I x i equals 0 if the number of ones in the sample x { 0 , 1 } n at the positions defined by I is even, and otherwise, the sum equals 1 . For I = , the empty sum is defined to be zero, and hence bias ( X ) = 1 .

ϵ-biased sample space

A probability distribution X over { 0 , 1 } n is called an ϵ -biased sample space if bias I ( X ) ϵ holds for all non-empty subsets I { 1 , 2 , , n } .

ϵ-biased set

An ϵ -biased sample space X that is generated by picking a uniform element from a multiset X { 0 , 1 } n is called ϵ -biased set. The size s of an ϵ -biased set X is the size of the multiset that generates the sample space.

ϵ-biased generator

An ϵ -biased generator G : { 0 , 1 } { 0 , 1 } n is a function that maps strings of length to strings of length n such that the multiset X G = { G ( y ) | y { 0 , 1 } } is an ϵ -biased set. The seed length of the generator is the number and is related to the size of the ϵ -biased set X G via the equation s = 2 .

Connection with epsilon-balanced error-correcting codes

There is a close connection between ϵ -biased sets and ϵ -balanced linear error-correcting codes. A linear code C : { 0 , 1 } n { 0 , 1 } s of message length n and block length s is ϵ -balanced if the Hamming weight of every nonzero codeword C ( x ) is between ( 1 2 ϵ ) s and ( 1 2 + ϵ ) s . Since C is a linear code, its generator matrix is an ( n × s ) -matrix A over F 2 with C ( x ) = x A .

Then it holds that a multiset X { 0 , 1 } n is ϵ -biased if and only if the linear code C X , whose columns are exactly elements of X , is ϵ -balanced.

Constructions of small epsilon-biased sets

Usually the goal is to find ϵ -biased sets that have a small size s relative to the parameters n and ϵ . This is because a smaller size s means that the amount of randomness needed to pick a random element from the set is smaller, and so the set can be used to fool parities using few random bits.

Theoretical bounds

The probabilistic method gives a non-explicit construction that achieves size s = O ( n / ϵ 2 ) . The construction is non-explicit in the sense that finding the ϵ -biased set requires a lot of true randomness, which does not help towards the goal of reducing the overall randomness. However, this non-explicit construction is useful because it shows that these efficient codes exist. On the other hand, the best known lower bound for the size of ϵ -biased sets is s = Ω ( n / ( ϵ 2 log ( 1 / ϵ ) ) , that is, in order for a set to be ϵ -biased, it must be at least that big.

Explicit constructions

There are many explicit, i.e., deterministic constructions of ϵ -biased sets with various parameter settings:

  • Naor & Naor (1990) achieve s = n poly ( ϵ ) . The construction makes use of Justesen codes (which is a concatenation of Reed–Solomon codes with the Wozencraft ensemble) as well as expander walk sampling.
  • Alon et al. (1992) achieve s = O ( n ϵ log ( n / ϵ ) ) 2 . One of their constructions is the concatenation of Reed–Solomon codes with the Hadamard code; this concatenation turns out to be an ϵ -balanced code, which gives rise to an ϵ -biased sample space via the connection mentioned above.
  • Concatenating Algebraic geometric codes with the Hadamard code gives an ϵ -balanced code with s = O ( n ϵ 3 log ( 1 / ϵ ) ) .
  • Ben-Aroya & Ta-Shma (2009) achieves s = O ( n ϵ 2 log ( 1 / ϵ ) ) 5 / 4 .
  • These bounds are mutually incomparable. In particular, none of these constructions yields the smallest ϵ -biased sets for all settings of ϵ and n .

    Application: almost k-wise independence

    An important application of small-bias sets lies in the construction of almost k-wise independent sample spaces.

    k-wise independent spaces

    A random variable Y over { 0 , 1 } n is a k-wise independent space if, for all index sets I { 1 , , n } of size k , the marginal distribution Y | I is exactly equal to the uniform distribution over { 0 , 1 } k . That is, for all such I and all strings z { 0 , 1 } k , the distribution Y satisfies Pr Y ( Y | I = z ) = 2 k .

    Constructions and bounds

    k-wise independent spaces are fairly well-understood.

  • A simple construction by Joffe (1974) achieves size n k .
  • Alon, Babai & Itai (1986) construct a k-wise independent space whose size is n k / 2 .
  • Chor et al. (1985) prove that no k-wise independent space can be significantly smaller than n k / 2 .
  • Joffe's construction

    Joffe (1974) constructs a k -wise independent space Y over the finite field with some prime number n > k of elements, i.e., Y is a distribution over F n n . The initial k marginals of the distribution are drawn independently and uniformly at random:

    ( Y 0 , , Y k 1 ) F n k .

    For each i with k i < n , the marginal distribution of Y i is then defined as

    Y i = Y 0 + Y 1 i + Y 2 i 2 + + Y k 1 i k 1 ,

    where the calculation is done in F n . Joffe (1974) proves that the distribution Y constructed in this way is k -wise independent as a distribution over F n n . The distribution Y is uniform on its support, and hence, the support of Y forms a k -wise independent set. It contains all n k strings in F n k that have been extended to strings of length n using the deterministic rule above.

    Almost k-wise independent spaces

    A random variable Y over { 0 , 1 } n is a δ -almost k-wise independent space if, for all index sets I { 1 , , n } of size k , the restricted distribution Y | I and the uniform distribution U k on { 0 , 1 } k are δ -close in 1-norm, i.e., Y | I U k 1 δ .

    Constructions

    Naor & Naor (1990) give a general framework for combining small k-wise independent spaces with small ϵ -biased spaces to obtain δ -almost k-wise independent spaces of even smaller size. In particular, let G 1 : { 0 , 1 } h { 0 , 1 } n be a linear mapping that generates a k-wise independent space and let G 2 : { 0 , 1 } { 0 , 1 } h be a generator of an ϵ -biased set over { 0 , 1 } h . That is, when given a uniformly random input, the output of G 1 is a k-wise independent space, and the output of G 2 is ϵ -biased. Then G : { 0 , 1 } { 0 , 1 } n with G ( x ) = G 1 ( G 2 ( x ) ) is a generator of an δ -almost k -wise independent space, where δ = 2 k / 2 ϵ .

    As mentioned above, Alon, Babai & Itai (1986) construct a generator G 1 with h = k 2 log n , and Naor & Naor (1990) construct a generator G 2 with = log s = log h + O ( log ( ϵ 1 ) ) . Hence, the concatenation G of G 1 and G 2 has seed length = log k + log log n + O ( log ( ϵ 1 ) ) . In order for G to yield a δ -almost k-wise independent space, we need to set ϵ = δ 2 k / 2 , which leads to a seed length of = log log n + O ( k + log ( δ 1 ) ) and a sample space of total size 2 log n poly ( 2 k δ 1 ) .

    References

    Small-bias sample space Wikipedia