In coding theory, the Wozencraft ensemble is a set of linear codes in which most of codes satisfy the Gilbert-Varshamov bound. It is named after John Wozencraft, who proved its existence. The ensemble is described by Massey (1963), who attributes it to Wozencraft. Justesen (1972) used the Wozencraft ensemble as the inner codes in his construction of strongly explicit asymptotically good code.
Theorem: Let
ε > 0. For a large enough
k , there exists an ensemble of inner codes
C i n 1 , ⋯ , C i n N of
rate 1 2 , where
N = q k − 1 , such that for at least
( 1 − ε ) N values of
i , C i n i has relative distance
⩾ H q − 1 ( 1 2 − ε ) .
Here relative distance is the ratio of minimum distance to block length. And H q is the q-ary entropy function defined as follows:
H q ( x ) = x log q ( q − 1 ) − x log q x − ( 1 − x ) log q ( 1 − x ) . In fact, to show the existence of this set of linear codes, we will specify this ensemble explicitly as follows: for α ∈ F q k − { 0 } , define the inner code
{ C i n α : F q k → F q 2 k C i n α ( x ) = ( x , α x ) Here we can notice that x ∈ F q k and α ∈ F q k . We can do the multiplication α x since F q k is isomorphic to F q k .
This ensemble is due to Wozencraft and is called the Wozencraft ensemble.
For all x , y ∈ F q k , we have the following facts:
- C i n α ( x ) + C i n α ( y ) = ( x , α x ) + ( y , α y ) = ( x + y , α ( x + y ) ) = C i n α ( x + y )
- For any a ∈ F q , a C i n α ( x ) = a ( x , α x ) = ( a x , α ( a x ) ) = C i n α ( a x )
So C i n α is a linear code for every α ∈ F q k − { 0 } .
Now we know that Wozencraft ensemble contains linear codes with rate 1 2 . In the following proof, we will show that there are at least ( 1 − ε ) N those linear codes having the relative distance ⩾ H q − 1 ( 1 2 − ε ) , i.e. they meet the Gilbert-Varshamov bound.
To prove that there are at least ( 1 − ε ) N number of linear codes in the Wozencraft ensemble having relative distance ⩾ H q − 1 ( 1 2 − ε ) , we will prove that there are at most ε N number of linear codes having relative distance < H q − 1 ( 1 2 − ε ) i.e., having distance < H q − 1 ( 1 2 − ε ) ⋅ 2 k .
Notice that in a linear code, the distance is equal to the minimum weight of all codewords of that code. This fact is the property of linear code. So if one non-zero codeword has weight < H q − 1 ( 1 2 − ε ) ⋅ 2 k , then that code has distance < H q − 1 ( 1 2 − ε ) ⋅ 2 k .
Let P be the set of linear codes having distance < H q − 1 ( 1 2 − ε ) ⋅ 2 k . Then there are | P | linear codes having some codeword that has weight < H q − 1 ( 1 2 − ε ) ⋅ 2 k .
Lemma. Two linear codes
C i n α 1 and
C i n α 2 with
α 1 , α 2 ∈ F q k distinct and non-zero, do not share any non-zero codeword.
Proof. Suppose there exist distinct non-zero elements
α 1 , α 2 ∈ F q k such that the linear codes
C i n α 1 and
C i n α 2 contain the same non-zero codeword
y . Now since
y ∈ C i n α 1 , y = ( y 1 , α 1 y 1 ) for some
y 1 ∈ F q k and similarly
y = ( y 2 , α 2 y 2 ) for some
y 2 ∈ F q k . Moreover since
y is non-zero we have
y 1 , y 2 ≠ 0. Therefore
( y 1 , α 1 y 1 ) = ( y 2 , α 2 y 2 ) , then
y 1 = y 2 ≠ 0 and
α 1 y 1 = α 2 y 2 . This implies
α 1 = α 2 , which is a contradiction.
Any linear code having distance < H q − 1 ( 1 2 − ε ) ⋅ 2 k has some codeword of weight < H q − 1 ( 1 2 − ε ) ⋅ 2 k . Now the Lemma implies that we have at least | P | different y such that w t ( y ) < H q − 1 ( 1 2 − ε ) ⋅ 2 k (one such codeword y for each linear code). Here w t ( y ) denotes the weight of codeword y , which is the number of non-zero positions of y .
Denote
S = { y : w t ( y ) < H q − 1 ( 1 2 − ε ) ⋅ 2 k } Then:
| P | ⩽ | S | ⩽ Vol q ( H q − 1 ( 1 2 − ε ) ⋅ 2 k , 2 k ) Vol q ( r , n ) is the volume of Hamming ball of radius r in [ q ] n ⩽ q H q ( H q − 1 ( 1 2 − ε ) ) ⋅ 2 k Vol q ( p n , n ) ⩽ q H q ( p ) n = q ( 1 2 − ε ) ⋅ 2 k = q k ( 1 − 2 ε ) < ε ( q k − 1 ) for k large enough = ε N So | P | < ε N , therefore the set of linear codes having the relative distance ⩾ H q − 1 ( 1 2 − ε ) ⋅ 2 k has at least N − ε N = ( 1 − ε ) N elements.