We briefly review the HSW coding theorem (the statement of the achievability of the Holevo information rate                     I        (        X        ;        B        )                 for communicating classical data over a quantum channel). We first review the minimal amount of quantum mechanics needed for the theorem. We then cover quantum typicality, and finally we prove the theorem using a recent sequential decoding technique.
In order to prove the HSW coding theorem, we really just need a few basic things from quantum mechanics. First, a quantum state is a unit trace, positive operator known as a density operator. Usually, we denote it by                     ρ                ,                     σ                ,                     ω                , etc. The simplest model for a quantum channel is known as a classical-quantum channel:
                    x        ↦                  ρ                      x                          .                The meaning of the above notation is that inputting the classical letter                     x                 at the transmitting end leads to a quantum state                               ρ                      x                                   at the receiving end. It is the task of the receiver to perform a measurement to determine the input of the sender. If it is true that the states                               ρ                      x                                   are perfectly distinguishable from one another (i.e., if they have orthogonal supports such that                               T          r                                  {                      ρ                          x                                            ρ                                          x                                  ′                                                              }                =        0                 for                     x        ≠                  x                      ′                                  ), then the channel is a noiseless channel. We are interested in situations for which this is not the case. If it is true that the states                               ρ                      x                                   all commute with one another, then this is effectively identical to the situation for a classical channel, so we are also not interested in these situations. So, the situation in which we are interested is that in which the states                               ρ                      x                                   have overlapping support and are non-commutative.
The most general way to describe a quantum measurement is with a positive operator-valued measure (POVM). We usually denote the elements of a POVM as                                           {                          Λ                              m                                      }                                m                                  . These operators should satisfy positivity and completeness in order to form a valid POVM:
                              Λ                      m                          ≥        0                                            ∀        m                                              ∑                      m                                    Λ                      m                          =        I        .                The probabilistic interpretation of quantum mechanics states that if someone measures a quantum state                     ρ                 using a measurement device corresponding to the POVM                               {                      Λ                          m                                }                        , then the probability                     p                  (          m          )                         for obtaining outcome                     m                 is equal to
                    p                  (          m          )                =                  Tr                          {                      Λ                          m                                ρ          }                ,                and the post-measurement state is
                              ρ                      m                                ′                          =                              1                          p                              (                m                )                                                                                        Λ                              m                                                    ρ                                            Λ                              m                                                    ,                if the person measuring obtains outcome                     m                . These rules are sufficient for us to consider classical communication schemes over cq channels.
The reader can find a good review of this topic in the article about the typical subspace.
The following lemma is important for our proofs. It demonstrates that a measurement that succeeds with high probability on average does not disturb the state too much on average:
Lemma: [Winter] Given an ensemble                               {                      p                          X                                            (            x            )                    ,                      ρ                          x                                }                         with expected density operator                     ρ        ≡                  ∑                      x                                    p                      X                                    (          x          )                          ρ                      x                                  , suppose that an operator                     Λ                 such that                     I        ≥        Λ        ≥        0                 succeeds with high probability on the state                     ρ                :
                              Tr                          {          Λ          ρ          }                ≥        1        −        ϵ        .                Then the subnormalized state                                           Λ                                    ρ                      x                                                Λ                                   is close in expected trace distance to the original state                               ρ                      x                                  :
                                          E                                X                                    {                                    ∥                                                Λ                                                            ρ                                  X                                                                              Λ                                            −                              ρ                                  X                                            ∥                                      1                                }                ≤        2                              ϵ                          .                (Note that                                           ∥            A            ∥                                1                                   is the nuclear norm of the operator                     A                 so that                                           ∥            A            ∥                                1                          ≡                Tr                              {                                                    A                                  †                                            A                                }                        .)
The following inequality is useful for us as well. It holds for any operators                     ρ                ,                     σ                ,                     Λ                 such that                     0        ≤        ρ        ,        σ        ,        Λ        ≤        I                :
The quantum information-theoretic interpretation of the above inequality is that the probability of obtaining outcome                     Λ                 from a quantum measurement acting on the state                     ρ                 is upper bounded by the probability of obtaining outcome                     Λ                 on the state                     σ                 summed with the distinguishability of the two states                     ρ                 and                     σ                .
Lemma: [Sen's bound] The following bound holds for a subnormalized state                     σ                 such that                     0        ≤        σ                 and                     T        r                  {          σ          }                ≤        1                 with                               Π                      1                                  , ... ,                               Π                      N                                   being projectors:                               Tr                          {          σ          }                −                  Tr                          {                      Π                          N                                ⋯                      Π                          1                                           σ                                 Π                          1                                ⋯                      Π                          N                                }                ≤        2                                            ∑                              i                =                1                                            N                                                    Tr                                      {                              (                I                −                                  Π                                      i                                                  )                            σ              }                                      ,                
We can think of Sen's bound as a "non-commutative union bound" because it is analogous to the following union bound from probability theory:
                    Pr                  {                                    (                              A                                  1                                            ∩              ⋯              ∩                              A                                  N                                            )                                      c                                }                =        Pr                  {                      A                          1                                      c                                ∪          ⋯          ∪                      A                          N                                      c                                }                ≤                  ∑                      i            =            1                                N                          Pr                  {                      A                          i                                      c                                }                ,                where                               A                      1                                  , ldots,                               A                      N                                   are events. The analogous bound for projector logic would be
                              Tr                          {                      (            I            −                          Π                              1                                      ⋯                          Π                              N                                      ⋯                          Π                              1                                      )                    ρ          }                ≤                  ∑                      i            =            1                                N                                    Tr                          {                      (            I            −                          Π                              i                                      )                    ρ          }                ,                if we think of                               Π                      1                          ⋯                  Π                      N                                   as a projector onto the intersection of subspaces. Though, the above bound only holds if the projectors                               Π                      1                                  , ...,                               Π                      N                                   are commuting (choosing                               Π                      1                          =                  |          +          ⟩                          ⟨          +          |                        ,                               Π                      2                          =                  |          0          ⟩                          ⟨          0          |                        , and                     ρ        =                  |          0          ⟩                          ⟨          0          |                         gives a counterexample). If the projectors are non-commuting, then Sen's bound is the next best thing and suffices for our purposes here.
We now prove the HSW theorem with Sen's non-commutative union bound. We divide up the proof into a few parts: codebook generation, POVM construction, and error analysis.
Codebook Generation. We first describe how Alice and Bob agree on a random choice of code. They have the channel                     x        →                  ρ                      x                                   and a distribution                               p                      X                                    (          x          )                        . They choose                     M                 classical sequences                               x                      n                                   according to the IID distribution                               p                                    X                              n                                                              (                      x                          n                                )                        . After selecting them, they label them with indices as                                           {                          x                              n                                                    (              m              )                        }                                m            ∈                          [              M              ]                                              . This leads to the following quantum codewords:
                              ρ                                    x                              n                                                    (              m              )                                      =                  ρ                                    x                              1                                                    (              m              )                                      ⊗        ⋯        ⊗                  ρ                                    x                              n                                                    (              m              )                                      .                The quantum codebook is then                               {                      ρ                                          x                                  n                                                            (                m                )                                              }                        . The average state of the codebook is then
where                     ρ        =                  ∑                      x                                    p                      X                                    (          x          )                          ρ                      x                                  .
POVM Construction . Sens' bound from the above lemma suggests a method for Bob to decode a state that Alice transmits. Bob should first ask "Is the received state in the average typical subspace?" He can do this operationally by performing a typical subspace measurement corresponding to                               {                      Π                          ρ              ,              δ                                      n                                ,          I          −                      Π                          ρ              ,              δ                                      n                                }                        . Next, he asks in sequential order, "Is the received codeword in the                               m                      th                                   conditionally typical subspace?" This is in some sense equivalent to the question, "Is the received codeword the                               m                      th                                   transmitted codeword?" He can ask these questions operationally by performing the measurements corresponding to the conditionally typical projectors                               {                      Π                                          ρ                                                      x                                          n                                                                            (                    m                    )                                                              ,              δ                                ,          I          −                      Π                                          ρ                                                      x                                          n                                                                            (                    m                    )                                                              ,              δ                                }                        .
Why should this sequential decoding scheme work well? The reason is that the transmitted codeword lies in the typical subspace on average:
                                          E                                              X                              n                                                              {                      Tr                                {                          Π                              ρ                ,                δ                                                                 ρ                                                X                                      n                                                                        }                    }                =                  Tr                          {                      Π                          ρ              ,              δ                                                                     E                                                      X                                  n                                                                          {                          ρ                                                X                                      n                                                                        }                    }                                            =                  Tr                          {                      Π                          ρ              ,              δ                                                       ρ                          ⊗              n                                }                                            ≥        1        −        ϵ        ,                where the inequality follows from (ef{eq:1st-typ-prop}). Also, the projectors                               Π                                    ρ                                                x                                      n                                                                    (                  m                  )                                                      ,            δ                                   are "good detectors" for the states                               ρ                                    x                              n                                                    (              m              )                                               (on average) because the following condition holds from conditional quantum typicality:
                                          E                                              X                              n                                                              {                      Tr                                {                          Π                                                ρ                                                            X                                              n                                                                                            ,                δ                                                                 ρ                                                X                                      n                                                                        }                    }                ≥        1        −        ϵ        .                Error Analysis . The probability of detecting the                               m                      th                                   codeword correctly under our sequential decoding scheme is equal to
                              Tr                          {                      Π                                          ρ                                                      X                                          n                                                                            (                    m                    )                                                              ,              δ                                                                                            Π                  ^                                                                                    ρ                                                      X                                          n                                                                            (                    m                    −                    1                    )                                                              ,              δ                                ⋯                                                                      Π                  ^                                                                                    ρ                                                      X                                          n                                                                            (                    1                    )                                                              ,              δ                                                       Π                          ρ              ,              δ                                      n                                                       ρ                                          x                                  n                                                            (                m                )                                                                     Π                          ρ              ,              δ                                      n                                                                                                       Π                  ^                                                                                    ρ                                                      X                                          n                                                                            (                    1                    )                                                              ,              δ                                ⋯                                                                      Π                  ^                                                                                    ρ                                                      X                                          n                                                                            (                    m                    −                    1                    )                                                              ,              δ                                            Π                                          ρ                                                      X                                          n                                                                            (                    m                    )                                                              ,              δ                                }                ,                where we make the abbreviation                                                         Π              ^                                      ≡        I        −        Π                . (Observe that we project into the average typical subspace just once.) Thus, the probability of an incorrect detection for the                               m                      th                                   codeword is given by
                    1        −                  Tr                          {                      Π                                          ρ                                                      X                                          n                                                                            (                    m                    )                                                              ,              δ                                                                                            Π                  ^                                                                                    ρ                                                      X                                          n                                                                            (                    m                    −                    1                    )                                                              ,              δ                                ⋯                                                                      Π                  ^                                                                                    ρ                                                      X                                          n                                                                            (                    1                    )                                                              ,              δ                                                       Π                          ρ              ,              δ                                      n                                                       ρ                                          x                                  n                                                            (                m                )                                                                     Π                          ρ              ,              δ                                      n                                                                                                       Π                  ^                                                                                    ρ                                                      X                                          n                                                                            (                    1                    )                                                              ,              δ                                ⋯                                                                      Π                  ^                                                                                    ρ                                                      X                                          n                                                                            (                    m                    −                    1                    )                                                              ,              δ                                            Π                                          ρ                                                      X                                          n                                                                            (                    m                    )                                                              ,              δ                                }                ,                and the average error probability of this scheme is equal to
                    1        −                              1            M                                    ∑                      m                                    Tr                          {                      Π                                          ρ                                                      X                                          n                                                                            (                    m                    )                                                              ,              δ                                                                                            Π                  ^                                                                                    ρ                                                      X                                          n                                                                            (                    m                    −                    1                    )                                                              ,              δ                                ⋯                                                                      Π                  ^                                                                                    ρ                                                      X                                          n                                                                            (                    1                    )                                                              ,              δ                                                       Π                          ρ              ,              δ                                      n                                                       ρ                                          x                                  n                                                            (                m                )                                                                     Π                          ρ              ,              δ                                      n                                                                                                       Π                  ^                                                                                    ρ                                                      X                                          n                                                                            (                    1                    )                                                              ,              δ                                ⋯                                                                      Π                  ^                                                                                    ρ                                                      X                                          n                                                                            (                    m                    −                    1                    )                                                              ,              δ                                            Π                                          ρ                                                      X                                          n                                                                            (                    m                    )                                                              ,              δ                                }                .                Instead of analyzing the average error probability, we analyze the expectation of the average error probability, where the expectation is with respect to the random choice of code:
Our first step is to apply Sen's bound to the above quantity. But before doing so, we should rewrite the above expression just slightly, by observing that
                    1        =                              E                                              X                              n                                                              {                                    1              M                                            ∑                          m                                            Tr                                {                          ρ                                                X                                      n                                                                    (                  m                  )                                                      }                    }                                            =                              E                                              X                              n                                                              {                                    1              M                                            ∑                          m                                            Tr                                {                          Π                              ρ                ,                δ                                            n                                                    ρ                                                X                                      n                                                                    (                  m                  )                                                      }                    +                      Tr                                {                                                                                Π                    ^                                                                              ρ                ,                δ                                            n                                                    ρ                                                X                                      n                                                                    (                  m                  )                                                      }                    }                                            =                              E                                              X                              n                                                              {                                    1              M                                            ∑                          m                                            Tr                                {                          Π                              ρ                ,                δ                                            n                                                    ρ                                                X                                      n                                                                    (                  m                  )                                                                    Π                              ρ                ,                δ                                            n                                      }                    }                +                              1            M                                    ∑                      m                                    Tr                          {                                                                      Π                  ^                                                                    ρ              ,              δ                                      n                                                          E                                                      X                                  n                                                                          {                          ρ                                                X                                      n                                                                    (                  m                  )                                                      }                    }                                            =                              E                                              X                              n                                                              {                                    1              M                                            ∑                          m                                            Tr                                {                          Π                              ρ                ,                δ                                            n                                                    ρ                                                X                                      n                                                                    (                  m                  )                                                                    Π                              ρ                ,                δ                                            n                                      }                    }                +                  Tr                          {                                                                      Π                  ^                                                                    ρ              ,              δ                                      n                                            ρ                          ⊗              n                                }                                            ≤                              E                                              X                              n                                                              {                                    1              M                                            ∑                          m                                            Tr                                {                          Π                              ρ                ,                δ                                            n                                                    ρ                                                X                                      n                                                                    (                  m                  )                                                                    Π                              ρ                ,                δ                                            n                                      }                    }                +        ϵ                Substituting into (3) (and forgetting about the small                     ϵ                 term for now) gives an upper bound of
                                          E                                              X                              n                                                              {                                    1              M                                            ∑                          m                                            Tr                                {                          Π                              ρ                ,                δ                                            n                                                    ρ                                                X                                      n                                                                    (                  m                  )                                                                    Π                              ρ                ,                δ                                            n                                      }                    }                                            −                              E                                              X                              n                                                              {                                    1              M                                            ∑                          m                                            Tr                                {                          Π                                                ρ                                                            X                                              n                                                                                    (                      m                      )                                                                      ,                δ                                                                                                          Π                    ^                                                                                                ρ                                                            X                                              n                                                                                    (                      m                      −                      1                      )                                                                      ,                δ                                      ⋯                                                                                Π                    ^                                                                                                ρ                                                            X                                              n                                                                                    (                      1                      )                                                                      ,                δ                                                                 Π                              ρ                ,                δ                                            n                                                                 ρ                                                X                                      n                                                                    (                  m                  )                                                                                 Π                              ρ                ,                δ                                            n                                                                                                                       Π                    ^                                                                                                ρ                                                            X                                              n                                                                                    (                      1                      )                                                                      ,                δ                                      ⋯                                                                                Π                    ^                                                                                                ρ                                                            X                                              n                                                                                    (                      m                      −                      1                      )                                                                      ,                δ                                                    Π                                                ρ                                                            X                                              n                                                                                    (                      m                      )                                                                      ,                δ                                      }                    }                .                We then apply Sen's bound to this expression with                     σ        =                  Π                      ρ            ,            δ                                n                                    ρ                                    X                              n                                                    (              m              )                                                Π                      ρ            ,            δ                                n                                   and the sequential projectors as                               Π                                    ρ                                                X                                      n                                                                    (                  m                  )                                                      ,            δ                                  ,                                                                         Π                ^                                                                        ρ                                                X                                      n                                                                    (                  m                  −                  1                  )                                                      ,            δ                                  , ...,                                                                         Π                ^                                                                        ρ                                                X                                      n                                                                    (                  1                  )                                                      ,            δ                                  . This gives the upper bound                                           E                                              X                              n                                                              {                                    1              M                                            ∑                          m                                2                                    [                              Tr                                            {                                  (                  I                  −                                      Π                                                                  ρ                                                                              X                                                          n                                                                                                            (                            m                            )                                                                                              ,                      δ                                                        )                                                  Π                                      ρ                    ,                    δ                                                        n                                                                    ρ                                                            X                                              n                                                                                    (                      m                      )                                                                                        Π                                      ρ                    ,                    δ                                                        n                                                  }                            +                              ∑                                  i                  =                  1                                                  m                  −                  1                                                            Tr                                            {                                  Π                                                            ρ                                                                        X                                                      n                                                                                                    (                          i                          )                                                                                      ,                    δ                                                                    Π                                      ρ                    ,                    δ                                                        n                                                                    ρ                                                            X                                              n                                                                                    (                      m                      )                                                                                        Π                                      ρ                    ,                    δ                                                        n                                                  }                            ]                                      1                              /                            2                                }                .                 Due to concavity of the square root, we can bound this expression from above by
                    2                              [                                          E                                                              X                                      n                                                                                      {                                                1                  M                                                            ∑                                  m                                                            Tr                                            {                                  (                  I                  −                                      Π                                                                  ρ                                                                              X                                                          n                                                                                                            (                            m                            )                                                                                              ,                      δ                                                        )                                                  Π                                      ρ                    ,                    δ                                                        n                                                                    ρ                                                            X                                              n                                                                                    (                      m                      )                                                                                        Π                                      ρ                    ,                    δ                                                        n                                                  }                            +                              ∑                                  i                  =                  1                                                  m                  −                  1                                                            Tr                                            {                                  Π                                                            ρ                                                                        X                                                      n                                                                                                    (                          i                          )                                                                                      ,                    δ                                                                    Π                                      ρ                    ,                    δ                                                        n                                                                    ρ                                                            X                                              n                                                                                    (                      m                      )                                                                                        Π                                      ρ                    ,                    δ                                                        n                                                  }                            }                        ]                                1                          /                        2                                                      ≤        2                              [                                          E                                                              X                                      n                                                                                      {                                                1                  M                                                            ∑                                  m                                                            Tr                                            {                                  (                  I                  −                                      Π                                                                  ρ                                                                              X                                                          n                                                                                                            (                            m                            )                                                                                              ,                      δ                                                        )                                                  Π                                      ρ                    ,                    δ                                                        n                                                                    ρ                                                            X                                              n                                                                                    (                      m                      )                                                                                        Π                                      ρ                    ,                    δ                                                        n                                                  }                            +                              ∑                                  i                  ≠                  m                                                            Tr                                            {                                  Π                                                            ρ                                                                        X                                                      n                                                                                                    (                          i                          )                                                                                      ,                    δ                                                                    Π                                      ρ                    ,                    δ                                                        n                                                                    ρ                                                            X                                              n                                                                                    (                      m                      )                                                                                        Π                                      ρ                    ,                    δ                                                        n                                                  }                            }                        ]                                1                          /                        2                          ,                where the second bound follows by summing over all of the codewords not equal to the                               m                      th                                   codeword (this sum can only be larger).
We now focus exclusively on showing that the term inside the square root can be made small. Consider the first term:
                                          E                                              X                              n                                                              {                                    1              M                                            ∑                          m                                            Tr                                {                          (              I              −                              Π                                                      ρ                                                                  X                                                  n                                                                                            (                        m                        )                                                                              ,                  δ                                            )                                      Π                              ρ                ,                δ                                            n                                                    ρ                                                X                                      n                                                                    (                  m                  )                                                                    Π                              ρ                ,                δ                                            n                                      }                    }                                            ≤                              E                                              X                              n                                                              {                                    1              M                                            ∑                          m                                            Tr                                {                          (              I              −                              Π                                                      ρ                                                                  X                                                  n                                                                                            (                        m                        )                                                                              ,                  δ                                            )                                      ρ                                                X                                      n                                                                    (                  m                  )                                                      }                    +                                    ∥                              ρ                                                      X                                          n                                                                            (                    m                    )                                                              −                              Π                                  ρ                  ,                  δ                                                  n                                                            ρ                                                      X                                          n                                                                            (                    m                    )                                                                              Π                                  ρ                  ,                  δ                                                  n                                            ∥                                      1                                }                                            ≤        ϵ        +        2                              ϵ                          .                where the first inequality follows from (1) and the second inequality follows from the Gentle Operator Lemma and the properties of unconditional and conditional typicality. Consider now the second term and the following chain of inequalities:
                              ∑                      i            ≠            m                                                E                                              X                              n                                                              {                      Tr                                {                          Π                                                ρ                                                            X                                              n                                                                                    (                      i                      )                                                                      ,                δ                                                                 Π                              ρ                ,                δ                                            n                                                                 ρ                                                X                                      n                                                                    (                  m                  )                                                                                 Π                              ρ                ,                δ                                            n                                      }                    }                                            =                  ∑                      i            ≠            m                                    Tr                          {                                    E                                                      X                                  n                                                                          {                          Π                                                ρ                                                            X                                              n                                                                                    (                      i                      )                                                                      ,                δ                                      }                                           Π                          ρ              ,              δ                                      n                                                                     E                                                      X                                  n                                                                          {                          ρ                                                X                                      n                                                                    (                  m                  )                                                      }                                           Π                          ρ              ,              δ                                      n                                }                                            =                  ∑                      i            ≠            m                                    Tr                          {                                    E                                                      X                                  n                                                                          {                          Π                                                ρ                                                            X                                              n                                                                                    (                      i                      )                                                                      ,                δ                                      }                                           Π                          ρ              ,              δ                                      n                                                       ρ                          ⊗              n                                                       Π                          ρ              ,              δ                                      n                                }                                            ≤                  ∑                      i            ≠            m                                    2                      −            n                          [              H                              (                B                )                            −              δ              ]                                                         Tr                          {                                    E                                                      X                                  n                                                                          {                          Π                                                ρ                                                            X                                              n                                                                                    (                      i                      )                                                                      ,                δ                                      }                                           Π                          ρ              ,              δ                                      n                                }                        The first equality follows because the codewords                               X                      n                                    (          m          )                         and                               X                      n                                    (          i          )                         are independent since they are different. The second equality follows from (2). The first inequality follows from (ef{eq:3rd-typ-prop}). Continuing, we have
                    ≤                  ∑                      i            ≠            m                                    2                      −            n                          [              H                              (                B                )                            −              δ              ]                                                                     E                                              X                              n                                                              {                      Tr                                {                          Π                                                ρ                                                            X                                              n                                                                                    (                      i                      )                                                                      ,                δ                                      }                    }                                            ≤                  ∑                      i            ≠            m                                    2                      −            n                          [              H                              (                B                )                            −              δ              ]                                                         2                      n                          [              H                              (                B                                  |                                X                )                            +              δ              ]                                                                  =                  ∑                      i            ≠            m                                    2                      −            n                          [              I                              (                X                ;                B                )                            −              2              δ              ]                                                                  ≤        M                           2                      −            n                          [              I                              (                X                ;                B                )                            −              2              δ              ]                                      .                The first inequality follows from                               Π                      ρ            ,            δ                                n                          ≤        I                 and exchanging the trace with the expectation. The second inequality follows from (ef{eq:2nd-cond-typ}). The next two are straightforward.
Putting everything together, we get our final bound on the expectation of the average error probability:
                    1        −                              E                                              X                              n                                                              {                                    1              M                                            ∑                          m                                            Tr                                {                          Π                                                ρ                                                            X                                              n                                                                                    (                      m                      )                                                                      ,                δ                                                                                                          Π                    ^                                                                                                ρ                                                            X                                              n                                                                                    (                      m                      −                      1                      )                                                                      ,                δ                                      ⋯                                                                                Π                    ^                                                                                                ρ                                                            X                                              n                                                                                    (                      1                      )                                                                      ,                δ                                                                 Π                              ρ                ,                δ                                            n                                                                 ρ                                                X                                      n                                                                    (                  m                  )                                                                                 Π                              ρ                ,                δ                                            n                                                                                                                       Π                    ^                                                                                                ρ                                                            X                                              n                                                                                    (                      1                      )                                                                      ,                δ                                      ⋯                                                                                Π                    ^                                                                                                ρ                                                            X                                              n                                                                                    (                      m                      −                      1                      )                                                                      ,                δ                                                    Π                                                ρ                                                            X                                              n                                                                                    (                      m                      )                                                                      ,                δ                                      }                    }                                            ≤        ϵ        +        2                              [                          (              ϵ              +              2                                                ϵ                                            )                        +            M                                       2                              −                n                                  [                  I                                      (                    X                    ;                    B                    )                                    −                  2                  δ                  ]                                                      ]                                1                          /                        2                          .                Thus, as long as we choose                     M        =                  2                      n                          [              I                              (                X                ;                B                )                            −              3              δ              ]                                              , there exists a code with vanishing error probability.