Math graphic
📐 Concept diagram

### 13.7 — Channel Capacity

Phase: Information Theory Prerequisites: 13-03-mutual-information, 13-01-entropy

Learning Objectives

By the end of this subject, you will be able to:

  1. Define a communication channel model and channel capacity
  2. Compute capacity for simple channels (BSC, BEC)
  3. State Shannon's noisy-channel coding theorem and its implications
  4. Explain the relationship between capacity, bandwidth, and SNR
  5. Interpret capacity as the maximum reliable communication rate

Core Content

⚠️ CRITICAL: The Communication Channel

A channel is defined by its conditional distribution $P(Y \mid X)$: - $X$: channel input (what we transmit) - $Y$: channel output (what is received)

Channel capacity is the maximum rate at which information can be transmitted reliably:

$$C = \max_{P(X)} I(X; Y)$$

The maximisation is over all possible input distributions $P(X)$.

Binary Symmetric Channel (BSC)

The simplest noisy channel: bits are flipped with probability $p$.

$$P(Y=0 \mid X=0) = 1-p, \quad P(Y=1 \mid X=0) = p$$ $$P(Y=0 \mid X=1) = p, \quad P(Y=1 \mid X=1) = 1-p$$

Capacity: $C_{\text{BSC}} = 1 - H_b(p)$ bits per channel use

The capacity is achieved by a uniform input distribution: $P(X=0) = P(X=1) = 0.5$.

Binary Erasure Channel (BEC)

Bits are either received correctly or "erased" (lost, denoted ?) with probability $\epsilon$.

$$P(Y=0 \mid X=0) = 1-\epsilon, \quad P(Y=\text{?} \mid X=0) = \epsilon, \quad P(Y=1 \mid X=0) = 0$$ $$P(Y=1 \mid X=1) = 1-\epsilon, \quad P(Y=\text{?} \mid X=1) = \epsilon, \quad P(Y=0 \mid X=1) = 0$$

Capacity: $C_{\text{BEC}} = 1 - \epsilon$ bits per channel use

Erasures are "easier" than errors — you at least KNOW when you're uncertain! The BEC at $\epsilon = 0.5$ still has capacity 0.5, while BSC at $p = 0.5$ has capacity 0.

⚠️ CRITICAL: Shannon's Noisy-Channel Coding Theorem

Theorem: For any discrete memoryless channel with capacity $C$:

  1. Achievability: For any rate $R < C$, there exists a code that achieves arbitrarily low error probability.
  2. Converse: For any rate $R > C$, reliable communication is impossible — the error probability is bounded away from zero.

Key insight: Reliable communication IS possible over noisy channels, up to a fundamental limit $C$. Error-correcting codes exist that can achieve rates arbitrarily close to $C$ (though the codes may be extremely long and complex).

This theorem launched the field of coding theory. Modern codes (Turbo codes, LDPC, polar codes) approach the Shannon limit in practice.

Continuous-Time Channel: AWGN

For the Additive White Gaussian Noise channel (continuous-time):

$$C = B \log_2\left(1 + \frac{S}{N}\right) \quad \text{bits per second}$$

Where: - $B$ = bandwidth (Hz) - $S/N$ = signal-to-noise ratio (linear, not dB)

This is the Shannon-Hartley theorem — the fundamental limit of any communication system.

Example: Telephone line: $B = 3400$ Hz, SNR = 30 dB = 1000.

$C = 3400 \cdot \log_2(1 + 1000) \approx 3400 \cdot 9.97 \approx 33,900$ bps.

This is the theoretical maximum for a dial-up modem — real modems hit ~56 kbps by exploiting the digital backbone, which circumvents the analog local-loop limit.



Key Terms

Worked Examples

Example 1: BSC capacity calculation

BSC with flip probability $p = 0.2$.

$H_b(0.2) = -0.2\log_2 0.2 - 0.8\log_2 0.8 = 0.464 + 0.258 = 0.722$

$C = 1 - 0.722 = 0.278$ bits per channel use.

Per 1000 channel uses, you can reliably transmit at most ~278 bits of information.

Example 2: Comparing BSC and BEC

Both have their "damage" parameter at 0.3:

BSC ($p=0.3$): $C = 1 - H_b(0.3) = 1 - 0.881 = 0.119$ bits

BEC ($\epsilon=0.3$): $C = 1 - 0.3 = 0.700$ bits

BEC capacity is MUCH higher because erasures tell you where the uncertainty is — you can use erasure codes efficiently. Errors are worse because you don't know which bits are wrong.

Example 3: AWGN capacity

WiFi channel: $B = 20$ MHz, SNR = 15 dB = 31.62.

$C = 20 \times 10^6 \cdot \log_2(1 + 31.62) = 20 \times 10^6 \cdot \log_2(32.62) \approx 20 \times 10^6 \cdot 5.028 \approx 100.6$ Mbps.

Real WiFi achieves much less due to practical coding gaps, overhead, and interference — but the theoretical limit is instructive.



Quiz

Q1: What does the concept of Channel capacity primarily refer to in this subject?

A) A visual representation of Channel capacity B) A historical anecdote about Channel capacity C) A computational error related to Channel capacity D) The definition and application of Channel capacity

Correct: D)

Q2: Which of the following is the key formula discussed in this subject?

A) An unrelated formula from a different topic B) P(Y \mid X) C) The inverse operation of the formula in question D) A simplified version of P(Y \mid X)...

Correct: B)

Q3: What is the primary purpose of Shannon-Hartley theorem?

A) It is used only in advanced research contexts B) It replaces all other methods in this domain C) It is used to shannon-hartley theorem in mathematical analysis D) It is primarily a historical notation system

Correct: C)

Q4: Which statement about ⚠️ Critical: The Communication Channel is TRUE?

A) ⚠️ Critical: The Communication Channel is mentioned only as a historical footnote B) ⚠️ Critical: The Communication Channel is an advanced topic beyond this subject's scope C) ⚠️ Critical: The Communication Channel is not related to this subject D) ⚠️ Critical: The Communication Channel is a fundamental concept covered in this subject

Correct: D)

Q5: Based on the worked examples in this subject, what is the correct result?

A) \max_{P(X)} I(X; Y)$ — maximum reliable communicatio B) A different result from a common mistake C) An unrelated numerical value D) The inverse of the correct answer

Correct: A)

Q6: How are ⚠️ Critical: The Communication Channel and Binary Symmetric Channel (Bsc) related?

A) ⚠️ Critical: The Communication Channel is the inverse of Binary Symmetric Channel (Bsc) B) ⚠️ Critical: The Communication Channel and Binary Symmetric Channel (Bsc) are completely unrelated topics C) ⚠️ Critical: The Communication Channel and Binary Symmetric Channel (Bsc) are closely related concepts D) ⚠️ Critical: The Communication Channel is a special case of Binary Symmetric Channel (Bsc)

Correct: C)

Q7: What is a common pitfall when working with Binary Erasure Channel (Bec)?

A) A common mistake is confusing Binary Erasure Channel (Bec) with a similar concept B) The main error with Binary Erasure Channel (Bec) is using it when it is not needed C) Binary Erasure Channel (Bec) has no common misconceptions D) Binary Erasure Channel (Bec) is always computed the same way in all contexts

Correct: A)

Q8: When should you apply ⚠️ Critical: Shannon'S Noisy-Channel Coding Theorem?

A) Avoid ⚠️ Critical: Shannon'S Noisy-Channel Coding Theorem unless explicitly instructed B) Apply ⚠️ Critical: Shannon'S Noisy-Channel Coding Theorem to solve problems in this subject's domain C) Use ⚠️ Critical: Shannon'S Noisy-Channel Coding Theorem only in pure mathematics contexts D) ⚠️ Critical: Shannon'S Noisy-Channel Coding Theorem is not practically useful

Correct: B)

Practice Problems

  1. What is the capacity of a BSC with $p = 0.01$?

    Click for answer $H_b(0.01) \approx 0.081$ bits $C = 1 - 0.081 = 0.919$ bits per channel use Even 1% error rate reduces capacity by ~8%.

  2. Why is $C = 0$ for a BSC with $p = 0.5$?

    Click for answer $H_b(0.5) = 1$, so $C = 0$. With $p=0.5$, the output is independent of the input — $I(X; Y) = 0$ for any input distribution. No matter what you transmit, the receiver sees pure noise. No reliable communication is possible.

  3. Channel capacity is defined as $C = \max_{P(X)} I(X; Y)$. Why maximise over $P(X)$?

    Click for answer The channel is defined by $P(Y \mid X)$, which is fixed (it's the physical medium). But we CAN choose how we use the channel — the input distribution $P(X)$. For example, a BSC achieves maximum $I(X; Y)$ when inputs are equally likely. If we always sent 0, $I(X; Y) = 0$. Capacity is the best we can do by designing the input optimally.

  4. The AWGN capacity formula $C = B \log_2(1 + S/N)$ suggests increasing bandwidth helps. Is there a limit?

    Click for answer As $B \to \infty$, $C \to \frac{S}{N_0} \log_2 e \approx 1.44 \cdot \frac{S}{N_0}$ (where $N_0$ is noise power spectral density). Infinite bandwidth does NOT give infinite capacity — it approaches a finite limit determined by signal power and noise density.

  5. If a channel has capacity $C = 0.5$ bits per use, can you transmit 100 bits reliably in 150 channel uses?

    Click for answer Rate $R = 100/150 = 0.667$ bits per use. Since $R > C = 0.5$, reliable transmission is IMPOSSIBLE according to the noisy-channel coding theorem (converse). You need at least $100/0.5 = 200$ channel uses. The error probability will be bounded away from zero.


Summary

Key takeaways:


Pitfalls



Next Steps

Next up: 13-08-differential-entropy.md