2025-01-14 19:14:22 +01:00
|
|
|
# Congestion Avoidance
|
|
|
|
|
Internet can be represented as a huge number
|
|
|
|
|
of links and buffers, where each buffer is
|
|
|
|
|
technically a `router`.
|
|
|
|
|
|
|
|
|
|
Each time we send a packet, the `time` it
|
|
|
|
|
gets to go from `Point S` to `Point D` is
|
|
|
|
|
called `One Way Delay` or `Forward Delay`
|
|
|
|
|
$T_{fw}$.
|
|
|
|
|
|
|
|
|
|
From `Point D` to `Point S` is
|
|
|
|
|
called `Backward Delay` $T_{bw}$
|
|
|
|
|
|
|
|
|
|
The `Round Trip Time` (`RTT`) is:
|
|
|
|
|
$RTT = T_{fw} + T_{bw}$
|
|
|
|
|
|
|
|
|
|
## Phil Karn Algorithm
|
|
|
|
|
To get an RTT estimation, we sum all the
|
|
|
|
|
delays introduced to physically transfer
|
|
|
|
|
packets $T_{pi}$ and the queue time of those
|
|
|
|
|
$T_{qi}$ on each node $N_{i}$.
|
|
|
|
|
$$
|
|
|
|
|
RTT = \sum_i
|
|
|
|
|
\underbrace{T_{pi}}_{
|
|
|
|
|
\text{propagation delay}
|
|
|
|
|
} +
|
|
|
|
|
\sum_i \underbrace{T_{qi}}_{
|
|
|
|
|
\text{queueing delay}
|
|
|
|
|
}
|
|
|
|
|
$$
|
|
|
|
|
|
|
|
|
|
While we can't control any $T_{pi}$ as it is
|
|
|
|
|
a physical constraint, we can control $T_{qi}$
|
|
|
|
|
|
|
|
|
|
## Internet Delay Model
|
|
|
|
|
|
|
|
|
|
<!-- TODO: Add image here-->
|
|
|
|
|
|
|
|
|
|
### Variables
|
|
|
|
|
- $Bw_Delay = e^{-sT_{bw}}$ : Backward Delay
|
|
|
|
|
- $Fw_Delay = e^{-sT_{fw}}$ : Forward Delay
|
|
|
|
|
- $q(t)$ : receiving data $bits$
|
|
|
|
|
- $d(t)$ : disturbance or draining of the buffer $\frac{bits}{s}$
|
|
|
|
|
- $AdW$ : Advertised Window (AKA free buffer space)
|
|
|
|
|
- $G_c(s)$ : Controller that checks how much data can be sent to the receiver
|
|
|
|
|
|
|
|
|
|
### Control
|
|
|
|
|
In order to control this, we need to model a Smith Predictor,
|
|
|
|
|
working without delay first and then equaling the two systems
|
|
|
|
|
later.
|
|
|
|
|
|
|
|
|
|
## TCP Models
|
|
|
|
|
TCP was modeled in order to be a `Greedy Protocol`, thus using as much as bandwidth
|
|
|
|
|
it could grab. In order to have congestions (when a flow can't be transmitted)
|
|
|
|
|
TCP had to be controlled. The difficulty in controlling such system lies in the
|
|
|
|
|
inability to know many variables such as intermediate buffers.
|
|
|
|
|
|
|
|
|
|
Tee way we control TCP has several `flavours` like:
|
|
|
|
|
- TCP Tahoe
|
|
|
|
|
- TCP Reno
|
|
|
|
|
- TCP New Reno
|
|
|
|
|
- TCP Santa Cruz
|
|
|
|
|
- TCP New Vegas
|
|
|
|
|
- TCP Westwood+
|
|
|
|
|
|
|
|
|
|
### TCP Tahoe
|
|
|
|
|
This `flavour` has 2 phases to control the single TCP flow.
|
|
|
|
|
It introduces concepts used in many other `flavours`:
|
|
|
|
|
- $cwndw$ : Congestion Window | If you send data more than this window, you'll likely
|
|
|
|
|
end up having a congestion and lose packets
|
|
|
|
|
- $t_k$ : Time when you received $k$ RTTs
|
|
|
|
|
- $sstresh$: Slow Start Treshold | after this value, we need to avoid having a
|
|
|
|
|
congestion and so we increase the $cwndw$ linearly
|
|
|
|
|
|
|
|
|
|
#### Slow Start (SS)
|
|
|
|
|
This is the fastest phase as it is exponential.
|
|
|
|
|
|
|
|
|
|
Each time all packets sent are successfully received, the $cwndw$ is exponentially increased:
|
|
|
|
|
$$
|
|
|
|
|
cwndw = cwndw(k - 1) * 2 = cwndw(0) * 2^k
|
|
|
|
|
$$
|
|
|
|
|
|
|
|
|
|
Or in other words, on each successful `ACK` :
|
|
|
|
|
$$
|
|
|
|
|
cwndw_{n} = cwndw_{n-1} + 1
|
|
|
|
|
$$
|
|
|
|
|
|
|
|
|
|
#### Congestion Avoidance
|
|
|
|
|
We are in this phase once $cwndw \geq sshtresh$. This phase `probes`
|
|
|
|
|
(check the max capacity) the available bandwidth.
|
|
|
|
|
|
|
|
|
|
Here on each successful `ACK`:
|
|
|
|
|
$$
|
|
|
|
|
cwndw_{n} = cwndw_{n-1} + \frac{1}{cwndw_{n-1}}
|
|
|
|
|
$$
|
|
|
|
|
|
|
|
|
|
Once we receive a `3DUPACK` or an `RTO`, we get that the bandwidth is somewhere
|
|
|
|
|
full and we are losing packets.
|
|
|
|
|
|
|
|
|
|
#### Congestion Phase
|
|
|
|
|
Eithe rwe receive a `3DUPACK` or an `RTO`:
|
|
|
|
|
- $sstresh = cwndw / 2$
|
|
|
|
|
- $cwndw = 2$
|
|
|
|
|
|
|
|
|
|
### TCP Reno / New Reno[^tcp-new-reno-freebsd]
|
|
|
|
|
They differ from TCP Tahoe only on the Congestion Phase
|
|
|
|
|
|
|
|
|
|
#### `3DUPACK` Congestion
|
|
|
|
|
- $sstresh = cwndw / 2$
|
|
|
|
|
- $cwndw = sshtresh$
|
|
|
|
|
|
|
|
|
|
#### `RTO` Congestion
|
|
|
|
|
- $sstresh = cwndw / 2$
|
|
|
|
|
- $cwndw = 2$
|
|
|
|
|
|
|
|
|
|
#### TCP New Reno Throughput Formula (Tuenis Ott)
|
|
|
|
|
$$
|
|
|
|
|
\frac{W}{2}\frac{W}{2} + \frac{1}{2}\frac{W}{2}\frac{W}{2} =
|
|
|
|
|
\frac{3}{8}W^2
|
|
|
|
|
$$
|
|
|
|
|
|
|
|
|
|
This formula is equal to how much $window$ is acked before a loss.
|
|
|
|
|
Therefore,
|
|
|
|
|
$
|
|
|
|
|
\frac{3}{8}W^2 = \frac{1}{p} \leftrightarrow W = \frac{2\sqrt{2}}{\sqrt{3p}}
|
|
|
|
|
$
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### TCP Westwood
|
|
|
|
|
This algorithm tries to estimate the available bandwidth without blindly `probing`:
|
|
|
|
|
$$
|
|
|
|
|
bw_k = \frac{
|
|
|
|
|
acked_{k} - acked_{k-1}
|
|
|
|
|
}{
|
|
|
|
|
t_{k} - t_{k-1}
|
|
|
|
|
} = \frac{data_k}{time\_interval}
|
|
|
|
|
$$
|
|
|
|
|
|
|
|
|
|
The thing is that if you try to sample in such a short period, you'll
|
|
|
|
|
have spikes that can't be filtered because of the high frequency
|
|
|
|
|
components.
|
|
|
|
|
|
|
|
|
|
So, we should sample at $f_c \geq 2f_{max}$ as per Nyquist/Shannon
|
|
|
|
|
Theorem:
|
|
|
|
|
|
|
|
|
|
$$
|
|
|
|
|
\begin{cases}
|
|
|
|
|
bw_k = \frac{
|
|
|
|
|
acked_{k} - acked_{k-1}
|
|
|
|
|
}{
|
|
|
|
|
t_{k} - t_{k-1}
|
|
|
|
|
} = \frac{data_k}{time\_interval} \\
|
|
|
|
|
\hat{bw}_k = \alpha bw_k + (1 - \alpha) \hat{bw}_{k-1} \rightarrow
|
|
|
|
|
\text{this is also called moving average}
|
|
|
|
|
\end{cases} \\
|
|
|
|
|
cwndw = \hat{bw}_k RTT_{min} \rightarrow
|
|
|
|
|
\text{we are assuming $T_q = 0$}
|
|
|
|
|
$$
|
|
|
|
|
|
|
|
|
|
#### `3DUPACK` Congestion
|
|
|
|
|
- $sstresh = \hat{bw}_k RTT_{min}$
|
|
|
|
|
- $cwndw = sshtresh$
|
|
|
|
|
|
|
|
|
|
#### `RTO` Congestion
|
|
|
|
|
- $sstresh = \hat{bw}_k RTT_{min}$
|
|
|
|
|
- $cwndw = 2$
|
|
|
|
|
|
|
|
|
|
### TCP Cubic
|
|
|
|
|
This method has many differences with all preceding ones[^tcp-cubic-linux].
|
|
|
|
|
Here this algorithm is `real-time` dependent and not `RTT` one.
|
|
|
|
|
|
|
|
|
|
This algorithm is slow at start, after $t=K$ it becomes exponential and
|
|
|
|
|
`probes` the bandwidth
|
|
|
|
|
|
|
|
|
|
#### Parameters
|
|
|
|
|
- $\beta$ : Multiplicative decrease factor | Set to $0.7$, $0.75$, $0.8$
|
|
|
|
|
- $W_{max}$ : Windows size `right-before` its last reduction
|
|
|
|
|
- $T$ : Time elapsed since last reduction
|
|
|
|
|
- $C$ : Scaling constant | Set to $0.4$
|
|
|
|
|
- $cwndw$: Congestion Window at current time
|
|
|
|
|
|
|
|
|
|
# Congestion
|
|
|
|
|
- $cwndw = C(t-k)^3+ W_{max}$
|
|
|
|
|
- $K = \sqrt[3]{\frac{(1 -\beta) W_{max}}{C}}$
|
|
|
|
|
|
|
|
|
|
## See also
|
|
|
|
|
- Statistical Multiplexing
|
|
|
|
|
- Congestion Avoidance: Van Jacobson ACM CCR 2004
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
[^tcp-new-reno-freebsd]: TCP New Reno is used as the default algorithm on FreeBSD
|
|
|
|
|
[^tcp-cubic-linux]: TCP Cubic is used as the default in Linux
|