Linux內核的網絡模塊具有網絡流量控制功能。包iproute2包安裝了tc
命令通過命令行來操控。
這篇文章展示了如何使用排隊規則來控制流量。比如,若因為用戶濫用網絡帶寬,要禁止所管理的網絡範圍內下載或種子下載行為,可以用排隊規則(queueing disciplines),禁調此類流量,保證整個網絡的速度。
這篇文章需要讀者對於網絡設備、iptables有一定的了解。
排隊
排隊控制著數據的發送方式;接收數據的反應性更強,面向網絡的控制更少。然而,由於TCP/IP數據包是使用慢速啟動發送的,因此系統開始緩慢發送數據包,並保持越來越快的發送速度,直到數據包開始被拒絕——因此,可以通過在轉發之前丟棄到達路由器的數據包來控制區域網上接收的流量。還有更多相關細節,但它們沒有直接涉及排隊邏輯。為了完全控制流量的大小,我們需要成為鏈中最慢的環節。也就是說,如果連接的最大下載速度為500k,如果你不將輸出限制在450k或以下,那麼將由數據機而不是我們來限制流量。每個網絡設備都有一個根目錄,可以在其中設置qdisc。默認情況下,此根具有fq_codel qdisc。(更多信息見下文)
有兩種類型:有分類和無分類。
類qdisc允許您創建類,其工作方式類似於樹上的分支。然後,您可以設置規則,將數據包過濾到每個類中。每個類本身都可以分配其他有類或無類qdisc。
無類qdisc不允許向其中添加更多qdisc。在開始配置qdisc之前,首先我們需要從根目錄中刪除任何現有的qdisc。這將從eth0設備中刪除任何qdisc:
# tc qdisc del root dev eth0
無類Qdiscs
這些隊列通過重新排序、減慢或丟棄數據包來對流量進行基本管理。此qdiscs不允許創建類。fifo_fast這是systemd 217之前的默認qdisc。在沒有應用自定義qdisc配置的每個網絡設備中,fifo_fast是根上設置的qdisc。fifo的意思是先進先出,也就是說,第一個進入的數據包將是第一個被發送的。這樣,沒有包裹會得到特殊待遇。令牌桶過濾器(TBF)只要不超過特定的速率限制,此qdisc就允許傳遞字節。它的工作原理是創建一個虛擬桶,然後以一定的速度丟棄代幣,填充該桶。每個包都從bucket中獲取一個虛擬令牌,並使用它來獲得通過的權限。如果到達的數據包太多,bucket將沒有剩餘的令牌,剩餘的數據包將等待一定時間以獲得新的令牌。如果令牌到達的速度不夠快,數據包將被丟棄。在相反的情況下(發送的數據包太少),令牌可用於允許一些突發(上傳尖峰)發生。
這意味著這個 qdisc 在減慢界面速度方面很有用。
舉例:
上傳可能會填滿數據機的隊列,因此,當您上傳一個巨大的文件時,交互性會被破壞。
# tc qdisc add dev ppp0 root tbf rate 220kbit latency 50ms burst 1540
請注意,上述上傳速度應更改為您的上傳速度減去幾個百分點(即鏈中最慢的環節)。此配置為ppp0
設備設置TBF,將上傳速度限制為220k,為丟棄前的包設置50ms的延遲,並設置1540的突發。
它的工作原理是將排隊保持在Linux機器上(可以在那裡進行整形),而不是數據機上。
Stochastic Fairness Queueing (SFQ)
This is a round-robin qdisc. Each conversation is set on a fifo queue, and on each round, each conversation has the possibility to send data. That is why it is called "Fairness". It is also called "Stochastic" because it does not really create a queue for each conversation, instead it uses a hashing algorithm. For the hash, there is a chance for multiple sessions on the same bucket. To solve this, SFQ changes its hashing algorithm often to prevent that this becomes noticeable.
Example:
This configuration sets SFQ on the root on the eth0 device, configuring it to perturb (alter) its hashing algorithm every 10 seconds.
# tc qdisc add dev eth0 root sfq perturb 10
CoDel and Fair Queueing CoDel
Since systemd 217, fq_codel is the default. CoDel (Controlled Delay) is an attempt to limit buffer bloating and minimize latency in saturated network links by distinguishing good queues (that empty quickly) from bad queues that stay saturated and slow. The fair queueing Codel utilizes fair queues to more readily distribute available bandwidth between Codel flows. The configuration options are limited intentionally, since the algorithm is designed to work with dynamic networks, and there are some corner cases to consider that are discussed on the bufferbloat wiki concerning Codel, including issues on very large switches and sub megabit connections.
Additional information is available via the tc-codel(8) and tc-fq_codel(8).
Classful Qdiscs
Classful qdiscs are very useful if you have different kinds of traffic which should have differing treatment. A classful qdisc allows you to have branches. The branches are called classes.
Setting a classful qdisc requires that you name each class. To name a class,the classid
parameter is used . The parent
parameter, as the name indicates, points to the parent of the class.
All the names should be set as x:y
where x
is the name of the root, and y
is the name of the class. Normally, the root is called 1:
and its children are things like 1:10
Hierarchical Token Bucket (HTB)
HTB is well suited for setups where you have a fixed amount of bandwidth which you want to divide for different purposes, giving each purpose a guaranteed bandwidth, with the possibility of specifying how much bandwidth can be borrowed. Here is an example with comments explaining what each line does:
# This line sets a HTB qdisc on the root of eth0, and it specifies that the class 1:30 is used by default. It sets the name of the root as 1:, for future references. tc qdisc add dev eth0 root handle 1: htb default 30 # This creates a class called 1:1, which is direct descendant of root (the parent is 1:), this class gets assigned also an HTB qdisc, and then it sets a max rate of 6mbits, with a burst of 15k tc class add dev eth0 parent 1: classid 1:1 htb rate 6mbit burst 15k # The previous class has this branches: # Class 1:10, which has a rate of 5mbit tc class add dev eth0 parent 1:1 classid 1:10 htb rate 5mbit burst 15k # Class 1:20, which has a rate of 3mbit tc class add dev eth0 parent 1:1 classid 1:20 htb rate 3mbit ceil 6mbit burst 15k # Class 1:30, which has a rate of 1kbit. This one is the default class. tc class add dev eth0 parent 1:1 classid 1:30 htb rate 1kbit ceil 6mbit burst 15k # Martin Devera, author of HTB, then recommends SFQ for beneath these classes: tc qdisc add dev eth0 parent 1:10 handle 10: sfq perturb 10 tc qdisc add dev eth0 parent 1:20 handle 20: sfq perturb 10 tc qdisc add dev eth0 parent 1:30 handle 30: sfq perturb 10
Filters
Once a classful qdisc is set on root (which may contain classes with more classful qdiscs), it is necessary to use filters to indicate which package should be processed by which class.
On a classless-only environment, filters are not necessary.
You can filter packets by using tc, or a combination of tc + iptables.
Using tc only
Here is an example explaining a filter:
# This command adds a filter to the qdisc 1: of dev eth0, set the # priority of the filter to 1, matches packets with a # destination port 22, and make the class 1:10 process the # packets that match. tc filter add dev eth0 protocol ip parent 1: prio 1 u32 match ip dport 22 0xffff flowid 1:10 # This filter is attached to the qdisc 1: of dev eth0, has a # priority of 2, and matches the ip address 4.3.2.1 exactly, and # matches packets with a source port of 80, then makes class # 1:11 process the packets that match tc filter add dev eth0 parent 1: protocol ip prio 2 u32 match ip src 4.3.2.1/32 match ip sport 80 0xffff flowid 1:11
Using tc + iptables
iptables has a method called fwmark that can be used to mark packets across interfaces.
First, this makes packets marked with 6, to be processed by the 1:30 class
# tc filter add dev eth0 protocol ip parent 1: prio 1 handle 6 fw flowid 1:30
This sets that mark 6, using iptables
# iptables -A PREROUTING -t mangle -i eth0 -j MARK --set-mark 6
You can then use iptables normally to match packets and then mark them with fwmark.
Example of ingress traffic shaping with SNAT
Qdiscs on ingress traffic provide only policing with no shaping. In order to shape ingress, the IFB (Intermediate Functional Block) device has to be used. However, another problem arises if SNAT or MASQUERADE is in use, as all incoming traffic has the same destination address. The Qdisc intercepts the incoming traffic on the external interface before reverse NAT translation so it can only see the router's IP as destination of the packets.
The following solution is implemented on OpenWRT and can be applied to Arch Linux: First the outgoing packets are marked with MARK and the corresponding connections (and related connections) with CONNMARK. On the incoming packets an ingress u32 filter redirects the traffic to IFB (action mirred), and also retrieves the mark of the packet from CONNTRACK (action connmark) thus providing information as to which IP behind the NAT initiated the traffic).
This function is integrated in kernel since linux包-3.19 and in iproute2包 since 4.1.
The following is a small script with only 2 HTB classes on ingress to demonstrate it. Traffic defaults to class 3:30. Outgoing traffic from 192.168.1.50 (behind NAT) to the Internet is marked with "3" and thus incoming packets from the Internet going to 192.168.1.50 are marked also with "3" and are classified on 3:33.
#!/bin/sh -x # Maximum allowed downlink. Set to 90% of the achievable downlink in kbits/s DOWNLINK=1800 # Interface facing the Internet EXTDEV=enp0s3 # Load IFB, all other modules all loaded automatically modprobe ifb ip link set dev ifb0 down # Clear old queuing disciplines (qdisc) on the interfaces and the MANGLE table tc qdisc del dev $EXTDEV root 2> /dev/null > /dev/null tc qdisc del dev $EXTDEV ingress 2> /dev/null > /dev/null tc qdisc del dev ifb0 root 2> /dev/null > /dev/null tc qdisc del dev ifb0 ingress 2> /dev/null > /dev/null iptables -t mangle -F iptables -t mangle -X QOS # appending "stop" (without quotes) after the name of the script stops here. if [ "$1" = "stop" ] then echo "Shaping removed on $EXTDEV." exit fi ip link set dev ifb0 up # HTB classes on IFB with rate limiting tc qdisc add dev ifb0 root handle 3: htb default 30 tc class add dev ifb0 parent 3: classid 3:3 htb rate ${DOWNLINK}kbit tc class add dev ifb0 parent 3:3 classid 3:30 htb rate 400kbit ceil ${DOWNLINK}kbit tc class add dev ifb0 parent 3:3 classid 3:33 htb rate 1400kbit ceil ${DOWNLINK}kbit # Packets marked with "3" on IFB flow through class 3:33 tc filter add dev ifb0 parent 3:0 protocol ip handle 3 fw flowid 3:33 # Outgoing traffic from 192.168.1.50 is marked with "3" iptables -t mangle -N QOS iptables -t mangle -A FORWARD -o $EXTDEV -j QOS iptables -t mangle -A OUTPUT -o $EXTDEV -j QOS iptables -t mangle -A QOS -j CONNMARK --restore-mark iptables -t mangle -A QOS -s 192.168.1.50 -m mark --mark 0 -j MARK --set-mark 3 iptables -t mangle -A QOS -j CONNMARK --save-mark # Forward all ingress traffic on internet interface to the IFB device tc qdisc add dev $EXTDEV ingress handle ffff: tc filter add dev $EXTDEV parent ffff: protocol ip \ u32 match u32 0 0 \ action connmark \ action mirred egress redirect dev ifb0 \ flowid ffff:1 exit 0