QoS Queuing/Congestion Management (Part 4)
Last updated
Last updated
Queuing and scheduling are closely related terms. Queuing is responsible for dictating the number of queues and the length of the queues. A longer queue reduces packet loss but increases delay. (A packet might have to wait longer in the queue when the length is longer).
Scheduling defines the algorithm that selects which queue to take the next packet from, and place it on the wire. A scheduler can round-robin from queues in an equal 1:1 ratio, but it can also give some queues more preference than others. The scheduler can give certain queues gauranteed bandwidth as well.
In general, when people refer to queuing algorithms they are talking about the queuing and scheduling together. Queuing is really only useful during times of congestion - when there are more packets trying to egress the interface than the line rate speed of the interface. Therefore queuing and scheduling are part of congestion management. They are responsible for giving some packets better treatment than other packets, by placing them onto the wire “ahead” of worse traffic.
You may be surprised to learn that when packets are taken from queues, they do not actually get placed directly on the wire. Instead they are put in a hardware queue, called the transmit queue. The transmit queue simply sends packets out the interface whenever there is bandwidth available. The transmit queue does not have a concept of slowing down. It is a basic FIFO queue that simply sends the next packet in line when there is room.
So in the diagram above, the queuing method is responsible for the number of queues and management of the queues. The scheduler is responsible for placing packets from these queues onto the hardware transmit queue. The queues created by the queueing method are software queues. They are simply pointers to places in buffers where the packets are held.
FIFO is the most basic queuing method. There is only a single queue, so there is no scheduling logic needed. Whenever there is space available for send a packet, the next packet in line in the queue is removed from the queue. Tail drop is used if the single queue becomes full. You can adjust the length of the FIFO queue on an interface using the command hold-queue # out.
Adjusting the output queue on a CSR1000v from its default of 40 to 100.
Remember that increasing the queue size increases the delay for any packets that are now held in the queue that would have been dropped before. So it is not necessarily beneficial to increase the queue size.
PQ uses up to four individual queues, with a scheduler that services highest priorities queues unconditionally. The scheduler first looks at the high priority queue. If there are any packets in the queue, these will be placed on the wire. If this queue is empty, it goes to the next highest priority queue and serves that queue, and so forth. The lowest priority queue will only ever get serviced if all other higher priority queues are empty.
The downside to using PQ is that the highest priority queues can use 100% of the bandwidth and completely starve out lower priority queues.
CQ allows for up to 16 queues. The scheduler reserves a percentage of the link bandwidth to each queue, and performs round-robin service to each queue. The scheduler gaurantees a minimum bandwidth level to each queue, and if other queues are using less bandwidth, queues can get more bandwidth than what is configured.
With CQ no single queue has better delay treatment over another. Each queue only gets bandwidth reserved. Therefore CQ is not good for voice traffic which needs low delay.
To reserve bandwidth for each queue, the scheduler takes X bits from each queue during each round robin pass, where X = percentage times link bandwidth. For example, on a 1Gbps link, a queue with 20% bandwidth would have 200M bits serviced each second. If one of the queues is currently empty, the scheduler can immediately move to the next queue in line. Because the scheduler can effectively “skip over” an empty queue, it allows other queues to have their bit count serviced faster which results in more bandwidth for that queue.
MDRR is very similar to CQ but allows a decifict to accumulate on each queue. With CQ, if a packet is bigger than the remaining bit count, CQ sends the packet unfragmented and does not penalize the queue at all.
An example will help explain this. Let’s say there are four queues with 25% bandwidth each, and the scheduler removes 1000 bits from each queue for each interval. If queue four has an 800 bit packet and then 400 bit packet, the scheduler removes both in a single interval. The second packet that “overran” the bit count used up additional bandwidth (200 bits) which the three other queues will suffer from.
With MDRR, queue four would now have a -200bit deficit. When it is serviced in the next pass, MDRR would only use 800 bits for this queue’s bit allowance, not 1000 bits.
Now that we understand the basic queuing/scheduling algorithms (FIFO, PQ, CQ and MDRR), we can look at queuing methods that are implemented in IOS.
Weighted Fair Queuing is sometimes called flow-based WFQ. This is because WFQ does not use classes at all for classification of traffic. Instead it automatically detects flows and puts each and every flow in its own queue. The scheduler then serves each queue based on its flow attributes such as the IP Precedence value.
WFQ uses a modified tail drop if a queue becomes full. WFQ will not drop the last packet to arrive, instead it will drop the packet in the queue with the least preference. WFQ gives a packet preference based on its IPP value and the length of a packet. This means a bigger packet in a flow is more likely to be dropped, even if it is not at the end of the queue.
A flow is identified by the layer 3 and layer 4 information, as well as the IP Precedence. Each flow gets its own queue, and by default all queues receive an equal share of bandwidth on the interface. Low-bandwidth flows then experience better quality than high-bandwidth flows, because low-bandwidth flows can better handle just a fraction of the link bandwidth. For example, if there are currently 20 flows for a T1 interface, a call can handle 75Kbps just fine, whereas a web download will suffer.
WFQ gives more bandwidth to flows with a higher IPP value. Bandwidth is based on ratio of the IPP value, so IPP 7 gets 8 times more bandwidth than IPP 0, and IPP4 gets twice the bandwidth as IPP2. In our previous example with 20 flows, if 2 flows had IPP of 5, they would get 270Kbps, and all 18 other flows with no IPP value (which is IPP0) would get 54Kbps.
You don’t see WFQ used by itself much these days. On serial interfaces slower than E1, WFQ is enabled by default, but on newer versions of IOS-XE it doesn’t seem that you can even enable WFQ on an interface. Nevertheless, the command to enable WFQ on an interface is fair-queue. Because WFQ isn’t class-based (it is flow-based), there is no policy-map and class-map configuration necessary. You simply enable WFQ on an interface.
On newer equipment today, you would only use WFQ on a particular class when preforming class-based WFQ, which we will look at next.
CBWFQ is what normally comes to mind when people think about QoS on Cisco IOS. CBWFQ uses the concepts of CQ on a per-class basis, in which minimum bandwidth is reserved for each class.
CBWFQ supports both tail drop (the default) and WRED for handling congestion avoidance. CBWFQ supports 64 queues. By default, the class-default queue is always present. In older versions of IOS it seems that WFQ was used by default in the class-default, but on CSR1000v I do not find this to be the case. But if you use WFQ for class-default, this would mean that any traffic not matching a user-configured class uses WFQ. FIFO is used by default in all other queues, in terms of managing that single queue itself. CQ principles are used in terms of scheduling among all queues.
Let’s look at a basic configuration of CBWFQ.
We can examine the policy-map applied to Gi1 using the following show command:
We can see that each class gets a percentage of the bandwidth configured on the interface. WEB_CLASS gets 200M and FTP_CLASS gets 700M. Also notice that the class-default uses a per-flow queue. By default all classes are given a queue limit of 416 packets. I believe this is platform-dependent.
Let’s see how we can change some of these values:
,
Above we see that the queue limit has been altered for FTP_CLASS, and the bandwidth reserved for each of the classes has been changed because we changed the configured bandwidth of Gi1.
When reserving bandwidth per-class, we have three options:
bandwidth kbps
bandwidth percent percentage
bandwidth remaining percent|ratio value
When using explicit bandwidth on a class in a policy-map such as bandwidth 100000, you cannot use bandwidth percent or bandwidth remaining on another class:
Likewise when using bandwidth percent and bandwidth remaining, you cannot use an explicit bandwidth value.
LLQ is not a separate queuing tool, but instead it is an option for queuing in a specific class with CBWFQ. LLQ is like PQ but with a policer for the queue to prevent the class from starving out other classes.
LLQ enables a class as a priority queue, which will always be serviced first if there are any packets in the queue. To avoid the possibility of this priority queue starving out other classes, a policer is implemented which will drop packets in the queue if the queue exceeds the configured CIR. LLQ is named as such because it gives the priority queue low latency. A packet in the queue does not experience any scheduling delay, because the scheduler services any packets in this queue immediately.
The process to configure a class for LLQ is as follows:
As you can see, LLQ is just an option for a class when using CBWFQ. You enable LLQ using the priority keyword and you must specify the bandwidth for the policer.
You can configure multiple classes for LLQ, but this effectively places all traffic in these classes in a single queue.
When using LLQ on a class, you cannot configure WRED or the bandwidth command (which reserves minimum bandwidth for a CBWFQ-based queue).
This is not a queuing mechanism, but instead a congestion avoidance mechanism. Nevertheless it seemed fitting to mention it here. To enable WRED on a class, you use the keyword random-detect. There are several additional options you have:
The most common parameter is probably dscp-based. This will automatically drop packets with lower DSCP values. precedence-based operates with the same principal and is actually the default if you don’t specify anything after random-detect.
Using the topology below, we will test basic CBWFQ configuration.
The server runs both apache and ftp server. HostA will download a file via http, and HostB will download a file via FTP.
On R2 we will manage congestion on Gi2 outbound. Because the downloads are going from the server towards the hosts, we can either queue outbound on R2 Gi2, or R1 Gi1.
We will use the same ACLs as before but add the port to the source IPs to be more percise. We’ll reserve 80% of bandwidth for FTP and 20% for web traffic. Additionally we’ll configure Gi2’s bandwidth for 1M. By default the CSR1000v limits throughput to 2Mbps anyways, but by configuring 1M we’ll be able to do calculations more easily.
First we’ll see what happens when we only have a single download. We’ll start an HTTP transfer from HostA and leave HostB doing nothing.
As we can see, HostA is achieving very close to 1Mbps throughput. The bandwidth reservation of 20% is not a maximum, it is a minimum gauranteed bandwidth.
While this download continues, we’ll start an FTP transfer from HostB.
Immediately the FTP transfer takes up 90*8 = 720Kbps, around 80% of the interface bandwidth. The web download on HostA has dropped to 23*8 = 184Kbps, which is around 20% of the interface bandwidth.
We can check statistics on our policy-map on Gi2 to verify that traffic is hitting our configured classes:
This output shows us that there are currently 56 packets in the queue for both web and FTP traffic. Web traffic has experienced more drops (323), which makes sense. Its bandwidth is being limited more aggresively which results in more drops.
If I wait a few more minutes, we can see under the offered rate line of output that FTP is approaching four times the bandwidth as web traffic.
Let’s experiment with LLQ. We’ll configure the web traffic class for LLQ and give it 20 percent bandwidth as the policing rate.
Note that if we try to configure more than 20 percent, IOS gives us an error that we have allocated more than 100 percent of bandwidth in the policy-map:
This policy-map is not much different than what we previously configured. Web traffic is still around 22KB/s. The only difference is that the web queue is getting served first, but as soon as the rate is above 20% of bandwidth in the given interval, packet loss occurs.
We can also configure PQ. The configuration involves simply the priority command with no policing rate, or priority level. We’ll configure a new policy-map.
I’ll stop both downloads and only start the FTP download. Even though it is in a lower level priority queue (level 2), it achieves full bandwidth because it is not competing with web traffic right now.
I then stop the FTP transfer, start the web transfer on HostA, then begin the FTP transfer again on HostB. The FTP transfer can barely even get going, because there is so much packet loss. The web transfer has priority so FTP traffic can never get through. Web traffic is starving out the FTP class.
For this reason, PQ is not often implemented.
Cisco QoS Exam Guide, Wendell Odom and Michael Vacanaugh, Ch. 5