QoS Queuing/Congestion Management (Part 4)

Queuing and scheduling are closely related terms. Queuing is responsible for dictating the number of queues and the length of the queues. A longer queue reduces packet loss but increases delay. (A packet might have to wait longer in the queue when the length is longer).

Scheduling defines the algorithm that selects which queue to take the next packet from, and place it on the wire. A scheduler can round-robin from queues in an equal 1:1 ratio, but it can also give some queues more preference than others. The scheduler can give certain queues gauranteed bandwidth as well.

In general, when people refer to queuing algorithms they are talking about the queuing and scheduling together. Queuing is really only useful during times of congestion - when there are more packets trying to egress the interface than the line rate speed of the interface. Therefore queuing and scheduling are part of congestion management. They are responsible for giving some packets better treatment than other packets, by placing them onto the wire “ahead” of worse traffic.

You may be surprised to learn that when packets are taken from queues, they do not actually get placed directly on the wire. Instead they are put in a hardware queue, called the transmit queue. The transmit queue simply sends packets out the interface whenever there is bandwidth available. The transmit queue does not have a concept of slowing down. It is a basic FIFO queue that simply sends the next packet in line when there is room.

So in the diagram above, the queuing method is responsible for the number of queues and management of the queues. The scheduler is responsible for placing packets from these queues onto the hardware transmit queue. The queues created by the queueing method are software queues. They are simply pointers to places in buffers where the packets are held.

Queuing Algorithms

FIFO (First in First out)

FIFO is the most basic queuing method. There is only a single queue, so there is no scheduling logic needed. Whenever there is space available for send a packet, the next packet in line in the queue is removed from the queue. Tail drop is used if the single queue becomes full. You can adjust the length of the FIFO queue on an interface using the command hold-queue # out.

R1(config)#int gi1
R1(config-if)#hold-queue ?
  <0-240000>  Queue length

R1(config-if)#do sho int gi1 | in queue
  Input queue: 0/375/0/0 (size/max/drops/flushes); Total output drops: 0
  Output queue: 0/40 (size/max)

R1(config-if)#hold-queue 100 out

R1(config-if)#do sho int gi1 | in queue
  Input queue: 0/375/0/0 (size/max/drops/flushes); Total output drops: 0
  Output queue: 0/100 (size/max)

Adjusting the output queue on a CSR1000v from its default of 40 to 100.

Remember that increasing the queue size increases the delay for any packets that are now held in the queue that would have been dropped before. So it is not necessarily beneficial to increase the queue size.

PQ (Priority Queuing)

PQ uses up to four individual queues, with a scheduler that services highest priorities queues unconditionally. The scheduler first looks at the high priority queue. If there are any packets in the queue, these will be placed on the wire. If this queue is empty, it goes to the next highest priority queue and serves that queue, and so forth. The lowest priority queue will only ever get serviced if all other higher priority queues are empty.

The downside to using PQ is that the highest priority queues can use 100% of the bandwidth and completely starve out lower priority queues.

CQ (Custom Queuing)

CQ allows for up to 16 queues. The scheduler reserves a percentage of the link bandwidth to each queue, and performs round-robin service to each queue. The scheduler gaurantees a minimum bandwidth level to each queue, and if other queues are using less bandwidth, queues can get more bandwidth than what is configured.

With CQ no single queue has better delay treatment over another. Each queue only gets bandwidth reserved. Therefore CQ is not good for voice traffic which needs low delay.

To reserve bandwidth for each queue, the scheduler takes X bits from each queue during each round robin pass, where X = percentage times link bandwidth. For example, on a 1Gbps link, a queue with 20% bandwidth would have 200M bits serviced each second. If one of the queues is currently empty, the scheduler can immediately move to the next queue in line. Because the scheduler can effectively “skip over” an empty queue, it allows other queues to have their bit count serviced faster which results in more bandwidth for that queue.

MDRR (Modified Deficit Round-Robin)

MDRR is very similar to CQ but allows a decifict to accumulate on each queue. With CQ, if a packet is bigger than the remaining bit count, CQ sends the packet unfragmented and does not penalize the queue at all.

An example will help explain this. Let’s say there are four queues with 25% bandwidth each, and the scheduler removes 1000 bits from each queue for each interval. If queue four has an 800 bit packet and then 400 bit packet, the scheduler removes both in a single interval. The second packet that “overran” the bit count used up additional bandwidth (200 bits) which the three other queues will suffer from.

With MDRR, queue four would now have a -200bit deficit. When it is serviced in the next pass, MDRR would only use 800 bits for this queue’s bit allowance, not 1000 bits.

Queuing Methods and Configuration

Now that we understand the basic queuing/scheduling algorithms (FIFO, PQ, CQ and MDRR), we can look at queuing methods that are implemented in IOS.

WFQ (Weighted Fair Queuing)

Weighted Fair Queuing is sometimes called flow-based WFQ. This is because WFQ does not use classes at all for classification of traffic. Instead it automatically detects flows and puts each and every flow in its own queue. The scheduler then serves each queue based on its flow attributes such as the IP Precedence value.

WFQ uses a modified tail drop if a queue becomes full. WFQ will not drop the last packet to arrive, instead it will drop the packet in the queue with the least preference. WFQ gives a packet preference based on its IPP value and the length of a packet. This means a bigger packet in a flow is more likely to be dropped, even if it is not at the end of the queue.

A flow is identified by the layer 3 and layer 4 information, as well as the IP Precedence. Each flow gets its own queue, and by default all queues receive an equal share of bandwidth on the interface. Low-bandwidth flows then experience better quality than high-bandwidth flows, because low-bandwidth flows can better handle just a fraction of the link bandwidth. For example, if there are currently 20 flows for a T1 interface, a call can handle 75Kbps just fine, whereas a web download will suffer.

WFQ gives more bandwidth to flows with a higher IPP value. Bandwidth is based on ratio of the IPP value, so IPP 7 gets 8 times more bandwidth than IPP 0, and IPP4 gets twice the bandwidth as IPP2. In our previous example with 20 flows, if 2 flows had IPP of 5, they would get 270Kbps, and all 18 other flows with no IPP value (which is IPP0) would get 54Kbps.

You don’t see WFQ used by itself much these days. On serial interfaces slower than E1, WFQ is enabled by default, but on newer versions of IOS-XE it doesn’t seem that you can even enable WFQ on an interface. Nevertheless, the command to enable WFQ on an interface is fair-queue. Because WFQ isn’t class-based (it is flow-based), there is no policy-map and class-map configuration necessary. You simply enable WFQ on an interface.

On newer equipment today, you would only use WFQ on a particular class when preforming class-based WFQ, which we will look at next.

CBWFQ (Class-Based Weighted Fair Queuing)

CBWFQ is what normally comes to mind when people think about QoS on Cisco IOS. CBWFQ uses the concepts of CQ on a per-class basis, in which minimum bandwidth is reserved for each class.

CBWFQ supports both tail drop (the default) and WRED for handling congestion avoidance. CBWFQ supports 64 queues. By default, the class-default queue is always present. In older versions of IOS it seems that WFQ was used by default in the class-default, but on CSR1000v I do not find this to be the case. But if you use WFQ for class-default, this would mean that any traffic not matching a user-configured class uses WFQ. FIFO is used by default in all other queues, in terms of managing that single queue itself. CQ principles are used in terms of scheduling among all queues.

Let’s look at a basic configuration of CBWFQ.

ip access-list extended WEB_TRAFFIC
 permit tcp any any eq 80
 permit tcp any any eq 443
!
ip access-list extended FTP_TRAFFIC
 permit tcp any any eq 20
 permit tcp any any eq 21
!
class-map WEB_CLASS
 match access-group name WEB_TRAFFIC
!
class-map FTP_CLASS
 match access-group name FTP_TRAFFIC
!
policy-map CBWFQ
 class WEB_CLASS
  bandwidth percent 20
 class FTP_CLASS
  bandwidth percent 70
 class class-default
  fair-queue
!
int Gi1
 service-policy output CBWFQ

We can examine the policy-map applied to Gi1 using the following show command:

R1#show policy-map int gi1
 GigabitEthernet1 

  Service-policy output: CBWFQ

    Class-map: WEB_CLASS (match-all)  
      0 packets, 0 bytes
      5 minute offered rate 0000 bps, drop rate 0000 bps
      Match: access-group name WEB_TRAFFIC
      Queueing
      queue limit 416 packets
      (queue depth/total drops/no-buffer drops) 0/0/0
      (pkts output/bytes output) 0/0
      bandwidth 20% (200000 kbps)

    Class-map: FTP_CLASS (match-all)  
      0 packets, 0 bytes
      5 minute offered rate 0000 bps, drop rate 0000 bps
      Match: access-group name FTP_TRAFFIC
      Queueing
      queue limit 416 packets
      (queue depth/total drops/no-buffer drops) 0/0/0
      (pkts output/bytes output) 0/0
      bandwidth 70% (700000 kbps)
          
    Class-map: class-default (match-any)  
      273 packets, 22850 bytes
      5 minute offered rate 0000 bps, drop rate 0000 bps
      Match: any 
      Queueing
      queue limit 416 packets
      (queue depth/total drops/no-buffer drops/flowdrops) 0/0/0/0
      (pkts output/bytes output) 50/3242
      Fair-queue: per-flow queue limit 104 packets

We can see that each class gets a percentage of the bandwidth configured on the interface. WEB_CLASS gets 200M and FTP_CLASS gets 700M. Also notice that the class-default uses a per-flow queue. By default all classes are given a queue limit of 416 packets. I believe this is platform-dependent.

Let’s see how we can change some of these values:

int Gi1
 bandwidth 500000
!
policy-map CBWFQ
 class FTP_CLASS
  queue-limit 500 packets

R1#show policy-map int gi1
 GigabitEthernet1 

  Service-policy output: CBWFQ

    Class-map: WEB_CLASS (match-all)  
      0 packets, 0 bytes
      5 minute offered rate 0000 bps, drop rate 0000 bps
      Match: access-group name WEB_TRAFFIC
      Queueing
      queue limit 416 packets
      (queue depth/total drops/no-buffer drops) 0/0/0
      (pkts output/bytes output) 0/0
      bandwidth 20% (100000 kbps)

    Class-map: FTP_CLASS (match-all)  
      0 packets, 0 bytes
      5 minute offered rate 0000 bps, drop rate 0000 bps
      Match: access-group name FTP_TRAFFIC
      Queueing
      queue limit 500 packets
      (queue depth/total drops/no-buffer drops) 0/0/0
      (pkts output/bytes output) 0/0
      bandwidth 70% (350000 kbps)
          

    Class-map: class-default (match-any)  
      337 packets, 28265 bytes
      5 minute offered rate 0000 bps, drop rate 0000 bps
      Match: any 
      Queueing
      queue limit 416 packets
      (queue depth/total drops/no-buffer drops/flowdrops) 0/0/0/0
      (pkts output/bytes output) 64/4189
      Fair-queue: per-flow queue limit 104 packets

Above we see that the queue limit has been altered for FTP_CLASS, and the bandwidth reserved for each of the classes has been changed because we changed the configured bandwidth of Gi1.

When reserving bandwidth per-class, we have three options:

bandwidth kbps
bandwidth percent percentage
bandwidth remaining percent|ratio value

When using explicit bandwidth on a class in a policy-map such as bandwidth 100000, you cannot use bandwidth percent or bandwidth remaining on another class:

R1(config)#policy-map test
R1(config-pmap)#class FTP_CLASS
R1(config-pmap-c)#bandwidth 1000
R1(config-pmap-c)#class WEB_CLASS
R1(config-pmap-c)#bandwidth percent 40
Mixed bandwidth types are not supported. Pls configure bandwidth command either in kbps, percent, remaining percent or remaining ratio but not mixed
R1(config-pmap-c)#bandwidth remaining percent 40
Mixed bandwidth types are not supported. Pls configure bandwidth command either in kbps, percent, remaining percent or remaining ratio but not mixed

Likewise when using bandwidth percent and bandwidth remaining, you cannot use an explicit bandwidth value.

LLQ (Low Latency Queuing)

LLQ is not a separate queuing tool, but instead it is an option for queuing in a specific class with CBWFQ. LLQ is like PQ but with a policer for the queue to prevent the class from starving out other classes.

LLQ enables a class as a priority queue, which will always be serviced first if there are any packets in the queue. To avoid the possibility of this priority queue starving out other classes, a policer is implemented which will drop packets in the queue if the queue exceeds the configured CIR. LLQ is named as such because it gives the priority queue low latency. A packet in the queue does not experience any scheduling delay, because the scheduler services any packets in this queue immediately.

The process to configure a class for LLQ is as follows:

policy-map CBWFQ
 class voice-class
  priority bandwidth|percent bandwidth-percent [burst]

As you can see, LLQ is just an option for a class when using CBWFQ. You enable LLQ using the priority keyword and you must specify the bandwidth for the policer.

You can configure multiple classes for LLQ, but this effectively places all traffic in these classes in a single queue.

When using LLQ on a class, you cannot configure WRED or the bandwidth command (which reserves minimum bandwidth for a CBWFQ-based queue).

Enabling WRED

This is not a queuing mechanism, but instead a congestion avoidance mechanism. Nevertheless it seemed fitting to mention it here. To enable WRED on a class, you use the keyword random-detect. There are several additional options you have:

R2(config-pmap-c)#random-detect ?
  discard-class                   parameters for each discard-class value
  discard-class-based             Enable discard-class-based WRED as drop
                                  policy
  dscp                            parameters for each dscp value
  dscp-based                      Enable dscp-based WRED as drop policy
  ecn                             explicit congestion notification
  exponential-weighting-constant  weight for mean queue depth calculation
  precedence                      parameters for each precedence value
  precedence-based                Enable precedence-based WRED as drop policy
  <cr>                            <cr>

The most common parameter is probably dscp-based. This will automatically drop packets with lower DSCP values. precedence-based operates with the same principal and is actually the default if you don’t specify anything after random-detect.

Lab

Using the topology below, we will test basic CBWFQ configuration.

The server runs both apache and ftp server. HostA will download a file via http, and HostB will download a file via FTP.

On R2 we will manage congestion on Gi2 outbound. Because the downloads are going from the server towards the hosts, we can either queue outbound on R2 Gi2, or R1 Gi1.

We will use the same ACLs as before but add the port to the source IPs to be more percise. We’ll reserve 80% of bandwidth for FTP and 20% for web traffic. Additionally we’ll configure Gi2’s bandwidth for 1M. By default the CSR1000v limits throughput to 2Mbps anyways, but by configuring 1M we’ll be able to do calculations more easily.

#R2
ip access-list extended FTP_TRAFFIC
 10 permit tcp any any eq ftp-data
 20 permit tcp any any eq ftp
 30 permit tcp any eq ftp-data any
 40 permit tcp any eq ftp any
ip access-list extended WEB_TRAFFIC
 10 permit tcp any any eq www
 20 permit tcp any any eq 443
 30 permit tcp any eq www any
 40 permit tcp any eq 443 any
!
class-map match-all WEB_CLASS
 match access-group name WEB_TRAFFIC
class-map match-all FTP_CLASS
 match access-group name FTP_TRAFFIC
!
policy-map CBWFQ
 class WEB_CLASS
  bandwidth percent 20 
 class FTP_CLASS
  bandwidth percent 80
!
int Gi2
 bandwidth 1000
 service-policy output CBWFQ

First we’ll see what happens when we only have a single download. We’ll start an HTTP transfer from HostA and leave HostB doing nothing.

cisco@hostA:~$ wget http://10.0.2.2/file.exe
--2022-11-12 15:00:12--  http://10.0.2.2/file.exe
Connecting to 10.0.2.2:80... connected.

file.exe.4            0%[                    ]   1.92M   114KB/s    eta 45m 30s

As we can see, HostA is achieving very close to 1Mbps throughput. The bandwidth reservation of 20% is not a maximum, it is a minimum gauranteed bandwidth.

While this download continues, we’ll start an FTP transfer from HostB.

wget ftp://cisco:cisco@10.0.2.2:21/file.exe --no-passive-ftp
--2022-11-12 15:01:31--  ftp://cisco:*password*@10.0.2.2/file.exe
           => ‘file.exe.4’
Connecting to 10.0.2.2:21... connected.

file.exe.4            0%[                    ]   1.51M  90.2KB/s    eta 61m 18s

#On HostA
file.exe.4            3%[                    ]  10.80M  23.4KB/s    eta 77m 50s

Immediately the FTP transfer takes up 90*8 = 720Kbps, around 80% of the interface bandwidth. The web download on HostA has dropped to 23*8 = 184Kbps, which is around 20% of the interface bandwidth.

We can check statistics on our policy-map on Gi2 to verify that traffic is hitting our configured classes:

R2#show policy-map int gi2
 GigabitEthernet2 

  Service-policy output: CBWFQ

    Class-map: WEB_CLASS (match-all)  
      9056 packets, 13707896 bytes
      5 minute offered rate 231000 bps, drop rate 13000 bps
      Match: access-group name WEB_TRAFFIC
      Queueing
      queue limit 64 packets
      (queue depth/total drops/no-buffer drops) 56/323/0
      (pkts output/bytes output) 8733/13218874
      bandwidth 20% (200 kbps)

    Class-map: FTP_CLASS (match-all)  
      8781 packets, 13271592 bytes
      5 minute offered rate 277000 bps, drop rate 12000 bps
      Match: access-group name FTP_TRAFFIC
      Queueing
      queue limit 64 packets
      (queue depth/total drops/no-buffer drops) 56/231/0
      (pkts output/bytes output) 8550/12921858
      bandwidth 80% (800 kbps)
          
    Class-map: class-default (match-any)  
      40 packets, 4560 bytes
      5 minute offered rate 0000 bps, drop rate 0000 bps
      Match: any 
      
      queue limit 64 packets
      (queue depth/total drops/no-buffer drops) 0/0/0
      (pkts output/bytes output) 0/0

This output shows us that there are currently 56 packets in the queue for both web and FTP traffic. Web traffic has experienced more drops (323), which makes sense. Its bandwidth is being limited more aggresively which results in more drops.

If I wait a few more minutes, we can see under the offered rate line of output that FTP is approaching four times the bandwidth as web traffic.

R2#show policy-map int gi2 | in Class-map|offered
    Class-map: WEB_CLASS (match-all)  
      5 minute offered rate 222000 bps, drop rate 13000 bps
    Class-map: FTP_CLASS (match-all)  
      5 minute offered rate 734000 bps, drop rate 16000 bps
    Class-map: class-default (match-any)  
      5 minute offered rate 0000 bps, drop rate 0000 bps

Let’s experiment with LLQ. We’ll configure the web traffic class for LLQ and give it 20 percent bandwidth as the policing rate.

policy-map CBWFQ
 class WEB_CLASS
  no bandwidth percent 20
  priority percent 20

Note that if we try to configure more than 20 percent, IOS gives us an error that we have allocated more than 100 percent of bandwidth in the policy-map:

R2(config-pmap-c)#priority percent 30
Sum total of class bandwidths exceeds 100 percent

This policy-map is not much different than what we previously configured. Web traffic is still around 22KB/s. The only difference is that the web queue is getting served first, but as soon as the rate is above 20% of bandwidth in the given interval, packet loss occurs.

We can also configure PQ. The configuration involves simply the priority command with no policing rate, or priority level. We’ll configure a new policy-map.

policy-map PQ_TEST
 class WEB_CLASS
  priority level 1
 class FTP_CLASS
  priority level 2
!
int Gi2
 no service-policy output CBWFQ
 service-policy output PQ_TEST

I’ll stop both downloads and only start the FTP download. Even though it is in a lower level priority queue (level 2), it achieves full bandwidth because it is not competing with web traffic right now.

cisco@hostB:~$ wget ftp://cisco:cisco@10.0.2.2:21/file.exe --no-passive-ftp
--2022-11-12 15:30:07--  ftp://cisco:*password*@10.0.2.2/file.exe
           => ‘file.exe.5’
Connecting to 10.0.2.2:21... connected.

file.exe.5            1%[                    ]   4.55M   114KB/s    eta 46m 33s

I then stop the FTP transfer, start the web transfer on HostA, then begin the FTP transfer again on HostB. The FTP transfer can barely even get going, because there is so much packet loss. The web transfer has priority so FTP traffic can never get through. Web traffic is starving out the FTP class.

cisco@hostB:~$ wget ftp://cisco:cisco@10.0.2.2:21/file.exe --no-passive-ftp
--2022-11-12 15:34:32--  ftp://cisco:*password*@10.0.2.2/file.exe
           => ‘file.exe.6’
Connecting to 10.0.2.2:21... connected.
Logging in as cisco ... Logged in!
==> SYST ... done.    ==> PWD ... done.
==> TYPE I ... done.  ==> CWD not needed.
==> SIZE file.exe ... 337641472
==> PORT ... done.    ==> RETR file.exe ... done.
Length: 337641472 (322M) (unauthoritative)

file.exe.6            0%[                    ]  14.14K  --.-KB/s    eta 2d 2h

For this reason, PQ is not often implemented.