PIM-SM SPT Switchover

In the previous article, we covered PIM-SM. We saw that in certain cases, traffic does not follow the optimal path. This is because traffic must flow through the RP. By using the SPT (shortest path tree) switchover feature, a LHR can switchover to a SPT-based tree rooted at the source.

In order to learn the source, the traffic needs to first flow via the RP. The LHR discovers the source upon receiving the first packet via the shared tree. Once the LHR discovers the source, it can switchover to a source-based tree. This allows the LHR to bypass the RP and solve the non-optimal path issue.

Lab

We’ll reuse our topology from the last article. To start, ensure all hosts have no igmp join-group configuration on their Gi0/0 interface.

We’ll re-create the issue we found at the end of the previous article. Configure Host4 to join the group 239.100.100.100.

#Host4
int gi0/0
 ip igmp join-group 239.100.100.100

If Source1 sends a ping to this group address, it must flow directly to the RP, and then from the RP down the shared tree. If we configure SPT switchover on R5, R5 will immediately switch to a source-based tree to bypass the RP.

SPT switchover is actually configured by default. We turned it off in the previous lab so that we could understand default PIM-SM behavior. To turn SPT switchover back on, use the following command:

#R5
ip pim spt-threshold 0

The SPT threshold is the point at which the router will switch to the source based tree. In older versions of IOS you could specify a bit per second value. In modern IOS-XE, you can only set the threshold to 0 or infinity. 0 means to switchover to the source-based tree immediately upon the first packet received from a source. Infinity means to never switchover. By default it is set to 0.

Ping the group from Source1 again:

Source#ping 239.100.100.100 repeat 3
Type escape sequence to abort.
Sending 3, 100-byte ICMP Echos to 239.100.100.100, timeout is 2 seconds:

Reply to request 0 from 10.10.200.10, 7 ms
Reply to request 1 from 10.10.200.10, 7 ms
Reply to request 1 from 10.10.200.10, 38 ms
Reply to request 2 from 10.10.200.10, 4 ms

R5 now has both the (*, G) and (S, G) entry.

R5#show ip mroute 239.100.100.100

(*, 239.100.100.100), 02:11:25/stopped, RP 6.6.6.6, flags: SJC
  Incoming interface: GigabitEthernet3, RPF nbr 10.4.5.4
  Outgoing interface list:
    GigabitEthernet2, Forward/Sparse, 02:11:25/00:02:55

(10.10.10.10, 239.100.100.100), 00:00:08/00:02:51, flags: JT
  Incoming interface: GigabitEthernet1, RPF nbr 10.1.5.1
  Outgoing interface list:
    GigabitEthernet2, Forward/Sparse, 00:00:08/00:02:55

Packet Walkthrough

The following steps took place just now:

Step 1

The RP forwards the ping down the shared tree. R5 receives the packet on Gi3. R5 learns the source IP from this first packet.

Step 2

R5 immediately switches over to the source-based tree. R5 sends a PIM Join out Gi1 for (10.10.10.10, 239.100.100.100), as Gi1 is the RPF interface for 10.10.10.10.

Packet capture taken on Gi1:

Step 3

ICMP sequence 1 (the 2nd packet) arrives both at Gi3 and Gi1. R5 can now Prune itself from the shared tree but only for the source 10.10.10.10. If R5 did not Prune itself, it would receive duplicate traffic on both Gi3 and Gi1. R5 also may still need to receive traffic to this group from other sources, so it can’t Prune itself from the shared tree altogether.

R5 sends a PIM Prune for source 10.10.10.10 with the RP-bit set. Since the RP-bit is set, this means the Prune is for the shared (*, G) tree. Because the source is specified, it means that R5 wants to be Pruned off the shared tree just for the source 10.10.10.10. In practice this means that routers along the shared tree path will remove their downstream interface facing the LHR from the (10.10.10.10, 239.100.100.100) entry. These routers will leave this interface in their OIL for the parent (*, 239.100.100.100) entry.

Step 4

R4 removes Gi4 from the OIL for (10.10.10.10, 239.100.100.100). R4 has no other interfaces in the OIL for this entry, so it sends a PIM Prune up to R6. Gi4 is still in the OIL for the parent (*, G) entry.

R4#show ip mroute 239.100.100.100

(*, 239.100.100.100), 01:56:18/00:03:20, RP 6.6.6.6, flags: S
  Incoming interface: GigabitEthernet3, RPF nbr 10.4.6.6
  Outgoing interface list:
    GigabitEthernet4, Forward/Sparse, 01:56:18/00:03:20

(10.10.10.10, 239.100.100.100), 00:05:53/00:00:23, flags: PR
  Incoming interface: GigabitEthernet3, RPF nbr 10.4.6.6
  Outgoing interface list: Null

Step 5

R6 removes Gi3 from the OIL for (10.10.10.10, 239.100.100.100).

R6#show ip mroute 239.100.100.100

(*, 239.100.100.100), 01:56:56/stopped, RP 6.6.6.6, flags: S
  Incoming interface: Null, RPF nbr 0.0.0.0
  Outgoing interface list:
    GigabitEthernet3, Forward/Sparse, 01:56:56/00:02:43
          
(10.10.10.10, 239.100.100.100), 00:00:44/00:02:19, flags: PT
  Incoming interface: GigabitEthernet1, RPF nbr 10.1.6.1
  Outgoing interface list: Null

Traffic continues to flow from the source to R5 via the shorest path tree.

Non-LHR Switchover

What happens if the "diversion point" is upstream of the LHR? For example, join 239.101.101.101 on Host1. When traffic is initiated from the source, it flows via the RP as normal. The RP sends the traffic down the shared tree to R2. R2 sends the traffic down the shared tree to R3. When R3 (the LHR) learns of the source, it switches to a (S, G) tree. However, it will still receive the traffic via the same interface (Gi1) whether it is receiving on the SPT or the shared tree.

The "diversion point" here is R2 instead of R3 (the LHR). When R2 receives the (S, G) Join from R3, it sends its own PIM Join out Gi1 for (S, G). Duplicate traffic now arrives via both Gi1 (SPT) and Gi4 (RPT or shared tree). It is now R2's responsibility to prune itself off the (*, G) tree for this source.

With the spt-threshold set to 0 on R3, the source begins sending to 239.101.101.101:

  • R3 learns of the source via the shared tree (from the PIM Register process)

  • R3 initiates an spt switchover and sends a (S, G) PIM Join for (10.10.10.10, 239.101.101.101) out Gi1.

  • R3 does not prune itself off the RPT tree for the (S, G), because both entries have the same incoming interface. (R3 receives the traffic on Gi1 either way).

  • R2 receives the PIM Join, finds that the RPF interface for the source 10.10.10.10 is Gi1, and sends a PIM Join for (10.10.10.10, 239.101.101.101) out Gi1.

  • R2 also notices that this same group has a (*, G) tree that has an incoming interface of Gi4. R2 sends a RPT Prune for (10.10.10.10, 239.101.101.101) on the shared tree towards the RP, out Gi4. This happens whether or not the spt-threshold is set to 0 on R2.

So we can say that the (S, G) RPT Prune only happens on routers where the (S, G) entry has a different incoming interface than the associated (*, G) entry.

Conclusion

Without SPT switchover, traffic would follow a non-optimal path via the RP. SPT switchover allows an LHR to bypass the RP in the data path.

SPT switchover is on by default. This means that the RP is really only used for the initial traffic flow. After the first packet, traffic flows via a shortest path tree rooted at the source.

Last updated