PIM-SM was developed for intradomain operation. If two separate ASs want to route multicast traffic between their domains, they must use MSDP (Multicast Source Discovery Protocol) in order to solve an issue with source discovery.
The inherent issue with interdomain multicast is that an AS should have control over its own RPs. It does not want its internal routers to use the RP of remote domains. This poses a problem when it comes to source discovery. In intradomain PIM-SM, the FHR will Register the source with its RP. In interdomain PIM-SM, we need another mechanism to do source registration, because we don’t want all FHRs to use an RP in another domain.
With MSDP, the RPs in each domain peer with each other. When an RP sees a PIM Register, it sends a SA (Source Active) message to all MSDP peers to notify them of this source. This allows the other RP(s) to create a SPT rooted at this source and pull in the traffic over the interdomain boundary. In effect, the SA replaces the PIM Register, and in fact the multicast traffic is encapsulated in the SA message just like in a PIM Register message. The SA message is sent every 60 seconds while the source continues to be active to keep the entry from timing out on remote RPs.
MSDP uses TCP port 639. It functions sort of like a "lite" version of BGP. Peers are explicitly configured and maintained via keepalives.
In addition to MSDP, you can optionally use MP-BGP to exchange reachability information of sources over the domain boundry. The AFI is ipv4 multicast. The prefixes of the sources are advertised, allowing other routers in the domain to create a (S, G) SPT rooted at a source in another domain. The prefixes learned over ipv4 multicast are used for the RPF check. These prefixes are not present in the unicast RIB, instead they are present in a separate mcast table. Using MP-BGP is not required for interdomain multicast to work. As long as a route to the source is present, traffic will work. The RPF check can use either the unicast table or mcast table (which is populated via BGP).
This article will cover interdomain PIM-SM. Interdomain PIM-SSM is a much easier alternative, as you don’t need to run MSDP because RPs are not required in the first place. You simply must have reachability to the sources in order for PIM-SSM to work.
Lab
We will use the following topology. Domain 1 runs OSPF and Domain 2 runs ISIS. Configure R3 and R6 as the RP for their domains using BSR.
Here are the startup configs:
#R1
hostname R1
!
ip multicast-routing distributed
!
int Gi1
ip address 10.10.13.1 255.255.255.0
no shut
ip ospf network point-to-point
ip pim sparse-mode
!
int Gi2
ip address 10.10.100.1 255.255.255.0
ip pim sparse-mode
no shut
!
int Lo0
ip address 1.1.1.1 255.255.255.255
!
router ospf 1
network 0.0.0.0 255.255.255.255 area 0
passive-interface Gi2
#R2
hostname R2
!
ip multicast-routing distributed
!
int Gi1
ip address 10.10.10.1 255.255.255.0
no shut
ip pim sparse-mode
!
int Gi2
ip address 10.10.23.2 255.255.255.0
ip pim sparse-mode
ip ospf network point-to-point
no shut
!
int Lo0
ip address 2.2.2.2 255.255.255.255
!
router ospf 1
network 0.0.0.0 255.255.255.255 area 0
passive-interface Gi1
#R3
hostname R3
!
ip multicast-routing distributed
!
int Gi1
ip address 10.10.23.3 255.255.255.0
no shut
ip pim sparse-mode
ip ospf network point-to-point
!
int Gi2
ip address 10.10.13.3 255.255.255.0
ip pim sparse-mode
ip ospf network point-to-point
no shut
!
int Gi3
ip address 10.10.34.3 255.255.255.0
ip pim sparse-mode
ip ospf network point-to-point
no shut
!
int Lo0
ip address 3.3.3.3 255.255.255.255
ip pim sparse-mode
!
ip pim bsr-candidate lo0
ip pim rp-candidate lo0
!
router ospf 1
network 0.0.0.0 255.255.255.255 area 0
#R4
hostname R4
!
ip multicast-routing distributed
!
int Gi1
ip address 10.10.34.4 255.255.255.0
no shut
ip pim sparse-mode
ip ospf network point-to-point
!
int Gi2
ip address 10.4.5.4 255.255.255.0
no shut
!
int Lo0
ip address 4.4.4.4 255.255.255.255
!
router ospf 1
network 0.0.0.0 255.255.255.255 area 0
passive-interface Gi2
#R5
hostname R5
!
ip multicast-routing distributed
!
int Gi1
ip address 10.20.56.5 255.255.255.0
no shut
ip pim sparse-mode
ip router isis
isis network point-to-point
!
int Gi2
ip address 10.4.5.5 255.255.255.0
no shut
!
int Lo0
ip address 5.5.5.5 255.255.255.255
ip router isis
!
router isis
net 49.0001.0000.0000.0005.00
is-type level-2-only
passive-interface Gi2
#R6
hostname R6
!
ip multicast-routing distributed
!
int Gi1
ip address 10.20.56.6 255.255.255.0
no shut
ip pim sparse-mode
ip router isis
isis network point-to-point
!
int Gi2
ip address 10.20.67.6 255.255.255.0
no shut
ip pim sparse-mode
ip router isis
isis network point-to-point
!
int Lo0
ip address 6.6.6.6 255.255.255.255
ip router isis
ip pim sparse-mode
!
ip pim bsr-candidate lo0
ip pim rp-candidate lo0
!
router isis
net 49.0001.0000.0000.0006.00
is-type level-2-only
#R7
hostname R7
!
ip multicast-routing distributed
!
int Gi1
ip address 10.20.67.7 255.255.255.0
no shut
ip pim sparse-mode
ip router isis
isis network point-to-point
!
int Gi2
ip address 10.20.100.1 255.255.255.0
ip pim sparse-mode
no shut
!
int Lo0
ip address 7.7.7.7 255.255.255.255
ip router isis
!
router isis
net 49.0001.0000.0000.0007.00
is-type level-2-only
passive-interface Gi2
#Source1
hostname Source1
!
no ip domain lookup
!
int Gi1
ip address 10.10.10.10 255.255.255.0
no shut
!
ip route 0.0.0.0 0.0.0.0 10.10.10.1
#Host1
hostname Host1
!
int Gi0/0
ip address 10.10.100.10 255.255.255.0
ip igmp join-group 239.1.1.1
no shut
!
ip route 0.0.0.0 0.0.0.0 10.10.100.1
#Host2
hostname Host2
!
int Gi0/0
ip address 10.20.100.10 255.255.255.0
ip igmp join-group 239.1.1.1
no shut
!
ip route 0.0.0.0 0.0.0.0 10.20.100.1
Both Host1 and Host2 have joined the group 239.1.1.1. However, traffic from Source1 is only reaching Host1 right now.
Source1#ping 239.1.1.1 repeat 3
Type escape sequence to abort.
Sending 3, 100-byte ICMP Echos to 239.1.1.1, timeout is 2 seconds:
Reply to request 0 from 10.10.100.10, 47 ms
Reply to request 1 from 10.10.100.10, 6 ms
Reply to request 1 from 10.10.100.10, 39 ms
Reply to request 2 from 10.10.100.10, 5 ms
In order for Host2 to receive the traffic, we must do two things:
Advertise routes for the RPs and sources into each domain via eBGP at the ASBRs (R4 and R5)
Configure MSDP peering between the two RPs
First we’ll enable eBGP between R4 and R5 and advertise the RP and Source prefixes. Configure R3 and R6 to be the RR in each domain.
Next we need to configure R3 and R6 as MSDP peers. This configuration is quite simple. The connect-source is analgous to BGP update-source.
#R3
ip msdp peer 6.6.6.6 connect-source lo0
#R6
ip msdp peer 3.3.3.3 connect-source lo0
Confirm the MSDP session is up:
R3#show ip msdp summary
MSDP Peer Status Summary
Peer Address AS State Uptime/ Reset SA Peer Name
Downtime Count Count
6.6.6.6 200 Up 00:00:36 0 0 ?
R3#show ip msdp peer
MSDP Peer 6.6.6.6 (?), AS 200
Connection status:
State: Up, Resets: 0, Connection source: Loopback0 (3.3.3.3)
Uptime(Downtime): 00:00:32, Messages sent/received: 1/1
Output messages discarded: 0
Connection and counters cleared 00:01:32 ago
SA Filtering:
Input (S,G) filter: none, route-map: none
Input RP filter: none, route-map: none
Output (S,G) filter: none, route-map: none
Output RP filter: none, route-map: none
SA-Requests:
Input filter: none
Peer ttl threshold: 0
SAs learned from this peer: 0
Number of connection transitions to Established state: 1
Input queue size: 0, Output queue size: 0
MD5 signature protection on MSDP TCP connection: not enabled
Message counters:
RPF Failure count: 0
SA Messages in/out: 0/0
SA Requests in: 0
SA Responses out: 0
Data Packets in/out: 0/0
Begin sending multicast traffic again at Source1. When R3 learns the source, it will send a Source Active message to R6. R6 will then join a (S, G) tree to the source. This is the same process as intradomain multicast, but the source just happens to be in a different domain.
Here is the MSDP SA message sent from R3 to R6. Notice that the multicast packet is encapsulated in the SA message. This is similar to a PIM Register, except the peer will never send a “stop” message.
Source1#ping 239.1.1.1 repeat 3
Type escape sequence to abort.
Sending 3, 100-byte ICMP Echos to 239.1.1.1, timeout is 2 seconds:
Reply to request 0 from 10.10.100.10, 38 ms
Reply to request 1 from 10.10.100.10, 6 ms
Reply to request 1 from 10.10.100.10, 42 ms
Reply to request 2 from 10.10.100.10, 6 ms
Why don’t we see replies from Host2 right now? R6 has joined an (S, G) tree rooted at Source1 (output below). Shouldn’t this be working?
R6#show ip mroute 239.1.1.1
(*, 239.1.1.1), 00:00:01/stopped, RP 6.6.6.6, flags: S
Incoming interface: Null, RPF nbr 0.0.0.0
Outgoing interface list:
GigabitEthernet2, Forward/Sparse, 00:00:01/00:03:28
(10.10.10.10, 239.1.1.1), 00:00:01/00:02:58, flags: M
Incoming interface: GigabitEthernet1, RPF nbr 10.20.56.5
Outgoing interface list:
GigabitEthernet2, Forward/Sparse, 00:00:01/00:03:28
Spend a few minutes troubleshooting on your own before scrolling down to see the answer.
There is no RPF neighbor for (10.10.10.10, 239.1.1.1). This is because R4 and R5 are not running PIM. This is an important point about interdomain multicast. The ASBRs must run PIM with eachother. Configure PIM sparse-mode on the interfaces connecting R4 and R5.
#R4, R5
int Gi2
ip pim sparse-mode
Traffic is now reaching Host2 as well:
Source1#ping 239.1.1.1 repeat 3
Type escape sequence to abort.
Sending 3, 100-byte ICMP Echos to 239.1.1.1, timeout is 2 seconds:
Reply to request 0 from 10.10.100.10, 5 ms
Reply to request 0 from 10.20.100.10, 8 ms
Reply to request 1 from 10.10.100.10, 5 ms
Reply to request 1 from 10.20.100.10, 103 ms
Reply to request 2 from 10.10.100.10, 5 ms
Reply to request 2 from 10.20.100.10, 7 ms
We have one issue though. Examine the RP mappings on R1 and R7.
R1#show ip pim rp map
PIM Group-to-RP Mappings
Group(s) 224.0.0.0/4
RP 6.6.6.6 (?), v2
Info source: 6.6.6.6 (?), via bootstrap, priority 0, holdtime 150
Uptime: 00:00:19, expires: 00:02:07
RP 3.3.3.3 (?), v2
Info source: 6.6.6.6 (?), via bootstrap, priority 0, holdtime 150
Uptime: 00:34:49, expires: 00:02:09
R1#show ip pim rp
Group: 239.1.1.1, RP: 6.6.6.6, uptime 00:00:25, expires 00:02:01
R7#sho ip pim rp
PIM Group-to-RP Mappings
Group(s) 224.0.0.0/4
RP 6.6.6.6 (?), v2
Info source: 6.6.6.6 (?), via bootstrap, priority 0, holdtime 150
Uptime: 00:27:11, expires: 00:01:34
RP 3.3.3.3 (?), v2
Info source: 6.6.6.6 (?), via bootstrap, priority 0, holdtime 150
Uptime: 00:00:52, expires: 00:01:36
R7#show ip pim rp
Group: 239.1.1.1, RP: 6.6.6.6, uptime 00:27:13, expires 00:01:33
Both routers are learning the other domain’s RP information. In fact, R1 is currently using R6 as an RP! (It wins due to highest IP address). How did this happen? When we configured PIM between R4 and R5, the BSR messages leaked to each domain. To fix this we need to configure Gi2 on each router as the BSR boundary.
#R4, R5
int Gi2
ip pim bsr-border
! The above command prevents PIM Bootstrap messages from being sent on the link
Wait a few minutes and the inter-domain RP mappings should time out on each router.
Using MP-BGP
Currently the route for Source1 (10.10.10.0/24) is present in the unicast RIB of all routers in domain 2.
R6#show ip route | in 10.10.10.
B 10.10.10.0/24 [200/3] via 5.5.5.5, 00:25:14
If this is not desirable, you can exchange this route over BGP ipv4 multicast instead of BGP ipv4 unicast. This will remove the prefix from the unicast routing table, but still allow the route to be used for the RPF check.
Using BGP ipv4 multicast can also be a traffic engineering technique. Perhaps you want one interdomain link to be used for multicast and another interdomain link for unicast traffic. You can configure separate peering sessions over each link in order to have greater control.
Let’s configure an ipv4 multicast session between R4 and R5 and advertise the source prefix over this session. We will also need to run the ipv4 multicast AFI on the internal routers in domain 2.
#R4
router bgp 100
no network 10.10.10.0 mask 255.255.255.0
address-family ipv4 multicast
neighbor 10.4.5.5 activate
network 10.10.10.0 mask 255.255.255.0
#R5
router bgp 200
address-family ipv4 multicast
neighbor 10.4.5.4 activate
! Within Domain 2, we also need to run the ipv4 multicast AFI
neighbor 6.6.6.6 activate
neighbor 6.6.6.6 next-hop-self
#R6
router bgp 200
address-family ipv4 multicast
neighbor 5.5.5.5 activate
neighbor 5.5.5.5 route-reflector-client
neighbor 7.7.7.7 activate
neighbor 7.7.7.7 route-reflector-client
#R7
router bgp 200
address-family ipv4 multicast
neighbor 6.6.6.6 activate
neighbor 6.6.6.6 route-reflector-client
The prefix is now gone from the RIB, but the RPF check can still work by using the mcast BGP table.
R6#show ip route | in 10.10.10.
R6#
R6#show ip rpf 10.10.10.10
RPF information for ? (10.10.10.10)
RPF interface: GigabitEthernet1
RPF neighbor: ? (10.20.56.5)
RPF route/mask: 10.10.10.0/24
RPF type: multicast (bgp 200)
Doing distance-preferred lookups across tables
RPF topology: ipv4 multicast base, originated from ipv4 unicast base
It is important to realize that the BGP mcast table is only used for RPF checks based on mutlicast sources. You do not advertise any multicast group addresses (224/4) into BGP. The RPF check is not actually used for forwarding traffic. It is used in order to have a usable upstream interface to create a (S, G) tree. The traffic then flows down this tree. (Traffic arrives from the source. Traffic is not sent to the source. We need a route to the source to do an RPF check though.)
Traffic from Source1 will still arrive at Host2, however Domain2 no longer has a unicast route back to the source, so the pings time out. But multicast traffic is nevertheless working.
Source1#ping 239.1.1.1
Type escape sequence to abort.
Sending 1, 100-byte ICMP Echos to 239.1.1.1, timeout is 2 seconds:
Reply to request 0 from 10.10.100.10, 47 ms
Host# debug ip icmp
*Sep 17 21:36:48.600: ICMP: echo reply sent, src 10.20.100.10, dst 10.10.10.10, topology BASE, dscp 0 topoid 0
*Sep 17 21:36:48.601: ICMP: dst (10.20.100.10) host unreachable rcv from 10.20.100.1
R7 sends a host unreachable to Host2 because 10.10.10.0/24 is no longer in the unicast RIB
Conclusion
Interdomain multicast requires either PIM-SSM (which is the simplest deployment) or PIM-SM with MSDP. MSDP (Multicast Source Discovery Protocol) allows an RP to discover a source in another domain. When a source goes active, the RP local to the source’s domain sends a Source Active message to all other MSDP peers. This allows the RPs in other domains to discover the source just as in a PIM Register message. The RPs, if they have state for the matching (*, G) shared tree, join a (S, G) tree rooted at the source in order to “pull in” the traffic.
Interdomain multicast also requires routes to sources in other domains in order to pass the RPF check. This can be done using unicast routes, or by running BGP for the ipv4 multicast AFI. Additionally, you must run PIM on the ASBRs at the interdomain link. Be sure to take precaution that these routers do not leak RP information.