802.3ah (Ethernet OAM)
Last updated
Last updated
Ethernet OAM is a protocol used for monitoring and troubleshooting Ethernet on a last-mile link. Ethernet OAM reminds me of BFD but for layer 2 Ethernet. Both protocols are used to detect failures at the layer 2 or layer 3 level. However, Ethernet OAM has many other features as well which we will see in this article. Ethernet OAM is sometimes called “Link OAM” or “Ethernet for the First Mile OAM.”
Ethernet OAM was invented in order to use the same OAM tools that traditional WAN technologies like ATM and SONET had, for Ethernet. Ethernet was invented as a LAN protocol and did not originally have the same requirements that we impose on Ethernet today as a WAN technology.
Ethernet OAM is only supported on full duplex, point-to-point links. Ethernet OAM PDUs use the slow protocol destination MAC of 0180.c200.0002. You may recognize this because it is the same MAC that LACP uses! Slow protocols cannot exceed a maximum tramission of 10 frames per second. The idea behind “slow protocols” is that they have very low impact on overall bandwidth of the link. The ethertype for slow protocols is 8809. LACP is subtype 1 and Ethernet OAM is suptype 3.
You typically run Ethernet OAM on a point-to-point link between a PE and the demarcation device at the customer premises. The main features when using Ethernet OAM are: link monitoring (detecting errors and acting on them based on a threshold), remote link fault detection (notifying the peer of faults), and the ability to turn the remote partner into a loopback. We can easily lab Ethernet-OAM using CSR1000v. Ethernet OAM is not supported on XRv or XR9kv in my testing.
Throughout this lab we will simply use two CSR1000v routers called CE1 and PE1 connected back-to-back on Gi1. We will also have CE2, which is the Z end of an E-Line service. We will only use CE2 at the very end of the lab.
To configure Ethernet OAM you simply use the following command under the interface:
I will also put CE1 in passive mode. By default it is in active mode. Active/passive mode works just like in LACP. The active side tries to initiate a session by actively sending OAMPDUs, and the passive side only sends PDUs in response. Two devices in passive mode will not become OAM peers.
PE1 was configured first, and sends OAM PDUs describing its capabilities. We also see here that Remote Evaluating and Remote Stable are both False because a remote partner has not been seen yet.
In the next frame, CE1 has seen PE1 so it sets Local Stable to True and Remove Evaluating to False. This is because it does not know whether PE1 has seen itself yet. It waits to receive a PDU with Local Stable from PE1. It also includes the Remote Information TLV, which is the same information in PE1’s Local Information TLV.
PE1 now sets Local Stable and Remote Stable, and also includes the Remote Information TLV:
CE1 can now set Remote Stable:
In the CLI we see the following syslog message on each router:
Using the following show command we can see some of the capabilities that we see in the pcaps above:
This command shows the MTU of the remote partner which can be handy.
The only capability supported right now (and by default) is link monitor. This is also seen the output of show ethernet oam summary.
If we enable support for remote loopback on PE1, this is seen as a capability of the partner on CE1. The show ethernet oam summary output is information about the neighbor.
Using the show ethernet oam status command we can see timeout values for various parameters configured locally.
By default OAMPDUs are sent once per second. This is the min rate you see above and is the lowest this timer can go.
The min-rate is used for the OAMPDU interval under normal operations. The max-rate is only used for PDU flooding during critical events. This rate can go no quicker than 10 frames per second (once per 100ms), and the slowest it can go is 1 frame per second.
So by default, the OAMPDU intervals are at the quickest interval you can configure. The timeout value is the only setting that you can make more aggresive. By default it is 5 seconds and you can set it as low as 2 seconds.
Link monitoring is used to monitor the quality of the link. If certain errors are detected, and the “low” threshold is crossed, the device sends an Event Notification PDU to the OAM peer. If the “high” threshold is crossed, you can errdisable the interface. By default, there is no high threshold setting. You can configure the various settings under ethernet oam link-monitor. By default link-monitor is already on. show ethernet oam status shows the configured parameters for each setting.
Ethernet OAM provides a way to indicate to a peer that local faults have occured. These faults include a link fault or a critical event (which is vendor specific). My favorite fault is called the “dying gasp.” This indicates that an unrecoverable condition has occured. I like to imagine the device literally dying and in its final gasp for air, revealing to its peer that something bad has happened and saying its final goodbyes. A dying gasp is sent when you gracefully shutdown an interface via the shutdown command.
A device can take action upon receiving any of these three RFIs. The only action available is to err-disable the interface.
I’ve set the dying-gasp action to errdisable on PE1, and shutdown Gi1 on CE1.
CE1 immedaitely sends the OAMPDU with the dying gasp flag set:
PE1 sees this and immedaitely errdisables the interface:
This is a very handy feature that allows one device to turn the remote device’s port into a loopback. By default this capability is turned off. To enable it you must use the following command on both devices:
Only a device in active mode can initate a remote loopback session. This means that CE1 cannot initiate the session. Only PE1 can initiate the session and turn CE1 Gi1 into a loopback.
Start the remote loopback session with the following command:
On CE2 (the Z end of the E-Line service in this lab) I run a ping to CE1. The ARP messages are literally looped back and seen twice on a pcap:
Besides the syslog messages, you can also see that the loopback session is running from show ethernet oam summary and show ethernet oam discovery.
Run this command to stop the loopback session: