Model-Driven Telemetry

Model-Driven Telemetry (MDT) is the process of streaming YANG-modeled data to a data collector. YANG in general can be used for both configuration and operational data. With MDT we use YANG for operational data.

MDT is somewhat like SNMP but with very short polling intervals. With MDT you can go as low as updating every 1 second. SNMP uses a pull operation in which a SNMP management server sends an SNMP poll which requests information from the device, and the device sends a response. MDT uses a push operation where the device periodically pushes the data to the data collector.

MDT solves several problems that exist with SNMP polling. First, SNMP polling is quite slow. Most polling intervals are five minutes which is very long in today’s networks. Second, SNMP polling is very inefficient in terms of data collection. The NMS must “walk” the MIB sequentially and the router must “re-arrange” data into SNMP formatted data, and reply to the NMS.

In comparsion, MDT can stream data only when needed (on-change) or at very low intervals. Processing data into an MDT stream is much more efficient for a router than formatting SNMP data and replying to SNMP GETs. MDT can also easily stream the same data to multiple collectors for redundancy. The data is collected and formatted once and then sent in duplicates to multiple collectors. In SNMP you would increase CPU for each additional poller you add, because the device needs to create a separate SNMP GET reply for every SNMP poller.

SNMP and MDT comparison chart

SNMP

MDT

On-Event updates possible?

Yes, using traps and informs

Yes, using “on-change” updates

Pull/Push model?

Primarily a pull model (SNMP polling), but traps/informs are push

Only a push model. Data is pushed “on-change” or on an interval.

Ability to manage devices?

Yes, but rarely used

No, MDT only streams operational data

Data model?

MIB

YANG models

Object reference method?

An OID references objects in the MIB

XPATH is used to specify which information in the YANG model to stream

Data encoding?

ASN.1 data types are used for SNMP GET responses and traps/informs

JSON, KV-GPB or Compact GPB

MDT often uses gRPC. gRPC stands for Google Remote Procedure Call. It is an open-source RPC framework that is low latency and very fast. It serializes data that looks like XML but is smaller and easier to use, somewhat like JSON. The data serialization is called Protobuf (protocol buffers).

gRPC can use key value GPB (google protocol buffers) or compact GPB. Key value GPB (KV-GPB) is a “self describing” format like JSON, where data is in a key-value pair. (i.e. “interface_name”:”GigabitEthernet1”). On the other hand, Compact GPB needs a .proto file to decode the data. Compact GPB saves a lot of bandwidth because you can “short hand” the data, i.e. “1”:”GigabitEthernet1”. The server-side needs to know that “1” means “interface_name.”

MDT can also use gNMI, which stands for gRPC Network Management Interface. gNMI is a management protocol like NETCONF, but it uses gRPC as the transport protocol. (NETCONF uses SSH). gNMI is often used for streaming telemetry, I believe because it is lightweight and fast. gNMI is an open standard and is a subset of NETCONF, similar to how RESTCONF is a subset of NETCONF. gNMI runs on HTTP/2.

Subscriptions

A telemetry session is defined by a subscription. A subscription can be dynamically created (dial-in mode) or statically configured on the network device (dial-out mode). Currently 100 subscriptions are supported on a single device on IOS-XE.

Dial-In vs. Dial-Out

When a subscription is dynamically created, it is called dial-in mode. A data collector “dials in” to a network device and subscribes to the YANG models it wishes to receive data for. The session needs to remain up in order to keep the stream active. A restart of the network device means the session needs to be restarted by the data collector once the network device comes back up.

When a subscription is statically configured on the network device, it is called dial-out mode. The network device has a subscription statically configured, and then it “dials out” to the network collector. You can see the dial-out subscription in the running config. The subscription is persistent across device reboots.

It is important to understand that for both dial-in and dial-out, the network device always pushes data to the collector. Dial-in vs. dial-out simply dictactes whether the device or collector initiates the subscription.

(If you are having trouble remembering this, you can think of it like a magazine subscription. Dial-In would be me signing up for the magainze subscription. Once they receive my subscription request, they start sending me a magazine each month. Dial-Out would be like the magainze service just sending me a magainze each month automatically without me ever requesting it. They automatically enrolled me based on finding my name/address from the phonebook or voter data. In both cases, the magazine company is always the one “pushing the data” towards me.)

MDT Protocols

You can use three different protocols with MDT: NETCONF, gNMI, and gRPC.

NETCONF and gNMI (gRPC NMI) only support dial-in mode. NETCONF only supports XML encoding, and gNMI only supports JSON encoding.

gRPC can only be used for dial-out, not dial-in, and it uses protobuffs for encoding. Later in this article we will configure gRPC dial-out mode on CSR1000v and XR9000v.

You can differentiate a dial-in vs. dial-out subscription using the following CLI show command. A dial-out subscription is type Configured. A dial-in subscription was created dynamically, and is type Dynamic.

Router#show telemetry ietf subscription all          
  Telemetry subscription brief

  ID               Type        State       Filter type      
  --------------------------------------------------------
  101              Configured  Valid       xpath
  2147483648       Dynamic     Valid       xpath

NETCONF

gNMI

gRPC

Dial-In

Supported

Supported

Not supported

Dial-Out

Not supported

Not supported

Supported

Encoding

XML

JSON

Protobuffs

yang-push vs. yang-notif-native

There are two types of streams you can use for MDT. A yang-push stream contains data which is described by a supported YANG model. This stream supports an XPath filter so that you can specify only a subset of data in the model to stream out to the collector. You can push updates out when the value changes or on a periodic interval.

yang-notif-native streams use native IOS technology, not supported yang models. The stream supports an XPath filter as well, but update notifications are only sent on-change. You cannot send periodic updates based on a time interval. yang-notif-native streams are not supported when using gRPC.

Creating a Dial-Out Subscription on IOS-XE

I know that was a lot of information. I encourage you to lab this up to get some hands-on practice with MDT, and then re-read the sections above to make sure you understand everything.

Before you can configure a Dail-Out subscription, you must have netconf enabled.

Use the following CLI commands to create a subscription:

telemetry ietf subscription <index>
 encoding encode-kvgpb #Key-Value gRPC protobuffs
 filter xpath /process-cpu-ios-xe-oper:cpu-usage/cpu-utilization/five-seconds
 source-address 10.0.0.1
 stream yang-push #yang-notif-native is an option, but gRPC dial-out only supports yang-push
 update-policy periodic 500 (in 100ths of a second, so this is 5 seconds)
 receiver ip address 10.0.0.2 57500 protocol grpc-tcp

The update-policy periodic command specifies how often data is pushed to the receiver. The minimum usable value is 100 which is every 1 second. You can also use update-policy on-change to only send data when it changes instead of periodically.

One tricky thing about defining dial-out subscriptions is that multiple receivers can be configured, but only the first one will be used. For example, if I add a receiver at 1.1.1.1, you can see it in the show run output:

Router#show run | sec tele                           
telemetry ietf subscription 101
 encoding encode-kvgpb
 filter xpath /process-cpu-ios-xe-oper:cpu-usage/cpu-utilization/five-seconds
 source-address 10.0.0.1
 stream yang-push
 update-policy periodic 500
 receiver ip address 1.1.1.1 57500 protocol grpc-tcp
 receiver ip address 10.0.0.2 57500 protocol grpc-tcp

However, if you look at the show telemetry ietf subscription <number> receiver output, you see that it is not actually in use:

Router#show telemetry ietf subscription 101 receiver 
Telemetry subscription receivers detail:

  Subscription ID: 101
  Address: 1.1.1.1
  Port: 57500
  Protocol: grpc-tcp
  Profile: 
  Connection: 65535
  State: Invalid
  Explanation: Multi-receivers not supported

  Subscription ID: 101
  Address: 10.0.0.2
  Port: 57500
  Protocol: grpc-tcp
  Profile: 
  Connection: 0
  State: Connected
  Explanation:

We can also see the configuration settings for the subscription by using the following command:

Router#show telemetry ietf subscription all detail 
Telemetry subscription detail:

  Subscription ID: 101
  Type: Configured
  State: Valid
  Stream: yang-push
  Filter:
    Filter type: xpath
    XPath: /process-cpu-ios-xe-oper:cpu-usage/cpu-utilization/five-seconds
  Update policy:
    Update Trigger: periodic
    Period: 500
  Encoding: encode-kvgpb
  Source VRF: 
  Source Address: 10.0.0.1
  Notes: 

  Receivers:
    Address                                    Port     Protocol         Protocol Profile      
    -----------------------------------------------------------------------------------------
    1.1.1.1                                    57500    grpc-tcp                               
    10.0.0.2                                   57500    grpc-tcp
  • Note that there is no way to know which receiver is currently in-use from this output.

For the rest of the lab, we need a data collector in order to visualize the data we are streaming. I used the excellent docker container that Jeremy Cohoe has provided here: https://hub.docker.com/r/jeremycohoe/tig_mdt

To set it up, I followed the README on his github repo here: https://github.com/jeremycohoe/cisco-ios-xe-mdt

If you have been following along with the previous labs, you have an Ubuntu server connected to your CSR1000v. All you should need to do is install docker on the Ubuntu server using this doc: https://docs.docker.com/engine/install/ubuntu/

Next you will download the container using sudo docker pull jeremycohoe/tig_mdt and then run it following along on the Github. I kept everything at the defaults, logged into Grafana using admin/Cisco123 and then added a Panel with a query that looks like this:

You can see above that the data is being collected very frequently. Default SNMP polling would probably show a graph that only has data points every 5 minutes.

The great thing about the docker container is that Telegraf, InfluxDB, and Grafana are already setup for you. This would take a few hours to install and configure yourself. Telegraf accepts the YANG data stream from the CSR1000v and writes the data to InfluxDB which is a time series database. Grafana preforms database queries against InfluxDB in order to graph the data like you see above.

Creating a Dial-Out Subscription on IOS-XR

Only 64-bit IOS-XR platforms support gRPC for dial-out sessions. 32-bit IOS-XR platforms only support TCP. You must use an XR9000v in order to test MDT in a virtualized lab.

On IOS-XR the telemetry destination, sensor, and subscription are all configured separately. You configure the destination as follows, under a destination-group.

telemetry model-driven
 destination-group TELEGRAF
  address-family ipv4 10.0.0.2 port 57500
   encoding self-describing-gpb
   protocol grpc no-tls

You configure the sensor as such:

telemetry model-driven
 sensor-group TELEGRAF
  sensor-path Cisco-IOS-XR-infra-statsd-oper:infra-statistics/interfaces/interface/latest/generic-counters

You tie the sesion and destination together under a subscription:

telemetry model-driven
 subscription TELEGRAF
  sensor-group-id TELEGRAF sample-interval 5000
  destination-id TELEGRAF

I moved the cable connecting the Ubuntu server to CSR1000v to the XR9000v and set Gi0/0/0/0 as 10.0.0.1/30. If you change the query in Grafana you should see data from the XR9000v now.

  • This query monitors bytes sent out Gi0/0/0/0. The sharp line upward is when I sent a flood of large pings.

We can also verify the subscription using the following show command:

RP/0/RP0/CPU0:XR9K#show telemetry model-driven subscription 
Mon Oct 24 20:33:45.253 UTC
Subscription:  TELEGRAF                 State: ACTIVE
-------------
  Sensor groups:
  Id                               Interval(ms)        State     
  TELEGRAF                         5000                Resolved  

  Destination Groups:
  Id                 Encoding            Transport   State   Port    Vrf                               IP                                            
  TELEGRAF           self-describing-gpb grpc        Active  57500                                     10.0.0.2                                      
    TLS :             False

To configure Dial-In services you simply need to enable gRPC server functionality on the router:

grpc
 port <number>

There are many grpc server options such as no-tls, max-request-total, etc.

Futher Reading

https://www.cisco.com/c/en/us/td/docs/ios-xml/ios/prog/configuration/176/b_176_programmability_cg/m_176_prog_ietf_telemetry.html

https://www.cisco.com/c/en/us/td/docs/iosxr/asr9000/telemetry/b-telemetry-cg-asr9000-61x/b-telemetry-cg-asr9000-61x_chapter_010.html

https://developer.cisco.com/network-automation/detail/5d6bbd08-7099-11eb-aa41-aa8fea613d8b/

https://www.youtube.com/watch?v=p94yetSTXdc&ab_channel=CiscoCatalystTV

https://www.cisco.com/c/en/us/support/docs/routers/asr-9000-series-aggregation-services-routers/215321-asr9k-model-driven-telemetry-whitepaper.html

https://www.youtube.com/watch?v=99hyTLPMAQQ&ab_channel=TechFieldDay

https://blogs.cisco.com/sp/the-limits-of-snmp

https://xrdocs.io/telemetry/tutorials/2016-07-21-configuring-model-driven-telemetry-mdt/

https://xrdocs.io/telemetry/blogs/2017-01-20-model-driven-telemetry-dial-in-or-dial-out/

Last updated