Hosting Telegraf On A Cisco Catalyst Switch

Yusuf Shalaby

Introduction

We built a telemetry system for the Toronto Internet Exchange using Cisco MDT. In our original design, network switches sent telemetry to a time series database. The shapes of these metrics, ie the information contained in each telemetry payload, are defined in data models published by Cisco for each operating system that supports MDT. Before these metrics can be ingested into the time series database, they often require some amount of processing. Some examples of processing operations done on these metrics include:

  • renaming fields or tags
  • changing data types
  • adding tags to a metric
  • joining multiple incoming metrics together

The basic setup for a telemetry ingestion pipeline is to process all incoming data on a single node. As the amount of incoming telemetry increases, vertical or horizontal scaling may be necessary. Vertical scaling means increasing the memory and CPU resources of the server. Horizontal scaling means adding more servers. A third option is to process the data on the system that produces it: the switch itself. This is possible through Cisco’s application-hosting framework for Cisco IOS-XE devices. In theory, Telegraf can run as a docker container on the device to process the data before it is sent to the database.

Diagram Description automatically
generated
Figure 0 - Telemetry processing options

The diagram above shows the different types of scaling we discussed, alongside the new edge processing option that is available with Cisco’s application-hosting framework. But how does application hosting work in practice?

The rest of this article will walk through the steps we took to set up Telegraf on a Cisco Catalyst switch, specifically the Catalyst 9300 series, which we received from Cisco to test this setup.

Docker image

Firstly you will need to prepare your Telegraf Docker image. To simplify the setup, we will bake in the configuration file and the plugins we need directly into the image. Here is the simplest Dockerfile to get you started:

1FROM telegraf:latest
2COPY ./configs/telegraf/telegraf.conf /etc/telegraf/telegraf.conf
3ENTRYPOINT ["telegraf", "--config", "/etc/telegraf/telegraf.conf"]

Make sure to replace the path to the configuration file with the actual path in your project. The telegraf.conf file should contain the plugins you want to use, such as the cisco_telemetry_mdt input plugin for receiving telemetry data from the Catalyst switch, and the influxdb_v2 output plugin for sending the processed data to your InfluxDB instance.

Before you build the container, verify your Docker Engine version. Docker containers built using Docker Engine v25 and up are incompatible with Cisco IOx. To check your docker engine version run docker version in your shell. We used v24.0.9 for this demo. The docker build command looks like this:

1docker build -t telegraf-catalyst:iox .

You also need to ensure that the CPU architecture of the resulting container matches the architecture of the target switch. If you’re building the container for an x86_64 CPU but you’re local machine has an ARM CPU, then specify the --platform flag to the docker build command:

1docker buildx build --platform linux/amd64 -t telegraf-catalyst:iox .

Next, save the container into a tar file:

1docker save -o telegraf-catalyst.tar telegraf-catalyst:iox

SSD Setup

A Cisco Catalyst SSD is required for application hosting. Some very specific Cisco signed applications can run on the bootflash card that comes with the switch, but if you want to build your own app you’ll need an SSD. The installation guide can be found here.

The contents of the SSD can be checked using dir usbflash1:.

After the SSD is installed, enable IOx on the Catalyst:

1configure terminal
2iox
3end

Note if you enable IOx before installing the SSD, then first disable it with no iox.

Finally copy the tar file to the SSD using scp:

1scp -O -o "StrictHostKeyChecking=no" ./telegraf-catalyst.tar  <username>@<hostaddress>:/usbflash1:/telegraf-catalyst.tar

Next we’ll configure and install the application.

Application configuration

We used the following configuration for app-hosting on the Catalyst:

app-hosting appid tats_torix
 app-vnic management guest-interface 0
 guest-ipaddress 10.135.25.100 netmask 255.255.255.0
 app-default-gateway 10.135.25.1 guest-interface 0
 app-resource docker
  !run-opts 1 "--entrypoint /bin/sleep infinity"
  run-opts 1 "--entrypoint '/usr/bin/telegraf --config /etc/telegraf/telegraf.conf'"
 app-resource profile custom
  cpu 7400
  memory 2048
  persist-disk 1024
  vcpu 2
 name-server0 208.67.222.222

Let’s highlight the key points:

app-hosting appid tats_torix
 app-vnic management guest-interface 0
 guest-ipaddress 10.135.25.100 netmask 255.255.255.0
 app-default-gateway 10.135.25.1 guest-interface 0

First we create an application with the ID tats_torix. Next we assign a management virtual network interface (vNIC) to guest interface 0. The management vNIC is used to connect the application to the management VRF of the switch. Finally we set the IP address and default gateway for this guest interface.

app-resource docker
 !run-opts 1 "--entrypoint /bin/sleep infinity"
 run-opts 1 "--entrypoint '/usr/bin/telegraf --config /etc/telegraf/telegraf.conf'"

Here we pass docker run options for the application. /bin/sleep is useful for debugging. We used it initially to verify that the installation worked by attaching to the application’s console and making sure the container was built as expected. Once this was verified we used the proper container entrypoint, which calls the /usr/bin/telegraf binary.

Once that’s done exit configuration mode and install the application:

app-hosting install appid tats_torix package usbflash1:telegraf-catalyst.tar
app-hosting activate appid tats_torix
app-hosting start appid tats_torix

In Cisco app-hosting, install unzips the application, activate sets up the application according to your configuration and start runs the application.

The following command allows you to connect to the application console:

app-hosting connect appid tats_torix console

If you make changes to the application, first run the inverse commands in the opposite order:

app-hosting stop appid tats_torix
app-hosting deactivate appid tats_torix
app-hosting uninstall tats_torix

app-hosting install appid tats_torix package usbflash1:telegraf-catalyst.tar
app-hosting activate appid tats_torix
app-hosting start appid tats_torix

Setting up telemetry

Now that Telegraf is set up, let’s stream some telemetry to the new docker endpoint. Below is the telemetry config we used for the Catalyst:

no telemetry ietf subscription 3308
telemetry ietf subscription 3308
 encoding encode-kvgpb
 filter xpath /interfaces-ios-xe-oper:interfaces/interface
 source-address 10.135.25.130
 source-vrf Mgmt-vrf
 stream yang-push
 update-policy periodic 3000
 receiver ip address 10.135.25.100 57000 protocol grpc-tcp

Note the receiver IP address is the IP address of the application we set up. The port is 57000 which is the one we used in our Telegraf configuration for the cisco_telemetry_mdt plugin.

Considerations

Now that we know it’s possible, the question is whether this is a good idea. There are a few reasons why you might want to reconsider this approach:

IOS-XE lockin

Only Cisco IOS-XE devices support IOs application hosting. This means that you cannot use this approach on NX-OS devices like the Nexus 9000 series.

SSD cost

The SSD required for application hosting is an additional cost. It is not included with the switch and must be purchased separately.

CI/CD complexity

Updating the application requires a few steps, including building the Docker image, copying it to the switch, and then installing it. This can be cumbersome and error-prone across a large fleet of devices. There will also be some downtime while the application is being updated, which may not be acceptable in production environments.