Hosting Telegraf On A Cisco Catalyst Switch
Yusuf ShalabyIntroduction
We built a telemetry system for the Toronto Internet Exchange using Cisco MDT. In our original design, network switches sent telemetry to a time series database. The shapes of these metrics, ie the information contained in each telemetry payload, are defined in data models published by Cisco for each operating system that supports MDT. Before these metrics can be ingested into the time series database, they often require some amount of processing. Some examples of processing operations done on these metrics include:
- renaming fields or tags
- changing data types
- adding tags to a metric
- joining multiple incoming metrics together
The basic setup for a telemetry ingestion pipeline is to process all incoming data on a single node. As the amount of incoming telemetry increases, vertical or horizontal scaling may be necessary. Vertical scaling means increasing the memory and CPU resources of the server. Horizontal scaling means adding more servers. A third option is to process the data on the system that produces it: the switch itself. This is possible through Cisco’s application-hosting framework for Cisco IOS-XE devices. In theory, Telegraf can run as a docker container on the device to process the data before it is sent to the database.
The diagram above shows the different types of scaling we discussed, alongside the new edge processing option that is available with Cisco’s application-hosting framework. But how does application hosting work in practice?
The rest of this article will walk through the steps we took to set up Telegraf on a Cisco Catalyst switch, specifically the Catalyst 9300 series, which we received from Cisco to test this setup.
Docker image
Firstly you will need to prepare your Telegraf Docker image. To simplify the setup, we will bake in the configuration file and the plugins we need directly into the image. Here is the simplest Dockerfile to get you started:
1FROM telegraf:latest
2COPY ./configs/telegraf/telegraf.conf /etc/telegraf/telegraf.conf
3ENTRYPOINT ["telegraf", "--config", "/etc/telegraf/telegraf.conf"]
Make sure to replace the path to the configuration file with the actual path in
your project. The telegraf.conf
file should contain the plugins you want to
use, such as the cisco_telemetry_mdt
input plugin for receiving telemetry data
from the Catalyst switch, and the influxdb_v2
output plugin for sending the
processed data to your InfluxDB instance.
Before you build the container, verify your Docker Engine version. Docker
containers built using Docker Engine v25 and up are incompatible with Cisco IOx.
To check your docker engine version run docker version
in your shell. We used
v24.0.9 for this demo. The docker build command looks like this:
1docker build -t telegraf-catalyst:iox .
You also need to ensure that the CPU architecture of the resulting container
matches the architecture of the target switch. If you’re building the container
for an x86_64 CPU but you’re local machine has an ARM CPU, then specify the
--platform
flag to the docker build
command:
1docker buildx build --platform linux/amd64 -t telegraf-catalyst:iox .
Next, save the container into a tar file:
1docker save -o telegraf-catalyst.tar telegraf-catalyst:iox
SSD Setup
A Cisco Catalyst SSD is required for application hosting. Some very specific Cisco signed applications can run on the bootflash card that comes with the switch, but if you want to build your own app you’ll need an SSD. The installation guide can be found here.
The contents of the SSD can be checked using dir usbflash1:
.
After the SSD is installed, enable IOx on the Catalyst:
1configure terminal
2iox
3end
Note if you enable IOx before installing the SSD, then first disable it with
no iox
.
Finally copy the tar file to the SSD using scp:
1scp -O -o "StrictHostKeyChecking=no" ./telegraf-catalyst.tar <username>@<hostaddress>:/usbflash1:/telegraf-catalyst.tar
Next we’ll configure and install the application.
Application configuration
We used the following configuration for app-hosting on the Catalyst:
app-hosting appid tats_torix
app-vnic management guest-interface 0
guest-ipaddress 10.135.25.100 netmask 255.255.255.0
app-default-gateway 10.135.25.1 guest-interface 0
app-resource docker
!run-opts 1 "--entrypoint /bin/sleep infinity"
run-opts 1 "--entrypoint '/usr/bin/telegraf --config /etc/telegraf/telegraf.conf'"
app-resource profile custom
cpu 7400
memory 2048
persist-disk 1024
vcpu 2
name-server0 208.67.222.222
Let’s highlight the key points:
app-hosting appid tats_torix
app-vnic management guest-interface 0
guest-ipaddress 10.135.25.100 netmask 255.255.255.0
app-default-gateway 10.135.25.1 guest-interface 0
First we create an application with the ID tats_torix
. Next we assign a
management virtual network interface (vNIC) to guest interface 0. The management
vNIC is used to connect the application to the management VRF of the switch.
Finally we set the IP address and default gateway for this guest interface.
app-resource docker
!run-opts 1 "--entrypoint /bin/sleep infinity"
run-opts 1 "--entrypoint '/usr/bin/telegraf --config /etc/telegraf/telegraf.conf'"
Here we pass docker run
options for the application. /bin/sleep
is useful
for debugging. We used it initially to verify that the installation worked by
attaching to the application’s console and making sure the container was built
as expected. Once this was verified we used the proper container entrypoint,
which calls the /usr/bin/telegraf
binary.
Once that’s done exit configuration mode and install the application:
app-hosting install appid tats_torix package usbflash1:telegraf-catalyst.tar
app-hosting activate appid tats_torix
app-hosting start appid tats_torix
In Cisco app-hosting, install
unzips the application, activate
sets up the
application according to your configuration and start
runs the application.
The following command allows you to connect to the application console:
app-hosting connect appid tats_torix console
If you make changes to the application, first run the inverse commands in the opposite order:
app-hosting stop appid tats_torix
app-hosting deactivate appid tats_torix
app-hosting uninstall tats_torix
app-hosting install appid tats_torix package usbflash1:telegraf-catalyst.tar
app-hosting activate appid tats_torix
app-hosting start appid tats_torix
Setting up telemetry
Now that Telegraf is set up, let’s stream some telemetry to the new docker endpoint. Below is the telemetry config we used for the Catalyst:
no telemetry ietf subscription 3308
telemetry ietf subscription 3308
encoding encode-kvgpb
filter xpath /interfaces-ios-xe-oper:interfaces/interface
source-address 10.135.25.130
source-vrf Mgmt-vrf
stream yang-push
update-policy periodic 3000
receiver ip address 10.135.25.100 57000 protocol grpc-tcp
Note the receiver IP address is the IP address of the application we set up. The
port is 57000
which is the one we used in our Telegraf configuration for the
cisco_telemetry_mdt
plugin.
Considerations
Now that we know it’s possible, the question is whether this is a good idea. There are a few reasons why you might want to reconsider this approach:
IOS-XE lockin
Only Cisco IOS-XE devices support IOs application hosting. This means that you cannot use this approach on NX-OS devices like the Nexus 9000 series.
SSD cost
The SSD required for application hosting is an additional cost. It is not included with the switch and must be purchased separately.
CI/CD complexity
Updating the application requires a few steps, including building the Docker image, copying it to the switch, and then installing it. This can be cumbersome and error-prone across a large fleet of devices. There will also be some downtime while the application is being updated, which may not be acceptable in production environments.