What a title to this post, right?
With the recent release of Nornir 3.0 – I wanted to explore the capabilities of Nornir and I already know, I will prob never use Ansible for Network Automation ever again.. 😉 However, the reason for this post is to give a high level overview of Nornir 3.0 and provide a guide to convert 2.x Nornir/Netconf scripts over to 3.0.
Some of the topics explored in this post will include but not limited to the following:
- Infrastructure as Code : (Jinja2 Template Rendering and YAML defined network state)
- Nornir 3.0.
- Directory Structure
How to Follow Along
I’d recommed to download the code from my github and review the repo. Once you are familiar with the code you should be ready to start reading along.
Review the topology below: You will become the operator of this network throughout this journey. The solution you are implementing will ease the workload of the deployment engineers and possibly save your company some money. Depending on the issue that XYZ company is trying to solve, it’s becoming clear that not every one requires a high dollar solution to automate their networks with vendor specific nms, orchestrators, etc.
For those of you new to Nornir, it’s an automation framework written in Python. If you are familiar with Ansible, you can adapt quite easily to Nornir as long as you know how to get around with python. You will quickly realize how flexible it is. One of my favorite features of Nornir is multithreading, allowing concurrent connections which in return makes this framework incredibly fast. We will discuss the topic of workers/threads a little more later in this post.
Begin by installing nornir with a simple **pip3 install nornir **
Lets discuss the directory structure. You can see here there is quite a bit going on..
NOTE: All of the following files/directories have to manually be created. These are not autocreated. Take a minute and re-create the folders/files under a chosen filepath. I started a git repo and this is where I created all my folders.
We’ve created a defaults, groups and hosts yml file under our ‘inventory‘ directory. We actually have a config.yml file which specifies the path location of these files. This config file is later passed into the nornir class that’s instantiated inside our python runbook, norconf.py. As always, our ‘templates‘ folder contains our Jinja2 files with the appropriate template files to render the configuration of the L3VPN and VPRNs for our multivendor environment. These are named according to their corresponding host platform and function.
Template Naming Example:
Additional files in here, such as nc_tasks.py are adopted from Nick Russos project which uses nornir 2.X. He’s configured some custom netconf tasks at a time in which netconf was originally being introduced into Nornir. The Log file is self explanatory.
At the time of this writing, the nornir_netconf plugin is not yet available for Nornir 3.0 as a direct pip dowload/install. What I have done is a series of try/except and mostly failures to get this to work. I had to take a step back and understand a lot of what’s happening under the hood of nornir. I’ve cloned the REPO @ https://github.com/nornir-automation/nornir_netconf@first and tried to install it via Poetry, but this was mostly a huge waste of time and nothing worked, particularly with the plugin configuration of Nornir. I removed the installation and went the pip route straight from git.
I was able to install the the code by using pip + git using the following:
pip3 install git+https://github.com/nornir-automation/nornir_netconf@first
However, during the process I got an exception “AttributeError: module ‘enum’ has no attribute ‘IntFlag’” From some searching around, it’s due to a discrepency with using enum34. I ran the following to ensure the package was present and removed it.
pip freeze | grep enum34
➜ nornir_netconf-first pip3 freeze | grep enum34 enum34==1.1.10
Looks like I do have it in installed … A quick, ‘pip3 uninstall enum34’ and re-ran the original pip3 install from git+git_page and the installation was successfull. I wonder what I broke by removing enum34 😉
Installing collected packages: nornir-netconf Successfully installed nornir-netconf-1.0.0
Python 3.8.2 (v3.8.2:7b3ab5921f, Feb 24 2020, 17:52:18) [Clang 6.0 (clang-600.0.57)] on darwin Type “help”, “copyright”, “credits” or “license” for more information. »> import nornir_netconf
print(SO FAR SO GOOD!)
I was having an issue with nornir netconf plugin originally and had to investigate how to manually register a plugin. That is before I found out how to get around the hurdle and install via git+pip. Here is the code I used to manually register the plugin in my runbook directly, in case anyone ever wants to register a new plugin..although a lot has to happen for any of this to work.
<div> from nornir_netconf.plugins.connections import Netconf </div> <div> ConnectionPluginRegister.register(“netconf”, ConnectionPluginRegister) </div>
Importing this function, I am actually able to receive this rpc_reply from a successfull RPC operation. This is critical to the operation of my script – as I write conditional statements depending on the returning output of the tasks.run result.
The Host File:
<div> <strong>R3_CSR</strong>: </div> <div> hostname: 192.168.0.223 </div> <div> groups: </div> <div> – CSR </div> <div> <strong>R3_SROS_PE:</strong> </div> <div> hostname: 192.168.0.222 </div> <div> groups: </div> <div> – NOKIA </div> <div> data: </div> <div> region: west-region </div> <div> <strong>R8_IOSXR_PE:</strong> </div> <div> hostname: 192.168.0.182 </div> <div> groups: </div> <div> – IOSXR </div> <div> data: </div> <div> region: west-region </div>
The Group File
- The data.target key is inherited and called upon during the execution of rpc-edit config to point the operation against the correct netconf data store)
- These connection options can make or break the process
The Config File
- A config.yaml file must specify the location of the hosts, groups and defaults fiiles.
We have specified 100 num_workers, which really means we can have up to 100 concurrent multithreaded sessions to devices. The way I think about Nornir running process is everything you’re doing is in a giant ‘for loop’. The tasks runs through all the devices in the inventory (unless you specify a filter) one by one. Although there isn’t a for statement written anywhere visible, you’re looping through all the devices in your inventory. However, using threads you’re actually doing this is parallel. You could technically specify the ‘plugin: serial’ and not take advantage of threads.
Run Book (Python Script of Compiled ‘tasks’)
<div> from nornir_netmiko.tasks import netmiko_send_command </div> <div> from nornir_utils.plugins.functions import print_result </div> <div> from nornir_utils.plugins.tasks.data import load_yaml </div> <div> from nornir_jinja2.plugins.tasks import template_file </div> <div> from nornir_netconf.plugins.tasks import netconf_edit_config </div> <div> from nc_tasks import netconf_edit_config, netconf_commit </div> <div> import xmltodict, json, pprint </div> <div> __author__ = ‘Hugo Tinoco’ </div> <div> __email__ = ‘email@example.com’ </div> <div> <strong># Specify a custom config yaml file.</strong> </div> <div> </div> <div> <strong>nr = InitNornir(‘<span style="color:#ff0000;">config.yml</span>‘)</strong> </div>
What are filters and how do we create them? A filter is a selection of hosts in which you want to execute a runbook against. For our main example in this post, we are an operator who is in charge of deploying a L3VPN/VPRN in a multi-vendor environemnt at the core. This will include Nokia SR 7750 and Cisco IOSxR. However, our hosts file contains ALL of our devices that are available in our network. The L3VPN we are deploying is only spanning across our ‘west-region’ pictured on the bottom left of the topology above. There are two CPE’s, one attached to the Nokia 7750 and one to the Cisco IOSxR. In order to deploy this service, we want to specify within Nornir that we only need to execute the tasks against these two specific routers. The rest of the network doesn’t need to know about this service. Below is a snippet of the ‘hosts.yml’ file which has customized region key and west-region item. You can see this is duplicated to the R8_IOSXR_PE device. That’s it! We’ve identified common ground between these devices, being in the ‘west-region’ of our network.
Now lets write some code to ensure nornir knows this is a filter.
Infrastructure as Code
We’ll be extracting information from our Yaml files which are variables inputted by the user along side our Jinja2 templates consisting of our Yang Models. We use Jinja2 to distribute the correct variables across our yang models for proper rendering. For distributing the configurations via NETCONF across our core network we enlist the help of Nornir to manage all of theses tasks. We’re allowing Nornir to handle the flow and procedures to ensure proper deployment.
Below is the yaml file containing our vars which will be utilized to render the j2 template. The following is for the Nokia platform:
Jinja 2 – yang:sr:conf
There are so many important pieces to construct this automation project. The J2 template file, must include everything that is necessary to create this service. Below is the example for the Nokia device. Please see my code via the github repo at the top of this document to review the IOSxR J2 Template file. There are also supporting documents at the end of this document if you need more information on Jinja2
Our overall goal is to deploy the VPRN/L3VPN. We start by creating a few custom functions.
We create get_vrfcli and get_vprncli. These two functions take advantage of netmiko_send_command plugin and are using platform specific cli commands. We will use these two commands to retrieve the service status. Then we take the two functions and wrap them inside cli_stats. We load the yaml file using the load_yaml plugin from Nornir. Once the task is executed, we drill into our vars file and extract the service name as a variable from our loaded dictionary (yaml file). This variable is then passed into the get_vrpncli/get_vrfcli functions to execute against our devices. At this point, if we execute the cli_stats tasks against our west-region, we can use conditional statements to execute the correct command against the correct platform device. The way in which we access the platform, is by simply digging into the task.host.platform key. This will return the value of the key.
I am working on a video tutorial and demonstration of Nornir 3.0. During the video, I will create additional tasks in which verify the L3 Connectivity via simple ping commands.
Bulk of the Code:
Lets review the iac_render function. We simply load our yaml vars and render our j2 templates. Special attention to the following:
At this point we have our payload to deploy against our devices. One thing to note, the result of the rendered template using the nornir plugin, template_file is a Nornir Class. Make sure this gets converted to a str: “payload = str(vprn.result)”. We will pass this into our netconf_edit_config task as the payload to deploy via netconf.
deploy_config = task.run(task=netconf_edit_config, target=task.host[‘target’], config=payload)
Lets examine this line of code. We assign ‘deploy_config’ as the variable for the returning output of our task. The task we will execute is the ‘netconf_edit_config‘ function. Again, this is a wrapper of ncclient, which I hope you’re familiar with – if not, please give it a google search or review the additional resources at the bottom of the doc. Now, the’ target=task.host[‘target’]’ is the data store to use during our rpc NETCONF call. We specified this for our host inside our groups file. See below:
NOKIA: username: ‘admin’ password: ‘admin’ platform: alcatel_sros port: 22 data: target: candidate
NETCONF has three data stores in which we can execute configuration changes against.
In my opinion, candidate is the most valuable operation. We are able to input a config change, validate and once we are sure of the changes we must commit the change. As the operator of this network, we must be sure not to cause any outages or create any rippling effects from our automation. We will insepct the RPC reply and ensure all is good and if so, we will commmit out changes for the customer.
Line 23, has a conditional statement where we dig into the actual platform of the hosts that’s running within our task. We simply compare it to alcatel_sros or iosxr, as those are our two core devices in this example. We extract a couple different items in the result of our loaded yaml file which we will use to return some output to the screen and provide results in a readable format. We do the same with our iosxr results.
At this point, the netconf_edit_config wrapper for ncclient should have executed the netconf rpc and editted the configuration.
We store the reply in a variable called rpcreply, by extracting the .result attribute out of our original deploy_config variable. This gives us the xml reply and we can check the result of the reply by using ‘if rpcreply.ok:’
Line 42 gives us a simple feedback to let us know the result has returned OK. We now run the netconf_commit task and confirm the change.
Finally, lets validate some of the services applied and use our custom function ‘nc_getvprn’ against our Nokia 7750.
nc_getvprn(task, serviceid=serviceid, servicename=servicename, customerid=customerid)
Earlier, we extrracted some vars from our yaml file and loaded them into script as the following: ‘serviceid’, ‘servicename’ and ‘customerid’. We use these variables to execute the task and get some information by parsing the result of the netconf_get_config rpc call. We process this information by using xmltodict.parse and converting the xml to a Python dictionary. We compare the values found inside our running configuration against the desired state of our network element. Infrastructue as code is fun right? Once we do some comparasion of our items, we return meaningful output to the screen to let us, the operator know that everything is configured as expected.
If you are not familiar with xmltodict, I will provide additional references at the bottom of this document.
At the time of this writing I only have completed the compliance check against the Nokia sros device. I will most likley be extending this code to do the same against the IOSxR device. Below is the customer ‘nc_getvprn” function which we just described.
It’s that easy. We run our tasks against our filter, west_region to narrow down our hosts for our multi-vendor environment. Lets review the output!
From the output above, we deployed our L3VPN (IOSxR) and VPRN (NOKIA) device. After we take full advantage of our IaC+Nornir, we return back to our CLI Scraping automation and rely on Netmiko to run simple show commands to view the VRF is actually present and validate the services.