Regular Expressions for Show Output

.split() handles output where fields sit in predictable positions. Real show output isn't always that polite. Regular expressions describe the SHAPE of what you want — an IP, a MAC, a serial — and pull it out of any text that contains it.

In this lesson you will:
  • Read and write the core regex vocabulary — classes, quantifiers, anchors
  • Extract single facts with re.search() and capture groups
  • Harvest every IP or MAC in a blob of output with re.findall()
  • Handle the no-match case without the classic NoneType crash
  • Know when regex is the wrong tool — and what the pros use instead

The problem .split() can’t solve

Lesson 1 extracted a serial with line.split()[3] — which works while the serial is always the fourth word. But show output shifts: fields go missing, columns vary by platform, and the thing you want floats somewhere in a paragraph. What stays constant is the shape of the data — an IPv4 address is always four number-groups joined by dots, no matter where it sits. Regular expressions let you describe that shape:

import re

arp_line = "Internet  10.20.30.1   4   0024.c4e9.48ae  ARPA   Vlan10"
m = re.search(r"\d+\.\d+\.\d+\.\d+", arp_line)
if m:
    print(m.group())        # 10.20.30.1

Two things to lock in immediately:

  1. re.search() scans the whole string for the first place the pattern fits, and returns a match object — or None if nothing fit.
  2. Patterns are raw stringsr"\d+", with the r prefix. Regex is built on backslashes, and so are Python string escapes; the r tells Python to pass your backslashes through untouched instead of interpreting them first.

The vocabulary (you need less than you think)

PatternMeansNetwork example
\da digit\d+ — a VLAN ID
\wletter, digit, or _part of a hostname
\s / \Swhitespace / NON-whitespace\S+ — “one word”, your workhorse
+ / *one-or-more / zero-or-more of the previous thing\d+ — 1 to many digits
?the previous thing is optionalGig?
{4}exactly 4 of the previous thing[0-9a-f]{4} — one MAC chunk
[abc] / [0-9a-f]any one character from the sethex digits
^ / $start / end of the string^interface — line opens a stanza
.any single characterthe dot in an IP must be \. (escaped!)

Three patterns worth memorizing because you’ll type them for the rest of your career:

ip_pattern  = r"\d+\.\d+\.\d+\.\d+"                       # IPv4 (practical form)
mac_pattern = r"[0-9a-fA-F]{4}\.[0-9a-fA-F]{4}\.[0-9a-fA-F]{4}"   # Cisco MAC
word        = r"\S+"                                       # "the next field"

Capture groups: extracting, not just finding

Wrap part of a pattern in parentheses and the match object remembers what that part matched. This is how a find becomes an extract:

line = "Processor board ID FOC2217A0AB"
m = re.search(r"Processor board ID (\S+)", line)
if m:
    serial = m.group(1)     # 'FOC2217A0AB'

group(0) (or plain group()) is everything the pattern matched; your parentheses are numbered from 1, left to right. Two groups pull two fields at once — here, interface name and IP from show ip interface brief:

line = "GigabitEthernet1/0/1   10.20.30.1      YES manual up      up"
m = re.search(r"^(\S+)\s+(\d+\.\d+\.\d+\.\d+)", line)
if m:
    intf, ip = m.group(1), m.group(2)

re.findall(): harvest everything at once

Where search finds the first match, findall returns a list of all of them — no match objects, just the matched strings (or the captured group, if you have exactly one):

output = """
Internet  10.20.30.1    4   0024.c4e9.48ae  ARPA  Vlan10
Internet  10.20.30.45   12  6c41.0e9a.1f02  ARPA  Vlan10
Internet  10.20.31.1    8   0024.c4e9.51bb  ARPA  Vlan20
"""
ips = re.findall(r"\d+\.\d+\.\d+\.\d+", output)
# ['10.20.30.1', '10.20.30.45', '10.20.31.1']

One line, every IP in an ARP table. Combine with Lesson 4 and set(re.findall(...)) dedupes as it harvests.

🖥 Patterns that read show output
▶ Try it yourself (Python runs in your browser)
Output appears here. First run downloads the Python runtime (~10 MB), so give it a few seconds.

Exercises (graded)

cd labs/python-foundations/lesson05
pytest -q

First lab that needs an import — put import re at the top of exercises.py. Five functions:

  1. find_serial(text) — the serial from a Processor board ID line, or None
  2. find_all_ips(text) — every IPv4 address in a blob, in order
  3. find_macs(text) — every Cisco-format MAC (aaaa.bbbb.cccc)
  4. interface_ip(line)(name, ip) tuple from a show ip int brief line, or None
  5. is_valid_hostname(name) — enforce the naming standard: starts with a lowercase letter, then lowercase letters, digits, or hyphens only
✅ Check your understanding

Why are regex patterns written as raw strings, like r"\d+"?

1 / 3

Summary

Regex describes the shape of data instead of its position: \d, \S, quantifiers, and anchors cover most of what network text demands, written always as raw strings. re.search() plus a capture group turns finding into extracting — guarded by if m:, because the NoneType-has-no-group crash is the most common regex bug there is. re.findall() harvests every match in one pass, and knowing when to graduate from hand-rolled patterns to TextFSM templates is itself a professional skill — one the flagship course builds directly on this lesson. Next up: functions — packaging the parsers you’ve been writing into tools you can reuse.