Regular Expressions for Show Output

.split() handles output where fields sit in predictable positions. Real show output isn't always that polite. Regular expressions describe the SHAPE of what you want — an IP, a MAC, a serial — and pull it out of any text that contains it.

The problem .split() can’t solve

Lesson 1 extracted a serial with line.split()[3] — which works while the serial is always the fourth word. But show output shifts: fields go missing, columns vary by platform, and the thing you want floats somewhere in a paragraph. What stays constant is the shape of the data — an IPv4 address is always four number-groups joined by dots, no matter where it sits. Regular expressions let you describe that shape:

import re

arp_line = "Internet  10.20.30.1   4   0024.c4e9.48ae  ARPA   Vlan10"
m = re.search(r"\d+\.\d+\.\d+\.\d+", arp_line)
if m:
    print(m.group())        # 10.20.30.1

Two things to lock in immediately:

re.search() scans the whole string for the first place the pattern fits, and returns a match object — or None if nothing fit.
Patterns are raw strings — r"\d+", with the r prefix. Regex is built on backslashes, and so are Python string escapes; the r tells Python to pass your backslashes through untouched instead of interpreting them first.

The vocabulary (you need less than you think)

Pattern	Means	Network example
`\d`	a digit	`\d+` — a VLAN ID
`\w`	letter, digit, or `_`	part of a hostname
`\s` / `\S`	whitespace / NON-whitespace	`\S+` — “one word”, your workhorse
`+` / `*`	one-or-more / zero-or-more of the previous thing	`\d+` — 1 to many digits
`?`	the previous thing is optional	`Gig?`
`{4}`	exactly 4 of the previous thing	`[0-9a-f]{4}` — one MAC chunk
`[abc]` / `[0-9a-f]`	any one character from the set	hex digits
`^` / `$`	start / end of the string	`^interface` — line opens a stanza
`.`	any single character	the dot in an IP must be `\.` (escaped!)

Three patterns worth memorizing because you’ll type them for the rest of your career:

ip_pattern  = r"\d+\.\d+\.\d+\.\d+"                       # IPv4 (practical form)
mac_pattern = r"[0-9a-fA-F]{4}\.[0-9a-fA-F]{4}\.[0-9a-fA-F]{4}"   # Cisco MAC
word        = r"\S+"                                       # "the next field"

Capture groups: extracting, not just finding

Wrap part of a pattern in parentheses and the match object remembers what that part matched. This is how a find becomes an extract:

line = "Processor board ID FOC2217A0AB"
m = re.search(r"Processor board ID (\S+)", line)
if m:
    serial = m.group(1)     # 'FOC2217A0AB'

group(0) (or plain group()) is everything the pattern matched; your parentheses are numbered from 1, left to right. Two groups pull two fields at once — here, interface name and IP from show ip interface brief:

line = "GigabitEthernet1/0/1   10.20.30.1      YES manual up      up"
m = re.search(r"^(\S+)\s+(\d+\.\d+\.\d+\.\d+)", line)
if m:
    intf, ip = m.group(1), m.group(2)

re.findall(): harvest everything at once

Where search finds the first match, findall returns a list of all of them — no match objects, just the matched strings (or the captured group, if you have exactly one):

output = """
Internet  10.20.30.1    4   0024.c4e9.48ae  ARPA  Vlan10
Internet  10.20.30.45   12  6c41.0e9a.1f02  ARPA  Vlan10
Internet  10.20.31.1    8   0024.c4e9.51bb  ARPA  Vlan20
"""
ips = re.findall(r"\d+\.\d+\.\d+\.\d+", output)
# ['10.20.30.1', '10.20.30.45', '10.20.31.1']

One line, every IP in an ARP table. Combine with Lesson 4 and set(re.findall(...)) dedupes as it harvests.

🖥 Patterns that read show output

▶ Try it yourself (Python runs in your browser)

import re

output = """GigabitEthernet1/0/1   10.20.30.1    YES manual up      up
GigabitEthernet1/0/2   unassigned    YES unset  down    down
TenGigE1/1/1           10.20.99.5    YES manual up      up"""

# Every IP in the output
print(re.findall(r"\d+\.\d+\.\d+\.\d+", output))

# Interface + IP pairs, line by line
for line in output.splitlines():
  m = re.search(r"^(\S+)\s+(\d+\.\d+\.\d+\.\d+)", line)
  if m:
      print(f"{m.group(1):<22} {m.group(2)}")

# Your turn:
# 1. Why didn't Gi1/0/2 print? (Check its second field against the pattern)
# 2. Find every interface NAME instead — pattern: r"^\S+"

Output appears here. First run downloads the Python runtime (~10 MB), so give it a few seconds.

Exercises (graded)

cd labs/python-foundations/lesson05
pytest -q

First lab that needs an import — put import re at the top of exercises.py. Five functions:

find_serial(text) — the serial from a Processor board ID line, or None
find_all_ips(text) — every IPv4 address in a blob, in order
find_macs(text) — every Cisco-format MAC (aaaa.bbbb.cccc)
interface_ip(line) — (name, ip) tuple from a show ip int brief line, or None
is_valid_hostname(name) — enforce the naming standard: starts with a lowercase letter, then lowercase letters, digits, or hyphens only

✅ Check your understanding

Why are regex patterns written as raw strings, like r"\d+"?

Summary

Regex describes the shape of data instead of its position: \d, \S, quantifiers, and anchors cover most of what network text demands, written always as raw strings. re.search() plus a capture group turns finding into extracting — guarded by if m:, because the NoneType-has-no-group crash is the most common regex bug there is. re.findall() harvests every match in one pass, and knowing when to graduate from hand-rolled patterns to TextFSM templates is itself a professional skill — one the flagship course builds directly on this lesson. Next up: functions — packaging the parsers you’ve been writing into tools you can reuse.