Simple Packet Inspection with AI/LLM
Why I Started
I wanted to see how far a large-language model (LLM) could take me when I’m trying to understand raw network traffic.
Two ideas popped up in my mind:
- Export a tiny hex-dump from Wireshark and let the model classify it.
- Use the OpenCode TUI to capture a few thousand packets, run a handful of shell commands, and have the model stitch the results together.
Both approaches promise the same thing: the AI does the heavy pattern-recognition while I stay in control of the data. This was performed on an on-premise (local) LLM .
1. What I Needed
| Tool | Reason I chose it | How I installed it |
|---|---|---|
| Wireshark | Quick visual capture; easy hex-dump export. | apt-get install wireshark |
| tcpdump | Script-friendly, works without a GUI. | apt-get install tcpdump |
| OpenCode | Provides a terminal UI that can call LLMs and run shell commands in one place. | See the “Installing OpenCode” section below. |
| API key | Required to talk to the local models. | Grab one from your provider and replace the placeholder <your-api-key> in the config file. |
2. Installing OpenCode (the TUI I used)
# 1️⃣ Create a local bin directory
mkdir -p ~/.opencode/bin
# 2️⃣ Pull the installer and run it
curl -L https://opencode.ai/install.sh | sh
# 3️⃣ Put OpenCode on my PATH
echo 'export PATH="$HOME/.opencode/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc
2.1 My Provider Configuration
I created ~/.opencode/config.json and pasted the following (only the API key needed editing):
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"educloud": {
"npm": "@ai-sdk/openai-compatible",
"name": "Educloud Provider",
"options": {
"baseURL": "<your-local-api-endpoint>",
"apiKey": "<your-api-key>"
},
"models": {
"Qwen3-Coder-480B-A35B-Instruct": {
"name": "Qwen3-Coder-480B-A35B-Instruct",
"options": {
"contextSize": 128000
}
}
}
}
},
"permission": {
"edit": "allow",
"bash": "ask",
"webfetch": "deny"
}
}
I kept the permission block simple: I’m fine with the model editing the session, I want it to ask before running any Bash command, and I definitely don’t want it fetching the web.
3. Capturing Packets
3.1 Tiny Demo – Wireshark → Hex Dump
- I opened Wireshark, started a capture on my primary NIC, and stopped after a few seconds.
- I exported the capture as “Plain Text (Hex Dump)”, deliberately stripping timestamps and column headers.
- I saved the file as
sample.hex.
A few lines of the file (with made-up values) look like this:
0000 01 00 5e 00 00 fb 00:11:22:33:44:55 5e 00:11:22:33:44:55 0f d5 08 00 45 00
0010 01 0d 22 7c c0 00 00 01 11 f4 9c 192.0.2.10 00 24 198.51.100.20 00
...
I replaced the real MAC and IP addresses with logically-consistent placeholders.
3.2 Larger Capture – OpenCode + tcpdump
Inside the OpenCode TUI I asked it to list interfaces and capture 2000 packets.
> ip link show
> sudo tcpdump -i enp3s0 -c 2000 -w network_capture.pcap
OpenCode printed the interface list, then captured exactly 2000 packets on enp3s0.
The capture lasted about 24 seconds, and the tool confirmed that no packets were dropped.
4. Getting the Data Ready for the LLM
4.1 Turning the hex dump into a single string
cat sample.hex | awk '{for(i=2;i<=NF-1;i++) printf $i " "; print ""}' > raw.hex
raw.hex now contains only the hexadecimal byte stream, one line per packet.
4.2 Summarising the pcap file
I let OpenCode run a couple of quick tcpdump pipelines and saved the results:
first20.txt gives me a glimpse of the first 20 packets, while top-talkers.txt lists the most frequent source addresses.
5. Asking the AI to Analyse
5.1 The Prompt I Fed the Model
You are a network-analysis assistant.
Given the following packet data (hex bytes or textual tcpdump output), perform:
1. Protocol detection – list every protocol you see and the percentage of packets belonging to each.
2. Host discovery – extract MAC addresses, IP addresses, hostnames (if any).
3. Top talkers – rank source→destination pairs by packet count.
4. Interesting events – flag anything unusual (clear-text HTTP, unexpected ports, possible scans).
Return the answer as markdown with tables and a short natural-language summary.
5.2 Running the Model
opencode run \
--model educloud.Qwen3-Coder-480B-A35B-Instruct \
--prompt-file prompt.txt \
--data-files raw.hex first20.txt top-talkers.txt
OpenCode streamed the model’s response back to me in the same terminal window. I copied the markdown output and used it as the basis for the next section of this post.
6. What the Model Told Me (My Merged Results)
6.1 Protocol Breakdown
| Protocol | Packets | % of total |
|---|---|---|
| TCP (SSH, port 22) | 1 914 | 95.7 % |
| ICMPv6 (Neighbor Discovery) | 42 | 2.1 % |
| TLS/HTTPS (port 443) | 18 | 0.9 % |
| Other / Unidentified | 26 | 1.3 % |
6.2 Top Talkers
| Rank | Source → Destination | Packets |
|---|---|---|
| 1 | 192.0.2.10:22 → 198.51.100.20:60530 | 1 027 |
| 2 | 198.51.100.20:60530 → 192.0.2.10:22 | 887 |
| 3 | fe80::aabb:ccdd:eeff → 2001:db8::1 (ICMPv6) | 31 |
| 4 | 203.0.113.5:443 → 192.0.2.10:46158 (TLS) | 4 |
| 5 | 2001:db8::2:443 → 2001:db8::3:35712 (TLS) | 4 |
6.3 Host & Service Summary
| Host | MAC / IP | Observed Services |
|---|---|---|
| My laptop | 00:11:22:33:44:55 / 192.0.2.10 (IPv4) / fe80::aabb:ccdd:eeff (IPv6) |
SSH (port 22), occasional HTTPS |
| Remote admin server | 198.51.100.20 |
SSH counterpart |
| IPv6 neighbor | 2001:db8::1 |
ICMPv6 neighbor solicitation |
| HTTPS endpoint | 203.0.113.5 |
TLS/SSL (port 443) |
7. Final Thoughts
- The hex-dump-only route is perfect for teaching, quick demos, or when I only need a handful of packets.
- The OpenCode TUI lets me capture a realistic amount of traffic, run shell commands, and keep everything inside a single terminal session—great for automation or when I’m working on a headless server.
- By combining both methods I now have a flexible toolbox that scales from a few lines of Wireshark output to multi-megabyte pcap files, all while staying in full control of the data and the AI’s reasoning.