Email is one of the oldest forms of digital communication, and still prevalent today. Filtering out spam messages is a mostly solved problem, but incoming messages are still a chaos of bank statements, temporary login codes and newsletters that bayesian filtering couldn't solve.
Why an LLM?
An LLM can infer context from a message in a way that normal spam filtering does not. It allows zero-shot sorting without prior training by IMAP user decisions and can act on implied relationships in message contents.
Since emails contain either private personal or confidential professional messages, sending them to a remote API for sorting is not an option. Fortunately, local models can run on CPU-only and fit into memory of consumer-grade hardware today.
Goals
Since the point is to make email sorting easier without compromising privacy, we will keep the entire application simple. Local LLMs will be handled by ollama to automate CPU/GPU offloading, fetching and running models. The rest is handled by a single python script with no dependencies other than the standard library.
It is intended to run once, sort all mail in the inbox into predefined folders, automatically creating missing folders on the fly.
Errors should stop the script in-place, so it is designed to copy messages before removing old ones - in the worst case, you will have two copies of the same message, but never miss one. It should be able to run blindly against a mailbox by only checking the inbox and moving processed messages out of it, preventing infinite loops by design.
Since users may want to run the tool against multiple inboxes, all configurable parameters should come from a simple JSON config file.
Technical setup
You only need two things: install ollama, and save this script to a file with executable permissions:
imap_sort.py
#!/usr/bin/env python3
import sys
import json
import imaplib
import email
import urllib.request
# parse config
if len(sys.argv) != 2:
raise RuntimeError("No config file provided")
with open(sys.argv[1], "r") as file:
config = json.load(file)
# connect to imap
if config["imap"]["use_ssl"]:
conn = imaplib.IMAP4_SSL(config["imap"]["host"], config["imap"]["port"])
else:
conn = imaplib.IMAP4(config["imap"]["host"], config["imap"]["port"])
conn.login(config["imap"]["user"], config["imap"]["pass"])
# ensure sorting folders exist
remote_folders = set()
status, folders = conn.list()
if status != "OK":
raise RuntimeError("Failed to list remote folders")
for line in folders:
remote_folders.add(line.decode().rsplit(' ', 1)[-1].strip('"'))
for folder in config["folders"]:
folder = "INBOX/"+folder
if folder not in remote_folders:
status, _ = conn.create(folder)
if status != "OK":
raise RuntimeError(f"Failed to create remote folder {folder}")
# read inbox messages
conn.select("INBOX")
status, data = conn.search(None, "ALL")
if status != "OK":
raise RuntimeError("Failed to read inbox contents")
message_ids = list()
if data[0]:
message_ids.extend(data[0].split())
# loop through inbox messages
for message_id in message_ids:
status, data = conn.fetch(message_id, "(RFC822)")
if status != "OK":
raise RuntimeError(f"Failed to read mail id {message_id}")
message = email.message_from_bytes(data[0][1])
body = ""
for part in message.walk():
if part.get_content_type() == "text/plain":
body = part.get_payload(decode=True).decode("utf-8", errors="replace")
break
# prompt ollama api for email classification
prompt = f"""You are an email classifier. Given the email details below, decide which folder it belongs in.
Available folders:
{"".join(f"{k}: {v}\n" for k, v in config["folders"].items())}
Reply with ONLY the exact folder name from the list above (e.g. "Work"). No explanation, no punctuation, just the folder name.
----- BEGINNING OF EMAIL DETAILS -----
From: {message['From']}
Subject: {message['Subject']}
Body snippet:
{body[:500]}
"""
req = urllib.request.Request(
f"{config["ollama"]["host"]}/api/generate",
headers={"Content-Type": "application/json"},
data=json.dumps({
"model": config["ollama"]["model"],
"prompt": prompt,
"stream": False,
}).encode("UTF-8")
)
with urllib.request.urlopen(req, timeout=60) as response:
target_folder = json.loads(response.read().decode("UTF-8")).get("response", "").strip()
if target_folder not in config["folders"]:
raise RuntimeError(f"Received invalid target folder {target_folder}")
# move message to target folder
conn.store(message_id, "-FLAGS", "\\Seen")
result, _ = conn.copy(message_id, "INBOX/"+target_folder)
if result != "OK":
raise RuntimeError("Failed to move message to target folder")
conn.store(message_id, "+FLAGS", "\\Deleted")
# cleanup
conn.expunge()
conn.logout()The script simply reads the config JSON file passed as the first argument, connects to the IMAP account, creates any missing folders and looks for messages in the inbox. For each one found, it passes a limited snippet of the content and the sender information to the local ollama model for categorization. To prevent data loss on error, it first copies messages to the new sorting location before deleting the old copy.
It is designed to error in place but leave the IMAP contents in a state that is safe to re-run the application against without any cleanup.
Sorting emails manually
The script is designed to work with JSON files of this format:
config.json
{
"imap": {
"host": "my.mailserver.tld",
"port": "993",
"user": "my@mail.tld",
"pass": "my_password",
"use_ssl": true
},
"ollama": {
"host": "http://localhost:11434",
"model": "phi4"
},
"folders": {
"Work": "Work-related emails, professional correspondence, business matters, project updates, meetings, deadlines",
"Newsletters": "Newsletters, mailing lists, digests, subscriptions, blog updates, announcements from services",
"Finance": "Bank statements, invoices, receipts, payment confirmations, billing, tax documents, financial alerts",
"Social": "Social media notifications, friend requests, comments, likes, messages from social platforms",
"Shopping": "Order confirmations, shipping notifications, delivery updates, product recommendations, store promotions",
"Travel": "Flight bookings, hotel reservations, travel itineraries, transportation confirmations",
"Personal": "Personal correspondence from friends and family, private matters",
"Spam-Likely": "Suspicious emails, unsolicited offers, scams, phishing attempts, junk",
"Misc": "Emails that don't fit into any other folder"
}
}Replace the values under imap with your inbox credentials.
Since message contents and topics vary widely between users and small local models are limited, you will have to test multiple models to find what works best for you.
The default model in the sample file above is phi4 for its multilingual comprehension capabilities, but you can change it for a different one, for example mistral:7b if you are memory constrained or gemma4:e4b for a more modern alternative.
Lastly, the folders section contains a map of all folders and their descriptions. Simply adding a folder name will create it in the IMAP account if missing, and the description tells the model what messages go into it.
Be careful to add a final folder for miscellaneous or unsortable messages, or the model may get stuck on a message it cannot place anywhere.
Running the script is incredibly simple:
python3 ./imap_sort.py config.jsonAssuming you have named your script imap_sort.py and the config file config.json, the process will sort messages in the inbox one by one and stop when finished. You won't see much output, which is intended - no output means it is still working.
Sorting may take a few minutes depending on your hardware and inbox contents. If you have multiple inboxes to sort, you can create different config JSON files for each of them.
Automatic background sorting
Since the script handles errors gracefully, you can safely run it as a scheduled background job.
To add it as a cron job, run
crontab -eThe append the line:
0 5 * * * python3 /path/to/imap_sort.py /path/to/config.jsonReplace the paths for the python script and config file with the real locations. Now the script will automatically sort your inbox at 5am every day, with no further intervention needed.