Bulk mailbox cleanup with Microsoft Graph: CSV-driven, dry-run first, throttle-aware.
Legal asks you to remove every message from a handful of senders, across two hundred mailboxes, by Friday. The classic answer is Compliance Search plus PurgeRequest. It works, but it cannot tell you exactly what got deleted, cannot be re-run safely, and the older Security and Compliance PowerShell modules it relies on are being retired. Here is the Graph-based approach I use instead, and why it ships with a dry run, retry logic for throttling, and a built-in safety valve for noisy mailboxes.
why this matters
Compliance Search plus New-ComplianceSearchAction is the official way
to bulk-delete email across mailboxes. It works. It also has limits that bite when
an auditor is involved: terse summaries, a cap on items per action, no clear
per-message audit trail, and partial failures that are hard to investigate. Microsoft
is also retiring the old Security and Compliance PowerShell modules, so existing
scripts that rely on them are increasingly broken or gated.
For a one-off discovery export, Compliance Search is fine. For a repeatable cleanup that an auditor needs to verify line by line, you want a real report and a process you can re-run.
The script in graph-exchange-mailbox-cleanup does exactly that. Two CSVs go in: mailboxes and senders. A timestamped report comes out, one row per matched message, with the action taken and the result. It runs against Graph using app-only auth, and it assumes throttling will happen because in any tenant of consequence, it will.
before you touch anything
- An app registration with application permissions. In Entra, create an app registration and grant it the
Mail.ReadWriteGraph permission as an application permission (not delegated). Application permissions act as the app itself rather than a signed-in user, which is what you need for unattended bulk operations. Click "Grant admin consent" after adding the permission. - Scope the consent. Application permissions hit every mailbox in the tenant by default. Attach an Application Access Policy if you can, so the app can only touch the mailboxes you actually want it to. This is a separate setup step in Exchange Online PowerShell and limits the blast radius if anything goes wrong.
- Clean inputs. One CSV with a
UPNcolumn (the mailboxes to clean), one with aSendercolumn (the senders to remove). The script trims whitespace, lower-cases entries, and removes duplicates, but it will not guess what you meant. Distribution list addresses will not match, because Graph resolvesfrom:against the actual sender header on each message. Use the real sending mailboxes in the senders list. - Always start with a dry run. Same parameters, just set
-DeleteItems N. The report tells you exactly which messages would have been touched, with no actual deletes.
step 1 · stage the inputs
Two flat CSVs, no headers beyond the one column each. The shape is intentionally boring so you can hand it to a non-engineer and get it back uncorrupted.
# Mailboxes.csv
UPN
itops@northshore.example
hr@northshore.example
shared-admin@northshore.example
# Senders.csv
Sender
ceo@northshore.example
cfo@northshore.example
executive.office@northshore.example
step 2 · dry run, every time
First pass is read-only. The script enumerates each mailbox, runs a Graph
$search="from:<sender>" for every entry in the senders CSV,
de-duplicates by message id, and writes a row per match into the timestamped report.
Nothing is deleted.
.\Cleanup-MailboxItemsBySender-Graph.ps1 `
-TenantId "00000000-0000-0000-0000-000000000000" `
-AppId "00000000-0000-0000-0000-000000000000" `
-AppSecret "<secret>" `
-MailboxesCsv .\Mailboxes.csv `
-SendersCsv .\Senders.csv `
-DeleteItems N `
-MaxPasses 1
Total mailboxes loaded: 214
Total senders loaded : 6
DeleteItems : N
PASS 1 complete. Total matched this pass: 11,847
Open the report in Excel, sort by mailbox, sanity-check the subjects against what
legal actually asked for. If a sender on the list is generating false-positives,
most often a shared mailbox or a generic noreply@, pull it out of the
senders CSV and re-run. Cheaper to fix the input than the explanation later.
step 3 · soft delete vs permanent delete
Two delete paths, and the difference matters.
DELETE /messages/{id}: the soft delete. Item moves to Deleted Items, then to Recoverable Items, and the user can pull it back from "Recover deleted items" until retention runs. This is the right default if there's any chance you've got a false-positive.POST /messages/{id}/permanentDelete: bypasses Recoverable Items entirely. Use it when legal has explicitly asked for an unrecoverable purge and you're confident in the matches. Once it runs, the item is gone short of an eDiscovery hold pulling it from Purges.
The script flips between the two with a single switch. It does not let you toggle mid-run.
.\Cleanup-MailboxItemsBySender-Graph.ps1 `
-TenantId "..." -AppId "..." -AppSecret "..." `
-MailboxesCsv .\Mailboxes.csv `
-SendersCsv .\Senders.csv `
-DeleteItems Y `
-UsePermanentDelete Y `
-MaxPasses 10 `
-WaitSecondsBetweenPasses 300
step 4 · why multi-pass exists
Graph's $search is fast but eventually-consistent. After a delete pass,
a re-run against the same mailbox can still surface a handful of items that the index
hadn't caught up on yet. Compounded across two hundred mailboxes, "almost zero" is
not "zero." -MaxPasses 10 -WaitSecondsBetweenPasses 300 tells the script
to keep going until a pass returns zero matches, with a five-minute cool-off between
runs. Each pass appends to the same report, so you get a complete trail.
step 5 · the per-mailbox transient ceiling
What is throttling? When you call Graph too often, or hit one mailbox too hard, Microsoft pushes back. The Graph response comes back as HTTP 429 (Too Many Requests) or a 5xx server error. Normally the right move is to back off and retry a few seconds later.
The trap: a single noisy mailbox, usually one that has been a busy distribution
target for years, can sit on 429s for the entire run window and starve the rest of
the fleet. The script counts these transient errors (429, 500, 502, 503, 504) per
mailbox. When it crosses -MaxTransientErrorsPerMailbox (default 40),
the script gives up on that mailbox, marks it skipped, and moves on. The run keeps
going for everyone else.
Transient error 429. Retry 3/8 in 6s
Transient error 503. Retry 4/8 in 8s
Transient error 429. Retry 5/8 in 10s
ERROR in mailbox archive-2014@northshore.example:
MailboxTransientThresholdExceeded transientCount=41 lastCode=429
===== Processing mailbox: hr@northshore.example =====
The skipped mailbox lands in the report as a MailboxSkipped row, so
nothing gets silently lost.
Fix for skipped mailboxes: re-run the script with just those mailboxes in the input CSV, ideally outside business hours when Graph throttling is less aggressive. They almost always complete on the second pass. This skip-and-move-on behaviour is the single most useful thing in the script: without it, one bad mailbox stalls a six-hour run.
DELETE call.
It's the audit trail.
common gotchas
- HTTP 403 on the very first mailbox. You wired up the app registration with delegated permissions instead of application. Delegated permissions act as the signed-in user, who almost certainly does not own all the target mailboxes. Fix: change the permission type to Application in the app registration, re-grant admin consent, and try again.
- Application Access Policy seems to do nothing. A new
New-ApplicationAccessPolicycan take up to an hour to propagate. Fix: test withTest-ApplicationAccessPolicybefore assuming the scope is enforced. If Test confirms the policy applies, give it the time it needs. - The 250-result page limit on
$search. Graph caps a single search response at 250 items, even if you ask for 1000. Fix: the script already paginates correctly using@odata.nextLink. Do not try to "fix" the page size up to 1000 yourself; Graph silently clamps it and you just rate-limit yourself faster. - Distribution lists in the senders CSV. They will not match anything. The Graph
from:filter resolves to the actual sender header on each message, which is the user mailbox, not the DL. Fix: put the real sending mailboxes in the senders list, not the distribution list addresses. - Recoverable Items quota fills up. A soft-delete-heavy run on a small mailbox can fill the recoverable items dumpster. Fix: process those mailboxes with permanent delete instead, or wait for the dumpster retention to roll items out before re-running.
- Items still appear after delete on hold mailboxes. If a mailbox is on Litigation Hold or In-Place Hold, deleted items stay in the Purges folder regardless of which delete option you used. This is by design for compliance, not a fix you can apply: the items are not user-visible and they will not return to the inbox. They cannot be removed without an eDiscovery action by a compliance officer.
when to skip this and use Compliance Search
If the request is "find every message containing this phrase tenant-wide and purge it," that's Compliance Search territory: content search syntax, hold-aware, designed for the eDiscovery use case. The script in this repo is for the narrower, more common ask: these specific senders, these specific mailboxes, give me the list of what got removed. When that's the brief, a CSV in and a CSV out is the right shape.