Sortable mailbox size reports, without the Excel cleanup ritual.

Day one of a tenant-to-tenant migration and someone needs to know which mailboxes are above 50 GB, which have an archive, and how the long tail looks. Get-MailboxStatistics will tell you, but it'll also hand you "12.43 GB (13,348,294,144 bytes)" as a string. Here's the small script I keep on hand to skip the Excel cleanup and get straight to a sortable CSV.

why this matters

Exchange Online does not return mailbox sizes as numbers. It returns TotalItemSize as a ByteQuantifiedSize, which is Microsoft's name for a pretty-printed string with the byte count tucked into parentheses. It reads beautifully in the console: "12.43 GB (13,348,294,144 bytes)". The trouble starts as soon as you export it.

Pipe it to Export-Csv and you get the whole "12.43 GB (13,348,294,144 bytes)" string in a single cell. Sort that column in Excel and 9 GB lands above 12 GB, because Excel sorts text alphabetically and "1" comes before "9". Every consultant has a Find and Replace recipe to clean it up, but most of those recipes silently break the moment one mailbox crosses a TB and the unit changes from GB to TB.

The script in exo-mailbox-size-report does the conversion in PowerShell where it belongs: pulls the bytes out of the formatted string, divides by 1 GB, rounds to two decimals, and writes a real numeric column you can sort, sum, and pivot on. It's small. That's the point.

before you run it

  • ExchangeOnlineManagement v3 module installed. The script uses Get-EXOMailbox and Get-EXOMailboxStatistics, the REST-backed cmdlets, not the legacy ones. They're noticeably faster and they're what Microsoft is keeping current.
  • An account that can read mailbox stats tenant-wide. View-Only Recipients is enough; you don't need anything write-capable for a size report.
  • The input CSV. One column, header UPN. Trim, lower-case, deduplicate before you hand it over: the script does the same thing internally but it's easier to fix bad input than read past it in the report.

the trick that matters

TotalItemSize renders as text, but it always carries the byte count in parentheses. Pull it out with a single regex and the rest is arithmetic.

// run this
function Get-BytesFromSizeString {
    param([string]$SizeString)
    if ($SizeString -match "\(([\d,]+)\s+bytes\)") {
        return [double]($matches[1] -replace ",", "")
    }
    return 0
}

That's the whole conversion. Once you have a real number of bytes, dividing by 1GB gives you a proper double, Round trims the noise, and Excel sorts it correctly the first time. No formula tricks, no Power Query, no "split on space and cast the first token."

running the report

// run this
.\Invoke-EXOMailboxSizeReport.ps1 `
    -InputCsv .\migrationmailboxes.csv `
    -CSVPath  .\MailboxSizeReport_20260505_1408.csv
// output
  Connecting to Exchange Online...
  Scanning mailboxes  10 / 412 : alex.lee@northshore.example
  Scanning mailboxes  20 / 412 : shared.finance@northshore.example
  ...
  Completed. Report exported to: .\MailboxSizeReport_20260505_1408.csv

For each UPN the script pulls primary stats, then, only if the mailbox actually has an archive, pulls archive stats with -Archive. Each mailbox is wrapped in its own try/catch so a single failed lookup doesn't kill the run; the error message lands on the row instead, with empty size columns. You get a clean report and a list of the things that need a closer look, in the same file.

archive detection, the right way

Most homemade size reports get archive detection wrong, and the symptom is always the same: the report shows zero archive GB across the entire fleet, even though plenty of mailboxes have archives.

What goes wrong: the script checks $mbx.ArchiveStatus and skips the archive size call when it comes back blank.

Why: by default, Get-EXOMailbox does not include archive properties in its response. The field comes back empty whether or not an archive actually exists.

Fix: ask explicitly for archive properties using -PropertySets Archive, then check ArchiveGuid rather than ArchiveStatus. ArchiveGuid is populated only when an archive really exists, so it is a reliable signal.

// run this
$mbx = Get-EXOMailbox -Identity $upn -PropertySets Archive
if ($mbx.ArchiveGuid -and $mbx.ArchiveGuid -ne [guid]::Empty) {
    $a = Get-EXOMailboxStatistics -Identity $upn -Archive
    # ... build report row using $a.TotalItemSize
}

ArchiveGuid is the source of truth: present and non-empty means the archive exists. Anything else (status, quota, displayname) is downstream of that.

what the output looks like

// output (sample report)
  UserPrincipalName              PrimaryTotalItemSize  PrimarySizeGB  ArchiveEnabled  ArchiveSizeGB
  ----------------               ---------------       ------         -----           ------
  alex.lee@northshore.example    48.21 GB              48.21          True            72.40
  shared.finance@northshore...   12.04 GB              12.04          False           0
  ops@northshore.example         3.88 GB               3.88           False           0
  ceo@northshore.example         84.62 GB              84.62          True            128.94

Two size columns per mailbox: a friendly text version for humans, and a numeric one for sorting and summing. Drop it in Excel, sort by PrimarySizeGB descending, and the conversation about migration batches writes itself: top 10 in their own slow batch, the long tail in everything-at-once batches, the archives flagged for the licensing review you didn't realise you needed.

One detail worth keeping. TotalItemSize includes Recoverable Items in some configurations and not others depending on retention. For migration sizing, what's actually moving over the wire, that usually doesn't matter. For licensing decisions ("do they need a 100 GB plan?"), it can. If the number is borderline, pull TotalDeletedItemSize too and look at the split.

common gotchas

  • Slow runs or timeouts on large tenants. You used the legacy Get-Mailbox cmdlet instead of Get-EXOMailbox. The legacy cmdlet works for small tenants but falls over once you loop across a few thousand mailboxes. Fix: use the Get-EXOMailbox and Get-EXOMailboxStatistics cmdlets from the ExchangeOnlineManagement v3 module. They are REST-based and far faster.
  • "User not found" errors mid-run. The input CSV contains soft-deleted mailboxes (mailboxes that have been removed but not fully purged). Fix: the script catches the error per row and keeps going, so the run completes regardless. To clean things up, filter soft-deleted mailboxes out of the input CSV before running.
  • Shared mailboxes. No special handling needed. Same UPN shape, works identically. Just include them in the input CSV.
  • Inactive mailboxes return nothing. Get-EXOMailbox only returns live mailboxes. Inactive ones (used for litigation-hold retention) need a separate query path. Fix: if you need inactive mailbox sizing for a hold review, run a parallel script with the -InactiveMailboxOnly flag. That is a different report.
  • Throttled by Exchange Online on huge tenants. EXO PowerShell will throttle on long-running loops. Fix: split the input CSV into chunks of a few thousand UPNs and run them sequentially. The script does not bake in retry logic because, for sizing runs, throttling is rarely the bottleneck.

when to use this and when to use the portal

For a one-off "how big is this one mailbox" question, the admin centre's mailbox usage tile or the Microsoft 365 Admin reports section will answer in three clicks. The script earns its keep when you need the data for something: a migration batch plan, a licensing analysis, a chargeback report, a quarterly review of the largest mailboxes, and you want it sortable, scriptable, and reproducible against the same input next month.

Sizing a tenant-to-tenant migration, or trying to make sense of which mailboxes are eating your storage budget? That's most of what I do. Book a 30-minute call and we can talk through the right approach for your environment.
← all posts // himanshu @ aroramsp