Splitting Large CSV and Excel Files: When and How to Do It Right
"Just send me the data" is how you end up with a 240 MB Excel file that Outlook refuses to attach, Numbers refuses to open, and your BI tool truncates silently. Splitting it is the fix — but only if you split it the right way for the next step in the pipeline.
Three legitimate reasons to split a file
- System limits. Email attachments, SharePoint, and many SaaS importers cap at 25–100 MB. Excel itself can't show more than 1,048,576 rows.
- Per-stakeholder distribution. One file per region, per account manager, per supplier — so each recipient sees only their own rows.
- Parallel processing. Splitting into equal chunks lets you fan out work across team members or workers without anyone duplicating effort.
The two splitting strategies
1. Split by column value
One output file per unique value in a chosen column ("Country", "Account Owner", "Supplier"). This is what you want for distribution — every recipient gets a single, complete file containing only their rows. Watch for:
- Null or blank values in the split column. Decide upfront whether they go to a single "_unassigned" file or get dropped.
- High-cardinality columns. Splitting by "Customer Email" on a 500k-row file gives you 500k tiny files. That's almost never useful.
- Whitespace and case differences. "UK", "uk", and "UK " will become three files unless you normalise first.
2. Split into equal chunks
Fixed number of files, or fixed rows per file. Use this for ingest into systems with attachment limits or for parallel processing. Keep the header row in every chunk — most ingesters require it, and reattaching headers later is fiddly.
Things that quietly break when you split
- Pivot tables and formulas that reference the full sheet. They don't survive — flatten to values before splitting.
- Multi-row records. Invoice exports sometimes spread one logical record across several rows. A row-count split will guillotine these mid-record.
- Sorted order. If you split by column value, the original sort order is lost inside each output file. Sort before, or sort each output.
Doing it in the browser
Our Excel Splitter (Splitter Buddy) handles both strategies. Upload an .xlsx, .xls or .csv, preview the data, then split either by the unique values in any column (one file per value) or into a fixed number of files / fixed rows per file. The result downloads as a single ZIP, and the parsing runs entirely in your browser — nothing is uploaded to a server. That last point matters when the data contains anything you wouldn't paste into a public form.
After the split
If you split by column value, you almost always want the reverse operation later — consolidating returns from each recipient back into one master file with their source preserved. That's exactly what the consolidation workflow covers, and pairing the two closes the loop on most distribute-collect-analyse cycles.
