Splitting Large JSON Files: Methods and Best Practices
You've got a 500MB JSON file and need to upload it somewhere with a 10MB limit. Or you're processing records and your system chokes on anything over 10,000 items at once. Time to split that file up.
There are a few ways to approach this depending on your constraints. Let's walk through them.
Why Split JSON Files?
Common reasons you'd want to break up a JSON file:
- API limits. Many services cap request sizes at 5MB, 10MB, or some arbitrary number.
- Email attachments. Most email servers reject files over 20-25MB.
- Memory constraints. Loading a huge file into memory crashes your script.
- Parallel processing. Split the work across multiple workers or threads.
- Version control. Git doesn't handle huge files well. Smaller chunks diff better.
- Database imports. Batch operations work better with manageable chunk sizes.
Splitting Strategies
There are two main approaches:
1. Split by file size
Each resulting file stays under a certain KB or MB limit. Good when you have hard size constraints (API limits, upload restrictions).
2. Split by item count
Each file contains exactly N items. Good for batch processing where you want consistent chunk sizes for load balancing.
Which should you use? Depends on the constraint:
- Upload limit of 10MB? Split by size.
- Process 1000 records at a time? Split by count.
- Not sure? Split by count, it's simpler.
Split by File Size
This is trickier than it sounds. JSON has overhead (brackets, commas, keys), so you can't just divide bytes evenly. You need to serialize each chunk and check its size.
The algorithm
- Start with an empty chunk
- Add items one by one
- After each addition, check the serialized size
- If over the limit, start a new chunk
- Repeat until done
Example output
If you have an array of 1000 users and a 100KB limit, you might get:
users_part_1.json(98KB, 150 users)users_part_2.json(97KB, 148 users)users_part_3.json(95KB, 145 users)- ...and so on
Notice the item counts vary. That's because some users have more data than others (longer names, more fields filled in).
Split by Item Count
Simpler approach: just take N items at a time. File sizes will vary, but each chunk has the same number of records.
Example output
With 1000 users and a 100 item limit:
users_part_1.json(100 users)users_part_2.json(100 users)- ...
users_part_10.json(100 users)
For objects vs arrays
If your JSON is an array, split by index. If it's an object with many keys, split by key count. Same principle.
Quick Tool Solution
Don't want to write code? Our JSON splitter tool handles both strategies. Paste your JSON, choose your method (size or count), set the limit, and download the parts.
Each resulting chunk gets a download button with its size displayed. Everything happens in your browser, so large files stay on your machine.
Code Examples
JavaScript: split by count
function splitArrayByCount(array, chunkSize) {
const chunks = [];
for (let i = 0; i < array.length; i += chunkSize) {
chunks.push(array.slice(i, i + chunkSize));
}
return chunks;
}
// Usage
const users = [...]; // Your big array
const chunks = splitArrayByCount(users, 100);
// Save each chunk
chunks.forEach((chunk, index) => {
const filename = `users_part_${index + 1}.json`;
fs.writeFileSync(filename, JSON.stringify(chunk, null, 2));
});
JavaScript: split by size
function splitArrayBySize(array, maxSizeKB) {
const maxBytes = maxSizeKB * 1024;
const chunks = [];
let currentChunk = [];
for (const item of array) {
currentChunk.push(item);
// Check size of current chunk
const size = Buffer.byteLength(JSON.stringify(currentChunk));
if (size > maxBytes) {
// Over limit, remove last item and start new chunk
currentChunk.pop();
if (currentChunk.length > 0) {
chunks.push(currentChunk);
}
currentChunk = [item];
}
}
// Don't forget the last chunk
if (currentChunk.length > 0) {
chunks.push(currentChunk);
}
return chunks;
}
Python: split by count
import json
def split_by_count(data, chunk_size):
"""Split a list into chunks of specified size."""
return [data[i:i + chunk_size] for i in range(0, len(data), chunk_size)]
# Usage
with open('large_file.json') as f:
users = json.load(f)
chunks = split_by_count(users, 100)
for i, chunk in enumerate(chunks):
with open(f'users_part_{i + 1}.json', 'w') as f:
json.dump(chunk, f, indent=2)
Command line with jq
If you have jq installed:
# Split array into chunks of 100 items
jq -c '.[]' large.json | split -l 100 - chunk_
# Creates chunk_aa, chunk_ab, etc.
# Wrap each back into an array
for f in chunk_*; do
jq -s '.' "$f" > "${f}.json"
done
Putting It Back Together
Splitting is only useful if you can reassemble later. Here's how to merge the chunks back:
JavaScript
const merged = chunks.flat();
// or
const merged = [].concat(...chunks);
Python
merged = []
for chunk in chunks:
merged.extend(chunk)
For more on merging, see our guide on merging JSON objects.