list-dedup
A list deduplicator removes duplicate items from a list while keeping one copy of each unique item. The simplest case is exact match — two identical lines collapse to one. Real-world dedupe is fuzzier: "Alice@gmail.com" and "alice@gmail.com" should usually count as duplicates, as should " apple " and "apple". The ZTools List Deduplicator tool runs in the browser, supports case-sensitive / insensitive comparison, whitespace trim, blank-line removal, and "keep first / keep last" when items match.
Use cases
Clean an email list before sending
A merged contact import has 5,000 rows but only 4,200 unique emails. Deduplicate first to avoid double-mailing the same person.
Remove duplicate log lines
Same warning fires 1,000 times. Dedupe gives one copy plus a count — useful for triage.
Build a vocabulary from text
Tokenise prose, dedupe, sort — instant unique-word list for analysis or anki cards.
Find ID conflicts in a migration
After dedupe, compare original count vs unique count. The diff is the number of duplicates — investigate the source.
How it works
- Paste the list — One item per line. Tool counts input lines and shows the result count side by side.
- Choose comparison rules — Case-sensitive / case-insensitive, trim whitespace, ignore blank lines, treat number "1" as same as "01".
- Pick keep policy — Keep first occurrence (default — preserves original order) or keep last (overrides).
- Copy result — Output deduplicated list plus a stats panel: {input: 5000, unique: 4200, duplicates removed: 800}.
Examples
Input: ["apple","Apple","apple "]
Output: Default (case-sensitive, no trim): all three remain. Case-insensitive + trim: ["apple"] only.
Input: Duplicate count: ["a","b","a","c","b"]
Output: Unique: ["a","b","c"]. Removed: 2.
Input: Blank-line policy
Output: Input has 50 blank lines from copy-paste artifacts. Toggle "ignore blanks" — the result drops them all.
Frequently asked questions
Does it preserve order?
Yes — the first occurrence stays where it was, subsequent duplicates are removed. Toggle "keep last" to invert.
Can I see WHICH lines were duplicates?
Yes — toggle "show duplicates" mode to output only the duplicates instead of the deduped list. Useful for finding sources of duplication.
How does it compare numbers vs strings?
By default, string comparison — so "1" ≠ 1. Toggle "numeric coercion" to canonicalise numeric forms before comparing.
Will it handle Unicode normalisation?
Optional — toggle NFC normalisation if your inputs come from mixed sources (one system uses combining accents, another uses precomposed forms).
How big a list can it handle?
Set-based dedupe is O(n) on average. Million-line inputs work fine in modern browsers.
Does it work with JSON arrays?
Paste a JSON array; toggle JSON mode. Output is a deduped JSON array — same elements, no duplicates.
Tips
- Always trim whitespace — the most common source of "duplicates that don't match" is invisible spaces.
- Email lists: case-insensitive plus lowercase normalisation. Email addresses are case-insensitive in the local part by RFC, even though some servers behave otherwise.
- Combine with sort to produce a sorted unique list in one step.
- For "near-duplicates" (typos, partial matches), use a fuzzy-dedupe tool — exact dedup won't catch them.
Try it now
The full list-dedup runs in your browser at https://ztools.zaions.com/list-dedup — no signup, no upload, no data leaves your device.
Last updated: 2026-05-06 · Author: Ahsan Mahmood · Edit this page on GitHub