number-extractor
A number extractor scans text for numeric values β integers, decimals, percentages, currency amounts, scientific notation β and outputs a clean list, skipping numeric patterns that match dates, phone numbers, or version strings to avoid noise. The ZTools Number Extractor runs entirely in the browser, supports US ($1,234.56), European (1.234,56), and ISO formats, classifies each match (integer, decimal, percent, currency), and exports as a sorted, deduplicated list with optional unit / context columns.
Use casesβ
Invoice / receipt total extractionβ
Paste an OCRed receipt; extractor pulls all currency amounts. Quickly verify subtotals + tax against expected.
Scientific paper data extractionβ
Extract every numeric value (statistics, p-values, sample sizes) from a methods section into a CSV for meta-analysis.
Bulk number cleanupβ
Customer-feedback survey with numeric ratings buried in free text. Extract to compute averages and distributions.
Product spec extractionβ
Paste a product description; extract dimensions, weights, capacities for tabular comparison.
How it worksβ
- Paste text β Free-form text or pasted document. Numbers in any format.
- Apply numeric patterns β Integers, decimals, scientific (1e6), currency ($1,234.56), percent (45%), units (10 kg).
- Classify each match β Integer / decimal / percent / currency / scientific. Tag with unit if detected.
- Heuristic filter β Skip patterns that look like dates (5/1/24), phones (555-1234), or version strings (1.2.3) to reduce false positives.
- Export β Plain list, CSV with classification + raw match + parsed value + context columns.
Examplesβ
Input: "$1,234.56 invoice with 8.5% tax = $1,339.50"
Output: 1234.56 (currency), 8.5 (percent), 1339.50 (currency).
Input: "Sample n=42, p<0.001, age 24.3 Β± 3.1 years"
Output: 42, 0.001, 24.3, 3.1.
Input: "Released v1.2.3 in 2024 with 10000 users"
Output: Skips "1.2.3" (version) and "2024" (year). Keeps 10000.
Frequently asked questionsβ
How does it skip dates and phones?
Heuristics: 4-digit standalone integers (year-like), patterns like "555-1234", "(555) 123-4567", and explicit time formats are filtered out. Adjustable strictness if too aggressive.
What about scientific notation?
Yes β "1.5e6" and "1.5Γ10^6" both extract. Toggleable.
Currency: what currencies are recognised?
$, β¬, Β£, Β₯, βΉ, plus 3-letter codes (USD, EUR). Currency-symbol detection populates the unit column.
How are thousands separators handled?
Auto-detected by locale. US: 1,234.56 β 1234.56. EU: 1.234,56 β 1234.56. Toggle locale to change.
Is the input uploaded?
No β client-side.
Why are some numbers missing?
Heuristic filters err on the side of avoiding false positives (e.g. dates). If you need every digit cluster regardless of context, switch to "permissive" mode.
Tipsβ
- Pick the right locale before extracting β wrong locale flips the meaning of comma vs period.
- Use the context column (raw surrounding text) to verify high-stakes extractions.
- For financial data, validate sums against expected totals β extraction misses one digit cleanly produces wrong totals.
- Combine classification filters (only currency, only percent) for cleaner downstream analysis.
- Run permissive mode then manually filter when context-aware mode is too aggressive.
Try it nowβ
The full number-extractor runs in your browser at https://ztools.zaions.com/number-extractor β no signup, no upload, no data leaves your device.
Last updated: 2026-05-05 Β· Author: Ahsan Mahmood Β· Edit this page on GitHub