Skip to main content

keyword-extractor

A keyword extractor analyses a body of text and surfaces the most informative words and phrases β€” the ones that distinguish the document from generic English β€” using algorithms like TF-IDF (term frequency–inverse document frequency) or RAKE (Rapid Automatic Keyword Extraction) to weigh significance. The ZTools Keyword Extractor runs entirely in the browser, supports single-word and multi-word phrase extraction, configurable stopword filtering (English / Spanish / French / German), n-gram range (1-3 words), and outputs a ranked list with relevance scores.

Use cases​

SEO content audit​

Paste a draft blog post; extractor surfaces the dominant keywords. Compare to your target keyword β€” is the post actually about what you intended?

Document summarisation​

A long report or article. Top keywords tell you the gist in 30 seconds without reading the document.

Tag suggestion for a CMS​

Extract candidate tags for blog posts or knowledge-base articles. Beats human-only tagging consistency for large content libraries.

Theme analysis across documents​

Paste each document separately; compare top keywords to see which themes recur and which are unique.

How it works​

  1. Paste text β€” Single document. For multi-document analysis, run each separately and compare.
  2. Pick algorithm β€” TF-IDF: best with a reference corpus. RAKE: works on a single document; favours phrases over single words.
  3. Configure β€” Stopword language, n-gram range (1 = single words, 2-3 = phrases), max results.
  4. Tokenise + score β€” Lowercase, drop stopwords, tokenise into words / n-grams, score by chosen algorithm.
  5. Export β€” Ranked list with score + frequency. Plain text or CSV.

Examples​

Input: 500-word article on machine learning

Output: Top keywords: "machine learning", "neural network", "training data", "model accuracy". Ranking by score.


Input: Product description

Output: Top: "ergonomic", "adjustable height", "lumbar support" β€” surfaces the standout features.


Input: Multi-paragraph travel review

Output: Top: "boutique hotel", "rooftop bar", "old town", "walking distance". Captures the experience.

Frequently asked questions​

TF-IDF vs RAKE β€” which to use?

TF-IDF needs a corpus to compute IDF (e.g. all documents on your blog) β€” surfaces what makes this doc unique. RAKE works on one doc β€” surfaces dense phrases. RAKE for one-off; TF-IDF for cross-document analysis.

Should I use single words or phrases?

Phrases (2-3 words) are usually more informative ("machine learning" beats "machine" alone). Single words help for very short texts.

Stopwords β€” what are they?

Common words (the, is, of, and) that appear everywhere and provide little discrimination. Removed before scoring. Custom stopword lists let you remove domain-specific noise.

How accurate is keyword extraction?

Algorithms are fast and language-agnostic but miss semantic synonyms (AI ↔ artificial intelligence). For deep semantic analysis, switch to embedding-based tools.

Is the input uploaded?

No β€” client-side only.

Why do some "obvious" keywords miss?

If they appear too uniformly across a corpus (TF-IDF) or lack co-occurring related words (RAKE), score is low. Tune algorithm parameters or supply a different corpus.

Tips​

  • Always check the top 10 β€” they tell you what your document is "really" about, which often differs from your intent.
  • Use 2-3 word phrase mode for blog and product content; single-word mode for short texts.
  • Remove brand names from stopwords if you want them as keywords; add them as stopwords if you want underlying-topic analysis.
  • For SEO, compare extracted keywords to your target search keyword. If they don't match, the article needs editing.
  • Run on chunks (per section) for long documents β€” global keywords miss local themes.

Try it now​

The full keyword-extractor runs in your browser at https://ztools.zaions.com/keyword-extractor β€” no signup, no upload, no data leaves your device.

Open the tool β†—


Last updated: 2026-05-05 Β· Author: Ahsan Mahmood Β· Edit this page on GitHub