KS
Killer-Skills

blog-voice-analyzer — how to use blog-voice-analyzer how to use blog-voice-analyzer, what is blog-voice-analyzer, blog-voice-analyzer alternative, blog-voice-analyzer setup guide, blog-voice-analyzer install, natural language processing for blog writing, ai pattern detection in content, google helpful content signals avoidance

v1.0.0
GitHub

About this Skill

Perfect for NLP Agents needing advanced blog content analysis and human-like writing pattern detection. blog-voice-analyzer is a skill that uses Natural Language Processing (NLP) to detect AI patterns in blog articles and provide diagnostics for improvement

Features

Invoked by the user with `/blog-voice-analyzer` command
Analyzes blog articles for AI-sounding patterns using NLP
Provides diagnostics to improve writing before publishing
Uses `pipenv run python3 scripts/blog/ai_voice_analyzer.py` for execution
Accepts `<path-to-mark>` as input for blog post analysis
Helps avoid being flagged by Google's helpful content signals

# Core Topics

Abhi5415 Abhi5415
[0]
[0]
Updated: 3/3/2026

Quality Score

Top 5%
42
Excellent
Based on code quality & docs
Installation
SYS Universal Install (Auto-Detect)
Cursor IDE Windsurf IDE VS Code IDE
> npx killer-skills add Abhi5415/clarido-marketing/blog-voice-analyzer

Agent Capability Analysis

The blog-voice-analyzer MCP Server by Abhi5415 is an open-source Categories.community integration for Claude and other AI agents, enabling seamless task automation and capability expansion. Optimized for how to use blog-voice-analyzer, what is blog-voice-analyzer, blog-voice-analyzer alternative.

Ideal Agent Persona

Perfect for NLP Agents needing advanced blog content analysis and human-like writing pattern detection.

Core Value

Empowers agents to analyze blog articles for AI-sounding patterns using NLP, providing diagnostics to improve writing and avoid Google's helpful content signals through Python scripts and pipenv-run execution.

Capabilities Granted for blog-voice-analyzer MCP Server

Analyzing blog posts for AI-generated content detection
Improving blog article writing style to sound more human-like
Debugging content to avoid Google's helpful content signals

! Prerequisites & Limits

  • Requires Python 3 execution environment
  • Needs access to blog article text data
  • Limited to analyzing text-based blog content
Project
SKILL.md
12.0 KB
.cursorrules
1.2 KB
package.json
240 B
Ready
UTF-8

# Tags

[No tags]
SKILL.md
Readonly

Blog Voice Analyzer

Analyze blog articles for AI-sounding patterns using NLP, then use the diagnostics to improve the writing before publishing. The goal is content that reads more human than virtually all other blog articles — and avoids being flagged by Google's helpful content signals.

Invoked by the user with /blog-voice-analyzer or when asked to "check" or "analyze" a blog post for AI patterns.

Quick Start

bash
1pipenv run python3 scripts/blog/ai_voice_analyzer.py <path-to-markdown-file>

The script accepts any text file (markdown, plain text). It strips frontmatter, HTML, and markdown formatting before analysis.

What It Detects

The analyzer runs 18 independent checks organized into four categories. Each scores 0-100 (higher = more human), and the overall score is a weighted average.

Structural Patterns

CheckWhat It MeasuresAI SignatureHuman Signature
Sentence length varianceStd dev of word counts per sentenceClusters around 15-20 words (std dev < 4)Wild variation: 3-word fragments mixed with 30-word sentences (std dev 8+)
Sentence opener diversityPOS patterns of first 2 tokens in each sentence40%+ start with "The" or "This"Fragments, questions, conjunctions, inversions, prepositional phrases
Clause depth varietyMax dependency tree depth per sentenceUniform depth across sentencesMix of flat simple sentences and deeply nested complex ones
Paragraph size varietyCoefficient of variation of paragraph word countsEvery paragraph roughly the same lengthOne-liners mixed with long blocks

Vocabulary & Word Choice

CheckWhat It MeasuresAI SignatureHuman Signature
Vocabulary diversity (TTR)Type-token ratio of content wordsLow TTR (< 45%) — recycles same wordsHigher TTR, though some writers deliberately use simple vocabulary
Hedge/filler phrasesExact match against ~50 AI-marker phrases + ~35 signal words"It's important to note", "multifaceted", "leverage", "delve", "cornerstone"Zero matches
Weak adverbsDensity of "really", "very", "literally", "significantly", etc.> 1% densityReplaced with stronger verbs or cut entirely
Nominalization densityNouns ending in -tion, -ment, -ness, -ity, -ence, -ance> 5% — "reduction", "transition", "consumption" instead of active verbs< 3% — prefers "reduce", "shift to", "consume"
Vague verb phrases"contributes to", "remains a", "poses a", "provides a", "aims to", etc.4-6+ per articleZero — uses direct assertions
Word repetitionContent words exceeding expected frequency (topic words get higher threshold)"Substantial" 4x in 300 wordsTopic words may repeat naturally; non-topic words stay varied

Voice & Personality

CheckWhat It MeasuresAI SignatureHuman Signature
Personal voiceFirst person ("I", "we"), second person ("you"), contractions ("don't", "it's")Zero of all threeFirst person for opinions, second person for engagement, contractions for warmth
Questions askedSentences ending with ?Zero questions — pure declaration5-10% of sentences are questions (rhetorical or direct)
Concrete specifics (NER)Named entities: people, places, dates, numbers, orgsZero — everything abstract and genericNames, dates, numbers, real examples
Readability registerFlesch-Kincaid grade + avg syllables per wordGrade 14+ (academic), avg syllables > 1.8Grade 6-10 (conversational), avg syllables < 1.6

Micro Patterns

CheckWhat It MeasuresAI SignatureHuman Signature
Passive voiceDependency labels nsubjpass / auxpass> 15% of sentences< 10%
Transition word openers"However", "Furthermore", "Additionally" at sentence start> 0.5 per paragraphLet ideas flow without signposting
Triple-item lists"X, Y, and Z" coordinated patterns> 2 per 1000 wordsNot everything comes in threes
Paired adjective cliches"ADJ and ADJ" via dependency parse ("smooth and swift", "widespread and uniform")> 3 per 1000 wordsPicks the stronger word

How to Read the Output

Overall Score

RangeInterpretation
75-100Reads naturally. Minor tweaks on flagged items.
55-74Some AI patterns visible. Targeted rewrites recommended.
35-54Clear AI voice. Significant rewriting needed.
0-34Strongly AI-generated. Full rewrite recommended.

Flagged Sentences

Only sentences with 2+ issues are flagged (reduces noise). Each is labeled HIGH/MED severity with specific diagnostics like:

[HIGH] #13:
"The variability in charging station availability, especially in rural areas, poses a challenge..."
  → Vague verb: "poses a"
  → Nominalization-heavy (4): variability, station, availability, distance
  → Length (20w) ≈ average (21w)

These are the sentences to rewrite first.

Priority Fixes

The summary lists the 5 worst-scoring dimensions with specific actions. Address these in order.

How to Use the Output to Improve Content

Step 1: Run the Analyzer

bash
1pipenv run python3 scripts/blog/ai_voice_analyzer.py path/to/article.md

Step 2: Fix Priority Items Top-Down

Work through the priority fixes list. The most impactful improvements by category:

If Personality scores low (< 70):

  • Add contractions throughout: "do not" → "don't", "it is" → "it's", "you are" → "you're"
  • Add first person for opinions: "I find...", "In my experience...", "We've all been there"
  • Add second person to engage the reader: "You know that feeling when..."

If Questions score low (< 50):

  • Turn topic sentences into questions: "Decision fatigue depletes willpower" → "Why does making too many choices drain you?"
  • Add rhetorical questions that make the reader feel seen: "Ever notice how the simplest decisions feel impossible by 8pm?"
  • End sections with a question that leads to the next point

If Readability scores low (< 70):

  • Replace latinate words with shorter ones: "utilize" → "use", "approximately" → "about", "demonstrate" → "show"
  • Break long sentences at semicolons and em dashes
  • Read it out loud — if you wouldn't say it to a friend, simplify it

If Nominalization scores low (< 70):

  • Find every -tion/-ment/-ness noun and convert to active verb:
    • "The reduction of stress" → "Reducing stress"
    • "The implementation of the system" → "Implementing the system"
    • "His contribution to the project" → "He contributed to the project"

If Hedge Phrases are detected:

  • Delete them. Every single one. They add no meaning:
    • "It's important to note that X" → "X"
    • "In today's fast-paced world" → delete entirely
    • "multifaceted" → "complex" or just describe the specific facets

If Vague Verbs are detected:

  • Replace with direct assertions:
    • "X contributes to Y" → "X causes Y" or "X increases Y"
    • "X remains a significant challenge" → "X is still hard" or explain specifically why
    • "X aims to achieve Y" → "X does Y" or "X tries to Y"

If Sentence Variance is low (< 70):

  • Add 3-5 word fragment sentences: "That's the point." / "And it works." / "Here's why."
  • Break one paragraph's longest sentence into two
  • Start a sentence with "But" or "And"

Step 3: Re-run and Compare

bash
1pipenv run python3 scripts/blog/ai_voice_analyzer.py path/to/article-v2.md

Target: overall score > 80, zero HIGH-severity flagged sentences.

Step 4 (Optional): Run the Model Benchmark

To compare how different AI models perform on similar prompts:

bash
1pipenv run python3 scripts/blog/benchmark_models.py

This generates 3 articles per model (Claude Sonnet, Haiku; GPT-4o, 4o-mini, o3-mini) on psychology/productivity topics and scores them all. Edit TOPICS and MODELS in the script to customize.

Known Limitations

Things the analyzer catches well:

  • Vocabulary patterns (hedge phrases, signal words, nominalizations, vague verbs)
  • Structural monotony (sentence length, opener repetition, clause depth)
  • Voice absence (no personality, no questions, no contractions)
  • Register mismatch (academic grade level for blog content)

Things the analyzer does NOT catch:

  • Semantic emptiness — a sentence can sound varied and use no hedge phrases but still say nothing substantive. The tool can't judge whether content is insightful vs. generic platitudes.
  • Template structure at the section level — Topic → Explanation → Topic → Explanation repeating throughout a piece is a strong AI pattern, but the tool only looks at sentence and paragraph level, not section-level templates.
  • Cliche ideas — the tool checks how you write, not what you write. An article full of obvious advice ("get enough sleep", "exercise regularly") will score fine structurally.
  • Over-hedging without trigger phrases — a passage can be wishy-washy and non-committal without using any of the specific phrases in our list.
  • Images, formatting, and layout — the tool only analyzes prose text.
  • Short text — articles under ~300 words produce unreliable scores because statistical measures need volume. Some dimensions (TTR, sentence variance) are especially noisy on short texts.
  • Topic word inflation — the word repetition detector tries to exclude topic words (extracted from the title and first paragraph), but it's imperfect. If your article's subject uses common words, they may still get flagged.
  • Style vs. defect — some writers deliberately use simple vocabulary (low TTR), avoid questions (declarative style), or write uniformly-sized paragraphs. The tool treats these as weaknesses because they correlate with AI output, but they may be intentional style choices. Use judgment.
  • False paired adjective flags — some paired adjectives are natural and effective ("fast and powerful"). The tool applies a per-1000-words density threshold but can still flag good writing.

Benchmark Reference Scores

From testing across known human and AI text:

TextScoreNotes
Paul Graham essays85-86Gold standard for clear, human prose
Clarido blog post (AI-written, edited)84Well-crafted AI output with personality
Claude Haiku (default prompt)82 avgBest out-of-the-box AI model
Claude Sonnet (default prompt)77 avgHigher grade level, more nominalizations
GPT-4o-mini (default prompt)70 avgLow personality, no contractions
o3-mini (default prompt)68 avgZero contractions, impersonal
GPT-4o (default prompt)66 avgWorst personality, fewest questions
Raw ChatGPT (no prompting)40Clear AI voice across all dimensions

Target for published content: 80+ with zero HIGH-severity flagged sentences.

Technical Details

Dependencies: spacy, textstat (Python, installed via pipenv). Requires the en_core_web_sm spaCy model.

How scoring works: Each of the 18 dimensions scores 0-100. The overall score is a weighted average with personality (0.12), hedge phrases (0.10), entity density (0.08), sentence variance (0.08), and vague verbs (0.07) weighted highest. Full weight table is in scripts/blog/ai_voice_analyzer.py in the analyze() function.

Sentence flagging threshold: Only sentences with 2+ co-occurring issues are flagged. Single-issue sentences are not surfaced to reduce noise. The "length close to average" flag only triggers when the document's overall sentence length std dev is below 6 (indicating genuine monotony).

Files

FilePurpose
scripts/blog/ai_voice_analyzer.pyMain analyzer — run on any text file
scripts/blog/benchmark_models.pyGenerate + score articles across Claude and GPT models
tmp/benchmark/Cached generated articles and results.json from benchmark runs

Related Skills

Looking for an alternative to blog-voice-analyzer or building a Categories.community AI Agent? Explore these related open-source MCP Servers.

View All

widget-generator

Logo of f
f

widget-generator is an open-source AI agent skill for creating widget plugins that are injected into prompt feeds on prompts.chat. It supports two rendering modes: standard prompt widgets using default PromptCard styling and custom render widgets built as full React components.

149.6k
0
Design

chat-sdk

Logo of lobehub
lobehub

chat-sdk is a unified TypeScript SDK for building chat bots across multiple platforms, providing a single interface for deploying bot logic.

73.0k
0
Communication

zustand

Logo of lobehub
lobehub

The ultimate space for work and life — to find, build, and collaborate with agent teammates that grow with you. We are taking agent harness to the next level — enabling multi-agent collaboration, effortless agent team design, and introducing agents as the unit of work interaction.

72.8k
0
Communication

data-fetching

Logo of lobehub
lobehub

The ultimate space for work and life — to find, build, and collaborate with agent teammates that grow with you. We are taking agent harness to the next level — enabling multi-agent collaboration, effortless agent team design, and introducing agents as the unit of work interaction.

72.8k
0
Communication