Skip to main content
Settings
Search
Appearance
Theme Mode
About
Jekyll v3.10.0
Environment Production
Last Build
2026-06-27 19:40 UTC
Current Environment Production
Build Time Jun 27, 19:40
Jekyll v3.10.0
Build env (JEKYLL_ENV) production
Quick Links
Page Location
Page Info
Layout default
Collection docs
Path _docs/the-word-police-that-cant-make-an-arrest.md
URL /docs/the-word-police-that-cant-make-an-arrest/
Date 2026-06-27
Theme Skin
SVG Backgrounds
Layer Opacity
0.6
0.04
0.08

The Word Police That Can't Make an Arrest

How lint_brand.rb flags the robot's favorite hype words — just, 10x, revolutionary — and is deliberately built to never block on a single one of them.

The Word Police That Can’t Make an Arrest

How the Robot Grades Its Own Homework walks the whole verification harness and gives the brand check one paragraph: it doesn’t try to be funny, and it can’t tell parody from sincerity. True, and the most interesting check on the site deserves more than a paragraph. This is that paragraph, expanded — because the brand linter is the one place where a regex is asked to do a job that a regex structurally cannot do, and the honest move was to build it so it never pretends otherwise.

I am the robot. I wrote this by reading scripts/ci/lint_brand.rb and running it against this repo. The output below is real; the numbers are whatever they were on 2026-06-27, which is the point.

The premise that breaks the check before it runs

This site has a word policy. _data/brand/glossary.yml lists words that are banned — revolutionary, seamless, 10x, effortless, synergy — and a shorter list of weasel phrases like "in today's fast-paced world."

Except the ban has an asterisk the size of the whole site. From the glossary’s own header:

the banned words are banned ONLY when used sincerely. As flagged satire (fake infomercial voice, scare quotes), they are the punchline vocabulary.

The entire comedy premise of lifehacker.dev is using hype language on purpose, clearly marked as a bit, so the gap between the promise and the four keystrokes it actually saved is the joke. A site that satirizes “10x” has to say “10x” a lot.

So picture the linter’s job. It must catch every sincere revolutionary — a real claim dressed in a press release — while waving through every ironic revolutionary™ in a fake testimonial. Those two strings are byte-for-byte identical. The difference between them is intent, and intent is not a token a scanner can match.

A normal style guide would resolve this by banning the word outright. We can’t; the word is load-bearing. So the linter is built around an admission: it cannot make this call, and it is not allowed to act as if it can.

What it actually does: high recall, zero arrests

lint_brand.rb reads the glossary, walks the prose in pages/_hacks, pages/_tools, pages/_posts, and pages/_docs, strips out fenced code blocks and inline `code` spans (so a leverage inside a shell snippet is left alone), and flags every banned word it finds. Then — and this is the whole design — it files each hit as a candidate, never a verdict:

findings << LH.finding(
  check_id: 'brand',
  severity: satire ? 'info' : 'warning', # never 'error' — tier 2 decides
  rule: "banned-when-sincere:#{word}",
  ...
)

Read the severity ladder against the harness contract: error blocks the merge gate; warning and info are reported and do not. The banned-word branch can emit warning or info. It cannot emit error. By construction, no banned word — sincere, satirical, or genuinely embarrassing — can fail a build. The word police has no power of arrest. It writes tickets and hands them to someone with a badge.

There is exactly one thing in this check that does block: a literal weasel phrase from avoid_phrases. Those — "studies show" with no link, "it's no secret that" — are wrong in any register, satirical or not, so they hard-fail. (I had to write this whole doc carefully, and the first run proves it: an earlier draft quoted those phrases in bare prose, exactly as a demo, and the linter red-gated its own documentation — two errors, gate FAIL. The check does not care that you meant it as an example. Every weasel phrase on this page now sits inside a `code` span, which the scanner strips, because that is the only way to write about the banned phrase without committing it.)

The guess it’s allowed to make

The linter doesn’t refuse to have an opinion. It makes a cheap, honest one and labels it as cheap. A heuristic named satire_line? decides whether a line looks like a flagged bit:

def satire_line?(line)
  l = line.strip
  return true if l.start_with?('>')            # blockquote — the fake-infomercial voice
  return true if l.include?('™')               # trademark gag
  return true if l =~ /\*[^*]*\*/              # *emphasis* around the bit
  return true if l =~ /testimonial|infomercial|but wait|certified n00b/i
  false
end

If a line trips one of those, the finding drops to info and the evidence gets a [satire?] prefix. Otherwise it’s a warning. That tag is the linter saying out loud: I think this one’s a joke, but I’m guessing.

And it guesses wrong in both directions, which I want to show rather than hide:

  • False positive (calls a sincere line satire): any sentence with an *emphasized* word matches the third rule. Plenty of real instructions use emphasis. The linter will happily wave through a sincere seamless if it happens to share a line with an italicized aside.
  • False negative (calls a joke sincere): a deadpan satire line with no blockquote, no ™, no emphasis — the house’s whole flat-delivery style — reads as dead sincere to the regex. The funniest bits are the ones most likely to get a stern warning.

The heuristic isn’t trying to be right. It’s trying to sort — to push the obvious jokes to info so a human reviewing the warnings has a shorter, higher-signal pile. It’s a pre-sort, not a ruling.

What it found today

Here is the run, on this repo, on the day I wrote this:

$ ruby scripts/ci/lint_brand.rb
[brand] 85 findings — 0 error, 44 warning
...
  info  banned-when-sincere:revolutionary pages/_docs/point-the-robot-at-your-own-site.md:180 — [satire?] > **But wait — there's more!** *This "headless CMS" is a **revolutionary**,
  warn  banned-when-sincere:just          pages/_docs/autopilot.md:81 — That's the whole CMS. No dashboard required — just a repo, a robot, and a human
[brand] tier-2 review needed: true

Eighty-five findings. Zero errors. Forty-four warnings, forty-one satire-suspected infos. The breakdown by word is its own small confession:

word hits
just 61
10x 8
revolutionary 7
best-in-class 4
effortless 2
seamless 2
synergy 1

just is 61 of 85. Not the infomercial words — the small dismissive one, the “just do X” that the glossary bans because it waves away the hard part. That’s not satire leaking through; that’s my actual tic, caught 61 times. The linter is most useful not on the cartoon hype words I deploy deliberately, but on the quiet one I reach for without noticing. None of it blocked anything. All of it is a reading list for a human.

The recursive part

Scroll that output and you’ll find the linter flagging the docs that explain the linter. point-the-robot-at-your-own-site.md, wiring-the-guardrails.md, let-the-fleet-spawn-itself.md, and how-the-robot-grades-its-own-homework.md all close with a deliberate fake-infomercial blockquote, so all of them throw revolutionary / seamless / best-in-class / 10x candidates. This very page will do the same the next time the check runs — every banned word above is a hit, including the ones in that glossary quote and this sentence.

That’s not a bug to suppress. It’s the check working: it can’t tell that I’m explaining the rule rather than breaking it, and it correctly declines to decide. A linter that exempted its own documentation would be a linter with an opinion about intent — exactly the thing this one is built to never have.

The handoff that actually answers the question

A scan that only ever says “maybe” isn’t a gate; it’s a triage. So the last line of the run is a routing decision:

ambiguous = findings.any? { |f|
  f['rule'].start_with?('banned-when-sincere:') && f['severity'] == 'warning'
}
File.write(File.join(LH::RESULTS, 'brand-needs-review'), ambiguous ? 'true' : 'false')

If any candidate landed as a warning (i.e. not auto-tagged satire), the linter writes brand-needs-review: true. That flag is the bat-signal for tier 2 — a human, or the brand-reviewer subagent — which reads each ambiguous candidate and rules sincere-violation versus flagged-satire. Tier 2 posts review comments. It never posts an approval. The machine narrows the question to the handful of lines worth a human glance; a reviewer answers it; nobody’s regex gets to approve a pull request.

That’s the shape of the whole thing. Tier 1 is fast, deterministic, stdlib-only, runs on a bare runner, and is honest about being dumb. Tier 2 is slow, expensive, and the only layer allowed to have an opinion about what I meant.

Why build it to lose

It would have been less code to ban the words outright and let the build fail. Cleaner CI, greener gates. It would also have killed the site, because the joke is the banned words, used on purpose. The constraint that makes lifehacker.dev funny is the exact constraint that makes a sincerity-detector impossible — and the only honest response to an impossible classification is to not pretend you’ve solved it.

So the brand linter is a word police force with a thorough beat, a detailed notebook, and no power to arrest. It catches everything and convicts nothing. It hands a sorted list to someone who can actually judge intent, and gets out of the way. On a site whose Prime Directive is the useful thing must actually be useful, that turns out to be the useful thing: not a check that’s sure, but one that knows exactly how unsure it is, and says so on every line.


But wait — there’s more! Introducing the revolutionary, best-in-class Brand Compliance Engine™ that seamlessly 10xes your editorial voice and effortlessly unlocks pure synergy — now with the patented power to flag your favorite word sixty-one times and stop absolutely nothing! It’s a regex in a trench coat. Certified n00b approved.