Skip to main content
aislop research should be repeatable. Public scans and benchmark writeups need pinned repositories, pinned scanner versions, raw JSON output, false-positive review, and detector changes that ship with regression tests.

Goals

  • Turn repeated AI-agent failure modes into deterministic rules.
  • Keep rule quality honest by scanning real repositories, not only fixtures.
  • Publish methods and limits so research posts are credible.
  • Feed product decisions with evidence about which rules matter, where noise appears, and what teams need to govern AI-written code.

Public scan protocol

For every public research run:
1

Define the cohort before scanning

Record the selection rule, such as GitHub Trending by language, top npm packages, benchmark tasks, framework repositories, or a public nominated list. Do not swap repositories after seeing results unless the reason is disclosed.
2

Pin every repository

Capture owner/repo, default branch, commit SHA, primary language, package manager, and whether install/build was attempted.
3

Pin the scanner

Capture aislop version, Node version, OS, config file, enabled engines, and exact command.
4

Store raw output

Keep the JSON result for each repository before writing a summary. Do not publish private source.
5

Classify findings

Sample top findings per rule and mark each as true positive, false positive, needs context, or setup/toolchain failure.
6

Convert learning into product changes

Tighten noisy detectors, add regression tests, improve source filtering, or document setup failures.
7

Publish method and limits

Include cohort, command, version, high-level results, representative examples, what changed in the CLI, and what the scan does not prove.

Preferred command

For a published run, prefer a pinned scanner version:
npx aislop@<version> scan . --json
For local source-tree research against the current checkout:
AISLOP_NO_TELEMETRY=1 DO_NOT_TRACK=1 CI=1 NO_COLOR=1 node dist/cli.js scan "<repo>" --json

Report template

# Title

## Cohort

- Selection rule:
- Repositories:
- Date scanned:
- aislop version:
- Command:

## Headline Findings

- Finding 1
- Finding 2
- Finding 3

## Rule-Level Results

| Rule | Findings | Sampled | True positives | False positives | Action |
|---|---:|---:|---:|---:|---|

## What Changed

- Detector change:
- Tests added:
- Docs updated:

## Limits

- What this scan does not measure:
- Known setup failures:
- Follow-up cohort:

Current research tracks

TrackPurpose
GitHub Trending quality sweepScan trending public repositories by language to find noisy rules before users do
Agent output benchmarkRun the same tasks across coding agents and score the generated repositories
Benchmark-to-rule translationConvert external benchmark signals into deterministic scanner rules
Rule provenanceTie first-party AI-slop rules to motivating patterns, detector strategy, and legitimate exceptions

What not to do

  • Do not publish leaderboards without pinned versions and a repeatable harness.
  • Do not claim a repository is bad because of a single scan.
  • Do not tune rules only to make one public report look better.
  • Do not use private customer code in public research.
  • Do not mix LLM judgment into scanner output. Label human review separately.