Asking GitHub to Recognize Lateralus
There's a moment in every language's life when it stops being a script you run locally and starts being a thing other people find on GitHub. For most languages, that moment coincides with GitHub adding your file extension to a YAML file in a Ruby gem called linguist. Until that happens, GitHub renders your code as plain text, doesn't show it in the language bar at the top of a repo, doesn't count it toward your profile's language statistics, and — most painfully — doesn't syntax-highlight it in pull-request diffs.
Today we've staged everything needed to get Lateralus through that door. Here's the full plan, in the open.
◉ What Linguist wants
Linguist's bar for a new language has four real rungs:
- A TextMate grammar (or tree-sitter grammar) with a stable scope name, kept in a public repo licensed permissively enough to be submoduled into Linguist.
- A file extension that isn't already claimed by something else.
- A sample corpus — ten-plus representative files — for Linguist's Bayesian classifier to train on.
- Evidence the language is actually in use: at least 200 unique
:user/:reporepositories on GitHub containing files in that language.
Rungs 1–3 are engineering. Rung 4 is community.
◉ Rungs 1 and 2: the grammar
The grammar already existed — it has been shipping in our VS Code extension since v1.0. What it didn't have was a dedicated home Linguist could submodule. So we carved it out into bad-antics/lateralus-grammar, a standalone repo with the grammar under grammars/lateralus.tmLanguage.json, a sample corpus under samples/Lateralus/, an MIT license, and nothing else. That's the exact shape Linguist submodules like.
The scope name is the important bit: source.ltl. Stable, shipped, consumed by a marketplace extension with thousands of downloads, and — critically — what the languages.yml entry will reference via tm_scope: source.ltl.
The extension, .ltl, is unclaimed. We checked. No other language in lib/linguist/languages.yml registers it, which means adding Lateralus can't regress anyone else.
◉ Rung 3: the sample corpus
A classifier is only as good as its training set, and the Linguist reviewers don't merge a language whose samples are all toy hello-world.ltl files. So we picked for domain diversity. The corpus now sitting in samples/Lateralus/ has 29 files across seven domains:
- Networking — SNMP agent, LoRa mesh, HTTP client
- Crypto — Ed25519 signatures, AES-CCM
- Retro — Game Boy ROM parser, PostScript interpreter
- Bioinformatics — FASTA streaming parser
- Industrial — Modbus RTU, CAN bus framing
- Compiler — constant-folding pass, capability tracker
- Audio & time — MIDI pipeline, IANA tz parser, GPX tracks
Every file in that list is production code pulled straight from our public lateralus-* repos, not something we wrote for the PR. That provenance matters: reviewers can click through to the commit history and watch the code evolve.
◉ Rung 4: the 200-repo bar
This is the one you can't engineer around. Linguist requires the language be in active use, not just technically valid. The current count — our own 7 first-party repos plus 40 community-written ones we've been able to audit by hand — sits at 47. Short of the bar, but on a trajectory.
The growth plan:
- Ship the 20-odd queued internal
lateralus-*repos publicly. - Land the
lateralus new <template>scaffolder — every scaffolded project is a new repo on the Internet. - Write a tutorial a week through the spring.
- Hit conferences.
Tracker page with a live progress bar: lateralus.dev/linguist. Counter script: count-ltl-repos.sh, runs daily.
◉ The .gitattributes trick
Here's the thing Linguist's CONTRIBUTING.md doesn't advertise loudly enough: Linguist reads .gitattributes on each repo, and those overrides beat the default detection rules. Which means, right now, today, before any upstream PR has been filed, every one of our repos can display "Lateralus" in the language bar by committing two lines:
*.ltl linguist-language=Lateralus
*.ltl linguist-detectable=true
We've done this across all six first-party Lateralus repos — lateralus-lang, lateralus-stdlib, lateralus-compiler, lateralus-examples, lateralus-os, lateralus-grammar. Push, wait thirty seconds, refresh, and the language bar turns pink. It's a small thing, but it's the single most effective visibility multiplier on the entire plan, because the repos now appear correctly in GitHub's "languages" facet of search, in trending lists if they land there, and in contributors' polyglot stats.
If you run a repo containing Lateralus code, drop the same two lines into your .gitattributes today.
◉ The PR itself
It's written. It sits in docs/linguist/pr-checklist.md with every box ticked except the 200-repo one. The languages.yml stanza is there too:
Lateralus:
type: programming
color: "#FF2A6D"
extensions:
- ".ltl"
aliases:
- lateralus-lang
tm_scope: source.ltl
ace_mode: text
The color #FF2A6D is the hot-pink from the official palette. It contrasts well against white and black, passes WCAG AA, and nobody else on the Linguist palette uses it.
◉ Why we're being this public about it
Because the 200-repo bar is, fundamentally, a community bar. It's there precisely so no single person can hammer a language into Linguist by themselves. The way past it is to be useful to enough people that they, unprompted, push .ltl files to their own repos. The best we can do is make the language worth writing in, make the tools fit for purpose, and explain — in a post like this one — exactly what the bar is and how to help clear it.
Star the lang repo. Try the playground. If you write a module in .ltl, push it publicly. Every repo moves the counter.
Track the progress
Live status page with a progress bar and the daily counter.
▶ Linguist Status Grammar Repo