How to Create Your Own npm init and Get Off npmjs.com

After struggling with npm init I figured out a way to avoid it entirely that ends up being easier.

Reposted from Learn JS the Hard Way By Zed A. Shaw

How to Create Your Own npm init and Get Off npmjs.com

One of the largest blind spots for programmers today is their dependence on singular platforms run by giant companies. They have all of their code on github.com, put all of their projects on npmjs.com, and brag about these sites being their "resume" of their accomplishments. They obsess over the stars and likes and downloads per week. Then they're shocked when one day it all goes away, or when Microsoft exploits their kindness to sell their code without attribution.

Don't believe me? Here's a post by someone who was blocked for "star farming" but they weren't the culprit, they were the victim. What happened is they signed up for a 3rd party site named NopeCha, and that site abused the victim's account to add a fake star to NopeCha's projects. Github then banned this person rather than banning NopeCha's accounts.

Imagine if this person had their entire professional career on github? It was their "resume" and one simple mistake, and done. They have nothing. Oddly, programmers claim this never happens despite numerous instances of companies doing this and multiple employees being caught and arrested for exactly this kind of fraud. It's entirely possible for some rogue employee to flag your account and your packages on npmjs.com to get banned. One accidental mistaken identity and all of your hard work is gone. One poorly written moderation bot and you're homeless.

A lot of what I want to do going forward is to be as independent as I possibly can. I don't want to feed any more of my code into Microsoft's Copyright avoidance technology. Thankfully, it's not too difficult to host all of your own code and npm supports almost enough features to make this seamless. You just have to work around npm init, and I'll tell you how.

You Can Self-host 90% of NPM

If you want to host your own git repository it's very easy. I use gitea on a simple VPS and it works very well. It used to have a fixed list of licenses, and it used to make you pick only the licenses gitea approved, but now you can easily add your own. I simply added a CopyrightAllRightsReserved "license" and done. It's viewable by anyone just like any of my blog posts, but nobody owns it...just like my blog posts (more on that in another post). A few things you can do with gitea:

  1. Create an organization that's configured to be "public" and any projects you want to be viewable can be transferred to it. Make your user the administrator and then you have the easiest way to create a public viewable list of projects while still keeping your own private.
  2. You can change the first thing people see to this group with LANDING_PAGE = explore and as long as your public organization is the only one then it'll show all of that organization's projects.
  3. You can set it to disable registration and now you have a nice personal publishing system for your works of art.
  4. If you want to let people sign-up to collaborate then gitea offers various authentication methods, CAPTCHA features, and other ways to weed out trash signups.

Once you have your own git service running, and you're able to assert your rights as a copyright holder, then npm will work for most of it's operations with a few modifications.

Installing From Your Git

The next problem is getting npm to install your code from your git repository. If you want to let people install your software then npm install supports installing from git URLs but you need a special syntax:

npm install git+HTTPS_URL.git

For example, if you want to install the code I'm discussing later in this post you do this:

npm install git+https://git.learnjsthehardway.com/learn-javascript-the-hard-way/ljsthw-bandolier.git

When someone does that it'll then show up in their npm list in what I feel is a more informative format:

zedshaw@ /Users/zedshaw
├── commander@9.4.1
├── csv@6.2.4
├── http-server@14.1.1
├── ljsthw-bandolier@0.3.1 (git+https://git.learnjsthehardway.com/learn-javascript-the-hard-way/ljsthw-bandolier.git#HEXCODE)
├── prompts@2.4.2
└── readline-sync@1.4.10

This shows you exactly where that module comes from--which incidentally would reduces the problem of typo-squatting on npmjs.com. Updating even works with npm upgrade as it will install a new version of the module from the original git URL, and if the version number changes then it will update like normal.

Everything is looking good except...

The npm init Problem

It's great to be able to install modules but what really helps is if people can quickly install example projects using your modules. In my course I want people to run a few simple commands to get demo code, course exercise code, and be able to start projects quickly. The npm init command seemed like the winner, but it turns out be very problematic in how it's implemented.

First, the documentation for npm init is really bad. It clearly describes everything without actually showing you how to make an npm init project for other people. Instead of clear instructions you get this mapping from npm to npx:

Sooooo, npm init just maps to npm exec in 5 different ways? Wait, so isn't npm exec just the npx command? So we're now at 2 levels of indirection? No, it gets worse because you then have to create a create-foo package in the npmjs.com repository. Now 3 levels of indirection but we're not done yet, oh no, architecture astronauts are never done with indirection.

This create-foo package that's required to be on npmjs.com (that's run by npm init (that's just running npm exec (that's really just the long form of npx))) has an additional level of indirection:

initializer in this case is an npm package named create-<initializer>, which will be installed by npm-exec and then have its main bin executed

It's.."main bin"? Is that the main: or the bin: key of the package.json? There's two, so "main bin" means nothing. Turns out you need to create one bin: entry and also set main: to that entry in order for npm init (I mean npm exec (I mean npx)) to run the right command.

The convolution in this is astounding for something that's basically just running some code out of a module. In theory this should be nothing more than a reference to a module npm install can use, and a command to run in that module. None of this triple quadruple obfuscated nested routing to 5 different commands is necessary.

npm init Always Queries the Registry

While trying to figure out how to configure package.json I ran into the final blocker: Every time I tried to use npm init the npm command tries to find the package in npmjs.com, no matter how I installed it. I can't test the command if I always need npmjs.com, and that also defeats the entire purpose of this whole exercise. If there is a way to use npm init then it's far too difficult to document for my users given I can't figure it out.

The documentation for npm init is very vague on when it's supposed to look in the npmjs.com registry:

Note: if a user already has the create-<initializer> package globally installed, that will be what npm init uses.

Nope, this is a lie. I tried packages in all kinds of configurations, globally, locally, in package.json files, everything I could think of and npm init always tried to find the package in the registry. That means you can't simply publish straight out of your personal git repository even though npm install can install directly out of a git.

If this documentation is not technically false--and I missed exactly how you make this line of documentation work--then the documentation fails to explain exactly what the condition is for running npm init without talking to npmjs.com. That's what makes this documentation so bad. It's all written as if it's just tiny notes reminding someone who wrote npm init how their own code works, not an explanation for other people who want to use it.

Why do I Need npm init?

The purpose of a command like npm init is to get people started quickly with a new project. Many projects require a lot of boilerplate setup that can be automated, so these "template builder" commands save everyone time. They also help people avoid mistakes in configuration because you're not accidentally manually copying errors into your project.

To create an alternative to npm init we don't really need much:

  1. A way for someone to install a module. We have this already with npm install using our git repository.
  2. A command they can run that does the setup. You have this already with plain old npx. It'll run any command in the package.json file's bin: section.

That's it, and that's effectively what npm init does, just in an insanely convoluted way. Once I realized this it was easy to create a project that did installs for the course. People use it like this:

npm install git+https://git.learnjsthehardway.com/learn-javascript-the-hard-way/ljsthw-bandolier.git

npx bando-up create my-first-project

The first command installs my little ljsthw-bandolier tool. This tool has the following in the package.json:

"main": "bando.js",
"bin": {
  "bando-up": "bando.js"
},

The bando-up command simply runs the bando.js script which is a command runner implementing different commands I'll provide users. The first command is a create.js command that knows how to check out the code for my "educational web framework", configure it, and help the user get started. It does the following:

  1. Uses git to checkout the project into the user's chosen directory, using --depth 1 to keep it small.
  2. Copies over any "template" files that the user will need to change to configure the project.
  3. Deletes the .git directory so the user can do their own git setup.

I'm using the bando.js command runner since I expect to add more commands to support the course. If you only want to let people install your software then just replace this bando.js with your own installer script.

Possible Attacks

No matter what you do Microsoft gets its pound of flesh out of your hard work. Yes, you can run your own registry. Yes, you can simplify that by using a simple git repository. Yes, you can make your own alternative to npm init to work around their registry demands. Seems all good right?

When you run npx it tries to find a module that implements the command you want to run. If you run npx bando-up it should search through your installed modules for that command, and then run the correct one. What happens when two modules list the same command? How does npx figure out which one you mean?

We can use npx bando-up --using ljsthw-bandolier attack-test to actually test this. I run that command, then I modify the package.json and give it a name that starts with a:

npx bando-up --using ljsthw-bandolier attack-test
vim package.json # change the name to attack-test
cd ..
npm install ./attack-test
npx bando-up --help

So which one is run? I have two projects with bando-up as commands, and I have no idea which one npx just ran. Let's modify the attack-test package so we can tell:

cd attack-test
vim package.json # change the version to 9.9.9
npm remove attack-test
cd ..
npm install ./attack-test
npx bando-up version
0.4.2
npm remove ljsthw-bandolier
npx bando-up version
9.9.9

As you can see, even though I have two packages implementing bando-up npx seems to choose "randomly"? It's not alphabetic, otherwise we'd get bando-up version with 9.9.9. After I remove the ljsthw-bandolier project I get the expected outcome.

There isn't any documentation I could find on how npx decides which command to run. It's entirely possible for someone to exploit this with the following attack:

  1. See that I have a project with a command bando-up.
  2. Find a project that I'm also telling people to install.
  3. Typo-squat that project and wait for students to fatfinger the spelling.
  4. Now their attack project installs, and potentially npx runs their version of bando-up instead of mine.

That's a fairly small attack surface, but still totally possible. The only mitigation is to tell people to always include the ljsthw-bandolier in the npx command:

npx --package=ljsthw-bandolier bando-up version
0.4.2
npx --package=attack-test bando-up version
0.4.2
npm remove ljsthw-bandolier
npx --package attack-test bando-up version
9.9.9

What the hell? The --package option doesn't even work?! I tried --package=attack-test, -p attack-test nearly everything mentioned in the npx documentation and not a single option worked.

I guess we're just screwed, and this is the best we can do with what we're given.

Suggestions for Improvement

If I could give out a wish list for the trillionaire Microsoft that's failing to run npm correctly then it would be these:

  1. Leave npm init to die in a pool of its own vomit.
  2. Turn npm create into a new command that is universal and has zero dependence on any registry.
  3. This new npm create would accept only direct URLs for installer projects to install. These URLs are anything compatible with npm install.
  4. npm create will use a totally separate create.json file with exactly what is necessary to make this feature work, rather than infecting package.json with convoluted options.
  5. This create.json would specify the .js file to use as the create script, and additional options for things like, removing the git, template patterns, and post install operations.
  6. The npm create wouldn't install the project, only use it to make everything work, that way it won't infect the system with additional commands.
  7. The npx command should prevent duplicate commands, and if two projects have the same command require the -p option and refuse to run.

No, I won't submit a pull request. I don't give trillionaires free labor.


More from Learn Code the Hard Way

Exploring the Replacement for C as an Educational Language

My thoughts so far on finding a replacement for C in teaching compiled languages and memory safety.

ResearchPublished Apr 20, 2024

How to Read Programmer Documentation

An excerpt from Learn Python the Hard Way, 5th Edition that explains how I analyze learn from projects with poor or no documentation (which is most of them).

PythonPublished July 29, 2023

The 5 Simple Rules to the Game of Code

An experimental idea to teach the basics of a Turing machine before teaching loops and branching. Feedback welcome.

PythonPublished July 29, 2023

Announcing _Learn Python the Hard Way_'s Next Edition

Announcing the new version of _Learn Python the Hard Way_ which will be entirely focused on Pre-Beginner Data Science and not web development.

AnnouncementPublished May 11, 2023