22 March 2021

Automating local environment variables

Can we improve the experience of using environment variables locally?

By Jon Short

In this article I'll give some general info around environment variables, then I go into my experiences making "cenv" (A CLI to change env vars for me).

Feel free to skip the first few sections if you're already familiar with environment variables! 👩‍🎓👨‍🎓

Environment variables

Environment variables are great. Code can do different things or perform different actions simply by taking into account the environment it's running within.

Generally there are two different types of environment variables:

global
process

Global

Global env vars belong to the operating system, and are often used when a variable needs to be persisted across processes (e.g. a path to a shared file).

Example - on linux-based systems printenv shows the current env vars

$ printenv
  ZSH=/Users/abc/.oh-my-zsh
  GOPATH=/Users/abc/go
  NVM_DIR=/Users/abc/.nvm

$ echo $GOPATH
  /Users/abc/go

Process

Process env vars only associate themselves with one command, usually used when the same command could be run multiple times with different options.

Example - here we set a process-specific env var of GREETING, which only affects the process it is run alongside

$ GREETING=hello bash -c 'echo "$GREETING"'
  hello

$ GREETING=goodbye bash -c 'echo "$GREETING"'
  goodbye

Now if we check the global env vars, we won't see an entry for GREETING, because it was set per-process

Example - here passing an argument to printenv shows the value for that specific env var

$ printenv GREETING

Handling lots of environment variables

As projects get larger sometimes the amount of environment variables that need to be passed to each process can become a bit unwieldy.

Javascript-based projects often use a .env file to solve this, with all vars being added to the file, and everything within it being made available to the running process as environment variables (similar to process-based vars).

Example - an example .env file that might be used for local development

API_URL=http://localhost:6000
AUTH_PROVIDER=http://localhost:4000
HTTP_TIMEOUT=60000
NODE_ENV=development

This is developer-friendly as all our variables are stored in a single place, and we can easily see what settings will be used when we run the process.

Switching out environment variables

Looking at the example in the previous section, let's imagine that we're asked to test a production build, and run against live URLs rather than localhost.

With a .env file we just need to update the URLs within the file to the correct live URLs:

- API_URL=http://localhost:6000
+ API_URL=https://api.jonshort.me
- AUTH_PROVIDER=http://localhost:4000
+ AUTH_PROVIDER=http://auth.jonshort.me

Now our .env file looks like this:

API_URL=https://api.jonshort.me
AUTH_PROVIDER=http://auth.jonshort.me
HTTP_TIMEOUT=60000
NODE_ENV=development

😀 Nice! We've switched out the URLs and should be ready to go.

...

😳 Oops, we almost forgot to update the NODE_ENV to simulate a production build:

- NODE_ENV=development
+ NODE_ENV=production

Let's take another look at our .env file:

API_URL=https://api.jonshort.me
AUTH_PROVIDER=http://auth.jonshort.me
HTTP_TIMEOUT=60000
NODE_ENV=production

🥳 Great! Let's run our application

$ node ./my-application.js
  - - - - - - - - - - - - - -
  Starting production mode...
  - - - - - - - - - - - - - -
  ERROR - unable to connect to auth API over HTTP - please use HTTPS

...

🤔 Hmm, looks like we made a mistake in the .env file, let's update that AUTH_PROVIDER value:

- AUTH_PROVIDER=http://auth.jonshort.me
+ AUTH_PROVIDER=https://auth.jonshort.me

😅 Phew! Ok here's the final .env file:

API_URL=https://api.jonshort.me
AUTH_PROVIDER=https://auth.jonshort.me
HTTP_TIMEOUT=60000
NODE_ENV=production

Here's where our problem appears

When we were asked to "test a production build, and run against live URLs rather than localhost" we knew where to update the values, but we didn't know what to update the values to.

This is one of the problems with env vars in general - unless they're documented by developers, the values need to be manually updated, and even then misspellings or typos can cause problems.

Imagine you were working on a project for the first time - there's no way you'd know what values to use! 😱

How I approached this problem when I faced it

In my day job at Experian I was having to update the local env vars for my primary project multiple times a day; often going into the file just to check what the current configuration was.

There were two use-cases:

Target a locally-mocked backend
Target a hosted, integration-testing backend

Switching these involved changing a number of env vars, and it was easy to accidentally miss one, even when doing it so often.

I had the idea to create a command-line interface (CLI) which would update the individual env vars to the correct value and allow me to choose between mock and int.

My idea for usage was something like this:

Example - Switching to mocked backend

$ update-env mock
  env is now mock

Example - Switching to integration-test backend

$ update-env int
  env is now int

Writing the CLI

When I was thinking about the env var problem, my thoughts were "we have these few vars that need to be updated to switch between mock and int".

Following this, I wrote a CLI which included the values for both, and when executed would go through the .env file and update the vars which required it.

So within the CLI code, I created a list of the vars, and their value for both mock and int, something like:

VAR_A

- mock = http://localhost:6000
- int = https://api.jonshort.me
  VAR_B
- mock = http://localhost:4000
- int = https://auth.jonshort.me

Then when the CLI was executed, the code would read each var in the .env file and if it matched one in the list above, update it to the value based on the current argument.

So usage-wise it fit the original idea:

$ update-env int
  env is now int

Inflexibility

The CLI worked well, and I no longer had to worry about updating env vars, since i could just run update-env and I'd know that the vars would be valid.

Some annoying aspects did start to appear over time though:

If an env var needed to change, it required a change to the CLI code
I could only use it on one project
Other developers would only be able to use the same mock and int options with it

For a while I just accepted these issues since the problem "change these env vars" was solved.

But as I worked on more projects outside of the one which I'd made the CLI for, it became increasingly annoying to remember all the different combinations of env vars.

I decided there must be a way to have the same functionality but written in a way that any project could use.

Re-framing the problem

Now that we need a more "generic" way to handle this functionality, let's think about where the information the CLI needs to function should live.

All possible vars need to exist somewhere in the target project, not the CLI
The "keyword" should be whatever the developer chooses
Developers that don't use the CLI shouldn't be forced to use it or be affected by the changes

With these restrictions in mind, we could allow projects to include a separate configuration file with env vars for the keywords they need, e.g:

{
  "mock": {
    "VAR_A": "http://localhost:6000",
    "VAR_B": "http://localhost:4000"
  },
  "int": {
    "VAR_A": "https://api.jonshort.me",
    "VAR_B": "https://auth.jonshort.me"
  }
}

This would be good for our CLI (since we have a reliable / predictable structure of values) but for users this would be annoying - they're being asked to introduce a separate file, just to help the CLI work out which env vars to use.

For developers who don't know about the CLI, they'd have to deal with this random file without knowing what it was for, and the file would inevitably become outdated.

Comments to the rescue!

Let's see if we can somehow store the data needed for the CLI alongside the actual vars. Out of the box .env files allow comments, which are an easy way for documentation or notes to be added to a .env file.

Example - Comments giving a bit more information about what the vars do

# Main API URL
VAR_A=http://localhost:6000

# Auth API URL
VAR_B=http://localhost:4000

Could we leverage that comments don't affect the actual usage of the .env file, and let developers include their vars for each keyword within it?

We could allow developers to include "blocks" of vars, for each keyword, e.g:

# mock
# VAR_A=http://localhost:6000
# VAR_B=http://localhost:4000

# int
# VAR_A=https://api.jonshort.me
# VAR_B=https://auth.jonshort.me

Then when we execute the CLI, it could uncomment the relevant vars, e.g:

$ update-env int

# mock
# VAR_A=http://localhost:6000
# VAR_B=http://localhost:4000

# int
VAR_A=https://api.jonshort.me
VAR_B=https://auth.jonshort.me

Patterns to the rescue!

We're almost there with the CLI, we just need a clear way for developers to signal which env vars to update.

Previously we added comment "blocks" with the vars for each keyword in. This would probably work, but the CLI would struggle to differentiate between a regular comment and a keyword, e.g:

# mock
# VAR_A=http://localhost:6000
# VAR_B=http://localhost:4000

# my name
NAME=jon

# int
# VAR_A=https://api.jonshort.me
# VAR_B=https://auth.jonshort.me

Here we'd have to think up some way to parse "mock" as a keyword, but "my name" as a regular comment.

...

Here's where patterns come in handy.

We can add an uncommon pattern to the start of keyword comments, allowing the CLI to easily interpret them as keywords!

Let's try it out:

# ++ mock ++
# VAR_A=http://localhost:6000
# VAR_B=http://localhost:4000

# my name
NAME=jon

# ++ int ++
# VAR_A=https://api.jonshort.me
# VAR_B=https://auth.jonshort.me

Now when we check each line, we know that any comment that starts with "++" is probably a keyword.

To parse this using regex it'd be something like:

/^#+ *\\+\\+ *(\\w+)/g

With this regex we can use the capture group to give us the keyword without much hassle 👍

Testing it out

$ update-env mock

# ++ mock ++
VAR_A=http://localhost:6000
VAR_B=http://localhost:4000

# my name
NAME=jon

# ++ int ++
# VAR_A=https://api.jonshort.me
# VAR_B=https://auth.jonshort.me

$ update-env int

# ++ mock ++
# VAR_A=http://localhost:6000
# VAR_B=http://localhost:4000

# my name
NAME=jon

# ++ int ++
VAR_A=https://api.jonshort.me
VAR_B=https://auth.jonshort.me

Nice! This works, and we're not seeing any issues with the "my name" area 🎉

Looking back

So now we have a solution that:

works on any project
can accept any keyword
entirely developer-configured, no need for CLI changes
can be totally ignored by developers that don't use it

We were only able to achieve this because we re-framed our problem from "how can I update these vars in my project" to "how can we enable anyone to update env vars on any project".

There's often a balance with this type of re-framing, as it can easily fall into "scope creep" whereby we spend so long thinking about related issues that the solution becomes bloated or unmanageable.

Generally it's worth manipulating your problem when starting a task, as it's less harmful at that stage if the solution needs to be approached in a different way.

Keep in mind the balance between generic / tailored solutions, as generic solutions aren't always the best way to go about something if the tailored solution would offer better performance, maintainability, or flexibility.

Quick plug - cenv

I tried not to go into the actual code behind the CLI too much in this article, mostly because it's focused on the problem / solution relationship.

The code around the CLI as referred in the updated solution is available on github as "cenv" (short for "change env") - https://github.com/JonShort/cenv

I use it on most of my projects now, as a quick way to switch between envs without having to remember a load of values 😅

Please check it out, it's written in rust, but feel free to copy the idea into another language (it'd be nice if it was a bit easier for users).

Thanks for reading,

Jon 😃