Best Practices for Bootstrapping a Node.js Application Configuration
Follow these best practices to bootstrap a Node.js application configuration in a robust and maintainable way using env-schema.
Bootstrapping a Node.js application often requires loading configuration, whether from environment variables, or static configuration files. In a previous article, I shared my opinions on environment variables and configuration anti patterns in Node.js applications and in this article I’d like to share my approach to avoid those anti patterns.
Traits of robust Node.js configuration
What makes a good configuration architecture for a Node.js application? Regardless if your choice is working with process.env
environment variables or other mechanisms, here are some traits I personally look for when evaluating an approach for configuration management:
- Async — I want to be able to load configuration asynchronously. This is especially important when loading configuration from external sources, such as a database or a remote API. However most configuration patterns I see in Node.js applications are synchronous, such as:
const config = require('./config.json')
- Schema defined — I want to be able to define the types of my configuration values, and have them validated at application bootstrap, before the application is entered into a ready state. Without a schema, a Node.js app might bootstrap successfully, but fail at runtime due to a missing or invalid configuration value.
- Configuration manager — I want to be able to access my configuration values from anywhere in my application, without having to expose implementation details such as the configuration file path or environment variable names. The following is an example of Node.js configuration anti-pattern:
const db = require('db')
db.connect({
host: process.env.DB_HOST,
port: process.env.DB_PORT,
user: process.env.DB_USER,
password: process.env.DB_PASSWORD,
database: process.env.DB_NAME
})
- Configuration factory — I want to be able to create multiple configuration instances, for example for different environments, or for different parts of my application. This is especially important when writing tests, where I might want to create a configuration instance with different values than the production configuration. Often the anti pattern followed by Node.js developers is to use a CJS configuration module and passing it around, which forces a singleton pattern.
Practical approach to Node.js configuration
In this second part of the article I’ll demonstrate the proposed configuration manager approach in a step-by-step process which outlines the libraries in-use, and outlines which anti-patterns to avoid.
Step 1: Use configuration data sourced from .env
files
The first step is to load configuration data from a .env
file. This is a common pattern in Node.js applications, so any of you whom are fans of the dotenv library will be familiar with this approach. However, note, I won’t be using dotenv
directly.
The .env
file is a simple text file, which contains key-value pairs, such as:
PORT=3000
LOG_LEVEL=debug
DB_HOST=localhost
DB_PORT=5432
Important to note:
- ✅ the
.env
file is not committed to source control - ✅ the
.env
file is added to.gitignore
(along with its permutations such as.env.local
,.env.development
,.env.test
, etc) - ✅ In production, the configuration values are sourced from environment variables, not from the
.env
file. - ✅ I do not have any other environment designation such as
.env.development
,.env.test
, etc. I only have a single.env
file.
Step 2: Use configuration schema with env-schema
Configuration details are defined, validated, and enforced using a schema.
I’m using the npm package env-schema to the define the configuration schema. For example:
import EnvSchema from "env-schema";
const schema = {
type: "object",
required: [
"PORT",
"LOG_LEVEL",
"DB_HOST",
"DB_PORT",
],
properties: {
PORT: {
type: "number",
default: 3000,
},
LOG_LEVEL: {
type: "string",
default: "info",
},
DB_HOST: {
type: "string",
default: "localhost",
},
DB_PORT: {
type: "number",
default: 5432,
},
},
};
This allows to define configuration details and enforce:
- A specific strongly typed value, such as
number
orstring
for a configuration item. - The configuration item is required, and must be present in the configuration data. Otherwise, the application crashes at bootstrap when configuration is initialized. This is a good practice, often known as fail fast.
I previously written an article titled Configuration Decoded: Lesser-Known Tips for Working with env-schema in Node.js which goes into more details about the env-schema
library and how to use custom validators in env-schema.
Step 3: Use a configuration manager instead of a singleton
Node.js module system, commonly known as CommonJS (CJS), follows a singleton pattern. This means that when a module is imported, it is cached and subsequent imports of the same module return the same instance. This could lead to challenges introduced during tests, where you might want to create a new instance of the configuration for each test, with different values.
Avoid the following anti-pattern:
// config.js
const config = require('./config.json')
module.exports = config
Instead, create an abstraction wrapper around the configuration data, ideally a class, which can be instantiated multiple times and hold its state internally instead of holding it via global variables.
The following is a simple class blueprint for a configuration manager that is responsible for loading the configuration data, validating it against the schema, and returning the configuration data.
class Config {
constructor() {
this.config = null;
}
async load({ envSchemaOptions = {}, overrideConfigData = null } = {}) {
// load configuration..
// ..
return this.config;
}
}
export { Config };
Step 4: Load configuration asynchronously
It is common for Node.js applications to load configuration synchronously, because developers rely on the Node.js module system (config = require('./config')
). This is also quite popular in codebases that use dotenv
which recommends a similar pattern: require('dotenv').config()
.
There’s nothing inherently wrong with configuration data that is loaded synchronously, but as you’ve probably experienced with JavaScript, a single function’s update to asynchronous pattern (Hi async/await!) can lead to a cascading effect of code refactors across the rest of the codebase.
Therefore, I recommend to always load configuration asynchronously, even if it is a simple .env
file. If in the future you’d need to load secrets via remote AWS KMS (Key Management System) or similar external resource use-cases, you wouldn’t have to pay the cost of code refactor.
Whether your choice is class-based or function-based, make sure you load configuration asynchronously. For example:
class Config {
// [...]
async load() {
// load configuration..
// ..
return this.config;
}
}
Step 4: Don’t leak configuration properties
If you use dotenv
with .env
files, you might be familiar with the following pattern:
// Load the configuration:
// (which really means it populates process.env with the configuration)
require('dotenv').config()
// Then later:
const dbHost = process.env.DB_HOST
No matter what, never access configuration data via process.env
directly. This is a common anti-pattern in Node.js applications, and it is dangerous because it exposes the implementation details of the configuration data, such as the environment variable names. It also exposes the entire process.env
object which might contain sensitive data that you don’t want to leak.
In this example, the configuration is loaded via dotenv
and the configuration values are accessed via process.env
. This is dangerous because the data in process.env
holds more than just the configuration data, it also holds the environment variables of the current process and potentially other sensitive data that might end up leaking.
Instead, make sure that you are always forcing that the configuration data you have in the application’s state is only scoped to the configuration data that you defined in the schema, and not the entire process.env
object.
If you work with the env-schema
library, you get this by default when working with a defined schema. If you work with dotenv
, make sure you return the config object from dotenv.config()
such as:
import dotenv from 'dotenv'
const config = dotenv.config()
// your configuration data is now available via
// `config.parsed` property
Note, that config.parsed
in the above configuration loading example doesn’t enforce or validated the configuration data against a schema, it only scopes the configuration data to the .env
file.
Summary
In concluding thoughts, if you are using a different library than env-schema
and your own configuration wrappers by upholding the above configuration best practices then you are doing it right. My personal choice is env-schema
with Fastify
.
As an addendum, I also use Awilix for dependency injection and here’s how I would use it to inject the configuration manager into my application:
// Load configuration
const configManager = new Config();
const config = await configManager.load();
// Other initializations [...]
// Initialize DI
const diContainer = await initDI({ config, database });