This weekend, I was just checking Google Webmaster Tool for my blog when I found out that the site property is missing. The problem was that Google failed to verify my site because there was no Google Analytics tracking code available (I choose GA tracking method to verify the site in Webmaster Tool).
I'm pretty sure the tracking code has been installed. It must be a bug from Google side, I thought.
I clicked the button to check my site again. To my surprise, it showed an error that my site is failed to load. I opened new tab and enter my site url and all I keep getting was the spinner until a few minutes later Google Chrome displayed an error saying it was connection timeout or something.
That's weird. I use now
to deploy my blog, and I think they'll restart the server if there is an error. I searched the zeit community slack channel and confirmed that indeed, now
will restart my app if it crashes.
But it didn't.
I started investigating this weird behavior. The only clue I have is the latest error reporting from Sentry.
Error: connection timeout
node_modules/f13b11854ffeb4f8d2920fd5cedc61d358741cfd/lib/drivers/node-mongodb-native/connection.js in Db.<anonymous> at line 168:17node_modules/f13b11854ffeb4f8d2920fd5cedc61d358741cfd/lib/drivers/node-mongodb-native/connection.js in Db.<anonymous> at line 168:17
I think I know what crashes my server. Somehow, sometimes my (free) mongodb server is unresponsive, causing timeout when trying to access the page that needs DB connection.
The problem is: why didn't now
restart my app after crash?
Local Reproduction
For making sure it's not now's fault, I tried to reproducing this behavior locally.
I assume now
uses something similar to nodemon
or pm2
to manage services and restart them in the event of an error. I installed both binary in my local machine, and start the server. After the server is started, I kill my local mongod
instance, and then visit the page. Both nodemon
and pm2
prints the error when the page is visited after mongod
is killed.
Interestingly, neither of them tried to restart the server after the error. Even after turning mongod
back up, and visiting the page again, the server is still crashed.
Here's the server code that connects to db and create an http server.
import mongoose from 'mongoose';import express from 'express';const app = express();mongoose.connect(MONGODB_URL, opts, (err) => { if (err) { console.error(err); process.exit(1); } app.listen(HTTP_PORT);});
As you can see, I've already handled the error case by exiting the process if there is an error. So if there is an error, the app will quit and now
will restart it.
Turns out, I only add error handling for initial connection to the mongodb server. Any connection failures that happen after initial connection will crash the server with no way of recover unless it's restarted manually.
Working on a Fix
The fix is quite straightforward, I got these by searching express and mongoose integration. I didn't know that mongoose.connect
returns an object with connection
property that behaves like EventEmitter
. This connection property can listen to various events like error
, disconnected
, and open
.
Initial error handling will be moved as callback for error
event. Attaching listener on express is moved as callback for open
event. The only missing handler from previous code is for disconnected
event. This is where we handle connection failures in the middle of request, we simply reconnect to the mongodb server.
Heres the code after the fix:
function connectToDatabase() { // return connection property from mongoose.connect call return mongoose.connect(MONGO_URL, options).connection;}connectToDatabase() .on('error', console.error.bind(console)) // reconnect .on('disconnected', connectToDatabase) // we use once because we don't want to add new listener after disconnected .once('open', () => app.listen(PORT));
Honestly, I still don't know why these crashes are unrecoverable. But at least for now, my blog is up and running again. I also added health check monitoring just in case.