A 404 maybe - A Heisenbug Story

Vagmi Mudumbai

We were deploying our webapp on one of the client's servers today. Our webapp is frontended by Nginx which then proxies to a bunch thin servers. Initially everything just worked fine. We cloned the repo from github, copied the nginx config to sites-enabled, ran the rake task to compile our Sass files to CSS files and start the thin servers. The app ran without a hitch. All this time we had been logged in to the system and watching htop, tailing logs and such. Once we were convinced that everything was working fine, we shot an email to the client that they could start using the server now and logged out of the ssh session.

The moment we logged out, the stylesheets and other static assets were not being served by nginx. The client called up and said that the UI looks screwy. We visited the site and we could not believe that all the static resources where throwing up 404s. We logged in again to the machine wondering if any of us had accidentally deleted the files but we found that they were all there. We checked the server again and we found everything was working. We looked at the logs and we found that when we were logged in the requests are served directly by Nginx however when we were logged out of the ssh session, the requests were not served by nginx and as a fallback were attempted to be served by the thin cluster, which failed as well.

This was really odd. They were just static assets - images, js and css. And the worst part is that any attempt made to study the bug rectified it. We knew were dealing with a Heisenbug. We figured that there had to be something that was running when we login and killed when we log out. We looked at .bashrc, .profile and everywhere else. We could not find it. Out of sheer guess work, we looked at /etc/mtab. There it was the home directory of the user was encrypted using ecryptfs. We maintain different apps on the servers under different user accounts. During ubuntu server installation the person who installed it from the client end had chosen to encrypt the home folder by mistake. Thin was serving fine as it had loaded the code on to memory while Nginx read files on demand and was unable to serve any static assets.

All we had to do was move the code out of the home folder and it started serving fine. One heisenbug successfully squashed. Or as Cecelia likes puts it, we squished it with a 10 ton hammer.

Comic shamelessly stolen from PhD Comics.

Posted on 2010-12-24T14:07:30Z by Vagmi Mudumbai Comments
blog comments powered by Disqus