Kill Your Email Server

Are you running an email server? Postfix? Exim? qmail? Kill it.

Cyrus IMAP? UW IMAP? POP3d? Kill, kill, kill. RoundCube, SquirrelMail, Horde? Kill.

Ask yourself one simple question:

Do I want to be an expert in email hosting, or do I want to get back to coding my app?”

Unless your business is email hosting — it’s time to kiss your email server goodbye.

Make it Someone Else’s Problem™

Use Google Apps. For $5/user/month, you get the great Gmail interface, simple management tools, and no spam, EVER. You cannot beat Google at email hosting. Why?

  1. Email sucks.
  2. Hosting email eats your time — blacklists, spam control, and security patches are just the start of your woes.
  3. As an entrepreneur, you have no time to spend screwing around with email.

The best devops automation in the world isn’t going to make email suck any less.

When crap breaks, I want to focus on my applications. I don’t want to troubleshoot mail. I happily pay $5/month to NEVER have to think about it.

Do I take email for granted? Absolutely. I want to open my browser and get my mail and never have to think about how it gets there.

There are only two questions to answer:

  1. How does mail get to me? and
  2. How do I send mail?

Inbound Mail

Google Apps. Google Apps. Google Apps.

But I don’t trust Google!” — someone

Last I checked, Google buys companies they’re interested in, like Blogger. I doubt their competitive advantage is from reading private email.

BUT what about encryption and privacy!?” You do know that the courts will simply order you to decrypt your email, right? If your business requires secrecy from the US government, then go ahead and close this tab, this isn’t the article for you.

Stripe uses Google Apps. They make lots of money. Google Apps even lets them do some amazing things with internal communications.

Could Stripe hire a top-notch sysadmin to do their email? Yep. Would the new hire, plus servers cost more than $50/user/year? Yep. Is “great email hosting” a core of Stripe’s business model? Hell no.

Outbound Mail

Use SendGrid, Mandrill, or anything else. You can start with Google Apps, but they cap outbound messages at around 500/day, easy to hit with a growing SaaS.

Kill your mailer daemon and use nullmailer. Postfix, Exim, etc., are fine as well, but complete overkill. You don’t want to run a public email server, remember?

TL;DR: Life is too short to run an email server.

You have a product to build, customers to win over, family and friends to enjoy time with, and your own health and well being to look after. Pay someone else to worry about email. Focus on what’s important.

To mistreat an aphorism: “No person, on his deathbed, says, ‘I wish I spent more time troubleshooting email.’”

Tagged as: email priorities basics

How about some *useful* email?

How'd you like tips like this delivered to your inbox?

powered by TinyLetter

I get it, you're busy too, so I won't clog up your mailbox with three dozen useless messages. Nothing but short and practical information that you can use right away. Deal?

10 Minutes to Increase Your Customer Satisfaction during Downtime

If you haven’t yet, you should sign up for a Pingdom account. I don’t care how good your nagios setup is, just remember: “who watches the watchers? ” I can’t count the number of times that Pingdom has noticed and alerted on our downtime first, BEFORE the monitoring system.

Yep, it’s $10 a month, but it’s dead-simple to set up and provides an automated and much-needed “down for everyone or just me?” sanity check on your servers. That’s not an affiliate link above, by the way. I don’t care whether or not you use Pingdom, but I’ve found it to be a very helpful tool.

Customer Satisfaction?

OK, so I promised increased customer satisfation — what gives? Well, Pingdom has another trick up their sleeves: hosted status pages, which easily justifies the price of the service:

http://royal.pingdom.com/2011/04/01/public-status-pages-under-your-own-custom-domain/

With the addition of a DNS entry, suddenly you have status.myapp.com available for public consumption. As I’ve written about before your customers don’t care about your uptime… until you’re down. Suddenly, you’ll find that they care a lot about what the %#@$ is going on!

Downtime Visibility

Downtime Visibility is the name of the game. If your app goes down and your customers don’t know why, or when it might be back, then you can kiss them goodbye — either to the immediate “SCREW THIS” factor or the long slow death of a thousand frustrations.

It’s simple: when your app’s down, tell your customers right away. Executed correctly, it looks like this:

@myapp (3 minutes ago) The site’s currently down. We’re working to diagnose and fix the problem, updates here and at status.myapp.com.

This communicates three key things:

  1. You know that there’s a problem.
  2. You care that your customers know about it, enough to have a status page and tweet about it right away.
  3. You’re working to fix it.

If your downtime lasts longer than 30 minutes, then tweet every 30 minutes, even if just to say “Still working on it.” In the absence of communication, people will assume that you don’t care. Obviously, you need to fix the problem, but it’s worth the 20 seconds (MAX!) it takes to send a quick tweet.

Other Tools

OK, I hear you, Pingdom’s status page is a bit spartan and you’d like something more. Here are some ideas:

  • S3 static site hosting. Throw up a static HTML or text file.
  • GitHub pages.
  • A cheap hosting account. All you need is index.html, no dynamic scripting necessary.
  • Blogger, WordPress, etc.
  • A redirect to your @myapp (or @myapp_status / @myapp_ops) twitter feed (make sure that the redirect’s hosted by your DNS company).

I wouldn’t recommend: - Putting your status page on your app server, or even your primary web server — this defeats the purpose. - Using a name other than status.myapp.com. People know to check there, and it’s easy to remember. See http://status.foxycart.com http://status.linode.com https://status.github.com, etc.

Now if you want to get fancy, you can: use a static site generator, a blog engine, etc. Style that page, brand it, go hog wild.

Just BE SURE to get VERY comfortable with your tools before the crap hits the fan. If your site goes down, and you’re having trouble updating your status page, or your status page gets overloaded… What’s the old saw? “Now you have two problems.”

Keep it simple.

In conclusion…

  1. Plan for downtime.
  2. Know how to use your status tools AHEAD of time. Make sure you can quickly and easily post to Twitter and update your status page.

Further Reading

http://www.kalzumeus.com/2010/04/20/building-highly-reliable-websites-for-small-companies/

Patrick has an internal status page that exercises all of the parts of his system, and has Pingdom check THAT page. Much more complete than just a “hey, is http://myapp.com up?”

Tagged as: downtime pingdom basics status

My First 5 Minutes on a Server (with Ansible)

So today, Bryan Kennedy wrote this nice post about how to set up some basic security on a fresh Ubuntu Lucid box.

Cut to Hacker News, with this extremely helpful comment:

The premise of this thing is not good advice.

1) Your first couple minutes on a server should be used to install a configuration management client, if your bootstrap policies somehow don’t already install one.

This stuff isn’t hard. It’s worth doing right.

Nothing like the express lane from “here’s some practical advice that you can use to secure your server today” and “NOPE, you need to be a big-shot sysadmin to perform basic configuration on the box!”

This helpful statement garnered this reply:

This stuff isn’t hard. It’s worth doing right.

Can you provide an article as equally succinct as the OP’s that provides this information? Your list is painfully devoid of anything of true value. Since it’s not hard, and worth doing right, I imagine something should already be written.

Let’s see if we can find some middle ground.

Introducing Ansible

I LOVE Ansible. It’s a nice little tool that falls somewhere on the spectrum between “Do everything manually” and “Learn a ridiculous configuration language and beat your head against the Puppet manual for three hours in order to make a small change.”

(I don’t think I’m a stupid person, but DAMN do I feel dumb when I’m working with Puppet configs. And before you ask, I worked with Puppet for a year and found it consistently painful.)

Key differences between Ansible and just about anything else:

  • Ansible is small, just a couple thousand lines of well-tested code.
  • There is no central server.
  • There is no software to install on managed servers.
  • There is no configuration database.
  • Configurations are text files that can be read by humans.
  • Ansible connects over SSH and uploads small Python scripts to do its work.

NOT included

I’m not going explain how to install Ansible on your workstation. Sorry, that’s a topic for another post. There’s decent documentation over at the Ansible site.

I’m not going to discuss how to write an Ansible playbook. Again, see documentation and maybe my pedantic playbook example.

I’m not going to talk in depth about my personal security preferences versus the original author’s. My purpose is simply to show that it’s equally easy (and possibly better) to automate server configurations.

(On that note, I’m going to skip firewalling for now, because it’s trivial to add, and because it’s a broader topic. fail2ban is a great start on locking down the server.)

So, without further ado…

First 5 minutes: “use configuration management” edition

Let’s restate the original post’s requirements:

I like to have a single deploy user for people to log in with. This user has a complex password stored somewhere safe. Regular users log in using their SSH keys. No one is allowed to SSH in as root, they can only use sudo as the deploy user.

I want to use fail2ban to stop SSH scanning robots from trying users and passwords all the livelong day.

I also like to automatically install security updates so I don’t get hit unawares by a new exploit, and so that I don’t have to think about updates often.

I like to have logwatch email me every night with a summary of the day’s logs.”

As a bullet list

Here’s what I think you should do to a new server:

  • Set the root password.
  • Create the deploy user.
  • Lock down SSH using fail2ban, disable root access & password authentication.
  • Add my team’s SSH keys to the deploy user.
  • Set up automatic security updates.
  • (Lock down the firewall)
  • Set up logwatch to run nightly (and also configure outbound mail).”

The first thing we have to do manually, but everything else can be easily done with an Ansible playbook.

A playbook?

Think of “playbook” as a shell script that does server deployment. “Do this, then do that. If we changed the ssh config file, then restart the ssh service.”

Ansible uses YAML for its playbooks, which means you can read and edit them in any text editor, even Notepad.

The result end is that the configurations are readable by mere mortals, in fact, it looks very much like the commands the original article prescribed.

Let’s proceed.

Step 1: Set the root password

Run:

yourmachine$ ssh root@server

Enter the initial root password from your hosting provider, then run:

root@server# passwd

Step 2: Fetch the bootstrap recipe.

https://github.com/phred/5minbootstrap/

yourmachine ~$ git clone https://github.com/phred/5minbootstrap.git
yourmachine ~$ cd 5minbootstrap

Step 3: Edit hosts.ini

Ansible needs to know about the servers you want to manage. There is no fancy central database, just a text file with a list of servers. Oh, it’s called an “inventory file.”

Edit the hosts.ini that came with the repository. Replace 127.0.0.1 with your IP address, and :2222 with your SSH port (or leave it off if it’s port 22).

[newservers]
127.0.0.1:2222

For convenience I made a newservers server group. The idea is that when I get a new server, I put it in that group temporarily and run the bootstrap.yml playbook.

Step 4: Update the SSH public key.

yourmachine ~/5minbootstrap$ cp ~/.ssh/id_dsa.pub ./fred.pub

For simplicity I provided my public key in the repo. Unless you want to grant me login access to your server, you probably want to change that. :-)

Step 5: Run the playbook.

This is the needed invocation for Vagrant:

yourmachine ~/5minbootstrap$ ansible-playbook -i hosts.ini bootstrap.yml --ask-pass --sudo

Correction 6 Mar 2013: If you are logging into a fresh Linode, or another sytem where you only have the root user, you need to run this command:

yourmachine ~/5minbootstrap$ ansible-playbook -i hosts.ini bootstrap.yml --user root --ask-pass

I have updated the 5minbootstrap repo with a couple small changes to make that work.

Step 6: Go get a cup of coffee.

You’re DONE. I prefer hand-ground French pressed coffee myself. Tea is also fine.

What?!? are you lying?

A little. Ansible takes a little bit of work to get going locally. But it takes ZERO server-side configuration.

It did take me some time to debug this playbook, about an hour. If it takes “5 minutes” to do the original set up steps, and about 2 minutes to do these… I break even on time investment after ~20 servers.

But when I consider that running commands by hand on 20 servers, the fat fingered mistakes I’d likely make, and that in actuality it might take much longer than 5 minutes to do those simple tasks, it seems worthwhile.

Checklists have been studied in hospitals and they are proven to reduce errors and improve surgical outcomes.

A playbook is a checklist that you can EXECUTE. Even better.

Zealotry

Why aren’t you using (XYZ configuration management tool) for EVERYTHING!!! ZOMG you’re doing it wrong.”

Let me take you aside and say for a minute that doing things by hand IS FINE. Especially if you’re not an expert sysadmin, and you’re only managing N servers, where N is approximately 2.

I’ve found that when N > 2 the task gets miserable, and the amount of work needed multiplies with each server.

Before I did configuration management, I kept meticulous notes (a topic for another post). I had a file called /root/JOURNAL.textile on each server and appended an entry with a date every time that I did something significant. Simple, stupid, reliable and repeatable. Manual management + meticulous notetaking: a fine solution.

After configuration management, I still keep meticulous notes, but now my knowledge is also captured and “re-playable” in an Ansible playbook.

TL;DR: Ansible manages configs, they’re human readable, and (mostly) one file.

Here’s the 64-line playbook that does all of this:

https://github.com/phred/5minbootstrap/blob/master/bootstrap.yml

I think that it’s human readable. Disagree? Hit me up on Twitter.

Tagged as: ansible bootstrap basics configuration management

Hey — thanks for reading!

My name is Fred, and I'm a web developer by trade, Linux sysadmin by necessity. I want you to win at hosting your own web applications.

Server administration doesn't have to come with a side of stomach ulcer.

As a developer you've got most of the skills you need, all you need are some practical ways to up your server game.

Questions? Email me.