Holy Sheet, Google is Fast

After posting my last post, I figured I'd do what I should have done before I typed it up&emdash;see if someone else had ever done the same thing.

Google had already indexed my post. Jesus, they are creepy fast.

(It is not on Bing or DuckDuckGo yet. A victory for the new evil empire.)

Use safari reading list to send to instapaper  Google Search

Using Safari's Reading List to Feed Instapaper

A week or so back, I started thinking that it might be nice to get away from using the Instapaper bookmarklet (which is great, don't get me wrong) and see if there was a way to use Safari's Reading List (which works on the iPad, iPhone, Mac, and presumably the PC) to send links to Instapaper.

This is about the Mac implementation, but presumably, a PC implementation would work similarly.

Getting the Reading List info to feed into Instapaper was surprisingly easy. I used Ruby, but you could certainly use pretty much any scripting language. The Reading List entries are stored in the Bookmarks.plist file that lives in your ~/Library/Safari directory. It's a binary plist, so you have to run a magic plist command on it to convert it into a text plist.

At that point, if you want to, you can throw that through a regular expression, or into a plist/XML parser to deal with it like an object.

You find the set of entries that define the Reading List links, parse the URLs out of them, and then you can send the links to the Instapaper Simple API to store in your Instapaper account.

I've got some extra jazz in there that keeps track of the last time it ran, only grabs links newer than that, and after submitting, throws up a little Growl message that lets me know the link was submitted successfully.

It's about 83 lines of code (and I'm a pretty crappy ruby programmer) and, so far, it works pretty flawlessly.

But, what good is that if I have to run a script every time I want to sync up my Reading List and Instapaper? Wouldn't it be awesome if there was a way to automatically run the script whenever the Reading List is updated?

It would be.

And there is.

launchd is a nifty cron-like thing that runs on the Mac. You tell it to do automatic stuff and it does it. One of the automatic things you can tell it to do is "watch this file, and if it changes, run this command."

After a bunch of trial and error, I got my little LaunchAgent working, and now, every time I add something to the Reading List in Safari, I wait about 10 seconds, and I get my little Growl notification that the link has been submitted to Instapaper.

Awesome, right?

Even better, now I can add things to my Reading List on Safari on my iPad or iPhone. When my computer next syncs up its bookmarks, it'll submit those links to Instapaper.

Even better than that, if I put the little touch file (that keeps track of when I last updated) onto something like Dropbox, I can run this utility on any computer I want, adding links to the Reading List all the while, and things should appropriately get added to Instapaper.

All in all, it's a nifty little system.

I'm going to throw the whole thing up on Github at some point this week, so you can download it and play with it yourself (and probably make it better).

iOS 5.1 and iTunes Match, Part 2: It Mostly Works Now

As I mentioned earlier this week, the recent iOS 5.1 (and iTunes 10.6) releases fixed a handful of the lingering issues that existed with iTunes Match.

The big remaining issue was really the fact that Smart Playlists that rely on the "limit" option in iTunes still didn't work. (There were a couple of other issues, with tracks that were tricky to get into iTunes Match, and sporadic play count updates.) Well, I spent some time this weekend poking around with things to see if there were any other improvements in the latest release.

And there are. Big ones.

First, I had two tracks that I had never been able to get into iTunes Match. They are some old mp3s I ripped in college (legal rips, even!) and burned to an archive CD during one of my moves between computers. The media seems to have corrupted a bit overtime, and no magic mp3 repair could get them into a state that would make iTunes Match happy. But, you know, new iTunes, new iOS, why not give it a shot?

Well, it worked. They were added and uploaded to iTunes Match in about 30 seconds. No errors, no problems.

So, if you've got some old tracks hanging around that you were never able to get into iTunes Match, it's probably worth another shot.

That brings us to those dreaded Smart Playlists. My big one that had been nerfed by iTunes Match was my "Best Recent Adds", which is the best 50 songs (by rating) that were most recently added to my iTunes. It was useless on iTunes Match, as it seemed to go off of the date the songs were added to my iPhone (or iPad) rather than the date they were added to my collection.

Googling around, I found this Apple support discussion where someone mentioned turning iTunes Match off and back on on your iOS device seemed to make these Smart Playlists work. "What the hell?", I figured.

Well, again, it worked.

I went onto my iPhone, turned off iTunes Match, force killed the Music app (probably not necessary, but I'm impatient), and then turned iTunes Match back on. I waited about 15 minutes for things to load again (here's a tip: go to your Podcasts screen, where there should be a nice iCloud cloud and progress bar -- that'll show you how far you are). Once it had loaded, I checked out my smart playlists&emdash;and now they worked!

Most of them, at least. There's still one or two that seem to act a bit weird (those based off of updating play counts). But the ones based off limits seem to work.

With iTunes 10.6 and iOS 5.1, iTunes Match seems to now:

  • Handle a much larger variety of "difficult" tracks
  • Handle artwork much more reliably
  • Added back Genius Mixes and Genius Playlists
  • Fixed the handling of (most) Smart Playlists

The only thing left, as far as I can tell, is getting play count syncing working. It still only seems to sync the first track you play, never updating the rest of the play counts (which is why some smart playlists won't look like they are working). Something to hope for in iTunes 10.6.1 or iOS 5.1.1, I suppose.

I would say that, if you were waiting for iTunes Match to sort itself out, it's pretty darn close. For my money ($24.99/year, in fact), it's easily worth it, just for the off-site backup of my music (and the streaming to my Apple TV).

iOS 5.1 and iTunes Match (Genius is back, baby!)

As mentioned in previous posts, there were a few things (Genius, better artwork syncing) that would probably have required an iOS update to get working.

Well, iOS 5.1 is here and Genius is back and it is glorious. Right under playlists is "Genius Playlist", like it used to be, pre-iTunes Match. One thing to be aware of, though, is that it will pull in music that isn't local to your phone so if you're somewhere without an internet connection, your Genius playlist might not have quite as much music as you'd hoped. And, your phone will download the songs that are in the cloud, so if you keep a carefully cultivated music library on your iOS device, you'll be littering it with anything that gets thrown into your Genius playlist.

The jury is still out a bit on artwork, but my iPhone does seem to be far more responsive downloading artwork, and it will load the artwork without having to leave and come back to the song. And once downloaded, at least in my very, very early testing, the phone seemed to be doing a better job of caching that artwork. That's at least promising.

Slowly, Apple is ironing out the wrinkles in iCloud and iTunes Match.

A Quick Bit About WordPress Caching

WordPress, being a PHP-heavy, database driven CMS, can hit some performance issues. Namely, without some basic caching turned on, WordPress can be dog slow.

Working for a shared web hoster, WordPress performance can be a problem. It sort of hits the sweet spot of everything that can possibly go wrong with shared hosting: loads of php file opens, a bunch of database requests, very little that can be easily cached up stream with a caching proxy like Varnish or Squid.

So, both out of curiosity and a desire to make things better for customers, I did some quick benchmarks of WordPress with no caching, with Hyper Cache, and with W3 Total Cache.

Based off my past experience (and, really, based off of little more than a bit of speculation), I figured Hyper Cache might perform best on shared hosting. It's simple, uses disk based caching, and doesn't seem to require a lot of overhead. I speculated it would perform a bit better than W3 Total Cache (as I thought, on a shared host) the overhead of W3 Total Cache might slow it down.

I was very wrong.

I ran a couple of tests. First, a sort of low traffic test (1000 requests, 20 at a time). The details:

Test 1, Low Traffic (1000 requests, 20 at a time)

No Caching

Metric Time (seconds)
Average Request 0.8
Longest Request 3.9
80% of Requests Handled In 1.2

That's not so bad, but this is a pretty light traffic load.

Hyper Cache

Metric Time (seconds)
Average Request 0.4
Longest Request 3.7
80% of Requests Handled In 0.4

Well, that's a good bit better, isn't it? Caching, not surprisingly, helps.

W3 Total Cache

Metric Time (seconds)
Average Request 0.07
Longest Request 0.3
80% of Requests Handled In 0.09

Crikey. That's not just better, that's basically as fast as you can load static content. That's insanely good.

Ok, well, under low traffic load, W3 Total Cache wins. What happens when we crank it up a bit?

Test 2, High Traffic (5000 requests, 100 at a time)

No Caching

Metric Time (seconds)
Average Request 1.7
Longest Request 11.3
80% of Requests Handled In 2.6

As expected, performance goes down as we start getting our big traffic influx. 11 seconds to handle the longest request, and it's about twice as bad as under light load.

Hyper Cache

Metric Time (seconds)
Average Request 1.8
Longest Request 15.3
80% of Requests Handled In 3.5

The first real surprise: Hyper Cache basically tips over under this load. I'm not sure why this would be (maybe Hyper Cache doesn't play well with NFS), but this is bad news for making our site run quickly.

W3 Total Cache

Metric Time (seconds)
Average Request 0.3
Longest Request 3.5
80% of Requests Handled In 0.4

Wow. W3 Total Cache just crushes it. Sure, it's slower than it was under light load, but it's basically still performing as well as it would if it was a static site.

That's pretty unreal performance. And, that's out of the box with almost no tuning. I'm guessing if I played a bit more, I could squeeze a bit more performance out of W3 Total Cache.

There are a million other caching plugins. I'm not sure, though, that'd I'd look much further. Even if another plugin could perform better than W3 Total Cache, it can't perform that much better. Given how widely used W3 Total Cache is, meaning loads of tips, tricks, and support on the web, I'm pretty sure this is my recommended caching plugin (and something that we might start installing automatically in our hosted WordPress install at work).

Website Truck Day

(This will get a little navel gazy, so if you don’t care about how this site is hosted, feel free to skip it.)

After a little more than six years hosting this site at DreamHost, I packed up and moved it, whole hog, to a VPS instance at Linode. It was less work than I expected (all in, maybe 8 hours), but probably more work than the average joe managing a blog would want to take on. I’ll explain a bit more later about exactly what went into the move, but first …

(Truck Day is the day that a baseball team loads all their gear into trucks and heads down to Florida/Arizona for Spring Training.)

Why move?

The decision to move off of shared hosting onto my own VPS is one that I’d been thinking about for a while. Honestly, I didn’t need to move for any traffic reasons. I get a handful of hits, depending on what I write about, and don’t go out of my way to pimp anything. I was a bit frustrated with the performance of my site on shared hosting (the fact that I don’t get a lot of traffic is sort of irrelevant when you are hosted on shared hosting – you’re all, basically, sharing the same set of resources).

Mostly, though, I’m a dork and I do web stuff for a living, so I wanted a place to mess around. I wanted a box I could login to and do development work, or write occasional scripts, or host code repositories for my own work. That’s something that just isn’t terribly easy/advisable to do on a shared account. DreamHost, actually, is a shared host where that sort of thing is possible, but it’s just a little icky to do that sort of stuff on a shared box (where, if security isn’t perfect, you’re potentially exposing your stuff to other folks, and where performance is never going to be as good as on your own slice of a server).

On top of all that, DreamHost has had a bad run lately. I don’t blame them (well, I do, but I don’t hold it against them), but they’ve had outages and some security issues, and then general problems that seem to come with growth and adding complexity to a company that used to be very narrowly focused on just hosting. Not that I know anything about that. I just didn’t feel like hanging around and letting them figure it out, on my dime. It happens. Sometimes you just grow apart. Anyway …

Linode is one of the premier VPS providers (VPS being where you are renting a dedicated slice of a server, just for yourself, rather than an entire server). They’ve got good pricing, good documentation, and a good reputation.

So I bit the bullet and bought a VPS to start the process of moving.

Moving Day

Moving Day was never intended to be Moving Day. I was going to spend a week or two, getting things ready, copying over data, testing, making sure everything was happy, and then cut over one service at a time (one site, a second site, mail).

Once I had provisioned my new VPS, I started by just getting things updated with the latest patches and all (much easier than you might guess, given all of the fun package manages that exist these days). Then, using those same package managers, I set about installing a web server, various scripting languages, mysql; all of the bits that you need to run a website can be reasonably easily installed with a few commands.

(This isn’t news to me, I’m just sharing.)

The other nice bit of doing things the system way, rather than rolling your own, is that you get to take advantage of system level patches and security updates. If you’re not doing anything crazy, where you need a lot of custom modules or code compiled in, using system packages is a very, very nice way to keep yourself up-to-date.

Throwing a firewall up, and doing a few other security things, and I kinda felt like I was done my setup. In about 3 or 4 hours, I was seeing a test page, had a database, and (hopefully) had things secured reasonably well.

A Quick Note About Security

One of the great things about managing your own site on your own server, virtual or otherwise, is that you are in complete control. Want better performance? Throw up a reverse proxy or caching server. Want some cool new tool that your host won’t install? Install it yourself.

It’s a great amount of freedom. It’s enough freedom, that let me rewrite that previous sentence

One of the worst things about manage your own site on your own server, virtual or otherwise, is that you are in complete control. Forget to setup a firewall? Enjoy having all of your data deleted and your site used for malware. Leave your mail server open for relaying mail? Spend a few weeks wondering why your mail to everyone is flagged as spam.

It cuts both ways. I do this stuff for a living and even I’m paranoid I’ve done stupid stuff. Google around, ask friends, turn off things you don’t need running.

Never login to your server with a password. Always setup SSH keys and use those. Just trust me.

Back to Moving

With the server now, in theory, all configured, I figured “Well, might as well move over my static HTML site.” I keep a backup of my stuff locally (you keep a backup of your site, locally, right?), so I just SCP’d everything up to the server, and there it was.

(Well, when I say “there it was” what I really mean is “There it was, in raw HTML source, connecting through HTTP Client on my Mac, since the way some servers work—including the one I setup—is that your need to connect using both the IP and a “host”, which is hard to do through a browser until you’ve cut over DNS, and you don’t want to cut over DNS until your site is ready, and you can’t tell if your site is ready until you’ve seen it … confusing enough? Google around for adding things to your hosts file, which may be easiest for most folks.)

Since my static site worked, I went to DreamHost to cut over the DNS. I will keep my vitriol to a minimum here. For all of the good things DreamHost does, their control panel/management area blows. It really, truly does.

To change DNS, I had to turn off hosting for my website. Which means my website would go down for as long as it took DNS to show up changed around the world. That could be anywhere from a few minutes to a few hours. Awesome. Appreciate that, DreamHost.

I bit the bullet, broke my site momentarily, and waited for DNS to change. After about 30 minutes, I could see that my site was up and running on my VPS. Success!

Once that was done, I moved my blog. It’s Wordpress, it’s got some plugins, and I figured I’d need to do some reconfiguring (just to make things work perfectly in the new environment). Again, I uploaded my site (local backups, friends), clearing out the statically cached files from my caching plugin of choice. There’s the fun extra step of running a database backup (to copy it to my new site), and a couple of config changes.

A couple of quick things of note:

  • Remember to download copies of anything you upload via your blog client, which you may not have backed up locally
  • Take this opportunity to clean up old plugins, old junk data, bad database tables, etc.

Again, I do the HTTP Client check, go to DreamHost, break my site temporarily, change DNS, wait for the changeover, and do the Snoopy dance of success.

All that was left was mail.

Mail is confusing as hell, there’s very little good documentation for it, and if you get it wrong, you can end up in very bad shape. My advice, outsource it to Google or any shared web host, where you can probably do it for free.

I ended up forwarding my mail (which I think I’ve gotten secured down to where it works, but won’t say for sure that I have a clue what I’m doing). This was another amazing process with DreamHost’s shit-tastic DNS tools that required an email to support, where they (without asking me) disabled my mail to point it to my new site, leaving my mail broken for 4 hours.

Not my favorite experience (especially since I was dealing with it at a basketball game), but once it was done, I was officially on my VPS.

The End?

In the end, it’s been worth it so far. It’s costing me a bit more each month (about $10/mo more, plus another $10/year for my domain renewal, that used to be free with my hosting), but my site is performing considerably faster—a few seconds per page load, on average, I’ve setup some handy things that let me manage my life a bit easier, and I’ve got a server I can pop into when I need to do some work.

It’s great for me.

I wouldn’t recommend it for most people. In fact, I wouldn’t recommend it to nearly anyone. For the average person, I’d recommend you stick with a shared web host (email me for some recommendations, if you need one). Let someone else deal with all this server crap. You focus on your site.

For me, the server crap is what I kinda enjoy.

The Live Music, as the Kids Say

Over the past few months, my concert going (attending "the live music") has ticked up. And it'll pick up even more over the next couple of months with shows by Kaiser Chiefs, Fanfarlo, Delta Spirit, and We Were Promised Jetpacks (and I think a couple of other shows I can remember).

Since I don't have much else to say, here's some video from some recent shows.

The Sheila Divine

The Parkington Sisters (covering Radiohead!)

Fanfarlo

We Were Promised Jetpacks

The Self-Fulfilling Prophecy of Bad Release Processes

If you need to wait any non-trivial amount of time between completing something and seeing how well it’s
performing, you’re not going to be working on that project by the time you get your answer. When you do get your
answer, you’re not only going to have to refresh your memory on what you had been working on, but you’re going
to have to do the same on whatever else you had started working on.[1]

I will admit, I’m currently a bit frustrated with our team’s development and release process. In the name of stability, we have given up speed, agility, and, honestly, stability.

Ironic, eh?

When you’re working on web stuff, having a long cycle between when something leaves your keyboard and when it is live on production servers being poked at by real, live customers is a bad idea for a whole bunch of reasons.

There’s always something different about production than your staging environment. It doesn’t matter how well prepare you, it just happens. Different security, some small schema change, a firewall rule. Something. The longer you wait between the last time you worked on your code and when it is in front of customers, the more likely something else will change, too. And the less likely you’ll remember how your stuff worked.

Somehow, handing over code to the QA team is considered to be a sanctified handoff. “My part is done. I’ve given the code to QA.” The fact that it doesn’t quite work, has significant issues, or blows up is irrelevant. The division of labor between developing and deploying makes it far too easy to pass the buck and make the brokenness someone else’s problem.

And, I think worst of all, you take your eye of the ball. If you’re engaged in your release, feel responsible for it, you can do smart things like push it live, make sure it’s working, take a look at some logs to see if your assumptions are right. Maybe you speculated on some language or colors, and it’s just not converting. Change it.

When you’ve decoupled your development and release processes, you don’t have that fine grained control, that immediate responsiveness. Instead, you need to get someone off of another project, have them dive back into the code, put it through another release process, and repeat it all over again. A change you could have made in five minutes and might have benefited your company or customers ends up in another week long QA process where, inevitably, some inexperienced QA person digs up some completely unrelated bug, and your watch your life just waste away …

Yes, long, rigid release cycles are necessary for software products that actually have to ship code, or might destroy data, or blow away finances, or do really bad things™

But if you’re working on the web, all they do is make things worse. I sincerely believe that. They don’t make things more stable (too many other things changing; too much context switching to remember why you had that odd bit of code), or make your releases smoother (too much passing of the buck; too many people involved; too many places where someone can say “stop the presses” because their nose is out of joint that day). They just make people feel better, and they let people cover their asses, so that the self-inflicted pain of choosing that release style can’t possibly be blamed on them. It’s a self-fulfilling prophecy. By making the process to allow people to cover their asses, you ensure that they have something to cover them about.


  1. Andrew Morrison from the above-linked post  ↩