<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd"
	xmlns:media="http://search.yahoo.com/mrss/"
>

<channel>
	<title>That Not So Fresh Feeling &#187; Web hosting</title>
	<atom:link href="http://blog.ryantoohil.com/category/web-hosting/feed" rel="self" type="application/rss+xml" />
	<link>http://blog.ryantoohil.com</link>
	<description>Stuff.</description>
	<lastBuildDate>Wed, 08 Sep 2010 05:36:00 +0000</lastBuildDate>
	<language> </language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
	<!-- podcast_generator="podPress/8.8" - maintenance_release="8.8.6.3" -->
	<copyright>2006-2007 </copyright>
	<managingEditor>ryan@ryantoohil.com (That Not So Fresh Feeling)</managingEditor>
	<webMaster>ryan@ryantoohil.com (That Not So Fresh Feeling)</webMaster>
	<category>posts</category>
	<ttl>1440</ttl>
	<itunes:subtitle></itunes:subtitle>
	<itunes:summary>Just another place for just another jackass to rant about sports, politics, entertainment, technology, and life.</itunes:summary>
	<itunes:keywords></itunes:keywords>
	<itunes:category text="Society &amp; Culture" />
	<itunes:author>That Not So Fresh Feeling</itunes:author>
	<itunes:owner>
		<itunes:name>That Not So Fresh Feeling</itunes:name>
		<itunes:email>ryan@ryantoohil.com</itunes:email>
	</itunes:owner>
	<itunes:block>no</itunes:block>
	<itunes:explicit>no</itunes:explicit>
	<itunes:image href="http://blog.ryantoohil.com/wp-content/plugins/podpress/images/powered_by_podpress_large.jpg" />
		<item>
		<title>No Need to Panic</title>
		<link>http://blog.ryantoohil.com/2009/09/no-need-to-panic.php</link>
		<comments>http://blog.ryantoohil.com/2009/09/no-need-to-panic.php#comments</comments>
		<pubDate>Tue, 08 Sep 2009 20:19:07 +0000</pubDate>
		<dc:creator>Ryan Toohil</dc:creator>
				<category><![CDATA[DreamHost]]></category>
		<category><![CDATA[Web hosting]]></category>

		<guid isPermaLink="false">http://blog.ryantoohil.com/?p=395</guid>
		<description><![CDATA[My site has been flaky over the last bit as my web host is doing who knows what. Presumably they&#8217;ll get it figured out. Though this made me laugh &#8230;. Well, it would have made people laugh if I had the ability to upload a file &#8230; so instead, I&#8217;ll just paste the text in: [...]]]></description>
			<content:encoded><![CDATA[<p>My site has been flaky over the last bit as my web host is doing who knows what. Presumably they&#8217;ll get it figured out.</p>
<p>Though this made me laugh &#8230;.</p>
<p>Well, it would have made people laugh if I had the ability to upload a file &#8230; so instead, I&#8217;ll just paste the text in:</p>
<p>12:35:39 up 8 min,  3 users,  load average: 413.01, 255.54, 113.59</p>
<p>Good times.</p>
<p>Oh &#8230; it&#8217;s getting worse:</p>
<p>[flower]$ uptime<br />
 12:46:20 up 18 min,  3 users,  load average: 742.20, 567.02, 336.31</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.ryantoohil.com/2009/09/no-need-to-panic.php/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Twitter Broke the Web</title>
		<link>http://blog.ryantoohil.com/2009/08/twitter-broke-the-web.php</link>
		<comments>http://blog.ryantoohil.com/2009/08/twitter-broke-the-web.php#comments</comments>
		<pubDate>Thu, 06 Aug 2009 15:04:46 +0000</pubDate>
		<dc:creator>Ryan Toohil</dc:creator>
				<category><![CDATA[Twitter]]></category>
		<category><![CDATA[Web]]></category>
		<category><![CDATA[Web hosting]]></category>

		<guid isPermaLink="false">http://blog.ryantoohil.com/?p=388</guid>
		<description><![CDATA[With Twitter being down, and feeling the need to spew out characters to the Twinterweb, I opened up trusty MarsEdit and decided to update my good ol&#8217; blog. That&#8217;s when I realized how much Twitter had taken over. I hadn&#8217;t logged into my web interface in a bit, so I had a WordPress update, 4 [...]]]></description>
			<content:encoded><![CDATA[<p>With <a href="https://www.twitter.com">Twitter</a> being down, and feeling the need to spew out characters to the Twinterweb, I opened up trusty <a href="http://www.red-sweater.com/marsedit/">MarsEdit</a> and decided to update my good ol&#8217; blog.</p>
<p>That&#8217;s when I realized how much Twitter had taken over.</p>
<p>I hadn&#8217;t logged into my web interface in a bit, so I had a WordPress update, 4 plugin updates, and a bunch of pending spam. I cleared that out and then started crafting a post in MarsEdit. As I went to post <a href="http://blog.ryantoohil.com/2009/08/twitters-down-oh-no.php">those fateful words</a>, MarsEdit choked on an XML-RPC error complaining about bad content.</p>
<p>I quickly scanned through recent post, looked for bad characters. I used the Googles to try to find an explanation. Finally, I noticed that each page of my blog was spitting out a PHP warning (because, you know, PHP is pretty dumb, and I&#8217;m too lazy to have turned off warnings) that it couldn&#8217;t download my latest tweets from Twitter. </p>
<p>One quick click of &#8220;Disable&#8221; and MarsEdit sprung back to life.</p>
<p>Twitter was breaking my site.</p>
<p>At which point, it dawned on me, that at this point, Twitter being down is like having part of the internet&#8217;s routing being down. It&#8217;s tied into so many systems, and so much traffic/content flows through it, that when it goes kablooey, all that content has to route elsewhere. Which then floods those systems, and they start to struggle and burst at the seams a bit, and then folks overflow into another system, and so on.</p>
<p>Until, the end result, of course, is that Twitter was breaking my site.</p>
<p>Here&#8217;s a graphical representation:</p>
<p><img src="http://www.ryantoohil.com/images/twitter-down-20090806-110343.jpg"/ alt="Twitter Down" title="Twitter Down"/></p>
<p>It&#8217;s fixed now.</p>
<p>My site, that is, not Twitter, which is <a href="http://downforeveryoneorjustme.com/twitter.com">still down</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.ryantoohil.com/2009/08/twitter-broke-the-web.php/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Awesome tool: HTTP Client for HTTP Debugging</title>
		<link>http://blog.ryantoohil.com/2008/12/awesome-tool-http-client-for-http-debugging.php</link>
		<comments>http://blog.ryantoohil.com/2008/12/awesome-tool-http-client-for-http-debugging.php#comments</comments>
		<pubDate>Sat, 13 Dec 2008 17:11:36 +0000</pubDate>
		<dc:creator>Ryan Toohil</dc:creator>
				<category><![CDATA[Mac]]></category>
		<category><![CDATA[Web]]></category>
		<category><![CDATA[Web hosting]]></category>

		<guid isPermaLink="false">http://blog.ryantoohil.com/?p=322</guid>
		<description><![CDATA[HTTP Client &#8211; Mac Developer Tool for HTTP Debugging If you do any web development, this tool is awesome.]]></description>
			<content:encoded><![CDATA[<p><a href="http://ditchnet.org/httpclient/">HTTP Client &#8211; Mac Developer Tool for HTTP Debugging</a></p>
<p>If you do any web development, this tool is awesome.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.ryantoohil.com/2008/12/awesome-tool-http-client-for-http-debugging.php/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Hey Twitter! Your problems are my problems!</title>
		<link>http://blog.ryantoohil.com/2008/06/hey-twitter-your-problems-are-my-problems.php</link>
		<comments>http://blog.ryantoohil.com/2008/06/hey-twitter-your-problems-are-my-problems.php#comments</comments>
		<pubDate>Tue, 03 Jun 2008 03:33:24 +0000</pubDate>
		<dc:creator>Ryan Toohil</dc:creator>
				<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Twitter]]></category>
		<category><![CDATA[Web hosting]]></category>
		<category><![CDATA[Work]]></category>

		<guid isPermaLink="false">http://blog.ryantoohil.com/?p=275</guid>
		<description><![CDATA[If you&#8217;ve been following along at home, you&#8217;ll have noticed I work for a web hosting company. We&#8217;re pretty big, with hundreds of thousands of customers, and we&#8217;ve got some interesting operation efficiencies/differences from traditional hosts that give us some fun advantages. Those same advantages come at the cost of having to do some interesting [...]]]></description>
			<content:encoded><![CDATA[<p>If you&#8217;ve been following along at home, you&#8217;ll have noticed I work for a web hosting company. We&#8217;re pretty big, with hundreds of thousands of customers, and we&#8217;ve got some interesting operation efficiencies/differences from traditional hosts that give us some fun advantages. Those same advantages come at the cost of having to do some interesting thinking about scalability and performance. This is where the stuff currently going on at Twitter starts to sound *really* familiar.</p>
<p>But, let&#8217;s start with the basics &#8230;</p>
<p><strong>Typical Web Host</strong><br />
A typical web host has a &#8220;pizza box&#8221; architecture. They pile a bunch of servers in a rack in a data center. Each server is running some MySQL, some Apache (or ISS), some email app (Exim or Qmail), and some FTP. Each server is usually running another instance of Apache to run the customer control panel. The box usually runs off it&#8217;s own local storage. Thus, each box is an individual entity running it&#8217;s own versions of applications, with it&#8217;s own individual issues. Depending on which box you&#8217;re on, you might have MySQL slowness, or Apache slowness, or disk slowness, all depending on what your neighbors are doing. Rolling out upgrades becomes a bit more arduous as you&#8217;ve got to upgrade each box. This is decidedly old-school, but very common and works fairly well. This architecture only starts to be a problem as you grow; each time you add X customers (where X is the number each box can support), you have to add a new box. Each time you add a new box, you need more power, space, backup storage, etc. That gets hard to scale.</p>
<p>So, then there was the idea of not using local box storage, but instead using networked storage. Now you can just add more disks without having to add entirely new boxes, and backups become less of an issue. This means you can grow a little bit more without having to add more boxes.</p>
<p><strong>Atypical Web Host</strong><br />
An <strong>a</strong>typical web host (like us and some of our competitors) might do things a bit different. For instance, rather than having a single LAMP (Linux-Apache-MySQL-PHP/Perl/Python/(P)Ruby) stack per set of customers (on a box), you build a pool of boxes that can all serve up services for a customer. This means that:</p>
<ul>
<li>You&#8217;ve got some redundancy</li>
<li>You don&#8217;t have to scale all of your services as you add new customers</li>
</ul>
<p>In this model, more like a typical web service than a web host, you&#8217;ve got pools of servers that perform certain tasks, and customer data is kind of abstracted away to all just live on big storage arrays. Things scale much easier and more cost effectively.</p>
<p>But &#8230;.</p>
<p><strong>What does this have to do with Twitter?</strong></p>
<p>See, when you&#8217;re running centralized services, you end up with a lot of data sitting in your database, and it&#8217;s getting accessed by Y thousands (tens of, hundreds of) users. The &#8220;pizza box&#8221; model doesn&#8217;t have this issue since the data is distributed across thousands of servers. But that&#8217;s not feasible in something like Twitter. You don&#8217;t only want to be able to see tweets from people who are on the same server as you. You want to see tweets from everybody. That&#8217;s the power of Twitter.</p>
<p>Having lots of data in a single database instance solves that problem. But it introduces a whole new set of issues: database locking, slave synchronization, disk performance.</p>
<p>Let&#8217;s take an educated guess at what Twitter&#8217;s database schema might be like:</p>
<blockquote>
<pre>Table: Users
UserID     int(14) autoincrement
Username   varchar(50)

Table: Tweets
TweetID	   int(14) autoincrement
UserID     int(14)
Tweet      varchar(140)
DateStamp  datetime

Table: Followers
UserID     int(14)
FollowerID int(14)

Table: UserTweets
UserID     int(14)
TweetID    int(14)</pre>
</blockquote>
<p>In a nutshell, we&#8217;ve got a table that keeps track of our users. We&#8217;ve got a table of tweets, which maps back to give us a user based off of the UserID. Right there, you get a pretty easy query to get all of the tweets for a user:</p>
<blockquote>
<pre>SELECT t.Tweet FROM Tweets t INNER JOIN Users u USING (UserID);</pre>
</blockquote>
<p>When Twitter was just a baby, that query&#8217;s pretty darn fast. It&#8217;ll give you all of the Tweets written by any user in particular. Reads will be fast, since UserID will be indexed and unique. Writes will be fast, since we&#8217;re not writing a ton of data and updating the indexes should be pretty quick (since Twitter is still just a baby).</p>
<p>Now, we add the functionality of &#8220;following&#8221; another Twitter user (good thing we thought ahead and added that to our schema!) Now a user can follow another user, which just simply sticks a row in the table with the id of the user and the id of who they are following.</p>
<p>Keeping track of all the tweets flowing into the user from folks they&#8217;re following is easy, since we planned ahead. We made a table that lets us map a bunch of tweets to a user. Every time we add a tweet, we stick a row in that says &#8220;this user has this tweet&#8221;&#8211;it doesn&#8217;t matter if the tweet was by that user, or someone they were following, they all go in that mapping table.</p>
<p>Again, when Twitter was a baby, this was all pretty easy and quick:</p>
<blockquote>
<pre>SELECT t.Tweet FROM Tweets t INNER JOIN UserTweets u USING (TweetID) where UserID = ?;</pre>
</blockquote>
<p>We&#8217;re going to get back all the tweets assigned to that user, whether written by the user or someone they are following. Simple and fast, since again, we&#8217;re hitting some pretty easily indexed and unique fields.</p>
<p>Of course, I&#8217;ve made one assumption here. I&#8217;m assuming that, when a user starts following someone, that their tweets start getting associated to my user (the UserTweets table). I&#8217;m guessing this is the case since when you start following someone all of their tweets don&#8217;t magically show up in your history (conversely, when you stop following someone, they don&#8217;t miraculously disappear &#8212; I don&#8217;t think so, at least). Either Twitter sticks stuff in the mapping table (my guess), or they go look up the date you started following each user, and they build that index on the fly.</p>
<p>In other words, the alternative version is that Twitter, when you view a page that displays your tweets along with those of folks you follow, would have to go lookup the date when you started following someone, then find all of their tweets after that date, bring them together, and put them in order.</p>
<p>Mapping table seems a whole lot more likely. And much faster. Which is kind of how Twitter got big.</p>
<p>Anyway, since we&#8217;ve got this nice mapping table, we need to know how to fill it up. If I&#8217;m following Robert Scoble (like the rest of Western Civilization), every time he writes something, it needs to make it to UserTweets associated with me.</p>
<p>Again, this is pretty easy, if you&#8217;ve got a handy loop. First, get all the followers:</p>
<blockquote>
<pre>SELECT FollowerID from Followers where UserID = SCOBLES_ID;</pre>
</blockquote>
<p>Then, loop through all those folks and post it:</p>
<blockquote>
<pre>INSERT INTO UserTweets (UserID, TweetID) values (FOLLOWER_ID, SCOBLES_TWEET_ID);</pre>
</blockquote>
<p>That loop will run as many times as it takes to update Scoble&#8217;s followers. None of these queries are complex. They&#8217;re all easily indexible. They should all be pretty fast. When Twitter is still in baby-state.</p>
<p>But now Twitter is growing. It&#8217;s a toddler. It&#8217;s got many many users and some of them have loads of followers. Things are starting to slow down. Twitter&#8217;s parents look at it and say &#8220;Well, one obvious reason you&#8217;re starting to slow down is that the database is starting to lock. See, now that we&#8217;ve got some data, lookups are taking a little bit longer, and inserts are taking a little bit longer, and when they happen at the same time, we don&#8217;t want data to get out of sync, so the database locks up. When it locks up rapidly enough and often enough, we run out of threads and our database likes to go down. We&#8217;re going to teach you to read and write at the same time.&#8221;</p>
<p>And that&#8217;s what they do.</p>
<blockquote><p>We currently use one database for writes with multiple slaves for read queries. As many know, replication of MySQL is no easy task, so we&#8217;ve brought in MySQL experts to help us with that immediately. (<a href="http://blog.twitter.com/2008/05/its-not-rocket-science-but-its-our-work.html">blog.twitter.com</a>)</p></blockquote>
<p>Now we&#8217;ve got a master database, where all of our writes (inserts) go, and a couple of replicas where all of our reads (selects) go. Perfect. Things are fast again. Users are happy. Twitter is moving along.</p>
<p>Twitter grows into the tween stage. And it&#8217;s not pretty. Databases are constantly crashing. Things are slow. What the hell? Didn&#8217;t we just fix this?</p>
<p>Well, no. We just hid the problem. Replication isn&#8217;t a pretty solution. MySQL replication is flawed. It&#8217;s not instantaneous. It can fall behind. Further, it breaks. A lot. And when it breaks, we lose half of our read capacity, which then overloads the other server, and everything goes down. Resynchronizing can be painful. Adding more replicas helps, to a point (since each replica adds a little bit of overhead to the master).</p>
<p>Big users (those with lots of followers) cause more load, since now we&#8217;ve got to do a big lookup to get thousands of rows (followers) and then do thousands of inserts every time one of those users posts. That&#8217;s not good for our database.</p>
<p><em><strong>NOTE:</strong> The Twitter folks claim they don&#8217;t copy the tweet around for each follower. I&#8217;m sure they don&#8217;t. But they don&#8217;t say anything about copying the ID around (which, as I&#8217;ve stated, makes a good amount of sense if you were architecting Twitter 18 months ago).</em></p>
<p><em></p>
<blockquote><p>13:03: I ask about how Twitterâ€™s engine works internally and I ask if Tweets are copied for each Twitter message. For instance, do my Tweets get copied 23,000 times? EV answers that the service does NOT do that. (<a href="http://scobleizer.com/2008/05/31/clearing-the-air-with-twitter/">scobleizer.com</a></p></blockquote>
<p></em></p>
<p><em>Also, it&#8217;s what Dare Obasanjo <a href="http://www.25hoursaday.com/weblog/2008/05/23/SomeThoughtsOnTwittersAvailabilityProblems.aspx">posits</a>, building on what <a href="http://assetbar.wordpress.com/2008/02/08/twitter-proxy-any-interest/">Israel says on the Assetbar blog</a>, and they&#8217;re both way smarter than me.<br />
</em></p>
<p>And you can&#8217;t really add a second master database. So writes are going to be as fast as our server can handle them.</p>
<p>This is where we are today (or at least recently). There&#8217;s enough users on Twitter that updates are probably starting to slow since the tables and indexes are so large that there&#8217;s just not much MySQL can do. Reads are slow because replication is fragile.</p>
<p><strong>What do we do? </strong><br />
Without completely re-architecting things, what can we do? There&#8217;s a couple of things, right off of my novice brain:</p>
<ul>
<li>Cache, Cache, Cache</li>
<li>Make the database smaller</li>
</ul>
<p><em>Cache, Cache, Cache</em><br />
The less we have to go to the database, the better things perform, and the better they scale. There are lots of things we can cache, things which don&#8217;t update very often. For instance, the list of followers. If we stored the follower list in memory for each user (as it was loaded), we&#8217;d cut down on a query that gets run every time a user loads Twitter.</p>
<p>That&#8217;s *a lot* of queries.</p>
<p>It sounds like the folks at Twitter are already heading down this path.</p>
<blockquote><p>A: We&#8217;ve mitigated much of this issue by using memcached, as many sites do, to minimize our reliance on a database. (<a href="http://blog.twitter.com/2008/05/its-not-rocket-science-but-its-our-work.html">blog.twitter.com</a>)</p></blockquote>
<p><em>Make the database smaller</em><br />
This isn&#8217;t something it sounds like Twitter wants to try. Basically, this is sharding. You take all the users with a UserID &lt; 100000 and put them in one database, all the users with a UserID &gt; 100000 and &lt; 200000 and put them in a second database, and so on. Then you have a master lookup that lets you know where to find the data you&#8217;re looking for (which you can cache!).</p>
<p>This makes your selects and inserts fast again, since the tables and indexes are smaller. But now you&#8217;ve got to manage the multiple shards, adjust them when they get too big, and deal with the added layer of abstraction.</p>
<p>It&#8217;d help, but it&#8217;s a big task.</p>
<p>If you&#8217;ve been reading my blog, you might be asking yourself:</p>
<p><strong>Why Does This Sound Familiar?</strong><br />
Why does this sound familiar? Because our unknownish web host ran into similar issues. See my post <a href="http://blog.ryantoohil.com/2008/02/thoughts-on-mysql-scalability-from-a-certified-mysql-moron.php">Thoughts on MySQL Scalability From a Certified MySQL Moron</a> from February.</p>
<p>All of the problems Twitter is facing seem to be directly related to building the service in a logical fashion, but not forseeing the problems that massive growth would have on the system (a single tweet spawning tens of thousands of database updates). Quite frankly, it&#8217;s completely reasonable. Frustrating, but reasonable. As Michael Kowalchik (founder of <a href="http://www.grazr.com">Grazr</a>) states:</p>
<blockquote><p>As a startup there&#8217;s only so much energy you have, and you must apportion your resources carefully. The truth is, we like to talk about scaling, but without steady growth and something people find compelling, all the scaling in the world won&#8217;t help you. (<a href="http://www.mathewingram.com/work/2008/06/01/twitter-and-the-importance-of-architecture/#comment-567723">mathewingram.com</a></p></blockquote>
<p>I&#8217;m curious about Twitter&#8217;s solutions both as a <a href="http://twitter.com/ryant">user</a> and as an engineery web dork at a company hitting the same problems. We&#8217;re starting to use memcached, looking at using smarter non-table locking databases, and thinking about utilizing the file system more than a database.</p>
<p>Twitter has some technological mountains to climb to be able to scale to support the rate at which they are growing. I think that, sooner or later, they&#8217;ll have to bite the bullet and use database shards. They&#8217;re also likely going to have to build out memcached clusters large enough to allow them to cache nearly every thing about a user. That&#8217;s loads of data (gigs and gigs and gigs), but machines and memory are cheap for a company like Twitter, and the payoff will be worth it. I have no idea if Twitter&#8217;s growth is slowing during these performance issues, but throwing some funds behind a massive memcached cluster would be well spent now, and would surely be useful even after any sort of re-architecture.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.ryantoohil.com/2008/06/hey-twitter-your-problems-are-my-problems.php/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Heh, Skitch is fun and phpBB definitely sucks balls</title>
		<link>http://blog.ryantoohil.com/2008/04/heh-skitch-is-fun-and-phpbb-definitely-sucks-balls.php</link>
		<comments>http://blog.ryantoohil.com/2008/04/heh-skitch-is-fun-and-phpbb-definitely-sucks-balls.php#comments</comments>
		<pubDate>Mon, 21 Apr 2008 02:14:44 +0000</pubDate>
		<dc:creator>Ryan Toohil</dc:creator>
				<category><![CDATA[Apple]]></category>
		<category><![CDATA[Apps]]></category>
		<category><![CDATA[Web hosting]]></category>

		<guid isPermaLink="false">http://blog.ryantoohil.com/?p=268</guid>
		<description><![CDATA[I&#8217;ve meant to use Skitch for a while, and finally got around to using it today. It&#8217;s pretty cool. For instance, you can do a search for &#8220;phpbb is a piece&#8221; and find some fun links: Then, if you want, you can do some fun stuff to it. Like add some comments: Tada! Awesome. Takes [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve meant to use <a href="http://www.skitch.com">Skitch</a> for a while, and finally got around to using it today. It&#8217;s pretty cool. For instance, you can do a search for &#8220;<a href="http://www.google.com/search?hl=en&#038;safe=off&#038;client=firefox-a&#038;rls=org.mozilla%3Aen-US%3Aofficial&#038;hs=tpQ&#038;q=phpbb+is+a+piece&#038;btnG=Search">phpbb is a piece</a>&#8221; and find some fun links:</p>
<p><img src="http://www.ryantoohil.com/images/phpbb_is_a_piece_-_Google_Search_-_Mozilla_Firefox_3_Beta_5_%28Build_2008032619%29-20080420-221004.jpg" alt="google results" title="phpbb is a piece of crap" /></p>
<p>Then, if you want, you can do some fun stuff to it. Like add some comments:</p>
<p><img src="http://www.ryantoohil.com/images/phpbb_is_a_piece_-_Google_Search_-_Mozilla_Firefox_3_Beta_5_%28Build_2008032619%29-20080420-221346.jpg" alt="google results with some color" title="phpbb is a piece of feces" /></p>
<p>Tada!</p>
<p>Awesome. Takes about 2 seconds.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.ryantoohil.com/2008/04/heh-skitch-is-fun-and-phpbb-definitely-sucks-balls.php/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>phpBB is a Piece of Feces and May Be the Bane of My Existance</title>
		<link>http://blog.ryantoohil.com/2008/04/phpbb-is-a-piece-of-feces-and-may-be-the-bane-of-my-existance.php</link>
		<comments>http://blog.ryantoohil.com/2008/04/phpbb-is-a-piece-of-feces-and-may-be-the-bane-of-my-existance.php#comments</comments>
		<pubDate>Sun, 20 Apr 2008 20:20:46 +0000</pubDate>
		<dc:creator>Ryan Toohil</dc:creator>
				<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Web hosting]]></category>
		<category><![CDATA[Work]]></category>

		<guid isPermaLink="false">http://blog.ryantoohil.com/?p=267</guid>
		<description><![CDATA[I haven&#8217;t been posting very much recently because I&#8217;ve sadly been working my tail off. I very much enjoy what I do, but there are just weeks (months &#8230; years &#8230;) where it&#8217;s just a non-stop grind to get everything done. Recently, it&#8217;s been working on launching a new VPS platform, but that was interrupted [...]]]></description>
			<content:encoded><![CDATA[<p>I haven&#8217;t been posting very much recently because I&#8217;ve sadly been working my tail off. I very much enjoy what I do, but there are just weeks (months &#8230; years &#8230;) where it&#8217;s just a non-stop grind to get everything done. Recently, it&#8217;s been working on launching a new VPS platform, but that was interrupted by a breakdown of our customer MySQL infrastructure.</p>
<p>Our setup is a bit different than most. Since we don&#8217;t run a typical box-by-box web hosting architecture, we don&#8217;t simply have a thousand boxes with each one running Apache and MySQL. Instead, we have a really robust pooled architecture for everything <strong>except</strong> MySQL, which just isn&#8217;t something that&#8217;s very <em>poolable</em>. For MySQL, we&#8217;ve got some big boxes with a bunch of memory and some fast disks that handle our MySQL load. But, slowly over time, performance had degraded.</p>
<p>When you&#8217;d hop onto a box and look at the transactions per second or number of queries, nothing looked terribly out of the ordinary. Yet the load would be huge, and performance would be pretty bad. Our team brought up some new boxes, shuffled customers between them to even the load out, moved our backup processing onto the hot spare replicated boxes (to reduce even more load on the disks) and things were better.</p>
<p>But they weren&#8217;t better enough. (I know, awesome English, eh?)</p>
<p>We started just watching the processlist, looking for the culprit. And after about 5 minutes, it was obvious.</p>
<p>Motherfrakking phpBB spam.</p>
<p>phpBB is written in a really shitty way. Not the forum part, necessarily, which works when it&#8217;s not being exploited. But the search part is awful. For every word in every post (unless you&#8217;ve got a smart list of words to ignore), it throws entries in some big tables so that when you search for &#8220;foobar&#8221;, it can tell you every post that contains that work. That&#8217;s a fine design for a small board with a tiny amount of traffic. But as your board grows, even legitimately, that table can become hundreds of thousands of rows long (or more!) and inserts and selects can become extremely slow.</p>
<p>It&#8217;s ten times worse when the only thing putting content into is spammers who are just flooding it with huge wordlists multiple times per second. Now, all of a sudden, you&#8217;ve got this single board showing up in your processlist five times, with each entry running for 30, 40, 50 seconds. One of those boards can cause some extra load on a server.</p>
<p>When you&#8217;ve got ten or twenty, it can bring the server to a halt. Literally. I popped onto a server where the load was near <em>10</em>. I turned off 40 phpBB boards getting spammed. The load dropped to less than <em>1</em> and stayed there.</p>
<p>After some quick thinking, we identified a bunch of boards that were getting spammed and turned them off. One of our engineers built a brilliant little monitoring script that can identify phpBB boards in the processlist and shut them off if they show up at a high enough frequency with those awful queries (you know then when you see them, believe me). All told, we&#8217;ve turned off maybe 12k boards in the past 2 weeks, and haven&#8217;t heard a single complaint.</p>
<p>Why? Because these boards were setup by users who then forgot about them. And there they sat, for months or years, collecting spam, draining resources. Basic negligence on the part of users caused a huge server load, which then caused those same customers to call in and complain.</p>
<p>It feels like we&#8217;ve got this mostly under control, except for the user side. We need to figure out a way to get people to realize that the things they install on their site can be exploited and lead to security issues (on their site), performance issues (for everyone), and can suck up the resources they pay for.</p>
<p>But yeah, it sucks when you work about an 80 hour week because people forget about their phpBB install, and the folks who wrote phpBB decided that they&#8217;d build the most stupidly designed search setup of all time.</p>
<p>So, when on April 8th my Twitter looked like this:<br />
<img src="http://www.ryantoohil.com/images/Twitter___ryant_-_Mozilla_Firefox_3_Beta_5_(Build_2008032619)-20080420-161832.jpg" alt="php is feces" title="php is feces" /></p>
<p>now you know why.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.ryantoohil.com/2008/04/phpbb-is-a-piece-of-feces-and-may-be-the-bane-of-my-existance.php/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Upgrading to WordPress 2.5</title>
		<link>http://blog.ryantoohil.com/2008/03/upgrading-to-wordpress-25.php</link>
		<comments>http://blog.ryantoohil.com/2008/03/upgrading-to-wordpress-25.php#comments</comments>
		<pubDate>Sun, 30 Mar 2008 00:36:06 +0000</pubDate>
		<dc:creator>Ryan Toohil</dc:creator>
				<category><![CDATA[Web hosting]]></category>
		<category><![CDATA[WordPress]]></category>

		<guid isPermaLink="false">http://blog.ryantoohil.com/?p=265</guid>
		<description><![CDATA[I figured I&#8217;d log what I did when I upgraded my blog to WordPress 2.5. First, I disabled some plugins I figured I wouldn&#8217;t necessarily need post-upgrade. The two I disabled were Kramer, which grabs Technorati links back to the blog (newly built into the WordPress dashboard) and SpotMilk, a customized dashboard (which I wasn&#8217;t [...]]]></description>
			<content:encoded><![CDATA[<p>I figured I&#8217;d log what I did when I upgraded my blog to <a href="http://wordpress.org/">WordPress 2.5</a>.</p>
<p>First, I disabled some plugins I figured I wouldn&#8217;t necessarily need post-upgrade. The two I disabled were Kramer, which grabs Technorati links back to the blog (newly built into the WordPress dashboard) and SpotMilk, a customized dashboard (which I wasn&#8217;t even sure would work).</p>
<p>Then I upgraded.</p>
<p>So far, so good.</p>
<p>Poking around the settings, I decided to turn on the global Gravatar usage, rather than using the Gravatar plugin. That&#8217;s a great idea, except my theme doesn&#8217;t come with Gravatar support, so I&#8217;ll need to use the built in functions.</p>
<p>Then my MacBook crashed for the second time today (I think it&#8217;s Twitterrific, but we&#8217;ll see). Awesomely, <a href="http://www.red-sweater.com/marsedit/">MarsEdit</a> earned its keep by having autosaved my work. So back to it.</p>
<p>After poking around, I got the built-in functionality to work, but since it returns an entire image tag, and not just the URL to the avatar image, it&#8217;s actually less useful to me than the plugin is. I turned the plugin back on. Good enough.</p>
<p>Next, I noticed the <a href="http://mowser.com/">Mowser</a> plugin had a new version. Perfect chance to try the new built-in plugin updating. Clicked the link and that was pretty much it &#8212; the plugin was up-to-date. Nifty. You can see the <a href="http://mowser.com/web/blog.ryantoohil.com">Mowser-fied version of my site here</a>. Not perfect, but pretty good work from a <a href="http://www.russellbeattie.com/blog/">one</a> &#8230; <a href="http://www.thisismobility.com/blog">two</a> person company.</p>
<p>Took this as the opportunity to clean up my plugins page. Gone are the aforementioned <a href="http://dev.wp-plugins.org/wiki/Kramer">Kramer</a> and <a href="http://www.ceprix.net/archives/spotmilk-admin-theme-for-wordpress/">SpotMilk</a>, along with the Hello Dolly and WP-flv plugin I&#8217;d installed a while ago.</p>
<p>Now, I wanted to turn some of my hard-coded plugin links into widget usage, to make switching themes easier. I started adding widgets to my left sidebar, expecting that I&#8217;d need to go disable them in the code. Nope! Nice, it must use a different bit of sidebar code when you use widgets. Very cool. This allows me to dump a couple more plugins (<a href="http://www.jimmyoliver.net/mynetflix-plugin/">MyNetflix</a> and a <a href="http://sevennine.net/projects/wp-audioscrobbler/">Last.fm</a> one). </p>
<p>Also, turn off WP-Cache when you&#8217;re testing, or you&#8217;ll be annoyed out of your mind.</p>
<p>One missing widget: I was previously using the Google Shared Items widget, but now I&#8217;ll just use the RSS feed for it. Let&#8217;s see how that looks &#8230; ugly. But, good enough for now. Maybe there&#8217;s a WordPress widget for it. Wow, I&#8217;m digging the widgets. They make my life a whole lot easier. I should have tried this a long time ago. Even added a little About Me text widget.</p>
<p>Turning WP-Cache back on.</p>
<p>Finally, testing to see if MarsEdit can still post &#8230; huzzah! Success. And with that, I&#8217;m done.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.ryantoohil.com/2008/03/upgrading-to-wordpress-25.php/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>How I Get Stuff Done</title>
		<link>http://blog.ryantoohil.com/2008/03/how-i-get-stuff-done.php</link>
		<comments>http://blog.ryantoohil.com/2008/03/how-i-get-stuff-done.php#comments</comments>
		<pubDate>Sun, 02 Mar 2008 23:45:22 +0000</pubDate>
		<dc:creator>Ryan Toohil</dc:creator>
				<category><![CDATA[Apple]]></category>
		<category><![CDATA[Mac]]></category>
		<category><![CDATA[Perl]]></category>
		<category><![CDATA[Web hosting]]></category>
		<category><![CDATA[Work]]></category>

		<guid isPermaLink="false">http://blog.ryantoohil.com/2008/03/how-i-get-stuff-done.php</guid>
		<description><![CDATA[I&#8217;ve been meaning for a while to sort of document how I get stuff done at work. It was just over a year ago that I bought my MacBook Pro. Within a week or so, I started using it at work. Probably within the first month, I&#8217;d completely moved to my MacBook as my sole [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been meaning for a while to sort of document how I get stuff done at work. It was just over a year ago that <a href="http://blog.ryantoohil.com/2007/02/the-much-awaited-stuff.php">I bought my MacBook Pro</a>. Within a week or so, I started using it at work. Probably within the first month, I&#8217;d completely moved to my MacBook as my sole work machine. After a year, and particularly since the upgrade to Leopard, I&#8217;ve kind of worked out how I get stuff done.</p>
<p>Let&#8217;s start with my environment.</p>
<p><a href="http://www.flickr.com/photos/75975029@N00/2276505444" title="View 'IMG_0337' on Flickr.com"><img src="http://farm3.static.flickr.com/2203/2276505444_480a1943f6.jpg" alt="IMG_0337" border="0" width="500" height="375" /></a><br />
The front wall of my office with pictures taken by a former co-worker. And, of course, the famous &#8220;Dwight&#8221; flasher flyer from &#8220;The Office&#8221;</p>
<p><a href="http://www.flickr.com/photos/75975029@N00/2275708813" title="View 'IMG_0334' on Flickr.com"><img src="http://farm3.static.flickr.com/2132/2275708813_889db4bfe7.jpg" alt="IMG_0334" border="0" width="500" height="375" /></a><br />
The shelf behind me containing random stuff I&#8217;ve gotten from eating kid&#8217;s meals and ice cream sundaes. And some Yankee Swap gifts. Oh, and I have some windows. That makes my life nicer.</p>
<p><a href="http://www.flickr.com/photos/75975029@N00/2276496998" title="View 'IMG_0332' on Flickr.com"><img src="http://farm3.static.flickr.com/2148/2276496998_0c9fbbeeaa.jpg" alt="IMG_0332" border="0" width="500" height="375" /></a><br />
My white board and busted ass bookshelf. And my cool VT light switch cover from Matt, and some random stuff I&#8217;ve collected and hung up.</p>
<p><a href="http://www.flickr.com/photos/75975029@N00/2275707195" title="View 'IMG_0333' on Flickr.com"><img src="http://farm3.static.flickr.com/2073/2275707195_b97b648668.jpg" alt="IMG_0333" border="0" width="500" height="375" /></a><br />
The view of where I sit. I used to use that big ass monitor to do a dual-monitor display, but since I&#8217;ve been moving around so much each day, now it&#8217;s just there to keep people from having a good look at me.</p>
<p><a href="http://www.flickr.com/photos/75975029@N00/2276503038" title="View 'IMG_0335' on Flickr.com"><img src="http://farm3.static.flickr.com/2252/2276503038_640e9c618c.jpg" alt="IMG_0335" border="0" width="500" height="375" /></a><br />
Finally, the MacBook Pro, my Motorola Q, my 30GB 5G IPod, my noise canceling head phones, and my phone that I don&#8217;t ever answer or use. And yeah, that&#8217;s Win2k running in Parallels. More on that in a bit.</p>
<p>So that&#8217;s where I do my work. </p>
<p>My Mac is setup in a very particular way. The upgrade to Leopard with Spaces has made my life considerably easier. It&#8217;s probably easiest to roll through how my Spaces are setup.</p>
<p><strong>Space 1</strong><br />
This is where I use my browser, which is currently <a href="http://www.mozilla.com/en-US/firefox/all-beta.html">Firefox 3 Beta 3</a>, and sometimes <a href="http://www.apple.com/safari/">Safari 3</a>.</p>
<p>Also running on this space is my &#8220;chat&#8221; clients. We use Jabber at work, which works nicely with <a href="http://www.adiumx.com/">Adium</a>. I&#8217;ve also got <a href="http://iconfactory.com/software/twitterrific">Twitterrific</a> running on this space.</p>
<p><strong>Space 2</strong><br />
Here&#8217;s where my Terminal lives, which is just the default Leopard terminal. Tabbed Terminals make me happy, particularly once I made the default tab switching hot keys to be Command+Left and Command+Right.</p>
<p>It&#8217;s all command line and vim and mysql. Good times.</p>
<p><strong>Space 3</strong><br />
Space 3 is where iCal and Mail live. Mail is just downloading my mail from Gmail. iCal is doing some cool stuff. I have most of my life in Google Calendar. iCal subscribes to my calendar feeds from GCal (including my work Outlook calendar&#8211;more on that in a second).</p>
<p>With all of my stuff in iCal, I then use the <a href="http://www.markspace.com/missingsync_windowsmobile.php">Missing Sync for Windows Mobile</a> to sync my calendars to the previously mentioned Motorola Q (which also connects to my work Exchange server, so it&#8217;s almost as a good as a Blackberry).</p>
<p><strong>Space 4</strong><br />
It&#8217;s the Windows space! I&#8217;ve got Win2k (don&#8217;t ask, I had a license lying around) running in <a href="http://www.parallels.com/en/products/desktop/">Parallels</a>, in Full Screen mode. Parallels runs pretty much just so I can run Outlook (for work email and calendaring) and so I can occasionally test stuff in IE6.</p>
<p>My Outlook runs a plugin called <a href="http://syncmycal.com/">SyncMyCal</a> to sync my Outlook calendar off to Google Calendar (which then gets sync&#8217;d down to iCal, as previously described).</p>
<p>Other software that occasionally comes in handy:</p>
<ul>
<li><a href="http://www.neooffice.org">NeoOffice</a> (though it&#8217;s slow and bulky and I&#8217;d switch if there was a viable alternative)</li>
<li><a href="http://www.apple.com/itunes/">iTunes</a> (obvs)</li>
<li><a href="http://www.red-sweater.com/marsedit/">MarsEdit</a> (for doing this sort of stuff)</li>
</ul>
<p>That&#8217;s how I get my work done. Anything else I should be using?</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.ryantoohil.com/2008/03/how-i-get-stuff-done.php/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Thoughts on MySQL Scalability From a Certified MySQL Moron</title>
		<link>http://blog.ryantoohil.com/2008/02/thoughts-on-mysql-scalability-from-a-certified-mysql-moron.php</link>
		<comments>http://blog.ryantoohil.com/2008/02/thoughts-on-mysql-scalability-from-a-certified-mysql-moron.php#comments</comments>
		<pubDate>Wed, 13 Feb 2008 22:43:35 +0000</pubDate>
		<dc:creator>Ryan Toohil</dc:creator>
				<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Web hosting]]></category>

		<guid isPermaLink="false">http://blog.ryantoohil.com/2008/02/thoughts-on-mysql-scalability-from-a-certified-mysql-moron.php</guid>
		<description><![CDATA[While I sit here and wait for myisamchk to finish and tell me that the table that various folks have spent the better part of the day trying to restore is either healthy or once again dead (how&#8217;s that for a run-on sentence), I wanted to dump out some of the things we&#8217;ve done to [...]]]></description>
			<content:encoded><![CDATA[<p>While I sit here and wait for myisamchk to finish and tell me that the table that various folks have spent the better part of the day trying to restore is either healthy or once again dead (how&#8217;s that for a run-on sentence), I wanted to dump out some of the things we&#8217;ve done to try to make our MySQL backend scale. It&#8217;s not been pretty, but given that it&#8217;s strung together with some Perl, some MySQL, and a bunch of paper clips, I think the folks around me have proven themselves brilliant (I just sit around and pretend to know what&#8217;s going on).</p>
<p>Oh, and this is all without a cluster. That&#8217;s probably the next step. And &#8220;Oh: part 2,&#8221; myisamchk finished checking 28 million rows. All is good. I&#8217;ve copied the data over to the main server and brought it up and everything is happy. Back to MySQL scalability &#8230;</p>
<p><strong>First issue:</strong> <em>We&#8217;ve got too much data</em></p>
<p>This one was easy to solve. We got rid of it. Sort of. We started archiving off data that we no longer needed chronologically. Old support incidents, logging, anything that had a timestamp that we aren&#8217;t looking at gets sliced off into an archive so that the main tables can be as tidy and fast as we can get them. Which for us is like 8GB of data and not fast at all. But it&#8217;s better than 12GB of data.</p>
<p><strong>Second Issue:</strong> <em>We&#8217;ve got too many connections</em></p>
<p>When you&#8217;re small(ish), it makes sense to throw a bunch of dbs on the same server. As you grow, those connections start to swamp MySQL. MySQL starts to get all panicked, and it doesn&#8217;t know how to handle all of the people asking for data, so it starts to get sloppy about closing old handles. Then it&#8217;s basically like thermal runaway in a transistor. The server can&#8217;t close old connections, new ones open up, adding more overhead, and all of a sudden your nice server has 5000 open connections and is hosed. Again, this was a pretty easy one to fix. Bring up a new box, move some databases it to it, and hope that you&#8217;ve built your code layer to make that swap pretty easy (ours was). Presto. Now both of your servers are happy.</p>
<p><strong>Third Issue:</strong> <em>We lock up the damn tables all the time</em></p>
<p>We&#8217;ve got a lot of customers who are constantly accessing their sites. We&#8217;ve got nearly 1000 support agents across the globe who are using our tools to look at customer configuration to make sure there&#8217;s no issues. This puts a whole chunk of load and repetitive queries on the database. That&#8217;s easily handled.</p>
<p>Except when you add in a bunch of data updates. Agents, customers, new signups adding and editing data in the database. All of a sudden those hundred pending SELECT statements are stuck because one big select locked the data when an UPDATE came in. Now you&#8217;ve got a bunch of web users who think your stuff is slow and/or broken. We&#8217;ve tried to attack this in a few ways:</p>
<ol>
<li>Fix your queries &#8212; We watch our slow queries and try to make them faster. We look at our most often called queries and try to make them faster. Sounds simple, but bad queries are the biggest cause of problems.</li>
<li>Add indexes &#8212; This goes with &#8220;fixing&#8221; queries. Add indexes and make sure you&#8217;re queries use them.</li>
<li>Perform less queries &#8212; Can you cache your data? Can you make less queries and do more in your language of choice? Can you make your users smarter (maybe without them knowing) about when they need to request data? Do it. The less queries you have, the more likely you won&#8217;t lock things up.</li>
<li>Split your reads and your writes &#8212; If you can split your reads and writes at the code layer, then you can shuttle reads off to one (or more boxes) and writes off to the primary box, and you should lock up a lot less. We accomplished this by having a couple of boxes replicate the main database, and having one of our smart engineers subclass Perl::DBI to look for SELECT statements and swap the database handle over to the read replica. It helps more than you might thing (but it&#8217;s not a silver bullet).</li>
</ol>
<p>Most of this sounds like common sense. It is. But it still matters. We&#8217;re trying to do a lot with a little, and every ounce of performance you can squeeze out matters, when your users are super demanding and will use any slowness as an excuse.</p>
<p>There are some other things we should and probably will try:</p>
<ul>
<li>Denormalization to bring data back together and cut down on costly joins</li>
<li>Sharding to split our data up into smaller chunks and this cut down on long table scans and huge indexes</li>
<li>A real MySQL cluster to optimize reads and writes and spread traffic out to many nodes</li>
</ul>
<p>I wish I knew more. I&#8217;m still barely up the curve compared to some of the engineers and admins I work with. Thankfully, they&#8217;ve been able to keep our many million row tables (and many GB data and index files) humming along with few interruptions.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.ryantoohil.com/2008/02/thoughts-on-mysql-scalability-from-a-certified-mysql-moron.php/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Affiliate Marketing is the Drizzling Shits of the Internet</title>
		<link>http://blog.ryantoohil.com/2007/12/affiliate-marketing-is-the-drizzling-shits-of-the-internet.php</link>
		<comments>http://blog.ryantoohil.com/2007/12/affiliate-marketing-is-the-drizzling-shits-of-the-internet.php#comments</comments>
		<pubDate>Fri, 28 Dec 2007 03:16:27 +0000</pubDate>
		<dc:creator>Ryan Toohil</dc:creator>
				<category><![CDATA[Advertising]]></category>
		<category><![CDATA[Affiliate]]></category>
		<category><![CDATA[Marketing]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[Web hosting]]></category>

		<guid isPermaLink="false">http://blog.ryantoohil.com/2007/12/affiliate-marketing-is-the-drizzling-shits-of-the-internet.php</guid>
		<description><![CDATA[Preface: These comments are mine and not those of my employer or anyone else. Ok, maybe they also represent the voices in my head. Advertising on the Web There&#8217;s lots of advertising on the web. The biggest web company in the world (Google, ever heard of them?) generates pretty much all of their revenue from [...]]]></description>
			<content:encoded><![CDATA[<p><em>Preface: These comments are mine and not those of my employer or anyone else. Ok, maybe they also represent the voices in my head.<br />
</em></p>
<p><strong>Advertising on the Web</strong></p>
<p>There&#8217;s lots of advertising on the web. The biggest web company in the world (Google, ever heard of them?) generates pretty much all of their revenue from text (and now some image ads) based on the context of your web searches, email, or web page content. Big media sites like ESPN or ABC or NBC generate some revenue and awareness through the old late 90s staple of the banner ad. Blogs, podcasts, and video sites get in on the action with pre/post-roll ads, typical interstitials, and sponsors. Tons and tons of ways for sites to generate some income, but they&#8217;re all pretty much based on getting a large number of eyeballs.</p>
<p>With the recent growth of blogs, forums, and just the general smaller sites run by individuals rather than corporations, folks have wanted to cash in on some of that free internet money. But banner ads and AdSense cash really don&#8217;t work too well unless you gets loads of traffic. Now, granted, there&#8217;s tons of ways to do that (which is why a bunch of the junk that fills up sites like Digg and Reddit these days are obvious <a href="http://en.wikipedia.org/wiki/Linkbait#Link_bait">linkbait</a> bullshit attempts to generate lots of traffic), but the internet in general&#8211;through better algorithms and crowdsourcing and such&#8211;has gotten pretty good at weeding those things out. Besides, only a few linkbaiting attempts can work in a given time period, so even this is not a surefire way to get yourself that sack with a dollar sign on it.</p>
<p><strong>Affiliate Marketing: Good<br />
</strong></p>
<p>This gave way to a new niche: affiliate marketing. In all honesty, this started a while back with Amazon. And, to this day, Amazon&#8217;s method isn&#8217;t really bullshit. Amazon&#8217;s affiliate system and basically a way for people who review stuff or talk about different things they want to link to those items on Amazon and get a kickback if someone purchases through their link. Everybody wins in this situation; the buyer gets an object they wanted, the site owner gets a few bucks back for setting up Amazon and the buyer, and Amazon gets to sell an object.</p>
<p>Plus Amazon gets some search engine love from having lots of sites link to them.</p>
<p>I&#8217;ve used the Amazon affiliate system when doing my not-very-often-update podcast. I&#8217;ve never made a dime, but that&#8217;s because I don&#8217;t do it very often and I&#8217;m not exactly linking to highly sought after stuff. But disclosure is important, because it&#8217;s the lack of disclosure is a big reason that affiliate marketing is currently the &lt;insert your horrific disease here&gt; of the interweb.</p>
<p><strong>Affiliate Marketing: Bad </strong></p>
<p>Somewhere along the way, the affiliate stuff got bastardized. It was, of course, inevitable. We live in a world of pyramid and get rich quick schemes broadcast in half-hour increments on late night TV. But the internet&#8217;s version is far more nefarious. The BS seemed to start in earnest with <a href="http://www.bzzagent.com/" title="BzzAgent" rel="nofollow">BzzAgent</a>, a Boston-based marketing company that paid &#8220;agents&#8221; to go around talking up products&#8211;products that the agency had been paid to promote. There was no disclosure; &#8220;Hey, I&#8217;ve never actually tried this product! I&#8217;m getting PAID to tell you that I think it&#8217;s awesome!&#8221; Of course, BzzAgent <a href="http://www.nytimes.com/2004/12/05/magazine/05BUZZ.html?ei=5090&amp;en=6dc3f3878659a642&amp;ex=1259989200&amp;partner=rssuserland&amp;pagewanted=all">took</a> <a href="http://brandautopsy.typepad.com/brandautopsy/2005/01/more_bad_bzz_fo.html">some</a> <a href="http://many.corante.com/archives/2005/05/03/bzzagent_and_creative_commons_a_cultural_chasm.php">heat</a> for their arguably deceptive marketing. It was not intentionally deceptive, but they were implicitly offering people incentives to be deceptive. After some bad press and some backlash, <a href="http://www.bzzagent.com/pages/Page.do?page=Code_of_Conduct" rel="nofollow">BzzAgent claims to be all about the disclosure</a>.</p>
<p><strong>Affiliate Marketing: Worse </strong></p>
<p>Similarly, a company called <a href="http://payperpost.com" title="PayPerPost is Evil" rel="nofollow">PayPerPost</a> sprung up. Here&#8217;s a group that will pay you to write about a product, service, or other site, right on your own website! You write up a few paragraphs, throw in a few links, and you make some money. The sponsoring company gets some search engine juice and some good word of mouth. PayPerPost gets some money for bringing the two parties together.</p>
<p>Not so different from the Amazon model, right? Sure, except that there was no required disclosure that the post was, basically, just a paid advertisement. Posts from PayPerPost folks weren&#8217;t required to be tagged as advertising, the way that those fake magazine articles are. The writer never even needed to try or use the product they were writing about. It was obvious to everyone what was going on: blatant link buying in an attempt to game the search engines.</p>
<p>(For a more complete story on PayPerPost, try <a href="http://www.techcrunch.com/?s=payperpost" title="TechCrunch on PayPerPost">TechCrunch</a>.)</p>
<p>So what&#8217;s the difference between this and the Amazon model? The end-user. The buyer. They&#8217;re getting hosed in that they&#8217;re just the commodity being traded in the middle. Taking someone to Amazon, where they&#8217;re then exposed to any number of other reviews for the product in question (which, by the way, is almost always a consumer product that&#8217;s a tangible good) is incredibly different than linking them to a web hosting company (more on that later) or some other digital good that is not quite as easily identified as something a user does or doesn&#8217;t want.</p>
<p>Particularly when all of the reviews for said product are paid for, and thus biased, by the aforementioned affiliate system.</p>
<p>So BzzAgent and PayPerPost started paying people to write about products and services, without a requirement of disclosure, and in many cases, without actually even trying out the product they were promoting. Sure, they were just trading on their online identity&#8211;burn people enough and you&#8217;re opinions are worthless. That is, of course, unless you can create endless domains and identities. The two companies were rightfully shat upon by the honest folks on the web. Both have started to talk about honest disclosure and transparency in attempt to stay relevant and to ensure their clients don&#8217;t run away for fear of being painted with the same dishonest brush.</p>
<p>It hasn&#8217;t worked for PayPerPost, whose <a href="http://blog.wired.com/business/2007/11/payperpost-user.html" title="Bye PayPerPost">business was rightfully crippled by Google</a> when Google basically dropped the rank of any site found to be working with PayPerPost. If PayPerPost&#8217;s business is as honest as they claim it is, this wouldn&#8217;t have mattered. The paying companies would still be lining up to get reviews and links. They&#8217;re not. They wanted search juice. And that&#8217;s not for sale, well, not through PayPerPost, at least.</p>
<p><strong>Affiliate Marketing: The Drizzling Shits</strong></p>
<p>Which brings me to my biggest pet peeve, and the one that hits closest to home: bullshit web hosting review sites. There&#8217;s tons of them. They claim to review web hosts. They don&#8217;t. They rank sites based on who pays them the most money per hosting sign up. It&#8217;s, quite frankly, a pox on the hosting industry. Each web host offers the affiliate a little more money. In return, the affiliate gives them good links for SEO, some traffic and new sign ups, and a couple of web STDs.</p>
<p>From some of our internal research, somewhere around 20% of all sign ups that come through affiliates are fraudulent. Most of them have a life span significantly shorter than a typical sign up. Many of the sign ups that make it through the front end fraud checks are still BS accounts. They sign up, collect the affiliate fee, and then cancel. With most web hosts, if you did that 10 times a day, you&#8217;d make in the neighborhood of $100k a year.</p>
<p>I&#8217;m not kidding.</p>
<p>Why is this so bad? Again, it comes down to disclosure. None of these sites reveal they&#8217;re doing this for pay. Most of them layer some arbitrary, made up review score on top of their listings, depending on which host is paying the most that month. The affiliate doesn&#8217;t care that it&#8217;s slimy&#8211;they&#8217;re getting paid. The web host  doesn&#8217;t care that it&#8217;s slimy&#8211;they&#8217;re getting new &#8220;real&#8221; hosting accounts. Who cares? The actual honest person who did hit Google or their search engine of choice to look for a web host to open a blog or a place to host their pictures of their grandkids. They find a review site, sign up with the top rated host (&#8220;oh my, this host is rated the top on ten different sites!&#8221;), and then find out it&#8217;s a completely crappy host. The poor grandma doesn&#8217;t realize that ten different review sites were all run by the same person/group/company. She didn&#8217;t realize that the top host was paying these affiliates so much because their service is so bad they&#8217;re hemorrhaging customers.</p>
<p>I&#8217;ll admit, my company pays affiliates. Slimy ones, at that. We&#8217;re not hemorrhaging customers. We&#8217;ve actually stepped up our game, I think, and have started to deliver a better hosting experience for most of our customers. But growing organically by word of mouth isn&#8217;t good enough for us, so we put on the full body web condom and deal with the underbelly of the internet.</p>
<p>It&#8217;s disgusting and immoral and we shouldn&#8217;t do it. Many of us have made that case. But, unfortunately, the dollars trump us. So we build in workarounds and special rules to pay off certain affiliates to make sure they get the conversions they want so they&#8217;ll keep sending us traffic. And keep linking to us.</p>
<p>Affiliate marketing isn&#8217;t inherently bad. But, as with anything, when you mix it with the internet, it ends up being more bad than good. It&#8217;s the drizzling shits of the internet.</p>
<p>Soon enough, Google will step up and kill this trend. And it&#8217;ll be a great day when we can focus on stuff that matters and not spend thousands of man-hours building search algorithms to weed out fake sites, building fraud detection to weed out the fake sign ups, and trying to convince ourselves that just because other folks are doing it, we need to do it to keep up.</p>
<p>Yuck.</p>
<p><strong>Save Us Obi-wan </strong></p>
<blockquote><p>Dear <a href="http://www.mattcutts.com/blog/" title="Matt Cutts, Spam Savior">Matt Cutts</a>,</p>
<p>Can Google please <a href="http://www.mattcutts.com/blog/how-to-report-paid-links/" title="Report Paid Links">fix the fake review sites</a>? It would be awesome.</p>
<p>Thanks.</p>
<p>Your pal,<br />
The Interweb</p></blockquote>
<p>Affiliate marketing is everywhere now. Google it. You&#8217;ll find hundreds of blogs devoted to how to get a spammy, content-less site ranked high in the search results, get people to click your links to generate conversions, and how to basically make money being dishonest. Granted, all marketing is somewhat dishonest&#8211;promote the good stuff, hide the bad stuff. But when it&#8217;s a first party doing it, you know to take what they say with a grain of salt (which is why good companies are transparent and talk about their occasional foibles &#8230; it makes the marketing spin look less spinny). When a supposed neutral third party is hiding the fact that they&#8217;re making money off of their &#8220;review,&#8221; it&#8217;s not easy to discern that. It&#8217;s ugly and stupid and dishonest. And it makes people loads of money.</p>
<p>Again, it&#8217;s why affiliate marketing is the drizzling shits.</p>
<p><strong>Examples</strong></p>
<p>Just in case you&#8217;re wondering, here&#8217;s what a bullshit review site looks like. I shouldn&#8217;t claim this to be authoritative. I don&#8217;t know with 100% certainty that these are fake review sites. But they fit the mold. They cloak their affiliate links, bring you over to the web host with an affiliate cookie, and have surprisingly similar reviews. I won&#8217;t link to them, but you can paste them into your address bar.</p>
<p>http://www.best-webhosting2007.com/</p>
<p>http://www.web-hosting-review.toptenreviews.com/</p>
<p>http://www.web-hosting-reviews.org/</p>
<p>http://www.web-hosting-top.com/</p>
<p>http://www.webhostingtoplist.com/</p>
<p>http://www.webhostingfever.com/</p>
<p>http://www.websitehostingreviews.com/</p>
<p>http://www.100best-free-web-space.com/</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.ryantoohil.com/2007/12/affiliate-marketing-is-the-drizzling-shits-of-the-internet.php/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>
