You think you want a distributed Twitter, but you don’t
Introduction
Anyone who’s used Twitter or follows Web 2.0 knows about Twitter’s problems. They became a victim of their own success when they became the new “it” thing, gaining hundreds of thousands of new users, but were unable to scale adequately to keep up the level of service. Since then, the “Fail Whale” graphic they use to show users that they’re having problems has become a famous inside joke amongst users and pundits about the problems Twitter and other Web 2.0 platforms are running into, including my own Hunting the Fail Whale post (buy a shirt). Since the problems started, numerous suggestions have been put forth for how Twitter can scale their service, both plausible and implausible, well thought out and not. One particular idea that’s caught hold recently has been the idea of de-centralizing and distributing the Twitter service, much like Email servers are run by all ISP’s. The technology being put forward to support this is a new extension to the Jabber XMPP instant messaging spec (the standard used by Google’s GTalk) , Microblogging over XMPP. You may think this is a great idea, and that you want a distributed Twitter-like service, but you don’t, and I’ll tell you why, but first let’s look at the advantages of a distributed Twitter…
Advantages of a Distributed Twitter
Twitter goes down. It’s almost Twitter’s defining characteristic at this point. For people addicted to Twitter for communicating and keeping in touch with friends using Twitter, it’s a frustrating but all too frequent occurrence to see the Fail Whale mocking them from their screen. Twitter has become the most famous single point of failure in software history. The problem Twitter has is that it can’t scale up by simply throwing more hardware at it. The reasons why are the topic of another post, but the fact is that as the number of messages going through Twitter goes up, it reaches a breaking point where it can no longer serve those messages. They can move that point up or down by disabling and re-enabling some resource-intensive services, but the breaking point remains.
The obvious solution to this problem is to spread out the load and let other sites handle part of the load. In a decentralized system like email, any one email server going down does not effect the rest of the servers, and email is still delivered for everyone else. Each mail server can scale its own box or boxes to match their load, and no one mail host has to handle the entire load.
A distributed microblogging infrastructure like this could work, eventually, but is it the right direction?
The problem with obvious solutions is that they copy old models, whether they’re a good fit or not, and they ignore the non-obvious, innovative ones…
Disadvantages of a Distributed Twitter
Why aren’t we just using email and mailing lists if this model is a good fit? The problems with a distributed microblogging system are the same as the problems with email. While there’s no single point of failure, there are many points of partial failure. The lag time of a distributed system make email less realtime and give a different feel to the communication. But these are minor points compared to the real issues.
Innovation
What was the last innovation in email? I’m not talking about email clients or contact management, I’m talking about the email infrastructure. What was it? IMAP? How long ago was that? Email is built on a set of standards, and is implemented by a number of different servers, clients, etc. All this makes innovation in the email space move at a glacial pace. Any change has to go through a standards body, which immediately makes it a long-term process, and then it has to go through an adoption and implementation cycle through servers and clients which may take years as not all servers will feel the need to upgrade immediately.
Microblogging’s potential for innovation has not been mined out yet. It’s barely scratched the surface, and been stalled for over a year by Twitter’s technical issues. Standardizing and distributing microblogging would kill innovation in the space and we’d be left with exactly what we have now and nothing more. That would be a shame, as there are several new ideas I want to implement in microblogging.
Rogue Nodes and Users
In fact, the most innovation in email has been around identifying and correcting misbehaving users and servers, i.e. spam filtering. Distributed systems are fundamentally susceptible to rogue elements using the infrastructure for their own purposes. Think of how much spam you get in your email inbox. Think of how many spams you don’t get on Twitter. An open, distributed system will be open for spammers to set up their own servers and ignore normal subscription rules to send messages directly to users. It will be the spam wars all over again, with the spammers working to stay one step ahead of the filters. With a centralized system, there is much more control over subscriptions, and they cannot be gamed from the outside.
No Single View of the System
This isn’t so much a problem with email, since it never existed there, but with a distributed Twitter system you’d lose the ability to look at all of the activity happening on the platform. That means not only is there no single user registry, but there’s no room for 3rd party add-on apps. There’s no public timeline, no Summize, no Tweetscan, no WhoShouldIFollow, no Twellow, etc. Those things are only possible because there’s a single place to look for the information. What’s more, these services have only begun to scratch the surface of mining the information in the social graph. With a distributed system, all that value is thrown away.
So What do people really want?
Users like Twitter. That is, they like Twitter when it’s working. Turning off the features that made Twitter easy and convenient to use, like Jabber IM support, have really hurt Twitter’s reputation. Users want Twitter’s ease of use AND the large number of users, because the network effect of more users is what really makes Twitter valuable. Unfortunately, Twitter doesn’t seem capable of giving us both reliably. Users also want more features. I’m not going to go into all of my plans for adding features, but if you’ve been following the discussions about Twitter over the last year you’ve heard some of them. Twitter is busy trying to get their current features all running again, though, so don’t expect new features any time soon.
So what’s needed? What’s needed is a scalable microblogging platform. It needs a sound system architecture that scales linearly by adding more hardware. It also needs a scalable business model that actually includes a revenue model that involves making more money as the number of users goes up, instead of losing more money (but making it up in volume!). It needs some innovation in technology, features, and business. In short, it needs Twitter done right.
It’s called Babelnote, and I’m building it.

Interesting post. I happen to be one of those cats who called out for a distributed twitter:
http://webcraftstudios.com/blog/2008/06/27/call-distributed-open-twitter/
I might try to collect my thoughts and form a better reply as a post later, but in short… I’m still leaning towards the distributed model.
Just a note: when I think of twitter, I always think in an IRC direction and not in a micro blogging direction. To me, one of the key aspects affecting user experience is the ability to choose who you listen to - a white list of sorts. Otherwise, it’s a massive scale chat room.
Email, IRC, and blogs have all had to deal with spam. I’ve actually seen some slip through the cracks on Twitter and I know of folks looking for ways to exploit twitter in various ways - some white hat and other black. However, the “choose who you listen to” nature of Twitter - in my opinion - is a HUGE component in keeping the system relatively low noise… and not the fact that it is a closed system. I would love to see a deeper technical analysis of this.
Email hasn’t significantly evolved because it simply does not need to. It does exactly what it needs to do and it does it very well. It’s only because the platform is stable that software developers can reliably spend time evolving the clients that interface with it.
IRC and Email are stable platforms. Business can count on them to fill needs. I can deploy an irc server with a web interface for chat support on a website and know that the service will be stable. I can communicate with clients via email and feel confident that the service will work the way it is intended. Twitter… well, it’s a toy.
The business model discussion has come up from time to time. My immediate response is the existence of business models built around email. Just because the technology is free and open does not mean a business can not be built on it.
The true failure of the current state of microblogging (see, I’m catching on) is the diversity of platforms. If I’m on Twitter, you’re on Babelnote, and my sister is on Brightkite… none of us can talk to each other… Epic Fail… The system with the most users will eventually win, everyone else will go out of business, and the need to innovate will go away… no one will be able to touch the space because one service already dominates. This is exactly what happened when Microsoft killed Netscape… and the Web browser did not evolve until Firefox came on the scene. Market dominance through closed system always stifles innovation.
Finally, all other topics are really side items in my unimportant opinion. I believe in open ideas, open information, and open technology. This is it’s own discussion that is far too long for a blog comment. In my mind, I believe that it is not a matter of *if* the twitter concept goes open… it’s just a matter of *when*…
The reason you don’t get much spam on Twitter is that you can’t send messages to groups of people… You can post statuses, and anyone who’s chosen to follow you will see it, but it was their choice to follow you. You can send individual direct messages, but that’s one-to-one communication and doesn’t scale like spammers like. With Email I can just send messages to multiple addresses. Similarly with a Jabber server using pubsub, if I run my own and enter false subscriptions, I can spam a whole slew of Jabber addresses. The distributed model would make this trivial, and spammers would certainly take advantage.
Yes, Email and IRC are stable… but what could they have been? What if you’d been able to easily define email forms that people could fill out and return? All of the things built on Lotus Notes were an attempt to turn Email into a more valuable business tool, but the standards made those innovations a lock-in to one platform (and a poor one at that). Those platforms are stable and locked into what they are because of standards. Standards are important, but only after the market is stable and mature. Premature standardization kills innovation and locks down the current implementation.
Believe me, I understand the issue of not being able to see other networks and their posts… It’s something I’m looking at and I’m working on a partial solution. It’s not something that requires a distributed model or a standard to solve.
I also understand open systems and open source… better than others in the Web 2.0 world from the ignorant posts I’ve seen touting it as a panacea. I’ve been an open source contributor for like 8 years, I’ve co-founded/run a successful open source project, etc. Open source means nothing for a web service, and especially when you’re re-implementing the same failed architecture you’re trying to replace.
One point you didn’t address is the biggest one, in my opinion, and that’s the inability to have the add-on value provided by 3rd party apps. With a distributed model there’s no Summize, Twitscan, etc. It becomes a lot less valuable when you can’t get the big picture.
You know, you have a really good argument with standards stifling innovation - I do have to admit. The best I can retort with is web standards. I’ll be the first to admit that we did a lot of things on the Web wrong. However, we do have a nice steady cycle of upgrades to web standards rolling out of the W3C. It’s slow and imperfect, but there is progress and innovation. To that end, I would suggest that - with the right approach - we could overcome that hurdle. I too am sad that e-mail isn’t a little bit better.
I’m not completely sold on the “one message per person” aspect being the sole spam killer. Blog comments are one to one and comment spam is still an ongoing battle. So long as there’s an API, I think spammers will find a way to exploit Twitter in time. Honestly, I don’t think spammers see Twitter as a viable platform because of a couple of reasons. First, not a lot of people use Twitter, at least compared to he number of people who use e-mail and the Web. Second, Twitter users are early adopters who happen to be rather tech savvy. Tech savvy users are going to be far less likely to respond to spam. To this end, I don’t think Twitter has really experienced the hell of full on spammer attacks - yet.
To respond to you most significant point, these systems depend on the fact that they can see everybody. I think Wordpress is a good example in this case. I can download and install wordpress on my server and still get tied into systems such as the one that attached my avatar to this comment. It also gives the user or system admin the option of being part of an external system or not. So long as there’s a web interface that goes along with server installs, 3rd party apps could still harvest information the same way systems harvest from RSS feeds now. I do see your point, it’s much more viable to build extensions to a single system. However, how valuable are these 3rd party apps if the general populous decides to shift from Twitter over to Plurk? With no standard, everything has to be re-engineered.
To some degree, I’m playing devil’s advocate. Your points are very valid.
You make valid points, although your solution is not an ideal solution, either. No proposal on the table is perfect, nor do I expect one to be. After placing my trust in one privately-controlled service, I am not eager to trust a second one. I vastly prefer having access to the source code, which means I can build a Laconi.ca instance that works best for me (in development now), and someone else can build one that works best for them.
Just because something is open source doesn’t mean it’s going to scale better than anything else. But it certainly leaves the door open for improvement and innovation by anyone with the interest and skills, which is not the case with a closed microblogging platform. As is often the case, your idea of what process/featureset/etc. is best is going to be different from mine. I want choices without sacrificing my community.
I like @Zaskoda’s comparison to an IRC or email server. Email’s not perfect, but it [usually] gets there. That’s far more important than having email responses filtered into a form, which is an interesting idea but could largely be replaced today with a link to a Google form or a regular HTML form. Email itself doesn’t have to be the be-all, end-all if we use it in combination with other tools that achieve our means. If I have a survey, I can send a link to a form instead of expecting email to be a form processor for me, too.
I’m excited by Identi.ca and prefer to support something that I can personally contribute to and influence, if only in my own interactions with it.
Yes, but what will you think when someone else sets up their own custom Laconica that doesn’t play nice with subscription rules and blasts out spam? It’s a very real possibility, and it’s 1000x easier than trying to spam on Twitter, so expect it to happen, just like it’s happened with Email.
Is accessibility to the source really more important than scalability? As a long-time open source developer I’d say no, it’s not. An open API and a scalable and performant service is important. A solid architecture that can scale to handle millions of users and billions of messages is infinitely more important, and Laconica doesn’t have it.
Which is a better solution for your housing, me giving you the plans for building a 19th century log cabin, or me giving you a new, modern, up to code house without giving you the plans? I think the answer is obvious.
But thanks for the comment… I’m trying to get people to at least think about this stuff before it bites them in the ass.
[...] Elsewhere: “Email is built on a set of standards, and is implemented by a number of different servers, cl… [...]
You’ve definitely got me thinking. But I still prefer Laconi.ca. I don’t want house plans because I don’t have the resources/skill to build a house from the ground up, but I can certainly (perhaps with a little help - I’m not superb with PHP) install and modify existing source code on my own server. Also, if you built me a house, I would expect a copy of the plans.
What I do NOT want is to trust the community I’ve built to a single company, same as I don’t want all email in the world routed through a single company. If Twitter can fix a spam problem then the Laconi.ca dev community can fix it, too. You’re still choosing whom to follow, so (unless I’m missing something) it would be a a bug if a spammer could send out spam blasts to unwilling users.