The Real-Time Web and its Future
- 2. The following report is based largely on insights shared generously
from these interviewees:
Aardvark Lexalytics
Adrian Chan Marnie Webb, Compumentor
AlertSite Mendeley
AllVoices Nomee
Amber Case Notify.me
Backtype Nozzl Media
Bernardo A. Huberman, OLark
HP Social Computing Lab OneRiot
Beth Kanter OrSiSo
Black Tonic PBWorks
Brad Fitzpatrick, Google Pipio
Brett Slatkin, Google Postrank
Chris Messina Steve Gillmor
CitySourced Superfeedr
Cliqset Sysomos
Collecta Ted Roden, NY Times/
DeWitt Clinton, Google EnjoysThings
Evri The American Red Cross
Factery Labs Threadsy
Faroo Tibco
FirstRain Tweetmeme
Jay Rosen, NYU Twingly
John Borthwick, BetaWorks Urban Airship
JS-Kit Warner Bros.
Kaazing Wowd
Kevin Marks, BT YourVersion
- 3. Contents
1 a. What is the real-time Web? Beyond Twitter and Facebook 2
b. Matrix of issues and companies 4
2. Case studies 5
a. Ted Roden puts real-time into enjoysthin.gs and the New York Times 6
b. Superfeedr: Transforming the legacy Web into real-time 9
c. Real-time as a trigger: Evri’s news-parsing technology 11
d. How Warner Brothers uses the real-time Web in the music business 13
e. Urban Airship does real-time mobile push 15
f. Nozzl Media: Bringing real-time to old media 17
g. Aardvark and the real-time Web of people 20
h. Mendeley and the real-time Web of science 23
i. Black Tonic re-imagines the real-time Web as a controlled experience 26
j. At the Red Cross, the real-time Web saves lives 28
3. Key players 31
a. John Borthwick: thoughtful prince of the real-time Web 32
b. Chris Messina: Rebel with a proposed technical standard 37
c. Brett Slatkin, Brad Fitzpatrick and PubSubHubbub 41
d. Steve Gillmor: The real-time Web’s leading journalist 45
e. Another 15 important people to follow to understand the real-time Web 51
4. Sector overviews 56
a. Stream readers: Interfaces for the real-time flow 57
b. Real-time search: Challenges old and new 68
c. Text analysis and filtering the real-time Web 72
5. Visualizations 75
a. The path to value 76
b. Real-time in conjunction with the static or slower Web 77
c. Information overload 78
6. Selected background articles on real-time technology 79
ReadWriteWeb | The Real-Time Web and its Future | 1
- 4. What is The Real-Time Web?
Beyond Twitter and Facebook
Dave Winer defines the real-time Web in four words: “It
Happens Without Waiting.”1 That’s true, and appropriately
vague. The phrase “real-time Web” means different things
for different people and it’s too early in the game to have
anything but a loose, inclusive definition.
Many of the different forms the real-time Web takes do have some common benefits, user experience
elements, lessons learned, pitfalls and possibilities. This is what we explore in this report.
It’s definitely a whole lot more than just Twitter and Facebook, though these are the best known
instances of what’s referred to as the real-time Web. Someday Facebook may open up its user data and
play a larger role in the real-time Web than just the introduction to the stream model that it plays today.
Someday Twitter may grow, discover how to retain users and effectively encourage more than the
small number of people who today create the vast majority of content on that service. Today engineers
estimate that Twitter sees about 1 thousand messages published per second and between 5 and 10
million links shared per day, before de-duplication. That sounds like a lot, but the real-time Web as a
whole is already much, much larger than Twitter.
For infrastructure provider Kaazing, the real-time Web is using HTML5 Web Sockets technology to push
live financial information to the Web browsers of banking customers that had always been limited to
desktop applications for security reasons.
For consumer web app Pip.io, the real-time Web is creating an XMPP-powered chat-like experience
for users to communicate with friends around objects like a Google Map or a streaming Netflix video
playing in the Pip.io web OS.
For semantic recommendation company Evri, the real-time Web is the ebbing and flowing of traffic
data on Wikipedia. That data points to hot topics that Evri needs to build topic pages to serve their
publisher customers.
1 http://www.scripting.com/stories/2009/09/22/whatIsTheRealtimeWeb.html
2 | ReadWriteWeb | The Real-Time Web and its Future
- 5. For search engine OneRiot, the real-time Web is made up of the links people share on Twitter ...as well
as Digg, Delicious and the click-streams of more than a million users who have opted-in to exposing
what they see online through the OneRiot toolbar.
For Q&A service Aardvark, the real-time Web is the people inside the social circle of a user who
happens to be available online at a given moment and interested in the topic of a user’s question.
There are hundreds of thousands of blogs that now deliver updated content to any other application
that subscribes to a PubSubHubbub or RSSCloud feed, immediately after that content is published.
NYU Journalism Professor Jay Rosen says the real-time Web creates a sense of flow for users that’s
comparable to the way television holds our attention. Google’s Brett Slatkin, developer of the
PubSubHubbub real-time protocol, says the real-time Web is a foundation for efficient computing and
use cases we can’t yet even imagine.
In writing this report we interviewed 50 people who work on technologies that power or leverage
what they consider to be the real-time Web. Those people have had a very diverse array of experiences,
but articulate a common story. It’s a story of increased computational efficiency – and software
that struggles to keep users from feeling overwhelmed. It’s a story of radically new possibilities
but strategies based on adding value in conjunction with more traditional, slower moving online
resources.
We hope you enjoy reading this overview of the emerging real-time Web. We believe this phenomenon
is one that will play a major role in the Web and world of the future. The page-based model of
destination sites, created by centralized expertise and navigated through authority-based search and
clicking link by link is being transcended. We think this survey of current strategies and experiences
to date will prove very useful in helping you effectively participate in and help build the future of the
real-time Web.
ReadWriteWeb | The Real-Time Web and its Future | 3
- 6. Matrix of Issues and Companies
This matrix allows you to navigate the contents of this report by topic. For example if you are a User
Experience expert, the second column shows you where the most relevant content for you is.
STANDARDS,
DATA CHANGING
BENEFITS OF USER ANALYTICS & NORMALIZATION OLDER REAL-TIME AS
REAL-TIME EXPERIENCE ADVERTISING & TEXT ANALYSIS ORGANIZATIONS A SERVICE
CASE STUDIES
enjoysthin.gs/
NYT • • •
SuperFeedr • •
Evri • • •
Warner Bros. • • •
Urban Airship • •
Nozzl Media • • •
Aardvark • •
Mendeley • • •
Black Tonic • • •
Red Cross • •
PEOPLE PROFILES
John Borthwick • • • •
Chris Messina • • • •
Slatkin/
Fitzpatrick • • • •
Steve Gillmor •
SECTOR OVERVIEWS
Stream Readers • •
Search • • • •
Text Analysis • • •
4 | ReadWriteWeb | The Real-Time Web and its Future
- 8. Ted Roden puts real-time into enjoysthin.gs
and the New York Times
By day, Ted Roden works on the very top floor of the New York
Times building, in the R&D department. The Times has a great
team of engineers: it does cutting-edge work in APIs, data
visualization and computer-assisted reporting. Roden does
work with real-time data at his day job, but he gets full creative
freedom when working on a side project called enjoysthin.gs.
The primary contributions that Ted Roden makes to understanding the real-time Web include
articulating the following:
• The material benefits of going real-time;
• The importance of user experience; and,
• The changing landscape in analytics and advertising.
Roden is also writing a book about real-time for O’Reilly Publishing. We had a conversation with him
about what happened after he added a real-time feed to enjoysthin.gs. He articulates well some of the
biggest advantages of a real-time infrastructure.
enjoysthin.gs is a visual bookmarking site, like Delicious for images and other media. Even
bookmarked text snippets are highlighted visually. User experience is a key consideration in all of the
site’s developments, and the service is a lot of fun to use.
This summer Roden added a premium subscription option to the site, called Joy accounts. A Joy account
costs $20 per year for access to all current and forthcoming premium features, or users can pay $5 for an
individual premium feature, such as disabling ads on the site or being able to view NSFW content.
One of the features that Joy account holders get is access to a real-time view of new shared content. That
real-time stream can be viewed in any browser but may be best served up in a Firefox sidebar. A real-time
feed as an up-sell value add? That’s remarkable, and Roden says the response has been positive.
The sidebar is simple but compelling. New content, including images, is pushed live into the side of the
browser as soon as it’s shared on the site.
6 | ReadWriteWeb | The Real-Time Web and its Future
- 9. At first, Roden said he used AJAX to poll his site every few seconds. Then he switched to a Comet
implementation.
enjoysthin.gs is still very small, but the implications of adding real-time to this site could likely benefit
sites of any size.
1. INCREASED TIME-ON-SITE
“People leave it open all day long,” Roden said of the sidebar. “Time-on-site has seen a huge increase.
It’s like when the new content comes in on the Facebook Live Feed: if you know it’s about to pop in five
seconds, you’ll stick around.”
A number of different factors are making time-on-site an increasingly important metric on the Web,
compared to page views. Increased consumption of video is the best known, but as real-time streams
of aggregated content become increasingly common, increased time-on-site will be an important
measurement of how successful an implementation is.
2. DECREASED SERVER COSTS
After implementing real-time infrastructure, Roden reports that “my site runs a lot more smoothly. I’ll
probably move the whole site to that technology, because deep down it’s much easier on the database
for me.”
“ I used to get hit by Stumbleupon and [the site] would start to crawl. Then
I changed to some of this real-time stuff, and I’ve reduced the number of
servers. Instead of the users sitting on the page and refreshing, I push it out
to them. My EC2 bill has gone way down.”
Roden’s experience complements the story that Google’s Brad Fitzpatrick told us about using
PubSubHubbub to push feeds to deliver shared items in Google Reader to FriendFeed. Changing
from polling to real-time push cut traffic between the two sites by 85%. Likewise, magazine-style feed
reader Feedly says that the part of its service that now consumes PubSubHubbub from Google Reader
has seen a 72% reduction in bandwidth.
3. ADVERTISING COMPLICATIONS
“Analytics totally change,” Roden told us. “If you never click around off the home page, then Google
Analytics says it’s one page view. Now if you’re pushing stories to the top of the page, then you don’t
know how many stories people have seen unless you start measuring differently.”
“ Measuring user engagement totally changes. People use enjoysthin.gs in a
sidebar in Firefox: do I count that as whole page view? Do I count it as one,
even though some people have it open for eight hours? Can you convince an
ReadWriteWeb | The Real-Time Web and its Future | 7
- 10. advertiser that they are going to see an ad 100 times while looking at a page
just once, and do they want that? For projects like enjoysthin.gs, it’s going to
be a scary world out there for advertising for a while.”
Roden has been placing display ads in the real-time feed and prioritizing the attractiveness of the
creative. That’s been somewhat effective so far, but he says it’s very early days in advertising in a real-
time model. He says that real-time won’t be an effective differentiator for ad sales in the future because
everything will be real-time. “Otherwise it’s like looking at a Word doc in a Web browser. It has to be
real-time,” Roden says.
Ted Roden says that at the root of the change towards real-time is a long list of emerging technologies
that make it easy. “It’s blowing my mind how quickly the tech is coming out,” he told us. What’s Roden
most excited about now? Tornado, the highly scalable, open-source real-time infrastructure released
by Facebook after its acquisition of FriendFeed. He’s switched all his prototypes at the New York Times
to it. “I’ll be really interested to see if people pick that up as quickly as they did Django,” he says. “It’s an
easy framework to work with.” The technology is becoming easier and easier, now it’s largely just the
frame of mind that has to change. “It’s not hard to write real-time code,” says Roden, “but if you’re in a
LAMP mindset, that doesn’t scale in real-time.”
See also:
• Ted Roden’s shared items on enjoysthin.gs at http://tedroden.enjoysthin.gs;
• Roden’s Delicious bookmarks (technical) at http://delicious.com/tedroden;
• Roden on Twitter at http://twitter.com/tedroden;
• New York Times Labs on Twitter at http://twitter.com/myoung/nytlabs.
8 | ReadWriteWeb | The Real-Time Web and its Future
- 11. Superfeedr: Transforming the
Legacy Web into real-time
Superfeedr’s slogan is, “We’re doing something stupid so
that you don’t have to.” Julien Genestoux’s Superfeedr is a
service that pulls in content feeds from around the Web
and then offers updates for those feeds in XMPP or
PubSubHubbub format.
Superfeedr’s primary contributions to understanding the real-time Web include articulating the
following:
• The opportunity to add value through technological transformation of legacy resources into
real-time;
• The ease of leveraging real-time, normalized data through use of services such as Superfeedr;
• How consumer markets may not be as prepared for real-time data as developers.
That means, instead of polling feed publishers over and over again to check for new updates, a
feed-consuming service can just sit and wait for Superfeedr to deliver updates automatically as they
become available. The publisher doesn’t even have to publish real-time feeds: Superfeedr takes care of
that. It’s real-time-as-a-Service.
“We don’t just do polling,” Genestoux says. “For each feed, we actually try to determine what is the
most appropriate way to get the updates: PubSubHubbub, RSSCloud, SUP, specific APIs (Twitter
stream, etc). We do polling as a failover.”
One year ago, Julien Genestoux launched a service called Notifixious. It delivered real-time updates
from any feed to a user’s IM client or email. Ten thousand people signed up for it, but 90% of them
were having just one blog delivered, usually by email. Not an inspiring predicament for Genestoux.
A very small subset of users were using the service to follow thousands of blogs. Genestoux inquired
and learned that they were using the service like an API. “The vast majority said they would pay to do
this, too,” Genestoux told us,” as long as it was cheaper than doing it themselves.”
Superfeedr now offers just that: transformation of feeds into real-time, at lower than the cost of your
current feed-parsing system and in 15 minutes or less after publication – or your money back.
ReadWriteWeb | The Real-Time Web and its Future | 9
- 12. The company is working on lowering that to 3 minutes or less. Your first 1000 feeds are free; if you
want to consume more than that, the company charges $1 for every 2000 items it delivers. Superfeedr
pings feeds once and shares updates with all subscribed customers, dramatically lowering the
polling overhead in the RSS ecosystem. “We also do feed normalization to make things easier for the
subscriber and avoid the hassle of dealing with RSS/Atom + namespaces,” says Genestoux.
Google’s Brett Slatkin, the primary developer of PubSubHubbub, is very supportive of what Superfeedr
is doing. Genestoux says the companies using his service so far include SixApart, Adobe, Twitterfeed,
StatusNet and a number of small services such as Webwag, EventVue, Quub, AppNotifications, Excla.im
and SmackSale.
“So many services fetch feeds from other services,” Genestoux says. “The market is huge. In the end,
everyone’s going to need real-time. It’s going to be the differential between services.”
Genestoux firmly believes that the real-time Web will have the biggest impact on developers, not
consumers. “The fact that services do not need to poll over and over, as well as have access to
‘normalized’ data, considerably lowers the bar to allow ‘free data’ to flow from one service to another.
Up until now, if you wanted your app to include data from other apps, you had to massively invest in
that (see Friendfeed), and maintaining such a component was a nightmare. If you make this ‘data flow/
stream’ transparent to the services, you start seeing richer mashups and apps that integrate data from
others. I sincerely think that more than for end users, the real-time will eventually change how Web
apps are built and interact together.”
What’s the downside? Genestoux admits that not all companies are comfortable relying on a third-
party service for this kind of functionality. Superfeedr went down for several hours one evening in
November. Genestoux wrote a blog post discussing the problem and his solution.2
Superfeedr isn’t the only real-time-as-a-service company online. Others we’ve spoken to include
Notify.me and Kaazing. Surely, there are many more. But when it comes to lightweight feed-
transformation services that are developer-friendly and engaged in cutting-edge Web technology
conversations, Superfeedr certainly fits the bill.
See also:
Julien Genestoux’s lifestream of links and bookmarks at http://www.ouvre-boite.com/
Genestoux is on Twitter at http://twitter.com/julien51
His circle on Twitter3 includes:
• http://twitter.com/ilan Ilan Abehassera, NY entrepreneur
• http://twitter.com/sdelbecque Stephane Delbecque, SF entrepreneur
• http://twitter.com/lawouach Sylvain Hellegouarch, French developer
• http://twitter.com/romefort Johann Romefort, CTO at Seesmic
• http://twitter.com/guillaume_ Guillaume Dumortier, SF entrepreneur
2 http://blog.superfeedr.com/Memcache/MySQL/post-mortem/post-mortem-02-11/
3 http://twitter.mailana.com/profile.php?person=julien51&
10 | ReadWriteWeb | The Real-Time Web and its Future
- 13. Real-time as a Trigger: Evri’s
News-Parsing Technology
Evri is a semantic Web recommendation service for online
publishers. The company tracks the real-time Web to know
when it needs to create or update a topic page for one of its
emerging news topics.
The primary contributions that Evri makes to understanding the real-time Web include articulating
the following:
• Creative ways that real-time and slower moving data sources can be used together to create value;
• Wikipedia as a source of real-time data beyond Twitter and Facebook. We’ve seen Wikipedia used by
other services before for disambiguation, but not as a source of real-time trending topics data;
• Another example of text analysis as a very important part of a service provider working on time-
sensitive content delivery; and
• Struggles experienced by forward-looking startup companies seeking to bring real-time services to
older businesses, in this case publishers.
Evri watches news sources to see when a news topic is trending, including articles on Wikipedia
that publicly available data shows have leaped in page views. Then it visits structured databases
like Wikipedia and FreeBase to check for updates to entries about related entities. It then creates or
updates a topic page with news links, photos and Twitter search results. The language used in those
Twitter posts is analyzed and the names of news entities in the posts are linked to other Evri topic
pages, like pivots.
“We’ve got it down to 15 minutes from when an event happens to when facts get updated,” Deep
Dhillon, CTO of Evri, told us. “Nothing is manual.” That may have been true of Patrick Swayze’s death,
as Dhillon pointed out in our interview, but it was not true of the death of anthropologist Claude
Levi-Strauss. The Levi-Strauss topic page was filled with news of his death, but for hours afterward the
excerpt from Wikipedia on his date of birth and death had not been updated to match the information
about his death that Wikipedia and Freebase contained.
“Another example is emergent entities,” Dhillon said. “The day after Michael Jackson died, there
was a bunch of info online about Conrad Murray, the physician. Within minutes, we had structured
ReadWriteWeb | The Real-Time Web and its Future | 11
- 14. information for a page but also for the rest of the system to link his ID with things like physician,
Michael Jackson. It ripples through our whole system. We have some API customers that are all about
emergent entities – we’re not just going to say that Conrad Murray is a person and a male.”
It’s a work in progress and Dhillon acknowledges that more work has to be done, on text analysis in
particular. Evri is working with the publishers that it draws content from (it’s wider than just a Web
search) on matters such as structured data and push notifications. The publishing industry has a lot
of catching up to do, though, in moving on from old content management systems that did little to
create meta data. The content that Evri receives for analysis comes in various forms (National Imagery
Transmission Format is one of the most common), and it has a wide variety of problems, but Dhillon
says that publishers have a motive to make sure their product is annotated.
More obvious is the incentive to do push notifications, Dhillon says. Timeliness being an advantage for
Google ranking is obvious.
In the future, then, everyday publishers may push highly structured content out to aggregators for
analysis, but today Evri is watching the real-time Web for news spikes, then using those as a trigger to
go out and query other parts of the Web.
See also:
• Deep Dhillon on Twitter http://twitter.com/Zang0
• Deep Dhillon’s blog http://chalobolo.blogspot.com/
12 | ReadWriteWeb | The Real-Time Web and its Future
- 15. How Warner Brothers uses
The real-time Web in the Music Business
Ethan Kaplan is VP of Technology at Warner Brothers Records,
and he’s a pretty savvy guy. He has built a real-time dashboard
to display the number of people who visit each Warner
Brothers artist website at any given time. When a site spikes
on the dashboard, the team can hover over that part of the
bar graph and see search results from blogs, Twitter and
elsewhere to determine what caused the increase in traffic and
to respond immediately.
The primary contributions that Ethan Kaplan offers to understanding the real-time Web are articulating
the following:
• A legacy industry capable of taking new forms of action based on substantially decreased delays in
information delivery;
• The value of having your own data in real-time, instead of relying entirely on third parties;
• Opportunities that arise from being able to create interfaces for real-time data display in-house; and
• Opportunities still untapped when real-time data is analyzed in bulk.
Kaplan tells us:
“ We used to be oriented around getting data only once a week, because that’s
how it was fed to us from SoundScan, Mediabase, etc. We’d then reconcile
that data against our plan for the week.”
“Now we’ve got a whole back end that exposes data in near and real-time:
purchases going through the system, site visits, visitors logged in, comments left.
The culture of that real-time environment has impacted how bands are being
marketed and products are created. People want more and more real-time.
“One day, for example, I saw a site with marginal traffic that suddenly had
7,000 people on it. We did a Twitter search, checked [celebrity blog aggregator]
WeSmirch.com and found out that the artist was having a baby. No one told us!
We immediately started planning to change the merch on the site, maybe have
ReadWriteWeb | The Real-Time Web and its Future | 13
- 16. a baby shower; we added a poll asking people if they thought it would be a boy
or a girl; all steps to take advantage of the traffic that was coming to the site at
that moment.
“Something like that happens every day. One of the sites might be trending more
than usual because the artist just released a record. We can correlate and react
right away. Omniture is good data, but it’s not as fast as we have here.”
Kaplan says the next step is to expose this data in ways that best suit different people throughout the
company. He views it through an Adobe AIR application that he built in dashboard form, but different
departments have different needs. He’d like to figure out effective ways to present that data all the way
up to the CEO level.
Traffic data is just one type of information that the company sees. Kaplan says Warner Brothers knows,
for example, that promotions on Twitter tend to get higher click-through rates but lower conversions
than promotions on Facebook. Authentic artist sites, even if they aren’t as contemporary in design as,
say, Facebook, remain very important to the online music ecosystem.
Kaplan says he’d like to see all data the company captures, including anonymous user-specific
data in aggregate, run through artificial intelligence systems that quantify and detect patterns of
engagement. “This user did X,Y and Z in a time period. That’s a huge amount of computation,” Kaplan
says. He told us that he’s looking at Mapreduce, cluster analysis and other methods, but the big
takeaway for him is that the company can do a lot because4 it has the raw data and understands what
types of data it needs.
Lessons learned? “We’re still at such an early stage that we don’t have any lessons learned,” Kaplan says.
“We’re just constantly learning new things.”
See also:
Ethan Kaplan’s personal blog http://blackrimglasses.com/
Jeremy Welt, SVP of New Media at Warner Bros Records http://www.jblogg.com/
Kaplan’s circle on Twitter includes:
• http://twitter.com/eston Eston Bond, Fox Entertainment
• http://twitter.com/mathewi Mathew Ingram, Toronto Globe and Mail
• http://twitter.com/kneath Kyle Neath, GitHub
• http://twitter.com/andygadiel Andy Gadiel, JamBase.com
• http://twitter.com/mikaelgm Mikael Mossberg, Warner Bros.
4 http://twitter.mailana.com/profile.php?person=ethank&
14 | ReadWriteWeb | The Real-Time Web and its Future
- 17. Urban Airship does
real-time Mobile Push
Urban Airship is a mobile phone push-notification and in-
app sales-infrastructure provider. The company powers push
notifications for a wide variety of customers, large and small,
filling a gap created primarily by Apple’s implementation
of push in a way that’s just complicated enough for many
developers to believe it warrants outsourcing.
Urban Airship’s primary contributions to our understanding of the real-time Web include articulating
the following:
• The wide variety of potential use cases for real-time, including onto mobile platforms;
• Another example of a real-time service provisioning as a business;
• Limitations introduced by delivering real-time data through networks owned by other companies.
Starting with the iPhone but aimed at cross- and multi-platform mobile services, Urban Airship told us
a number of interesting things about its experience with real-time information delivery:
• Machine-to-machine real-time messaging is now cheap and relatively easy to implement.
• You can now get updates on a wide spectrum of activities. The technologies to deliver notifications
are evolving faster than the use cases, and there remains some question of just what to do with
these real-time capabilities. A number of real-time companies have told us that the technology
is dropping in price and complexity so quickly that people are looking for particular ways to
implement a clearly compelling general concept (real-time messaging). In other words, the real-
time Web may be more tool-driven than demand-driven so far.
• Use cases that Urban Airship has seen so far range from mobile social games to reminder apps to
mobile storytelling that uses push notification to let a plot unfold over time. The company says
it has other customers in sports and medical fields that it can’t discuss publicly. One that has just
launched is a prescription drug-tracking service that pushes notifications soon before a user’s
prescription needs to be refilled.
ReadWriteWeb | The Real-Time Web and its Future | 15
- 18. • Push notifications have been used most visibly by media companies to send simple messages,
but each iPhone push can carry a payload and allow recipients to take actions such as voting or
approving a purchase. “There will be richer content in the future, not just a line of text,” founder
Scott Kveton told us. “It’s going to move from alerts to real-time interactive: more personal,
more social.”
• Scaling large quantities of high-priority real-time information remains a challenge.
(Shortly after our interview, Urban Airship launched a product aimed at filling this need.)
• One very new expectation that clients have of many developers who they hire is an ability to
quickly build out real-time features.
• Push notifications on the iPhone also require a download; push can come only from apps on
the phone.
So, Urban Airship says it is a cheap and easy mobile push-notification service and that rich use cases
of the future are limited only by our imaginations.
16 | ReadWriteWeb | The Real-Time Web and its Future
- 19. Nozzl Media: Bringing real-time
to Old Media
Steve Suo and Brian Hendrickson were newspaper guys
for decades. Then the confluence of declining revenue and
institutional risk-aversion, during a period of historic change
for the industry, led them to leave those institutions and strike
out on their own. Suo has a background in automated public-
records extraction and analysis and Hendrickson in real-time.
The primary contributions to understanding the real-time Web that Nozzl Media offers are articulating
the following:
• The gap between legacy publishing and the real-time Web; that’s both opportunity and barrier;
• Another filtering strategy: user-centric, client-side and full-text, instead of strategic, programmatic
and as a pre-determined value-add; and,
• The opportunity available in transforming old data into real-time.
Early this year, another long-time newspaper guy, Steve Woodward, joined them to found a startup
called Nozzl Media. Nozzl aims to help newspapers embellish their original content with a real-time,
filterable stream of hyper-local public records, news and blog posts. The company is building a mobile
Web app and Web page widget that push that content live to readers.
Public records tend to be largely inaccessible, relegated to arcane, search-driven websites and dumb
PDFs. Nozzl Media says it has built technology to extract that information, put it in geographic context
and push it live to the Web as soon as it’s discovered.
ReadWriteWeb | The Real-Time Web and its Future | 17
- 20. Nozzl is doing a number of particularly interesting things.
PUBLIC RECORDS
Nozzl extracts public records of interest – including Occupational Health and Safety Administration
(OSHA) citations to businesses, approved building permits and doctors’ licensing information – from
online repositories with what Nozzl calls its “automated form-pumping robot.” Many computer-assisted
reporting specialists write scripts to perform one-off acts of data extraction for their research, but
Nozzl has built software to perform these functions systematically, regularly, reliably and behind the
scenes – and then make the information available in a published stream in real-time. The results can be
quite interesting – and could qualify as news content.
Is the raw feed of public records valuable, though? Or is a journalist with a trained eye still needed
to find the real news in the feed and put it into context? Presumably, both the raw feed and the
journalism it enables will support one another, but a raw feed of public records could possibly have a
signal-to-noise ratio that no one but a journalist would find compelling.
The fire hose is valuable, but sometimes the hand of a skilled, real-time curator is more valuable. Nozzl
Media specializes in pushing the fire hose to the public as an act of media.
Finding new forms of information that haven’t been available in real-time and making them easily
available is a meaningful addition of value. People say that information about more and more social
activities are becoming available as data – but someone has to build the infrastructure for that to
happen, and that requires more technology in certain milieus than in others. Government data – so
often made available in unsyndicated, opaque PDF files – is particularly challenging. Dislodging it into
the cloud, then, becomes a particularly valuable act.
LIVE FILTERING
The Nozzl team built its Web page widget with a live jQuery feature that allows for filtering of the
current corpus of data on the fly; items on display are filtered as each letter is typed by the user in the
filter box. It looks like Google Suggest in reverse.
Filtering the flow of data is something that every company in this space is talking about, and Nozzl has
a unique way of doing it. Real-time, on-demand, full-text filtering at the user’s fingertips may or may not
be a compelling user experience. It’s an option, though, that stands in contrast to the text analysis, entity
extraction and imposed categorization that other filtering strategies emphasize and are slowed by.
EMBELLISHING LEGACY CONTENT
The time-frame for freshness in publishing is shrinking rapidly. While online publishing was so much
faster than print publishing that it disrupted an entire industry, the manual creation of original content
is the slow horse in the race online. That doesn’t mean it’s not valuable; it’s the primary source of value
for the institutions in question (newspapers), but it’s not necessarily sufficient.
18 | ReadWriteWeb | The Real-Time Web and its Future
- 21. Embellishing the original content of newspapers with local real-time content is reminiscent of the old
newswire model of newspapers syndicating AP or Reuters content. Will it save newspapers? A dose of
Facebook newsfeed-style delivery of things like new doctor licenses, local restaurant health violations
and aggregated blog posts on a newspaper website? That could make a big difference. Time will tell.
NEWSPAPER RETICENCE
Nozzl originally intended to focus on a mobile Web app, or delivering content for newspapers that
need mobile apps. The company believes that newspapers and broadcasting organizations in general
do not yet have effective mobile implementations; spend a little time using all but a few mobile
newspaper efforts, and you’ll see the validity of this argument.
Newspapers were reticent to use Nozzl in that way, though. They wanted widgets for their websites
instead. We assumed that was because websites are more effectively monetized using display ads.
The widget economy and experience are crowded, though, and Nozzl argues that displays ads have
peaked and will only decline from here. Nozzl as a stand-alone, highly functioning local-news mobile
app strikes us as incredibly compelling; Nozzl as one more widget on a Web page, less so.
Nozzl’s Steve Woodward says it was simpler than that, though. “The real reason [that newspapers were
reticent about the mobile app],” he says, “has more to do with comfort level than any direct thoughts
about monetization. Mobile is a new technology that most newspapers aren’t yet comfortable with.
On the other hand, they feel they understand the Web, and they certainly understand content. So they
are able to see value in adding real-time content to a news site, while they have a harder time seeing
the same or greater value in mobile.”
So goes the story of innovators who would break free of aging institutions only to establish businesses
that are built on adding value to those same institutions. Web widgets it is, for now at least.
Bringing content to Web pages in real-time may not be a sufficient differentiator for Nozzle indefinitely,
though. As Ted Roden of the NY Times R&D Department and enjoysthin.gs says, “Otherwise it’s like
looking at a Word doc in a Web browser. [Everything in the future] has to be real-time.”
Woodward says that the type of content Nozzle delivers will be key. “We need to step up our game to
bring in more, not fewer, public records,” he says. “That kind of content will be the thing that sets us
apart most from future competitors.”
See also:
Steve Woodward on Twitter http://twitter.com/nozzlsteve
Brian Hendrickson on Twitter http://twitter.com/brianjesse
ReadWriteWeb | The Real-Time Web and its Future | 19
- 22. Aardvark and the real-time
Web of People
Aardvark is a social search engine that combines artificial
intelligence, natural-language processing and presence data
to create what the company calls “the real-time Web of people.”
The end result is “a magical experience,” CEO Max Ventilla says.
The primary contributions that Aardvark makes to understanding the real-time Web include:
• Leveraging presence data;
• Communicating across platforms;
• Emphasizing user experience;
• Harvesting social data from third-party profiles;
• Text analysis on-the-fly;
• Mediating human interactions with machine intelligence; and,
• Filtering the flow for both inquirers and respondents.
You can ask Aardvark any question, and it will try to find a person in your extended social circles
who knows about that topic and is available to answer at that moment. Aardvark facilitates these
conversations through a very polite IM bot, an iPhone app with push notifications, the company’s
website, Twitter or email. Instead of broadcasting your question to everyone’s stream of information,
Aardvark delivers the question only to people who are relevant and available.
Founded in 2007 but launched just this year, Aardvark’s got an all-star team of engineers from Google
and Yahoo and high-profile investors. It’s already cutting deals with major tech brands, and the use cases
are just beginning to be explored. The Web 2.0 Summit had a dedicated Aardvark circle for attendees
to answer each other’s questions, and Federated Media will soon roll out a campaign sponsored by
Microsoft in which Aardvark will facilitate a Q&A with relevant IT experts around the clock.
The company says that 90% of questions get answered in five minutes or less. During our extensive
use of the system and conversations with many other users, we found the answers that were delivered
were generally satisfactory or better. The system gets smarter the more you use it.
“When users come in and have a magical experience,” CEO Max Ventilla says, “that’s more important
than the info they get back, to know that there are people who would help you immediately. This is
20 | ReadWriteWeb | The Real-Time Web and its Future
- 23. social search as a complement to web search. The billions of pages on the Web are static data; that’s
just a fraction of what’s available in peoples’ heads.”
Aardvark goes so far as to say in a blog post about the real-time Web5 that, “What really matters is the
increased accessibility of people online, not just information online.”
Users are tagged with areas of interest or expertise by the friends who invite them to the system, and
then they add additional tags on their own. Further information about what a person knows is gleaned
by analyzing the user’s Facebook profile page or Twitter stream.
“Data gets stale, even your profile data,” Ventilla says. “We want to keep that fresh, by taking advantage
of all the data that’s passing by. The things you’re posting about [on other social sites] are things you
have recent experience with. Being able to converse with someone who just had a learning experience
adds a lot of relevance. Social graph and profile data built up over time, the fact that people are
making that info available for building value with communication tools – that’s a dramatic shift with
the Web.”
In addition to user tags and social network profiles, Aardvark analyzes the text of inquiries to find related
users to query, and it keeps track of response times and types. The service notes the vocabulary that people
use (including ‘off-color’ conversations), who likes little chats and who engages in extended conversations. It
then pairs sets of users with questions and with answers that it believes will be compatible.
5 http://blog.vark.com/?p=201
ReadWriteWeb | The Real-Time Web and its Future | 21
- 24. “This is a serendipity engine,” Ventilla says. “There’s variability in peoples’ experience, and we have
to maximize the chance that something goes beautifully instead of bad. It’s about designing a user
experience to keep a conversation on the rails.”
Aardvark scores high on user experience for most of its interfaces, the latest iteration of its website
being one possible exception. With this service, the website isn’t that important.
Filtering the flow of information from the real-time Web is a concern that everyone who is touched by
these technologies raises. Aardvark says it performs a filtering function by limiting the broadcast of a
user’s question to relevant people they are socially connected to.
“[With Aardvark,] you have the ability to have a conversation,” CEO Max Ventilla says. “This is
fundamentally different from other forms of real-time search.”
Conversations are so easy to have on demand with Aardvark that I once instigated and conducted
three extended, simultaneous live interviews with topical experts around the world during a tech
industry event6, all through the Aardvark IM interface.
QUESTIONS THAT AARDVARK HAS ANSWERED WELL IN TESTING.
• Is there any good way to serve a butternut squash and a sweet potato in the same meal? I’m
thinking maybe I should just do the squash. [I ended up making a great soup.]
• What are some examples of publicly available real-time data still excluded from search after today’s
announcements by Bing and Google? [Best answer: commodities prices.]
• What’s a good email address for Mozilla PR? [I should have had this already, and it took one line of
explanation, but a Mozilla employee gave me contact info for the head of PR there within minutes.]
• I have 5 minutes to choose: what tech, business, news or art podcast should I load up to take on a
walk with my dogs? [Best suggestion: Monocle Weekly.]
• What’s in Arm & Hammer baking soda laundry detergent, and can I spread it on my carpet to
vacuum up? [I would have been to embarrassed to ask this in other contexts, but Aardvark
subjected just a small number of people to my cry for help.]
QUESTIONS THAT AARDVARK HAS NOT ANSWERED WELL IN TESTING.
• What’s a romantic ocean cabin rental near San Diego that I might be able to get near new year’s?
[No answer.]
• What question should I ask the founder of Blog Talk Radio podcast service in an interview?
[A 15 year old gave me a generic question, and I didn’t resubmit.]
• Where can I get pizza delivered in North East Portland after 10pm? [To be fair, this may be an
unanswerable question. I can’t believe I bought a house in an area with such bad pizza coverage.]
6 http://www.readwriteweb.com/archives/bing_twitter_search.php
22 | ReadWriteWeb | The Real-Time Web and its Future
- 25. Mendeley and the real-time
Web of Science
Mendeley is a service for organizing scientific research papers
and includes social features such as recommendations of
research and other scientists you might like. The company says
it’s like Last.fm or iTunes for scientific research and has backers
that include co-founders of Last.fm and Skype. The company
offers both Web and desktop software.
The primary contributions that Mendeley makes to understanding the real-time Web include
articulating the following:
• Opportunities to transform legacy institutions in qualitative ways by reducing time and harnessing
network effects;
• The importance of offering non-real-time, non-social value in order to get individual buy-in; and,
• The value of implicit data.
What’s the real-time element? Whereas scientists traditionally have had to attend events to learn about
the hot research topics in their fields and who is doing related research, Mendeley can track reading
and citation activity in real-time to provide recommendations and trending data. The company is also
considering adding a feature to its Word plugin that captures and tracks citations as they are written.
Bringing real-time, social network effects and recommendation to science? If successful, the
consequences could be profound. Effective online recommendations could change work in the lab
and the quality of the face-to-face conversations. Real-world interaction now has a whole lot more
preliminary context, thanks to the Web in general and services like Mendeley in particular.
Mendeley says it is on pace to become the largest repository of scientific literature on the Web
sometime next year. The key to adoption of the software, the company says, has been that Mendeley
offers value even when used alone: the meta data extraction and paper organizing are useful enough
on their own. There are many different kinds of software for organizing scientific papers, though, and
early versions of Mendeley had some trouble processing the content that users inputted. The software
is really aimed at social recommendations, and many scientists enjoy it for that.
ReadWriteWeb | The Real-Time Web and its Future | 23
- 26. Librarians interested in discovering which journals are publishing the hottest research articles also
use Mendeley; that is information that publishers of high-priced research journals haven’t had an
interest in exposing. Mendeley envisions a future when university departments use the service to
capture data about the productivity of their researchers, information that could influence hiring and
tenure decisions.
“The real benefit of real-time is for those doing the science,” Mendeley’s Research Director Jason Hoyt
told us. “The most relevant research to yours could be in a minor journal you might miss. If it’s popular
and relevant, this search process will show you that.”
“You find researchers downloading a lot of papers,” Hoyt says. “Many times people will cite bad
research; but implicit data – like opening a document several times, sharing it, etc. – that data says that
a research document is really relevant.”
Mendeley isn’t the only real-time company that derives a lot of its value from a desktop client and
the implicit behavioral data that it provides. Many of the best-known real-time search engines
leverage local software that captures implicit data. There is far more implicit data (like clickstreams)
in the world than explicit data (like shared links) – it’s just a matter of building support for software
that makes it available.
Aren’t scientists famously private with their research in progress, though? “There might be some trade
off, even with anonymous aggregate data,” Hoyt told us. “But you have to communicate in science
anyway – and you have to give a little to gain a lot. You do have the option to make what you’re
reading private in Mendeley, but less than 5% of articles and citations are hidden from complete view.”
That’s a reasonable account, but some reviewers have said that Mendeley’s disposition towards sharing
creates a flow that encourages users to either share publicly or not use the service at all. (Private group
sharing isn’t yet supported, for example.) Time will tell how well Mendeley can move a market that’s
already crowded with other research organization tools that are far less social.
Hoyt says the company is still learning what to do with all the data it captures, but there are a lot of
possibilities.
24 | ReadWriteWeb | The Real-Time Web and its Future
- 27. “ If we have a subset of research on a topic right now, we can then predict
where the research is going to take us in future. We can predict how research
topics are going to morph. Then you can know where to apply research
funds or remove funds. People could start modeling their careers based on
the data they are seeing.”
One of the next steps on a technical level, Hoyt says, will be for Mendeley to learn how to extract sets
of data from papers and offer scientists recommendations of data that are similar to what they are
working with.
This is disruptive work that Mendeley is doing.
See also:
Jason Hoyt on Twitter http://twitter.com/jasonhoyt
Jason Hoyt’s social graph on Twitter includes:
• William Gunn, scientist, http://twitter.com/mrgunn
• Daniel Mietchen, scientist, biophysics, http://twitter.com/evomri
ReadWriteWeb | The Real-Time Web and its Future | 25
- 28. Black Tonic Re-Imagines the real-time Web
as a Controlled Experience
Black Tonic is unlike any other company covered in this report.
The Black Tonic product is a presentation tool for designers
to give controlled, remote presentations of proposed design
work to clients.
The Black Tonic experience is not public. It’s not collaborative. It’s not a lot of things we associate with
the most visible examples of real-time technology. It’s actually very controlled.
Black Tonic is a download-free, HTML- and JavaScript-only browser-synchronization and browser-
sharing application with unlimited viewership and support for broadcasting to mobile browsers. Still
pre-launch, the company says it plans to “offer prices and plans that scale from independent designers
to large agencies.”
The company calls this type of browser synchronizing technology DOMCasting.7 It’s an interesting,
relatively simple, model.
A common problem for designers working for remote clients is that work tends to be sent in PDF
or PowerPoint formats, via email. The client then clicks through the presentation at their own pace,
with no explanation from the designer, well before the two parties have a phone conversation to go
through it together. Designers don’t like this very much. “It frustrates the necessary process and work
flow when reviewing work,” Black Tonic co-founder David Price says.
Black Tonic offers a way for designers to control in real-time what is displayed in the viewer’s browser,
through nothing but a Web link, and with as many remote viewers on Web or mobile browsers as they
choose to share the link with. Presentations – complete with explanations, concepts and story – can
then be given at the designer’s pace.
Black Tonic argues that on real-time social networks such as Twitter and Facebook, the emphasis is
on empowering individuals, and there’s no structure to the relationships between people. A spectrum
of options is available on the real-time Web, though, ranging from technologies that reinforce and
empower the perspective of the individual to those that force an individual to view content from a
different perspective or a larger structured context.
7 http://wolv3rin3.com/articles/2009/august/14/domcasting-technology-101-introducing-flow
26 | ReadWriteWeb | The Real-Time Web and its Future
- 29. “If you’re doing a remote client presentation, how do you prevent the client from having a subjective
experience of the work?” Black Tonic co-founder Phillippe Blanc asks. “First, force them to view the work
from a perspective guided by the designer. Once they understand the work and the context, you can
have a collaborative, constructive discussion about the work.”
“Conversation is the new content. And true conversation only happens when people share time
and space,” Blanc’s co-founder David Price says. “The designer’s inability to storyboard is a failure
of the process.”
Historically, the two argue, when people find the limits of a technology, they develop workarounds.
Then, when more powerful technology becomes available, people often fail to reconsider the
workarounds and so change the process.
The Black Tonic team believes that lightweight real-time technology is an opportunity to reconsider
remote presentations, to add some structure to them and add the necessary control over presentation
that they haven’t had with the workaround of emailing PDFs.
A whole lot of options arise when a new computing paradigm emerges. Real-time doesn’t have to only
mean delivering a chaotic or filtered stream of social information to an individual at the center of the
system. Black Tonic is a good example of looking outside the standard application of a new technology
and instead taking advantage of the opportunity to reconsider standard practices that have been
influenced by technological limitations that no longer exist.
ReadWriteWeb | The Real-Time Web and its Future | 27
- 30. At the Red Cross, the
real-time Web Saves Lives
The real-time Web isn’t just changing our lives online; it’s
starting to make a big difference offline as well. Disaster relief
efforts at the American Red Cross have been transformed by
real-time technology. Walmart may be world famous for its
powerful inventory-control system, but some people say the
Red Cross is becoming another leading example of a highly
effective, large-scale organization co-ordinating activities
around the world in real-time.
The primary contributions that Michael Spencer’s discussion of the Red Cross makes to our
understanding of the real-time Web include articulating the following:
• The real-world consequences of real-time technology;
• Transforming a legacy institution using real-time technology;
• Strategic reliance on third-party software in a real-time context; and,
• The importance of planning, relative to technology implementation.
Michael Spencer, lead for SharePoint technology at the American Red Cross National Headquarters,
puts it like this:
“ The Red Cross has been around for over 100 years. I’ve been here for 12 years,
and with what I’ve seen over the last year in terms of real-time information,
co-ordination and our dashboard overseeing everything, I think we’ve
made 50 years worth of advancement in a year or two because of real-time
technologies. At the Red Cross, the real-time Web saves lives.”
28 | ReadWriteWeb | The Real-Time Web and its Future
- 31. The national Red Cross disaster response center responds to about 350 disasters every year, whenever
a local chapter is beyond its capacity. When hurricanes strike, the organization has days to plan; with
earthquakes or aviation disasters, it has no time at all to plan.
Spencer says:
“ It used to take two days to inventory our available volunteers. Now that can
be done in one or two hours. We used to call them, send them emails, try to
process all of these incoming emails. It was a struggle to get people on the
ground. Now I can see exactly who is available, trim the list down by region,
by language, by specialty skills. That’s all at my fingertips instantly.”
“We now put videos and photographs in an online disaster news room,
where victims can also go for shelter locations. We’re feeding information
into SharePoint and then posting that to newsroom.redcross.org. All that
info feeds into a public shelter database; as soon as one opens or closes, the
information is available to the public. It’s a way for the media to see what
they can publish on the radio and TV. This is critical info. With shelters, once
one is filled to capacity, people need to be sent to a different shelter.
“We also have something called ‘Safe and Well.’ We can now register people
through our website and then publish this information, so that anyone
looking for info on family can search for peoples’ names, addresses or phone
numbers. Displaced people can leave a message there – we can reassure so
many people that their loved ones are safe.”
The Red Cross makes sure to keep latency and downtime on that “Safe and Well” site as low as possible.
One thing the organization has to do when responding to disasters is to verify the claims of home loss
that people file. That used to take a long time, but no longer, Spencer says.
“ In this last year, we’ve sent volunteers out with PDAs. We used to go around
with a car and sheet of paper to verify damage. Now we have handhelds
that let you take a picture of a house – it has GPS in it – upload it to a
satellite, and then we can do real-time monitoring from a dashboard.
“That dashboard view of houses damaged? That would have taken weeks
before. Now we can do it right away. The government can also do fly-overs
that feed rough estimates of damage from a plane into our portal, so we
can get an overview within a few hours, and then our volunteers go out with
ReadWriteWeb | The Real-Time Web and its Future | 29
- 32. devices. That used to take me a week and a half or two weeks, even longer.
I could never get a fly-over by the government or get my volunteers in. Now
it’s fed automatically to my dashboard. I don’t have to call people and report
our new numbers. We even used to do shelter numbers by hand for meal
ordering. Now it’s all done through the Web.”
From volunteer and shelter co-ordination to the “Safe and Well” program to sometimes millions
of dollars in donations collected online in a single day, the Red Cross is heavily dependent on its
Web presence. The organization uses a service called AlertSite to monitor its uptime. AlertSite runs
continuous automatic tests of website functionality and sends the Red Cross real-time alerts and
diagnostics whenever there’s a problem. “We were having critical problems with SharePoint going out
for 5 to 24 minutes,” Spencer says. “We can’t withstand that. AlertSite now pages all the engineers with
diagnostics, and we respond immediately, sometimes just from our BlackBerrys.”
Despite those problems, Spencer remains a big advocate of SharePoint.
“ We’ve seen the evolution of SharePoint over time. The biggest problem with
SharePoint 2007 is when you fail to put a good governance plan in place.
Your work should be 80% planning, 20% implementation. It tends to be just
the opposite. People tend not to plan it out well and don’t have a good idea
of what SharePoint could do. We’re only leveraging about 15%, maybe 20%,
of its capabilities. We had to spin up a call center for Katrina, for example:
we needed to track calls, see who’s following up, etc. I was able to create a
solution in SharePoint in one day, and they are still using the same system
three years later. It’s all about training users how to use it, empowering them
to take it off IT’s shoulders.”
Another third-party service that the Red Cross uses heavily? Breaking News Online (BNO), the
international newswire on Twitter and the iPhone. BNO is an amazing story. The service was founded
two years ago by a 17 year old from Switzerland and is now run by a plucky little crew of online
journalists around the world. It’s the fastest way to get breaking news from around the world,
around the clock. Rafat Ali of the UK Guardian’s paidContent wrote last month that BNO is eating the
mainstream media’s lunch and that someone really ought to try to buy the organization. Apparently,
BNO is so on top of things that even the Red Cross watches it closely.
Spencer says that a lot of people at Red Cross headquarters are subscribed to BNO. He told us the story
of an eight-hour work session on simultaneous disasters that the team finished late one recent night,
only to receive push notifications from BNO as soon as they closed their laptops, breaking news that
another disaster had struck.
30 | ReadWriteWeb | The Real-Time Web and its Future
- 34. John Borthwick: Thoughtful prince
of the real-time Web
John Borthwick is a complicated, thoughtful
man. Business Week called him “perhaps
the real-time Web’s key articulator.” He has
already built, bought and invested in more
high-profile real-time Web technologies than
probably anyone else in the consumer Web
world. He’s hardly an unqualified cheerleader Creative Commons
for the real-time Web, though. Borthwick Attribution Brian Solis
is unafraid to consider different sides of a
situation or to change his mind.
In 1997, John Borthwick built and sold to AOL the content publishing company behind the site Total
New York. The New York Times focused on the irony of the deal in its coverage: Borthwick had publicly
called for independent content producers to stay independent just a month earlier. While at AOL, he
testified in the US government’s case against Microsoft – but now he says he thinks the position he
took was wrong. These days he argues instead that innovation will outpace monopoly in technology
and that regulation isn’t the solution.
Borthwick saw AOL fall from grace, but he kept in touch with many of the smartest people he met
there, and he has ties to several of their startup companies today. That circle of people includes
http://luckyrobot.com/ Gerry Campbell of real-time search engine Collecta and the Summize crew,
which both Borthwick and Campbell invested in before it was acquired to become Twitter’s in-house
search engine.
Borthwick points to the rise of YouTube as proof that an entirely new kind of search can emerge fast.
YouTube is now the second-most popular place for people to perform searches online, after Google.
This summer he wrote, “I now see search as fragmenting and Twitter search doing to Google what
broadband did to AOL.”
These days, Borthwick is the CEO of Betaworks, the best-known investment group on the real-time
Web. After Summize went to Twitter, Bit.ly, a link-sharing and analytics tool built by Betaworks and
invested in by a constellation of Silicon Valley superstars, became the default URL shortener for
Twitter.com.
32 | ReadWriteWeb | The Real-Time Web and its Future
- 35. Other Betaworks investments include the most popular Twitter client (TweetDeck), Howard Lindzon’s
Twitter experience for stock traders (Stocktwits), the new database of gadget reviews (Gdgt), from
Engadget and Gizmodo founders Ryan Block and Peter Rojas, the humor site Someecards, hyper-
local news aggregator Outside.in, lightweight customer support service UserVoice, content curation
platform Tumblr and 13 others. Betaworks itself bought Twitterfeed, the service that every organization
from CNN to the White House uses to pump RSS feeds into Twitter and now into Facebook.
For all this real-timeness, Borthwick watches out vigilantly for his own ability to think and
communicate in long form:
“ I write about one long blog post per quarter. I don’t show them to anyone. I’m
long-winded and verbose. I try to make it intentionally long form because
there’s a lot of things we’re touching on right now. I write about history. A lot of
the tools we’re using today are washing away history. There’s a bunch of really
profound implications of that. I try to do long form things periodically because
you can get so fragmented in our world that you never dig into the long-term
issues that we’re contributing to but not talking about.”
This leader of the real-time Web, one of the main men behind the biggest little link shortener on earth, is
worried about the consequences of rapid-fire short-form communication? Thank goodness. Thoughtful
consideration is very reassuring and too rare. Here’s Borthwick on why he does what he does:
“ John Barlow said there was no Prana or life source energy in an Internet
interaction, but could there be some sense of life and of energy that
gets transmitted? Part of what’s happening in the real-time Web is the
synchronicity that takes place in a real-time conversation. There’s not time
to package and prepare the meaning around the meaning of what you’re
discussing; the liveness of the event yields an order of magnitude different
interaction, and that interaction is more human. The Web is becoming a
more human place. We’re humanizing the machine a bit. I think that’s a
good thing. I have three kids, and I see the way they interact with machines,
and this is something I strive toward. There’s a moral imperative in this – but
I don’t want to imply that for anyone else.”
Borthwick is a believer in the data portability vision; he believes that identity will be separate from
services in the future and that people will pick and choose between best-of-breed service options.
“In the early days, there was a sense that people were going to build portal sites,” Borthwick says. “Then
people thought that social networks would provide a new way to navigate.” Now he sees search as a
primary form of navigation, a way to track conversations, not pages.
ReadWriteWeb | The Real-Time Web and its Future | 33
- 36. “ We believe things are becoming more connected. In the future, everything
will consume APIs and publish APIs. People on the business side would
say over the last 5 or 10 years, ‘That’s not a company. That’s a service’ [i.e.
services with APIs at both ends]. I would say to them, ‘If it’s just a product,
then what is the whole it should be a part of. They’d say Yahoo should buy
it, but in most cases they squander it. I sold a company to AOL and went
through the squandering of my company, then did that to other companies.
If the next generation is cohesive parts, the whole they belong to is the
Internet.
“[Betaworks investment] Gdgt is a database. They are aggregating user-
generated content around a structured data set. That’s central to what we
think about at Betaworks. We view it as data structuring – that fits into
our worldview of what’s important. They aren’t a gadget blog or a media
company. They understand that many of those contributions won’t happen
on their website, that the boundaries of their site need to be permeable. They
are all involved in social real-time. They are also to a greater extent sharing
open data.
“In the real-time stream, a core reason why we jumped in with TweetDeck
(we wanted to buy the company) was because Iain was articulating the
data in a column format. The Web is striving for new representations of data
types. We’re supplementing the page-based metaphor with the stream-
based metaphor. When you screw with metaphors, you destabilize things.
All the clients before TweetDeck used the heritage metaphor of instant
messaging.
“The metaphors people choose are so powerful for how people both publish
and subscribe. I think we’re just scratching the surface of this stuff. The lock-
in that we’ve had around pages has held us back in terms of innovation and
how to use this medium. When we got here [to the Web] there was nothing,
and we flopped a 500-year-old metaphor of pages, a browser that by its
name says you will browse, not touch, this content. But it was not meant to
be a one-way experience. We’re only a fragment of the way into this journey.”
34 | ReadWriteWeb | The Real-Time Web and its Future
- 37. ARE WE GOING TO GET BRAIN IMPLANTS?
I made casual mention of brain implants and what a bad idea I think they are in a recent conversation
with Borthwick, and he had something to say about the matter.
“ The brain implant is implicitly happening. I spend seven hours a day looking
at and tied to the screen. We’ve extended ourselves into this network already;
we’ve accepted it de facto. A good piece of the revolution for me is to humanize
it more. There’s a large degree of computing and Web work that has occurred
in the last twenty years that’s dehumanizing. The transition from portal to
search to social distribution – part of that trajectory is that it’s becoming
more human. But we are also placing ourselves into the network and into the
machine. The day we wake up and realize that the network has ‘become self’
will be too late – we will have extended ourselves into the network.
“Once upon a time, people thought eyeglasses were technology. In that
Umberto Eco book ‘The Name of the Rose’, a character made eyeglasses.
People thought he was modifying sight. You read this and it’s quaint. We
embrace them as an extension of self, but we don’t think of eyeglasses as
technology. We’ve become comfortable with the technological mediation of
what we see. It’s an example of how human beings are capable of extending
sense of self and embedding technology into our sense of self.
“Filtering is already endemic to the stream. To some extent, everybody is
curating the inputs into their stream, but sharing the curation tools is not
available today or is very, very crude. Using other people’s brains to filter and
help curate that data stream in a dynamic fashion is implicit to where all this
is going. The data structuring stuff is important because we’ve got to find
ways beyond search to find things. But as one of the engineers on Summize
said, a computer science professor wouldn’t consider this search because
the axis on which we measure is time, not relevancy. To me, it’s much more
of a filtering metaphor. What we found with Summize was that people left
multiple tabs open to run concurrent searches. All of the old PubSub Wyman
stuff was coming back to the fore. Human filters, understanding how we
can share, how we can do data structuring, using search and navigation for
discovering relevant info is where this is going.
ReadWriteWeb | The Real-Time Web and its Future | 35
- 38. “ I feel like we’ve got this concurrent stream of how we can plug into what
other people are doing, thinking, feeling and experiencing. We can bring
greater humanness to that, make the world more connected and more
understanding because we can understand other people’s context. That’s
what you’re feeding in. A lot of that is what I’m working on, what I wish for
and think is fascinating.
“That said, I have a lot of respect for the sole contributor. My brother is an
artist and has no interest in other people’s ideas. Many of the greatest works
have been created that way. There’s a tension there that’s very interesting.”
See also:
John Borthwick’s blog posts and other information is at http://www.borthwick.com/weblog/
John Borthwick’s circle on Twitter1 includes:
• Andrew Weissman, Betaworks, http://twitter.com/aweissman
• Bijan Sabet, VC at Spark Capital, http://twitter.com/bijan
• Terry Jones, CEO at Fluidinfo, http://twitter.com/terrycojones
• Nathan Folkman, Engineer at Foursquare, former Systems Architect at Betaworks,
http://twitter.com/nathanfolkman
• Mary Hodder, serial entrepreneur, http://twitter.com/maryhodder
1 http://twitter.mailana.com/profile.php?person=johnborthwick&
36 | ReadWriteWeb | The Real-Time Web and its Future
- 39. Chris Messina: Rebel with a
proposed technical standard
Just 10 years ago, Chris Messina was a
suburban teenager in New Hampshire who
lost his faith in authority, stopped doing all his
homework and tried to hold his high school’s
website hostage after he was suspended
for running an ad on it for a proposed gay/
straight alliance student group. Photo of Messina from
Wikipedia, taken by Tara Hunt.
Since then, he’s enjoyed some impressive
accomplishments. He designed the two-
page ad that ran in the New York Times
announcing the launch of Firefox2; he
co-founded a network of public events
(Barcamp3) in more than 350 cities; he
serves on the Boards of the OpenID
Foundation4, the influential new Open Web
Foundation5; and he is now one of the most
closely watched players in the world of
online social networking. He’ll turn 29 years
old in January.
Now working as an independent consultant, Messina is one of the leading people behind a technical
format for syndicating user activity data from one service to another in a human-readable way, called
Activity Streams6. Facebook, MySpace and Windows Live have already begun producing user data in
the Activity Streams format. Twitter does not yet.
2 10,000 people donated $30 each to buy that ad and it featured all their names.
3 http://barcamp.org
4 http://openid.net/foundation
5 http://openwebfoundation.org/
6 http://activitystrea.ms
ReadWriteWeb | The Real-Time Web and its Future | 37
- 40. WHAT IS THE ACTIVITY STREAMS FORMAT?
Everybody talks about filtering the real-time stream of information online, but the Activity Streams
community is where conversations take place between leading engineers at the world’s biggest and
smallest social networks with the goal to replace the “walled garden” model of social networking with
an open, inter-operable communication marketplace.
If Activity Streams succeeds, you will be able to subscribe to and filter the activities of your friends
across multiple different networks, without having to sign up for or even know about those other
networks.
This is almost the equivalent of AT&T phones being able to make calls to Verizon phones, or of rail-
transport companies being able to ship goods across the country over different railroad networks –
because the rails for the trains are the same size.
It’s different, though, because of the granular filtering by type of activity. Applications built on top
of Activity Streams will allow for the equivalent of a phone that accepts phone calls only about
certain subjects from certain people... because, of course, we’re now receiving a lot more inbound
communication than we did in the telephone era.
“ The real-time river of news makes information available to you as it is
created,” Messina told us, “but you need a way to consume it that respects
your time, enhances the content or makes it easier to consume. The Activity
Streams format aims to allow people to receive a stream in a way that they
can manage.”
An extension of the Atom feed format, the spec explains it like this: “An activity is a description of an
action that was performed (the verb) at some instant in time by some actor (the subject), usually on
some social object (the object). An activity feed is a feed of such activities.”
In the current draft spec, you can perform such actions as Post, Share, Save, Mark as Favorite, Play,
Start Following, Make Friend, Join and Tag Object. An Object could be an Article, Blog Entry, Note, File,
Photo, Photo Album, Playlist, Video, Audio, Bookmark, Person, Group, Place or Comment. These actions
can have such contexts as Location, Mood and Annotation. Stream aggregator Cliqset publishes
Activity Streams feeds that don’t require API authentication to view. You can see a sample one at:
http://cliqset.com/feed/atom?uid=dbounds.
The aim of Activity Streams is to have multiple social networks use a common language and have
a common understanding of what all those things mean, so that messages can be read across different
networking sites. Messina explains that both publishing and subscription technologies need to
become more sophisticated in reading and writing streams of data in order for this vision to become
a reality.
38 | ReadWriteWeb | The Real-Time Web and its Future
- 41. He says:
“ The real-time Web is a shift towards something more like how humans
interact with the world: the information just flows right in. When it comes
to thinking about Activity Streams, how can we add a few more semantic
hints to the original info coming to our [subscription] agents? And then
how do we filter what’s relevant? Here’s an analogy. Dogs have 300 million
receptors in their noses, so they can parse smells really well. We only have
6 million receptors in our noses. Imagine if we went from having 6 million
to 300 million receptors that we could use to filter information. We haven’t
developed those sensors yet in order to create more possibilities.”
Standardized, semantic clues from feed publishers and the ability to read them in whatever application
we use to read updates are the kinds of receptors that Messina is helping to design and implement.
THE WEB OF PEOPLE
“ The thing non-geeks can understand and bring to this is their identity,” Messina
says. “We’re getting back to the individual as the primary actor in the system.
They can hook up systems to their identity providers and do things.
“Facebook is one of the first services to orient itself in this direction; it is
providing some good R&D into where this is going, and it is doing good work
in this kind of direction. You log in to your Facebook account, and everything
flows to you. Right now, that’s the best metaphor that we have.
“I think Facebook is going to play a very important roll. I think it has a desire
to align itself with the Web, just as Google does.
“Video games provide a great experience about what real-time on the Web
would be like. Gaming has to be real-time to be enjoyable. Right now, most
of the Web uses interfaces from the document-centric era of the Web that
don’t scale or translate to the real-time Web.
“For example, we want to have longer conversations, but email is one of the
big linchpins that’s broken. Outlook is so entrenched. It’s clear that these
conversation systems are broken.
“But the ‘river of news’ doesn’t have handles that regular people can grasp.
ReadWriteWeb | The Real-Time Web and its Future | 39
- 42. The number of old people who make Facebook wall posts and think they’re
private is enormous! But there are a lot of benefits to this real-time Web, like
being able to reply immediately to a photo. My mom would like iPhone push
notifications of pictures of me or my girlfriend. How do we lead with a carrot
to get people to shift away from email and into a real-time model?”
When Messina was 13 years old, he traveled to Greece and Italy and was shocked to find out that
people in some European cultures left work in the middle of the day to have lunch with their families
and take a nap. “The fact that a whole culture could exist and be so different from mine broke all my
assumptions,” he says. That realization gave him a great sense of hope. Now, as an adult, the tagline on
his blog reads, “All of this can be made better. Ready? Begin.”
He’s been working to make the world better ever since, and now he has a whole lot of traction. Watch
his work for an important window onto the future of the Internet.
See also:
Chris Messina’s blog http://factoryjoe.com/blog
Messina on Twitter http://twitter.com/chrismessina
His Flickr collection of notable user interfaces http://www.flickr.com/photos/factoryjoe
To understand Messina and his work, pay attention to:
• David Recordon at Facebook http://www.davidrecordon.com/
• Scott Kveton, Urban Airship http://kveton.com/blog/
• Will Norris, independent software developer http://willnorris.com
• Joseph Smarr http://josephsmarr.com/ and
John McCrea http://therealmccrea.com, Plaxo/Comcast
40 | ReadWriteWeb | The Real-Time Web and its Future
- 43. Brett Slatkin, Brad Fitzpatrick
and PubSubHubbub
Brett Slatkin has long been an idealist. “If I made a great
product, and Microsoft offered me a lot of money, I would
spit in their faces,” he told Newsweek while a brash freshman
at Columbia University in 2002. He joined Google after
completing a computer science degree in 2005. Last year,
Slatkin sprung into public view with the launch of Google
App Engine, a product that lets developers run their Web
applications on Google’s infrastructure.
Slatkin works on App Engine as his day job, but for his 20% time project he has led the creation of an
important new real-time syndication format called PubSubHubbub. Slatkin calls it Hubbub for short.
HOW HUBBUB WORKS
The PubSubHubbub model has three parties. There’s a Publisher (FeedBurner, for example) and a
Subscriber (perhaps Netvibes), and communication is facilitated through a Hub (Google’s AppSpot
Hub was the demo and is the most popular Hub so far). The publisher knows that every time new
content is published, it will notify the hub; the hub that gets notified will be declared at the top of the
publisher’s document, just like an RSS feed URL. So, the publisher delivers new content to the hub, and
then the hub delivers that message immediately to all the subscribers who have subscribed to receive
updates from that particular publisher.
This is very different from the traditional model in which a subscriber polls a publisher directly every 5
to 30 minutes (or less) to see if there’s new content. There usually isn’t new content, and so that model
is inefficient and slow. Hubbub is nearly immediate and only takes action when something important
occurs. Protocol co-creator Brad Fitzpatrick says that the current system of websites polling each other
for updates is like a kid in the back seat of a car saying “Are we there yet?” over and over again. Hubbub
says, “Shut up, kid. I’ll tell you when we get there.” That’s how Fitzpatrick explains it.
It’s remarkably simple, at the end points in particular. If things ever get complicated, it would be in the
hub, and that’s easily available as a service if a publisher doesn’t want to host their own. The hub does
things like authenticate subscribers, check in with feeds that haven’t pinged it lately, deliver a single
update from a publisher to multiple subscribers and act as a publisher itself for other hubs to subscribe
ReadWriteWeb | The Real-Time Web and its Future | 41
- 44. to. Neither publishers nor subscribers have to worry about the hub’s details, though, unless they are
looking for things like subscriber analytics.
Real-time PubSubHubbub feeds are already being published by FeedBurner, Blogger, LiveJournal,
LiveDoor, Google Alerts and the feed republishing service Superfeedr. Facebook’s FriendFeed,
LazyFeed and the newest version of Netvibes are consuming Hubbub feeds so far, as are a number of
small sites and services that are using the feeds for machine-to-machine communication.
Slatkin is the public face of the protocol, but he created it with Google’s Brad Fitzpatrick.
Fitzpatrick, now 29 years old, grew up in Oregon and built the popular social-networking service
LiveJournal while he was in high school in 1999. One year later, he hired Martin Atkins, then a high-
schooler in the UK and now a SixApart engineer and a leader in the online identity community. (Atkins
also had a big hand in formalizing Hubbub.) In 2003, LiveJournal grew fast and hired a number of
additional engineers, including then high-school senior and now Senior Open Programs Manager at
Facebook David Recordon. Also in 2003, Fitzpatrick’s company developed Memcached, an open-source
memory caching system that’s used today by Twitter, Digg, YouTube, Craigslist, Wikipedia, WordPress,
Flickr and more. In 2005, Fitzpatrick sold LiveJournal to SixApart. Later than year, he created the first
OpenID authentication protocol for LiveJournal. In other words, he’s been a whirlwind of technical
innovation for the last 10 years.
Fitzpatrick is now at Google working on what could become the infrastructure for distributed,
independent and inter-operable social networks, PubSubHubbub among them.
Fitzpatrick explained:
“ Real-time stuff is one dependency around federated social networking. No
one would suggest a chat function that’s based on polling, for example.
You can’t compete with walled gardens that have real-time internally if you
don’t. One of the obstacles has always been real-time: engaged conversation,
news feed, etc. So in order to solve social networking we need to implement
PubSubHubbub and WebFinger [a profile-syncing technology that Fitzpatrick
is now working on: http://groups.google.com/group/webfinger].
“Things are about to get interesting. I don’t need another social networking
site – we need competition, we need the basic crap that all these sites
do [posting, commenting, sharing, etc.] to be federated and all working
together.”
So Atom-based Activity Streams may be the language in which functions such as posting, commenting
and sharing are expressed; and then PubSubHubbub may be the method of delivering the Atom feeds
of updates in real-time.
42 | ReadWriteWeb | The Real-Time Web and its Future
- 45. The use cases are essential to consider, but Slatkin thinks of this work mostly as creating better
building blocks that can then be used for anything. He emphasizes that engineers need to be building
now to scale for the unforeseeable use cases of the future.
“Real-time implementers need to think about consistent [application] workloads,” he told us. “That’s
the only way they can scale.”
“ To sip from the fire hose you need to only get what you care about. If you
have to cut anything out, then you’ll drown. People say ‘RSS and Atom are
good enough!’ I don’t think people know where we’re going to be in 10
years. Right now our back ends can handle the load – but if we only cared
about today, then we’d just stay home. The whole point of technology is to
make new things. When people think about the real-time Web, they need to
think about new use cases that no one has considered because they seemed
technically unfeasible. If you told someone 10 years ago that you could have
15 people concurrently editing a document – that was crazy!”
Slatkin emphasizes that we can’t know what the ultimate killer apps for push will be, but he rattled off
to us a short list of ways in which he could imagine them being put to use:
“ Push as compliance with SEC for filing financial reports. Real-time
monitoring of the performance of cloud services using Hubbub. Sensor
networks: tiny sensors everywhere with little bits of data, sonar modules or
IR pings. Put a thousand of those in a field and get a 3-D picture of what’s
going on. So far, that’s been done with binary, proprietary, one-off protocols,
hard to use. Open, real-time Web data could enable vast numbers of people
to consume that sensor data. It could be used on battlefields, football fields
or as road data.”
Fitzpatrick thinks Hubbub could even replace Google’s crawls of the Web. “All content should be
real-time and subscribable,” he says. “You could replace crawling with this, every page on the Web. You
could probably get most pages pretty soon, but one could imagine modifying Apache to support this
by default.”
Former Googler Paul Buchheit (Googler #23, in fact), now at Facebook after selling FriendFeed to the
company earlier this year, zooms into the smallest details. “The next step is for people to open more of
their current activities and plans,” he wrote in a recent blog post.7
7 http://paulbuchheit.blogspot.com/2009/11/open-as-in-water-fluid-necessary-for.html
ReadWriteWeb | The Real-Time Web and its Future | 43
- 46. “ This is often referred to as ‘real-time’, but since real-time is also a technical
term, we often focus too much on the technical aspect of it. The ‘real-time’
that matters is the human part -- what I’m doing and thinking right now,
and my ability to communicate that to the world, right now... When this
activity reaches critical mass, it should be very interesting for society. It
dramatically alters the time and growth coefficients in group formation. It
enables a much higher degree of serendipity and ad hoc socializing.
“The basic pattern of openness is that better access to information and
better systems lead to better decisions and better living. This general
principal is broadly accepted, but we’re just now discovering that it also
applies to the minutiae of our lives.”
See also our May article about Buchheit, “The Man Who Made Gmail Says Real-Time Conversation is
What’s Next” http://www.readwriteweb.com/archives/the_man_who_made_gmail_says_real-time_
conversation.php
So, matters large and small will be shared on the Internet; they’ll be marked up in standard formats,
and they’ll be pushed in real-time to anyone or any application that wants them. Then, we’ll analyze
and learn from them individually and in aggregate.
See also:
• Brett Slatkin streams his activities at http://www.onebigfluke.com/
• Brad Fitzpatrick posts frequently to Twitter at http://twitter.com/bradfitz
• Google’s DeWitt Clinton is good to follow as well for related topics: http://twitter.com/dewitt
44 | ReadWriteWeb | The Real-Time Web and its Future