(JOINED IN PROGRESS)
…came up after the applications it was meant to describe. Napster, SETI@home, ICQ, their
cousins. So peer-to-peer is a label and not a definition. The idea of computers
communicating with one another as peers is the founding idea of the Internet itself. So
the mere fact of peers communicating with one another cannot be the big deal. That can’t
be the full explanation of why what’s happening now is different from what happened
before. What is a big deal about what we see happening now is what and where these new
peers are. The new nodes in the peer-to-peer systems are devices, and principally PCs,
connected to the edges of the Internet cloud. In the early 90’s, when the launch of
Mosaic started to drive real connection of PCs to the Internet for the first time, a PC
was really used as nothing more than a life support system for a browser. And that’s
pretty much the status quo we’ve had from then until now. PCs have always existed behind
a veil of second-class connectivity. Because you could not get a permanent IP address for
your PC, you could not get a domain name, and because you could not get a domain name
you could not really host anything. People at the network’s edges were relegated to being
always consumers of resources but never providers of resources. The veil of second-class
connectivity created a second class of users - the people at the edges of the network.
Now, in engineering terms this was not such a big problem in the early 90s. The PCs we
had then were essentially toy computers. They were flaky, they were crash-prone, they
were weak, they were slow. But look what’s happened in the last five years around the
edges of the network - the operating systems have gotten distinctly less flaky, the
applications have gotten less crash-prone. Thanks to Moore’s Law and the growth in data
density, you can now for a thousand bucks buy a server class machine and stick it under
your desk. At the edges of the Internet, thanks to both the increase in the quality of
hardware as well as the massive increase in the number of devices connected, there are
now, at a conservative estimate, ten quadrillion clock cycles per second of compute time,
there are ten thousand terabytes of storage space. These are resources you could do
something with, if you could get to them. PCs are the dark matter of the Internet, the
part of the fabric of the Internet that is there but has not yet been woven into the
whole. There is a vast array of resources behind that veil of second-class connectivity,
resources that are inaccessible because they exist in a world of variable connection and
unpredictable IP addresses, and P2P is a way of piercing that veil, P2P is a way of
aggregating those resources.
So here’s my working definition of P2P. An application is peer-to-peer if it aggregates
resources at the network’s edge, and those resources can be anything. It can be content,
it can be cycles, it can be storage space, it can be human presence. I’d like to echo
Tim’s picking up on Dave Weiner’s point about the P in P2P is people. ICQ is an example
where instead of making variable connectivity a disadvantage, it makes it an advantage
because it tells you something important about whether that person is there or not. And
we don’t know all of the resources that are going to be aggregated in P2P systems yet.
We haven’t seen a P2P app that requires 30,000 sound cards, or 30,000 video cards but
we’re going to, someday. The other half of my definition is this: In order to get to
those resources , P2P applications have to solve what I call the addressing problem.
P to P applications have to find some way to address the nodes outside of the DNS system
that we’re used to, usually by creating an alternative namespace managed by the service
itself. Napster manages the Napster namespace, ICQ manages the ICQ namespace, and so
forth. And this is what’s required in order to be able to reach those nodes that are
variably connected, and because of this variable connectivity, the nodes themselves have
to have significant or total autonomy from any central server. This is what makes peer-
to-peer distinctive. Peer-to-peer applications create new addressing schemes for the
resources of the networks edge and then they use those resources to create new functions.
Now I’d like - A couple of caveats about this address, about this definition. Several
people have asked about wireless and why in this version I’m so focused on PCs. There’s
no engineering reason that PCs are the important resources. It’s really just a fact of
history - to echo Willy Sutton, PCs are where the cycles are. But as Steve Birbeck from
IBM has pointed out, in a billion-device future, all of the devices are going to have to
be peer-to-peer, because we are going to have to find ways to connect them all to one
another without central management. So as wireless devices, and as things like the TiVo
and WebTV grow, they will also become part of peer-to-peer systems. The other thing
about this definition is that this is not what makes peer-to-peer applications important,
it’s just what makes them possible. What makes a peer-to-peer application important is
what it does with the resources of aggregates.
With that general background, I’d like to turn my attention to Napster for a moment. As
Tim noted, an ocean of ink has been spilled about Napster, and another ocean of ink is
in the process of being spilled thanks to Monday’s ruling. But so much of this has been
a kind of hysteria about intellectual property or strange inquiries into the nature of
the law and popular culture and I want to leave that aside - Napster has obviously
succeeded in large part because it’s about music, which is something people love. But
Napster also has lessons for us in terms of engineering and structure. And I think the
first lesson for this group is about decentralization. And it may seem like coals to
Newcastle to talk to a group that’s come together to think about peer-to-peer and say
that Napster has lessons about decentralization, but I think the lessons aren’t the
immediately obvious ones. Napster, as has been noted, is not fully decentralized. Napster
maintains two critical central resources - a database of songs and a database of user
addresses. And there has been some criticism from a group of people I guess I would call
peerier-than-thou, people who never saw a centralized service they didn’t want to smash
into shards. And I would like to suggest that instead of being random or a mistake,
Napster’s mix of centralization and decentralization is not random, but it’s actually
split very savvily along very particular economic lines. It’s been widely noted in the
literature regarding free markets that it is very difficult to coordinate group behavior
among anonymous, autonomous self-interested actors. The classic thought experiment for
this is the tragedy of the commons, where a group of shepherds each individually grazes
their sheep as much as possible on commonly owned land, in order to exercise their
selfish interests. And the result of this is the land is overgrazed and the group as a
whole suffers. And yet Napster has crossed the 50 million user threshold without suffering
from the tragedy of the commons. I believe that what Napster has done is that it’s
decentralized the aspects of the system that you would do for yourself anyway. It has
decentralized the aspects that can be handled by selfishness but it has centralized
the things that have to be coordinated away from the behavior of the individual actors.
You would buy that PC anyway. You would pay for that hard disk anyway. You would get
that Internet connection anyway. And you would be happy to have the music you like on
your hard drive. Those are all of the things Napster decentralizes. You would not be happy
to maintain a database of music you don’t like, or of other users whose taste you taste
you don’t share, on your PC. You’re not going to give your resources away for that while
pursuing your own selfish goals. So Napster has centralized the things that require a kind
of organized coordination away from the autonomous actors at the edges of the network. I
believe that Napster is the best example we have of a class of applications that I would
call “decentralized enough.” Napster has mixed centralization and decentralization in a
really canny way to create what is obviously the most explosively adopted peer-to-peer
application on the Internet. And by suggesting that Napster is decentralized enough, I
hope to suggest, even of my own definition, that rather than having a perfect test for
“is this in” or “is this out”, that we recognize people who share our goals. Many people
have contested with me my focus on what I consider the brokenness of the DNS system, and
consider that a sideshow, and my definition I recognize doesn’t include a lot of work on
the two-way Web or dynamic DNS, but I recognize that people working on the two-way Web
and dynamic DNS are fellow travelers. Anyone thinking about ways of decentralizing power
and putting it back into the hands of users, what Larry Lessing has called the end-to-end
internet, shares a goal with me, and rather than worrying too much about whether things
are completely perfectly decentralized, I’d like to suggest that we think about whether
things are decentralized enough to achieve their goals. I think what Napster has shown us
decentralization is better as a tool than a goal.
Another lesson I think Napster has for us is about usability. And this, again, might seem
almost tautologically daft - if tens of millions of people use it, it must by definition
be usable, right? But Napster has a very different kind of usability than the one that
we’re accustomed to. With the launch of the Web, and in particular with the ease of use
of HTML, average network citizens could create a user interface, and of course the result
was mostly dreck, it was badly scanned pictures of people’s pets and favorite band lists
and so forth. And this incredible literature of concern about how terrible websites are
has grown up to the point where we now think of usability as relating primarily to what
users see and interact with at the surface of an application, and where in conversation
we use the word usability to be synonymous with good interface design. But a funny thing
happens when you apply this definition to Napster, because Napster’s user interface is
completely terrible. It’s really, really dreadful. If Photoshop is a 10, Napster is about
a 3. So here we have a bit of a paradox, which is: here’s an incredibly usable
application, by definition, 50 million users can’t be wrong, with a terrible user
interface. So where is the usability in Napster? I think Napster’s main innovation is
that it provides usability for the network layer. Napster makes the Internet itself
usable. In particular, what Napster does is it lowers the barrier to network
configuration. The current deal we have with our end users is, you can do anything you
like with that hunk of silicon under your desk. You can consume any publicly available
resource on the Internet on demand without apology. But the minute you want to provide a
resource, the minute you want to create a network name for yourself, that’s some powerful
juju and you have to get some experts involved. A thought experiment I think might
illustrate this - let’s say I came to your house, and I put a PC under your desk,
connected it to the Internet and handed you a file. And said here, serve this from that.
How hard can that be, right? Build a website on that PC, take this file, serve it. What
would you have to do? You’d have to go to your ISP and you’d have to convince them to
give you a fixed IP address, the chances of which are approximately forget it, but let’s
say that they let you do that. Then you’d have to go to a registrar and you’d have to
fill in all sorts of weird information, like who’s your technical contact and what’s a
NIC handle and hello, I just wanted to serve this file. But once you get through that,
and pay the registrar for the privilege, you then have to sit around for a few days.
Actually, you have to go back to your ISP and convince them that you want to use their
DNS servers to point to your PC in your house which is sort of forget it squared, but
let’s pretend it happened. Then you have to sit around for a few days and maybe when
someone types that domain name in, packets will show up at your box. Well, hallelujah,
you’re done, right? But no, not yet. You still have to download Apache, and then you
have to put on your hip waders because you have to go into httpd.conf and the Apache
configuration files are a travesty of user hostility. So your computer, you own it, your
file, you own it, and you have to involve several other parties, pay them for the
privilege, and the time spent can be measured in both hours worked and days wasted. And
then, maybe, you can serve that file. If, however, the file is a music file, you can
download Napster and in five minutes you can be serving it on the Internet without
having to involve anyone else. Napster creates usability at the network layer. Using
Napster, I can create a human readable, permanent internet address for myself, and the
best part is, I can do it all for free without having to ask anyone else for either help
or permission. That is a revolution. And for the people working on the two-way web, make
that your benchmark. Don’t accept the current difficulties of configuring a domain name
and configuring a web server. When I can serve a HTML file from my PC as easily as I can
serve an MP3 file from my PC, then you’re done. Then you’ve really achieved something.
The third lesson that I believe Napster holds out for us is a message bout how we interact
with the user base. As the network grows, the intelligence of the average user converges
on the intelligence of the average member of the population as a whole. And that’s not
sociology, that’s math. In the old days, the average computer user was in the John von
Neumann range, they are now in the Alfred E. Neuman range, and that is never going to
change. So, this is our world. This is the world we live in. The difficulties of network
configuration must go away for this to succeed. There is a parallel here. Twenty years
ago, when the PC first arrived, the mainframe people scoffed. Because they knew that no
one could run a computer without special training, and in particular no one could run a
computer on their own. And so people smuggled PCs into the enterprise through the back
door behind the backs of the people running the mainframe. That’s what’s happening with
network configuration today. People are smuggling peer-to-peer applications into the
enterprise under the noses of the IT department. And that is a critical - that has
happened because Napster and ICQ and their cousins, rather than saying, oh, we’re going
to educate the users about IP addresses and network configuration, has said instead,
we’re going to lower the threshold of adoption until the average network user can create
their own network address, their own network identity for themselves.
The fourth and last lesson Napster has for us is short and it’s bad. Despite all the
talk about how IPv6 was going to bring this unlimited new future of manageable addresses,
it was totally apparent by the late 90s that absolutely no one who had any responsibility
for the public Internet was going to lift a finger to allow users to create their own
network addresses for themselves. So Napster and ICQ and their cousins stepped in and
simply solved the problems themselves. You have to admire the entrepreneurial force
behind this, but you also have to worry about control. The WHOIS database contains 23
million addresses. The Napster database contains more than 50 million. The AOL database
for AIM and ICQ contains more than 150 million. The universe of peer-to-peer addresses
in four years is already much vaster than the universe of DNS addresses, centrally
managed DNS addresses and it is growing at a much faster rate. So, ironically, Napster
has brought back end-to-end connectivity, it’s brought back what Larry Lessing, and John
Udell, and Dave Weiner have been focusing on about restoring power to the people at the
edges of the nodes, I’m sorry, in the nodes at the edges of the Internet. But it has done
this by creating a privately managed address space. And the risk we face with peer-to-
peer applications is a rise in balkanization even as we get ease of use. We are seeing
private databases to public networks, and the risk of that cannot be overstated. Resist
this. Whatever else you are thinking about when you think about peer-to-peer, think about
interoperability. Don’t worry about standards yet. Take the message of the earliest days
of the DARPAnet, where they weren’t trying to force computers to adopt common standards,
they were trying to provide a layer of interoperability, and only after they did that did
things like TCP/IP become implemented at the machine level. Think about interoperability.
I believe this is both the biggest opportunity and the biggest challenge facing this
group. A little over fifty years ago, Thomas Watson from IBM said that he could foresee
a need for perhaps five computers worldwide, and we now know that that figure was wrong,
because he overestimated by four. I don’t know what the framework for that one global
computer is going to look like, but I do know two important things about it. It’s not Sun
One, and it’s not dot-net, and it’s not any other corporate press release you might have
read about. It will have elements of those things, of course, but the challenge is too
big to be done by one company and it’s too important to be owned by one company. And the
other thing I know about it is this: there are better than even odds that the people
responsible for really getting that framework going are in this room, and they are not
necessarily going to be on this stage in the next three days, so I want to echo what
Tim said: Find each other. Talk to each other. If this is going to happen, this is going
to be the group that gets it done.
|