Two weeks ago, I broke down and bought a Kindle. I like it:

  • It’s a good and well-designed reader, and the experience is much better than the other e-book reading I’ve done before on phones and PDAs. I like how when you bookmark a page, you can see it… the corner of the page gets a little dog-ear. [1]
  • It’s got a nice e-paper screen that uses ambient light, not backlight, which makes it readable anywhere just like a printed page — it’s even better, not worse, in direct sunlight.
  • It’s light and thin and sturdy. Sure beats carrying three or four books on a trip.
  • It has great battery life. I’ve only charged it once so far, when I first received it… since then I’ve had it for 11 days and read a full book and a half, and it still has 75% of its first charge left. (It helps that I turn the wireless off unless I’m actively using it.)
  • Fast, free wireless everywhere in the U.S., without computers or WiFi.

But today, it transformed my reading experience.

This morning, I was unsuspectingly reading my feeds in Google Reader as usual, blissfully unaware that the way I read books was about to change. Among other articles, I noticed that Slashdot ran a book review of Inside Steve’s Brain (that’s Jobs, not Wozniak or Ballmer). The review made me want to read the book. That’s when the new reality started, because I was interested in the book now, and had time to start reading it now:

  • Normally, I would have ordered it from Amazon and waited for it to arrive. But what usually happens then is that the book arrives a week later, and when it gets here I don’t have time to start it right away or I don’t quite feel like that kind of book just at the moment… and it goes on a shelf, with a 70% probability of being picked up at some point in the future.
  • Today, I searched for the book title on my Kindle, clicked “Buy”, and a few minutes later started reading Inside Steve’s Brain while eating lunch. [2]

That convenience isn’t merely instant gratification, it’s transformative. I suspect I’m going to be reading even more books now, even though I have a few little nits with the device, such as that the next and previous page buttons are a little too easy to press in some positions.

In other news, the Kindle also supports reading blogs and newspapers and some web surfing, but those are less compelling for me because I tend to do those things in the context of work, which means I’m already sitting at a computer with a bigger color screen and full keyboard. Maybe someday I’ll do it on e-paper. Until then, just living inside a virtual bookstore is plenty for me. Kindle + the Amazon Kindle store = iPod + iTunes for books. [3]

Here’s a useful summary article on Kindle features from a prospective user’s point of view.


1. The first two books I downloaded? The Design of Everyday Things, which was interestingly apropros to read on a new device like the Kindle with its nice dog-ear feedback and other well-designed features, and Foundation, which I hadn’t read in ages.

2. And it cost less than half what the dead-tree version would (though the latter was hardcover).

3. Caveat: I’m not actually an iPod owner, and I hate how Apple keeps insisting on installing iTunes on my computer just because I have Safari or QuickTime installed (mutter grumble evil product-tying monopolies mutter grumble :-) ). But apparently everyone else loves them, and they have indeed changed their industry.

Quad-core a "waste of electricity"?

Jeff Atwood wrote:

In my opinion, quad-core CPUs are still a waste of electricity unless you’re putting them in a server. Four cores on the desktop is great for bragging rights and mathematical superiority (yep, 4 > 2), but those four cores provide almost no benchmarkable improvement in the type of applications most people use. Including software development tools.

Really? You must not be using the right tools. :-) For example, here are three I’m familiar with:

image Visual C++ 2008’s /MP flag tells the compiler to compile files in the same project in parallel. I typically get linear speedups on the compile phase. The link phase is still sequential, but on most projects compilation dominates.
imageSince Visual Studio 2005 we’ve supported parallel project builds in Batch Build mode, where you can build multiple subprojects in parallel (e.g., compile your release and debug builds in parallel), though that feature didn’t let you compile multiple files in the same project in parallel. (As I’ve blogged about before, Visual C++ 2005 actually already shipped with the /MP feature, but it was undocumented.)
image Excel 2007 does parallel recalculation. Assuming the spreadsheet is large and doesn’t just contain sequential dependencies between cells, it usually scales linearly up to at least 8 cores (the most I heard that was tested before shipping). I’m told that customers who are working on big financial spreadsheets love it.
imageAnd need I mention games? (This is just a snarky comment… Jeff already correctly noted that “rendering, encoding, or scientific applications” are often scalable today.)

And of course, even if you’re having a terrible day and not a single one of your applications can use more than one core, you can still see real improvement on CPU-intensive multi-application workloads on a multicore machine today, such as by being able to run other foreground applications at full speed while encoding a movie in the background.

Granted, as I’ve said before, we do need to see examples of manycore (e.g., >10 cores) exploiting mainstream applications (e.g., something your dad might use). But it’s overreaching to claim that there are no multicore (e.g., <10 cores) exploiting applications at all, not even development tools. We may not yet have achieved the mainstream manycore killer app, but it isn’t like we have nothing to show at all. We have started out on the road that will take us there.

Welcome to Silicon Miami: The System-On-a-Chip Evolution

A lot of people seem to have opinions about whether hardware trends are generally moving things on-chip or off-chip. I just saw another discussion about this on Slashdot today. Here’s part of the summary of that article:

"In the near future the Central Processing Unit (CPU) will not be as central anymore. AMD has announced the Torrenza platform that revives the concept op [sic] co-processors. Intel is also taking steps in this direction with the announcement of the CSI. With these technologies in the future we can put special chips (GPU’s, APU’s, etc. etc.) directly on the motherboard in a special socket. Hardware.Info has published a clear introduction to AMD Torrenza and Intel CSI and sneak peaks [sic] into the future of processors."

Sloppy spelling aside (and, sigh, a good example of why not to live on on spell-check alone), is this a real trend?

Of course it is. But the exact reverse trend is also real, and I happen to think the reverse trend is more likely to dominate in the medium term. I’ll briefly explain why, and support why I think the above is highlighting the wrong trend and making the wrong prediction.

Two Trends, Both Repeating Throughout (Computer) History

Those who’ve been watching, or simply using, CPUs for years have probably seen both of the following apposite [NB, this spelling is intentional] trends, sometimes at the same time for different hardware functions:

  • Stuff moves off the CPU. For example, first the graphics are handled by the CPU; then they’re moved off to a separate GPU for better efficiency.
  • Stuff moves onto the CPU. For example, first the FPU is a coprocessor; then it’s moved onto the CPU for better efficiency.

The truth is, the wheel turns. It can turn in different directions at the same time for different parts of the hardware. Just because we’re happening to look at a "move off the chip" moment for one set of components does not a trend make.

Consider why things move on or off the CPU:

  • When the CPU is already pretty busy much of the time and doesn’t have much spare capacity, people start making noises about moving this or that off "for better efficiency," and they’re right.
  • When the CPU is already pretty idle most of the time, or system cost is an issue, people start making the reverse noises "for better efficiency," and they’re right. (Indeed, if you read the Woz interview that I blogged about recently, you’ll notice how he repeatedly emphasizes his wonderful adventures in the art of the latter — namely, doing more with fewer chips. It led directly to the success of the personal computer, years before it would otherwise likely have happened. Thanks, Woz.)

Add to the mix that general-purpose CPUs by definition can’t be as efficient as special-purpose chips, even when they can do comparable work, and we can better appreciate the balanced forces in play and how they can tip one way or another at different times and for different hardware features.

What’s New or Different Now?

So now mix in the current sea change away from ever-faster uniprocessors and toward processors with many, but not as remarkably faster, cores. Will this sway the long-term trend toward on-processor designs or toward co-processor designs?

The first thing that might occur to us is that there’s still a balance of forces. Specifically, we might consider these effects that I mentioned in the Free Lunch paper:

  • On the one hand, this is a force in favor of coprocessors, thus moving work off the CPU. A single core isn’t getting faster the way it used to, and we software folks are gluttons for CPU cycles and are always asking the hardware to do more stuff; after all, we hardly ever remove software features. Therefore for many programs CPU cycles are more dear, so we’ll want to use them for the program’s code as much as we can instead of frittering them away on other work. (This reasoning applies mainly to single-threaded programs and non-scaleable multi-threaded programs, of course.)
  • On the other hand, this is also a force against coprocessors, for moving work onto the CPU. We’re now getting a bunch (and soon many bunches) of cores, not just one. Until software gets its act together and we start seeing more mainstream manycore-exploiting applications, we’re going to be enjoying a minor embarrassment of riches in terms of spare CPU capacity, and presumably we’ll be happy using those otherwise idle cores to do work that expensive secondary chips might otherwise do. At least until we have applications ready to soak up all those cycles.

So are the forces still in balance, as they have ever been? Are we just going see more on-the-chip / off-the-chip cycles?

In part yes, but the above analysis is looking more at symptoms than at causes — the reasons why things are happening. The real point is more fundamental, and at the heart of why the free lunch is over:

  • On the gripping hand, the fundamental reason why we’re getting so many cores on a chip is because CPU designers don’t know what to do with all those transistors. Moore’s Law is still happily handing out a doubling of transistors per chip every 18 months or so (and will keep doing that for probably at least another decade, thank you, despite recurring ‘Moore’s Law is dead!’ discussion threads on popular forums). That’s the main reason why we’re getting multicore parts: About five years ago, commodity CPU designers pretty much finished mining the "make the chip more complex to run single-threaded code faster" path that they had been mining to good effect for 30 years (there will be more gains there, but more incremental than exponential), and so we’re on the road to manycore instead.

But we’re also on the road to doing other things with all those transistors, besides just manycore. After all, manycore isn’t the only, or necessarily the best, use for all those gates. Now, I said "all" deliberately: To be sure you don’t get me wrong, let me emphasize that manycore is a wonderful new world and a great use for many of those transistors and we should be eagerly excited about that; it’s just not the only or best use for all of those transistors.

What Will Dominate Over the Next Decade? More On-CPU Than Off-CPU

It’s no coincidence that companies like AMD are buying companies like ATI. I’m certainly not going out on much of a limb to predict the following:

  • Of course we’ll see some GPUs move on-chip. It’s a great way to soak up transistors and increase bandwidth between the CPU and GPU. Knowing how long CPU design/production pipelines are, don’t expect to see this in earnest for about 3-5 years. But do expect to see it.
  • Of course we’ll see some NICs move on-chip. It’s a great way to soak up transistors and increase bandwidth between the CPU and NIC.
  • Of course we’ll see some [crypto, security checking, etc., and probably waffle-toasting, and shirt ironing] work move on-chip.

Think "system on a chip" (SoC). By the way, I’m not claiming to make any earth-shattering observation here. All of this is based on public information and/or fairly obvious inference, and I’m sure it has been pointed out by others. Much of it already appears on various CPU vendors’ official roadmaps.

There are just too many transistors available, and located too conveniently close to the CPU cores, to not want to take advantage of them. Just think of it in real estate terms: It’s all about "location, location, location." And when you have a low-rent location (those transistors are keep getting cheaper) in prime beachfront property (on-chip), of course there’ll be a mad rush to buy up the property and a construction boom to build high-rises on the beachfront (think silicon Miami) until the property values reach supply-demand equilibrium again (we get to balanced SoC chips that evenly spend those enormous transistor budgets, the same way we’ve already reached balanced traditional systems). It’s a bit like predicting that rain will fall downward. And it doesn’t really matter whether we think skyscrapers on the beach are aesthetically pleasing or not.

Yes, the on-chip/off-chip wheel will definitely keep turning. Don’t quote this five years from now and say it was wrong by pointing at some new coprocessor where some work moved off-chip; of course that will happen too. And so will the reverse. That both of those trends will continue isn’t really news, at least not to anyone who’s been working with computers for the past couple of decades. It’s just part of the normal let’s-build-a-balanced-system design cycle as software demands evolve and different hardware parts progress at different speeds.

The news lies in the balance between the trends: The one by far most likely to dominate over the next decade will be for now-separate parts to move onto the CPU, not away from it. Pundit commentary notwithstanding, the real estate is just too cheap. Miami, here we come.