Nerdblog.com: June 2010

Native Code vs. Browser (a test)

I was really happy to see this test (from themaninblue) comparing Canvas, HTML, SVG, and Flash. It says that Flash is a little bit faster at drawing a particle system, not that much.

I ran his tests for my desktop PC, using the "1000 particle" mode:

Chrome 5 runs this Canvas test at about 30FPS. (Yes, it's faster than IE9.)
The Flash version runs about 42FPS.

And in the land of the blind, the one-eyed man is king...

So just for kicks, I coded up a native windows particle system just to compare. (It's quite safe to try, if you have Windows.)

My code runs about 350FPS on the same test.

I know that's sort of silly, but it also looked bad if I used more than 1000 particles at a time. So my simple code is 12x faster than the fastest Canvas browser, and 8x faster than Flash. I suspect some of this is the Javascript time, and some is drawing.

This is not a GPU-based app. It doesn't even use DirectDraw. It doesn't use SIMD. It's just a 32-bit bitmap that I draw stuff to, in a loop.

So it seems like there's a bit of room to make browsers faster without requiring a GPU. Or maybe there are better opportunities to do vector operations in browsers and make Javascript use less time.

But overall, I don't know why we're not paying attention to pixel-bound inner loops in the browsers, like people have for years in native code. There's apparently a lot of room for improvement.

Client Fragmentation: Apps making software more expensive

I'm hearing stories of companies that want to ship on multiple platforms, where the only solution for a company is to hire 10 people to deal with it. And this seems like a lot, until you think about what's happened over the last few years.

You want to ship a client for the iPhone, Android, RIM, Mac, Windows (Linux?). Of course, to make it more fun, Microsoft has managed to fragment XP against Vista...certain features just aren't available in XP.

Back in 2000, you might take VC funding to pay for expensive Sun servers and bandwidth. Now you need a pile of dollars to hire enough people to port your app to five platforms.

For the most part, all of these client platforms use separate languages, skills, toolchains, and development environments. You have to physically shift from a Mac to a PC, or you have to launch Eclipse to write Java code, or other days xCode, or emacs, or Visual Studio. This is all relatively new, since in the past, most developers could work in one place, and they got good at it.

Now you might be expecting history to repeat itself: one platform gets all the market share, and everyone writes software for it! That would be the 1990s outcome. (It was fun when the worst bugs we had to deal with were "A LaserJet shared from NT4 to Win98 prints upside down." Seriously. Recompile.)

Of course, I left out the Web, which is the market share leader right now. And that's actually one of the reasons it's different this time.

So to me, the absolutely brilliant thing Google has done with Android is to fragment client development. (They might not see it this way.)

With Android, I have to write Java code. And honestly, it was hard enough to make my C++ stuff work in Objective C. If I really want to be portable from desktop to mobile, I need at least three code bases now, and that's ignoring RIM and separate resources for iPad, etc. When you cross language boundaries, there is no #ifdef shared code, there is no "fix this on platform X", there's just hard work.

You end up moving a code process ("this doesn't compile anymore") to a manual one ("our project manager said I should implement this feature sometime"), and that is dangerous and error-prone territory.

Google's approach looks smart for the Web, because it makes browser compatibility look like a cakewalk. In short, client fragmentation makes writing code that runs in the browser a lot more attractive than it was 5 years ago.

Everyone knows that Apple has been even more insistent on the current fragmentation, and in the short term it has helped them: they have locked up the best developers right now. Developers hate switching environments, and so more people seem to be making apps for iPhone than any other platform, Windows included.

But in the long-term, innovation will happen in the places where it's easy for a team of 2 people to get something done that everyone can use, and I don't think successful businesses will be built around a single client platform. (Good demos absolutely will run on one platform. But after that, there's hard work.)

I think fundamentally, every time you switch languages or IDEs or physical machines to do a task, you should probably have more people involved. And as we shift languages, we've moved processes that might have been "fix the code to make it compile on the Mac" to some meta-programming process, like "we have to reimplement this feature in Java for Android."

When you have 5 developers trying to coordinate a feature set, you need people to manage it, and people to re-design it, and software gets rapidly more complicated and slower to make. Bugs are slower to fix, because your RIM guy moved over to do the iPad version, and he doesn't want to shift gears this week, maybe next? So there's another project manager to keep track of that task. Big company processes, for the smallest projects.

My project f.lux has given me some of these headaches. It's a C++ app on Windows, an Objective-C app on Mac, and a C thing on Linux. And as I'm looking through Android kernel source thinking about hooks to access Snapdragon hardware that are not currently exposed, or trying to figure out how to hack in .kext's on jailbroken iPads, I realize fundamentally this is a worse world than having one environment to deploy to. It makes me want to write Windows apps ten years ago, or Web apps instead.

One possible secondary effect of this fragmentation is that open source can help solve some of the details pretty well. But it is important to say that most of open source is corporate-funded today, and only a few very important projects get seriously committed hobbyist involvement, so I don't think this is a panacea in most ways.

For apps and businesses that need to add features and fix bugs quickly, we've got an expensive mess right now.

In the meantime, you can write for the browser, and do a client or two where it makes sense.

The world has definitely changed.

A Flawed Browser UI: the case of too many TABS

I have been watching myself avoid working in a variety of clever ways.

Most of them center around the browser, and the "tabbed" browser UI is troubling me more and more. You could say most of them are flaws in my own use, but I think they are common enough that solutions should be found.

The tabbed browser is broken. It's a bunch of tabs, often ordered badly, and the overhead of navigating it all is making computers gradually less useful for me.

I use browsers for News, for Research, for Apps, and as part of a collaborative experience.

People send links via IM, Email, or Facebook, and I click on them, which immediately disconnects the content from the original discussion. I don't know who to reply to about a certain article, because it's unclear exactly why that tab is actually open. I don't know when the tab was created, and I don't know which task I was doing when I opened it.

Often I will have 100 tabs open. I have seen Safari eat my wife's Mac with way way more than this. The "hunt" through the ten windows and 100 tabs is no fun and needs to get fixed.

Some of my tabs will be half-read articles that were interrupted by another task. Some of them will be things I've mostly read but didn't close. They don't close themselves, they don't even try...

Some of the tabs I've never looked at. Some of them are lazy bookmarks, things I'm keeping open so I can compare "A vs. B" later, except it's really hard to do that, because lining them up is quite hard.

But I do notice that I have "sets" of things...stories opened from a discussion, references from a Google search on an obscure programming term. And rarely do I think in advance, "I'm going to create a new window and start a new task."

We could have sets. We could have tasks. The computer could figure it out and make it make sense. I am really sure that a 3GHz quad core CPU could figure out "these things are all links from google results pages with setTimeout in the query string."

(And yeah, IE does some color-coding by domain, but this is just visual noise, not organization.)

I need some really amazing ways to group tabs by task, to save sets of them, and to have the UI connect back to the places I came from.

If I do a Google search to figure out how to write some portable Javascript, I will open ten tabs along the way, and then I will eventually find the single page that shows me the answer, and the whole task should go away in one click. Today, I have to close the 10 pages individually, and the Google search window that was around in case none of those pages contained an answer. Wouldn't a great UI do this grouping for me?

If I click on a link from Facebook or Twitter, and then I want to leave a comment for the person who sent me that direction, why should I have to look through tabs to do this? Wouldn't a great UI link these pages together?

Or maybe I'm shopping online for the best energy-efficient lightbulb, and I've opened 50 pages talking about that, and Windows update comes along, and I say, man, I want to come back to this later. Wouldn't a great UI let me save them as one 'library' of stuff and close 50 tabs at once?

And what about when I sit down at my computer to look something up, but instead there's some phenomenally juicy page on the top of my browser that makes me forget what I was going to accomplish and read it instead? Wouldn't a great UI somehow ask me what I wanted to do when I woke up my laptop? Google could (and should) make this fantastic.

I am seeing a ton of add-ons and random hacks that try to do things like this, but I think this set of problems is so core to the experience and so important to get right that we have to re-think the organization of the browser.

Adding tabs to the browser was a huge step forward for so many reasons, but it is time to help people organize the work they are doing on their computers, not confuse them with distractions and 100 tabs at a time, which is mostly what my computer is, today.

Today, I usually just quit the whole thing, lose all my state, just to avoid having to look through the entire mess. This can be so much better, and it doesn't even take a miracle, just some smart code and good UI.

Software Lens Fixes (barrel distortion, etc.)

I read with some interest this article on lens distortions from dpreview. I had no idea the Canon S90 was doing so much software lens correction!
http://www.dpreview.com/articles/distortion/

That's the image out of the Canon S90, natively.

It back me back 15 years to my college dorm room, when I had the original black & white Quickcam. For this old camera (~1994) Connectix had chosen a lens which allowed more light and sharpness for less money, but at a cost of terrific barrel distortion.

Back then, I posted some ancient code to do a bilinear warp of the form r = r + ar^3 + br^5, a polynomial approximation that seems to be able to fix most lens distortion. This code caches the warp map, and applies it to video in realtime. I'm sure it barely compiles anymore, because the hardware is really ancient:

http://stereopsis.com/old/barrel.html

Today you'd probably want to use a much better interpolation than bilinear, but this one was realtime for video back in 1996, which seemed pretty cool at the time.

Really all lenses should have these kinds of computations done automatically in the processing pipeline. It would give a benefit to weight, sharpness, and cost. And the software isn't that hard to write, either.

The dpreview article mentions that optical corrections for barreling tend to introduce a mustache distortion (which even the most expensive lenses aren't able to fix perfectly).

It seems to me like lenses should have some of these distortion measurements built-in, and the data could come through the processing pipeline directly.

My impression is that DxO and Photoshop CS5 are keeping external databases of this information, which is nice for the moment. It doesn't necessarily make sense to talk to a network service just to process a RAW file. But I guess it makes it possible to do this sort of correction right away rather than waiting 10 years for metadata standards to adapt.