My original plan for being featured on the iDevBlogADay blogroll was to be able to share some of the work I’m doing on the Core Audio book. I figured that as I worked through new material, that would translate into blog entries that could then get the word out about the book.
Unfortunately, I think what’s happening is that I’ve been working on iDevBlogADay entries instead of working on the book. And that’s not going to fly, at least if we want to get done in time for WWDC (two of which this book has already missed).
So, given that, and given that there are 50 other iOS developers waiting for a turn, I’m going cede my spot on iDevBlogADay, return to the waiting list, and hopefully apply that time to getting the book done.
If you want to keep following me, please do… previously, my blogging has tended to come and go in waves of inspiration, rather than the steady schedule that comes with participation in iDevBlogADay, so just grab the RSS feed URL or create a bookmark or just follow me on Twitter.
Today’s announcement of the new features in Android 3.0 (Honeycomb) showed a feature I truly didn’t expect to see: support for HTTP Live Streaming.
Given Google’s decision to drop H.264 support from Chrome – a move that I denounced a few weeks back and would simply characterize here as batshit crazy – the idea of embracing HLS has to be seen as surprising, given that the MPEG codecs are the only commonly-used payloads in real-world HLS. The format could handle other payloads, but in practice, it’s all about the MP4s.
And that, of course, is because the target audience for HLS is iOS devices. Apple says they have an installed base of 160 million iOS devices out there now, and even the earliest iPhone can play an HLS stream. Moreover, App Store terms require the use of HLS for non-trivial streaming video applications. So there’s more and more content out there in this format. Android is wise to hop on this bandwagon, and opt in… unless of course they turn around and expect content providers to switch to WebM payloads (one would hope they’re not that dumb).
I don’t think I’d previously thought of the iOS base as a target for media providers, but found myself thinking: could the iOS base be bigger than Blu-Ray? A little searching shows it’s not even close: as of last Summer, Blu-Ray had a US installed base of just under 20 million, while iOS devices of all stripes number 40 million in the US (coincidentally making it the largest US mobile gaming platform as well). And while Blu-Ray had a good Christmas, iPad sales were insane.
Not every iOS user is going to stream video, and most content providers will need to develop custom apps to use the feature (Netflix, MLB, etc.), but those that do are already making big investments in the format. No wonder Google is opting in now… trying to get all the content providers to support an Android-specific format (other than Flash) would surely be a non-starter.
Now if Apple and the content providers could just work the kinks out…
Experiment for you… Google for iphone youtube and look how pessimistic the completions are: the first is “iphone youtube fix” and the third is “iphone youtube not working”
Now do a search for iphone youtube slow and the completions all seem to tell a common story: that it’s slow on wifi. Moreover, there are more than 4 million hits across these search terms, with about 3.6 million just for “iphone youtube slow”.
Surely something is going on.
I noticed it with my son’s YouTube habit. He often watches the various “let’s play” video game videos that people have posted, such as let’s play SSX. I keep hearing the audio start and stop, and realized that he keeps reaching the end of the buffer and starting over, just to hit the buffer again. Trying out YouTube myself, I find I often hit the same problem.
But when I was at CodeMash last week, even with a heavily loaded network, I was able to play YouTube and other videos on my iPhone and iPad much more consistently than I can at home. So this got me interested in figuring out what the problem is with my network.
Don’t get your hopes up… I haven’t figured it out. But I did manage to eliminate a lot of root causes, and make some interesting discoveries along the way.
The most common advice is to change your DNS server, usually to OpenDNS or Google Public DNS. Slow DNS is often the cause of web slowness, since many pages require lookups of many different sites for their various parts (images, ads, etc.). But this is less likely to be a problem for a pure video streaming app: you’re not hitting a bunch of different sites in the YouTube app, you’re presumably hitting the same YouTube content servers repeatedly. Moreover, I already had OpenDNS configured for my DNS lookups (which itself is a questionable practice, since it allegedly confuses Akamai).
Another suggestion that pops up in the forums is to selectively disable different bands from your wifi router. But there’s no consistency in the online advice as to whether b, g, or n is the most reliable, and dropping b and n from mine didn’t make a difference.
Furthermore, I have some old routers I need to put on craigslist, and I swapped them out to see if that would fix the problem. Replacing my D-Link DIR-644 with a Netgear WGR-614v4 or a Belkin “N Wireless Router” didn’t make a difference either.
In my testing, I focused on two sample videos, a YouTube video The Prom: Doomsday Theme from a 2008 symphonic performance of music from Doctor Who, and the Crunchyroll video The Melancholy of Haruhi Suzumiya Episode 3, played with the Crunchyroll iPhone app, so that I could try to expand the problem beyond the YouTube app and see if it applies to iOS video streaming in general.
And oh boy, does it ever. While the desktop versions of YouTube and Crunchyroll start immediately and play without pauses on my wifi laptop, their iOS equivalents are badly challenged to deliver even adequate performance. On my iPad, the “Doomsday” YouTube video takes at least 40 seconds to get enough video to start playing. Last night, it was nearly five minutes.
If anything, Crunchyroll performs worse on both iPhone and iPad. The “Haruhi” video starts almost immediately, but rarely gets more than a minute in before it exhausts the buffer and stops.
So what’s the problem? They’re all on the same network… but it turns out speeds are different. Using the speedtest.net website and the Speedtest.net Mobile Speed Test app, I found that while my laptop gets 4.5 Mbps downstream at home, the iPad only gets about 2 Mbps, and the iPhone 3GS rarely gets over 1.5 Mbps.
The top two are on my LAN, and are pretty typical. The next two after that (1/22/11, 2:45 PM) are on the public wifi at the Meijer grocery/discount store in Cascade, MI. The two on the bottom are from a Culver’s restaurant just down 28th St. Two interesting points about these results. Again, neither gives me a download speed over 1.5 Mbps, but look at the uplink speed at Culver’s: 15 Mbps! Can this possibly be right? And if it is… why? Are the people who shoot straight-to-DVD movies in Grand Rapids coming over to Culver’s to upload their dailies while they enjoy a Value Basket?
See that little white area on the scrubber just to the right of the playhead? It’s something I don’t see much on iOS: buffered data.
So what really is going on here anyways? For one thing, are we looking at progressive download or streaming? I suspect that both YouTube and Crunchyroll use HTTP Live Streaming. It’s easy to use with the iOS APIs, works with the codecs that are in the iDevices’ hardware, and optionally uses encryption (presumably any commercial service is going to need a DRM story in order to license commercial content). HLS can also automatically adjust to higher or lower bandwidth as conditions demand (well, it’s supposed to…). Furthermore, the App Store terms basically require the use of HLS for video streaming.
whois tells us that 65.49.43.x block is assigned to “CrunchyRoll, Inc.”, as expected, and it’s interesting to see that most of the traffic is on port 80 (which we’d expect from HLS), with the exception of one request on 443, which is presumably an
https request. The fact that the phone keeps making new requests, rather than keeping one file connection open, is consistent with the workings of HLS, where the client downloads a
.m3u8 playlist file, that simply provides a list of 30-second segment files that are then downloaded, queued, and played by the client. Given the consistent behavior between Crunchyroll and YouTube, and Apple’s emphasis on the technology, I’m inclined to hypothesize that we’re seeing HLS used by both apps.
But oh my goodness, why does it suck so much? The experience compares very poorly with YouTube on a laptop, which starts to play almost immediately and doesn’t stop after exhausting the buffer 30 seconds later. Whether you use Flash or the HTML5 support in YouTube (I’ve opted into the latter), it always just works, which is more than can currently be said of the iOS options, at least for me (and, if the Google hit count is right, for a couple million other people).
One other thing that doesn’t wash for me right now: remember when Apple started streaming their special events again? I blogged that it was a good demo of HLS, and I watched the first one with the iPhone, iPad, and Mac Pro running side-by-side to make a point that HLS was good stuff, and it all worked. How can the live stream hold up so well on three devices, yet a single archive stream falls apart on just one of the iOS devices?
I actually really like what I’ve seen of HLS: the spec is clean and the potential is immense. I even wondered aloud about doing a book on it eventually. But I can’t do that if I can’t get it working satisfactorily for myself.
What the hell is going on with this stuff?
If I ever get a good answer, there’ll be a “Part II”.
One thing I’m coming away from CodeMash with is a desire to clean up a lot of my old habits and dig into tools and techniques I’ve long known were available, but haven’t used. In some ways, I’m still stuck in my iPhone OS 2 ways in an iOS 4 world.
Daniel Steinberg has taken a pretty extreme position, but one that makes sense: he no longer has any private instance variables in his header files, since the current SDK allows you to put them in the implementation. Combined with the use of a class extension in the .m for helper methods, this makes it possible for the header to be exactly what it’s supposed to be: an exposure of the public interface to your class, with no clues about the implementation underneath.
To my mind, Daniel was also the winner of the “mobile smackdown” session, in which one presenter each from the iOS, Windows Phone 7, and Android camps was given 15 minutes to develop a trivial Twitter app that could manage a persistent list of user names and, when tapped, nagivate to that user’s twitter.com page. I say Daniel won because his iPhone app was the only one to complete all the features in time (actually, Daniel needed an extra 30 seconds to finish two lines of code). The Windows Phone presenter never made it to adding new names to the list, and the Android guy didn’t get around to showing the user’s page. One of Daniel’s wins was in using the “use Core Data for storage” checkbox: by graphically designing a data model for his “Twitterer” class, he picked up persistence and his table view in one fell swoop. Now that I think of it, I don’t remember how, or if, the other platforms persisted their user lists. I don’t use Core Data often, but after this demonstration, I’m much more inclined to do so.
There was a whole session on unit testing for iOS, something I just explored on my own for my first day tutorial (and even then, I was using it as much for illustrating the use of multiple targets in an Xcode project as for the actual testing of features). I’ve never been religious about testing, particularly given that GUIs have long proven difficult to make fully testable, but with a testing framework buit into Xcode (not everyone’s favorite, but it’s a start), it’s well worth rethinking how I could use it to get some measure of test coverage and fight regressions.
All of this makes me worry about the status of the iPhone SDK Development book I wrote with Bill Dudney. That was an iPhone OS 2 book that slipped far enough to be an early iPhone OS 3 book, with the addition of new chapters for important new frameworks like Core Data and Game Kit. But with iOS 5 surely looming, some of it is starting to look pretty crusty. In particular, the arrival of Grand Central Dispatch means that means that it’s no longer safe to blithely ignore threads, as we did, since there are scenarios where you can have even simple code that unwittingly manages to get off the main thread, which means trouble for UIKit. Furthermore, new frameworks demand blocks for completion handlers, so that’s something that now needs to appear early (and given that the block syntax is pure C, readers will need to be acclimated to C earlier than they used to). And I’ve long wanted to move the debugging and performance chapters (my favorites, actually) much earlier, so readers can figure out their own EXC_BAD_ACCESS problems. Not that I can currently even plan on a rev to that book – I still have four chapters to go on Core Audio, and would need a long and difficult conversation with the Prags besides. But I certainly see where my guidance to new developers has changed, significantly, in the last few years.
Betweeen Christmas break, a week of CodeMash prep and home office reorganization, and CodeMash itself, I feel like I’ve been off for a month (and my MYOB First Edge financial status would seem to agree). I feel ready to start anew this week, and making a clean break with the past suits this mood nicely.
I’m pleasantly surprised that Google’s removal of H.264 from Chrome in favor of WebM has been greeted with widespread skepticism. You’d think that removing popular and important functionality from a shipping product would be met with scorn, but when Google wraps itself with the “open” buzzword, they often seem to get a pass.
Ars Technica’s Google’s dropping H.264 from Chrome a step backward for openness has been much cited as a strong argument against the move. It makes the important point that video codecs extend far beyond the web, and that H.264’s deep adoption in satellite, cable, physical media, and small devices make it clearly inextricable, no matter how popular WebM might get on the web (which, thusfar, is not much). It concludes that this move makes Flash more valuable and viable as a fallback position.
And while I agree with all of this, I still find that most of the discussion has been written from the software developer’s point of view. And that’s a huge mistake, because it overlooks the people who are actually using video codecs: content producers and distributors.
And have they been clamoring for a new codec? One that is more “open”? No, no they have not. As Streaming Media columnist Jan Ozer laments in Welcome to the Two-Codec World,
I also know that whatever leverage Google uses, they still haven’t created any positive reason to distribute video in WebM format. They haven’t created any new revenue opportunities, opened any new markets or increased the size of the pie. They’ve just made it more expensive to get your share, all in the highly ethereal pursuit of “open codec technologies.” So, if you do check your wallet, sometime soon, you’ll start to see less money in it, courtesy of Google.
I’m grateful that Ozer has called out the vapidity of WebM proponents gushing about the “openness” of the VP8 codec. It reminds me of John Gruber’s jab (regarding Android) that Google was “drunk on its own keyword”. What’s most atrocious to me about VP8 is that open-source has trumped clarity, implementability, and standardization. VP8 apparently only exists as a code-base, not as a technical standard that could, at least in theory, be re-implemented by a third party. As the much-cited first in-depth technical analysis of VP8 said:
The spec consists largely of C code copy-pasted from the VP8 source code — up to and including TODOs, “optimizations”, and even C-specific hacks, such as workarounds for the undefined behavior of signed right shift on negative numbers. In many places it is simply outright opaque. Copy-pasted C code is not a spec. I may have complained about the H.264 spec being overly verbose, but at least it’s precise. The VP8 spec, by comparison, is imprecise, unclear, and overly short, leaving many portions of the format very vaguely explained. Some parts even explicitly refuse to fully explain a particular feature, pointing to highly-optimized, nigh-impossible-to-understand reference code for an explanation. There’s no way in hell anyone could write a decoder solely with this spec alone.
Remember that even Microsoft’s VC-1 was presented and ratified as an actual SMPTE standard. One can also contrast the slop of code that is VP8 with the strategic designs of MPEG with all their codecs, standardizing decoding while permitting any encoder that produces a compliant stream that plays on the reference decoder.
This matters because of something that developers have a hard time grasping: an encoder is not a state machine. Meaning that there need not, and probably should not be, a single base encoder. An obvious example of this is the various use cases for video. A DVD or Blu-Ray disc is encoded once and played thousands or millions of times. In this scenario, it is perfectly acceptable for the encode process to require expensive hardware, a long encode time, a professional encoder, and so on, since those costs are easily recouped and are only needed once. By contrast, video used in a video-conferencing style application requires fairly modest hardware, real-time encoding, and can make few if any demands of the user. Under the MPEG-LA game-plan, the market optimizes for both of these use cases. But when there is no standard other than the code, it is highly unlikely that any implementations will vary much from that code.
Developers also don’t understand that professional encoding is something of an art, that codecs and different encoding software and hardware have distinct behaviors that can be mastered and exploited. In fact, early Blu-Ray discs were often authored with MPEG-2 rather than the more advanced H.264 and VC-1 because the encoders — both the devices and the people operating them — had deeper support for and a better understanding of MPEG-2. Assuming that VP8 is equivalent to H.264 on any technical basis overlooks these human factors, the idea that people now know how to get the most out of H.264, and have little reason to achieve a similar mastery of VP8.
Also, MPEG rightly boasts that ongoing encoder improvements over time allow for users to enjoy the same quality at progressively lower bitrates. It is not likely that VP8 can do the same, so while it may be competitive (at best) with H.264, it won’t necessarily stay that way.
Furthermore, is the MPEG-LA way really so bad? Here’s a line from a review in the Discrete Cosine blog of VP8 back when On2 was still trying to sell it as a commercial product:
On2 is advertising VP8 as an alternative to the mucky patent world of the MPEG licensing association, but that process isn’t nearly as difficult to traverse as they imply, and I doubt the costs to get a license for H.264 are significantly different than the costs to license VP8.
The great benefit of ISO standards like VC-1 and H.264 is that anyone can go get a reference encoder or reference decoder, with the full source code, and hack on their own product. When it times come to ship, they just send the MPEG-LA a dollar (or whatever) for each copy and everyone is happy.
It’s hard to understand what benefits the “openness” of VP8 will ever really provide. Even if it does end up being cheaper than licensing H.264 from MPEG-LA — and even if the licensing body would have demanded royalty payments had H.264 not been challenged by VP8 — proponents overlook the fact that the production and distribution of video is always an enormously expensive endeavor. 15 years ago, I was taught that “good video starts at a thousand dollars a minute”, and we’d expect the number is at least twice that today, just for a minimal level of technical competence. Given that, the costs of H.264 are a drop in the bucket, too small to seriously affect anyone’s behavior. And if not cost, then what else does “openness” deliver? Is there value in forking VP8, to create another even less compatible codec?
In the end, maybe what bugs me is the presumption that software developers like the brain trust at Google know what’s best for everyone else. But assuming that “open source” will be valuable to video professionals is like saying that the assembly line should be great for software development because it worked for Henry Ford.