Let me introduce you to the “Whiny Little Bitch Contingent”. This was a term I coined in the late 2000′s to cover the Java developers who cried and moaned about the slow decline in Apple’s support for Java: the deprecation of the Cocoa-Java bridge, the long wait for Java 6 on Mac OS X, its absence from iOS, etc. Every time there was news on this front, they could be reliably counted on to dredge up Steve Jobs’ pledge at the JavaOne 2000 keynote to make the Mac the best Java programming environment… and to bring this up in seeming ignorance of the passage of many years, the changes in the tech world, the abject failure of Desktop Java, other companies’ broken promises (Sony’s pledge of Java on the PlayStation 2, the Java-based Phantom gaming console), etc.
The obvious trait of the Whiny Little Bitch Contingent is their sense of entitlement: companies like Apple owe us stuff. The more subtle trait is their ignorance of the basic principle that people, organizations, and companies largely operate in their own self-interest. Apple got interested in Java when it seemed like a promising way to write Mac apps (or, a promising way to get developers to write Mac apps). When that failed, they had understandably little interest in providing developers a means of writing apps for other platforms. I’m sure I’m not the only person to write a Java webapp on the Mac that I knew my employer would block Mac clients from actually using. By 2008, when Apple entered the mobile market with the iPhone, there was nothing about supporting Java that would appeal to Apple’s self-interest, outside of a small number of hardware sales to Java developers.
That’s what defines the WLBC to me: sense of entitlement, and an igorance of other parties’ self-interest (which leads to an expectation of charity and thus the sense of entitlement).
So, yesterday, Apple holds an event to roll out their whole big deal with Textbooks on the iPad. They look pretty, they’ve got an economic model that may make some sense for publishers (i.e., it may be in the publishers’ self-interest), etc. Also, there’s a tool for creating textbooks in Apple’s format.
And this is where the Whiny Little Bitch Contingent goes ape-shit. Because there’s a clause in the iBooks Author EULA that says if you’re going to charge for your books, you can only publish to Apple’s iBookstore.
So, let’s back up a second. The only point of this software is to feed Apple’s content chain. The only reason it is being offered, free, is to lure authors and publishers to use Apple’s stuff… which in turn sells more iPads and gives Apple a 30% cut. If you are not going to put stuff on Apple’s store, why do you even care about this? Hell, I don’t develop for Microsoft’s platforms, so if they see the need to turn Visual Studio into an adventure game… hey that’s their problem.
If you’re not authoring for Apple’s iBookstore, why do you even care what iBooks Author does, or what’s in its EULA?
In decrying the “cold cynicism” of Apple’s iBook EULA, Marshall Kirkpatrick writes:
It’s hard to wrap my brain around the cold cynicism of Apple’s releasing a new tool to democratize the publishing of eBooks today, only to include in the tool’s terms and conditions a prohibition against selling those books anywhere but through Apple’s own bookstore
“Democratize the publishing of eBooks”? Where the hell did he get that? Maybe he watched the video and fell for the grandiosity and puffery… I never actually watch these Apple dog-and-pony shows anymore, as following the Twitter discussion seems to give me the info I need. But thinking that Apple is in the business of democratizing anything is nuts: they’re in the business of selling stuff, and the only reason they’d give out a free tool is to get you to help them sell more of that stuff.
I didn’t download iBooks Author, even though you’d expect an Apple-skewing author like me to be one of the first onboard. Frankly, I’m pretty tired of writing, as the last two books have been difficult experiences, and the thought of starting another book, even with a 70% royalty instead of 5%, is not that appealing. A year ago I thought about self-publishing a book on AV Foundation, but right now I lack the will (also, I’ve failed to fall in love with AV Foundation, and blanch at its presumptions, limitations, and lack of extensibility… I much prefer the wild and wooly QuickTime or Core Audio).
So, if we’re going to talk about iBooks Author, let me know how it holds up for long documents: if it’s pretty on page 1, is it still usable when you’re 200 pages in? Does it offer useful tools for managing huge numbers of assets? Does it provide its own revision system and change tracking, or does it at least play nicely with Subversion and Git? Can it be used in a collaborative environment? These are interesting questions, at least to people who plan to use the tool to publish books on the iBookstore.
But if Apple’s not giving you a pretty, free tool you can use to write .mobi files that Amazon can sell Kindles with? Sorry, Whiny Little Bitch Contingent, I’ve got zero sympathy for you there. Call it a third party opportunity. Or just put on your big boy underwear and do it yourself.
Take it, Geddy:
You don’t get something for nothing
You can’t have freedom for free
You won’t get wise
With the sleep still in your eyes
No matter what your dreams might be
I’m surprised how fast iOS conference slides go old. I re-used some AV Foundation material that was less than a year old at August’s CocoaConf, some of it already seemed crusty, not the least of which was a performSelectorOnMainThread: call on a slide where any decent 2011 coder would use dispatch_async() and a block. So it’s probably just as well that I did two completely new talks for last weekend’s Voices That Matter: iOS Developers Conference in Boston.
OK, long-time readers — ooh, do I have long-time readers? — know how these posts work: I do links to the slides on Slideshare and code examples on my Dropbox. Those are are at the bottom, so skip there if that’s what you need. Also, I’ve put URLs of the sample code under each of the “Demo” slides.
The AV Foundation talk is completely limited to capture, since my last few talks have gone so deep into the woods on editing (and I’m still unsatisfied with my mess of sample code on that topic that I put together for VTM:iPhone Seattle in the Spring… maybe someday I’ll have time for a do-over). I re-used an earlier “capture to file and playback” example, and the ZXing barcode reader as an example of setting up an AVCaptureVideoDataOutput, so the new thing in this talk was a straightforward face-finder using the new-to-iOS Core Image CIDetector. Apple’s WWDC has a more ambitious example of this API, so go check that out if you want to go deeper.
The Core Audio talk was the one I was most jazzed about, given that Audio Units are far more interesting in iOS 5 with the addition of music, generator, and a dozen effects units. That demo builds up an AUGraph that takes mic input, a file-player looping a drum track, and an AUSampler allowing for MIDI input from a physical keyboard (the Rock Band 3 keyboard, in fact) to play two-second synth sample that I cropped from one of the Soundtrack loops, all mixed by an AUMultichannelMixer and then fed through two effects (distortion and low pass filter) before going out to hardware through AURemoteIO. Oh, and with a simple detail view that lets you adjust the input levels into the mixer and to bypass the effects.
The process of setting up a .aupreset and getting that into an AUSampler at runtime is quite convoluted. There are lots of screenshots from AULab in the slides, but I might just shoot a screencast and post to YouTube. For now, combine WWDC 2011 session #411 with Technical Note TN2283 and you have as much a fighting chance as I did.
I’ll be doing these talks again at CocoaConf in Raleigh, NC on Dec. 1-2, with a few fix-ups and polishing. The face-finder has a stupid bug where it creates a new CIDetector on each callback from the camera, which is grievously wasteful. For the Core Audio AUGraph, I realized in the AU property docs that the mixer has pre-/post- peak/average meters, so it looks like it would be easy to add level meters to the UI. So those versions of the talks will be a little more polished. Hey, it was a tough crunch getting enough time away from client work to get the sample code done at all.
Speaking of preparation, the other thing notable about these talks is that I was able to do the slides for both talks entirely on the iPad, while on the road, using Keynote, OmniGraffle, and Textastic. Consumption-only device, my ass.
- Capturing Stills, Sounds, and Scenes with AV Foundation
- Core Audio Cranks It Up
Trying something different for my two 90-minute AV Foundation presentations at CocoaConf today, I decided to do my presentation entirely from the iPad with the VGA adapter… no laptop.
In some ways, it was an easy choice to make: I’d already done the slides in Keynote, so running the same presentation off the iPad only necessitated changing a few fonts (I usually code in Incosolata, which required a change to Courier). If anything, the app demos worked better, since running apps in the simulator precludes showing off anything particular to the device hardware, such as accelerometers, location, or (in my case) video capture. In fact, as soon as you touch the AV Foundation capture APIs in a project, you lose the ability to build for the Simulator.
Downsides include the fact that I couldn’t hop into source on Xcode, so any important code needed to be in slides (I continue to hope for Xcode for iPad, though its painful performance on my 4GB MacBook has dashed those hopes somewhat). Still, I remain impressed that I can get so much done with just the iPad… probably my biggest disappointment today was having to use the official WordPress app to upload this entry’s picture, as WordPress always destroys user data, and even if I do need it only for uploading pictures from the iPad, I don’t need the pictures more than I need the blog itself (how can it be that the WordPress app is as bad as it is?)
One thing to keep in mind for iPad-only presentations is that you cannot charge while the VGA cable is plugged in, so you need to start with enough battery power to get through your presentation. That said, it’s not hard: between two 90-minute talks, I drained my battery from 90% to 55%. The battery certainly seemed likely to outlast both my voice and my legs, so that’s not a problem.
I’m happy to leave the laptop behind whenever possible, and will probably do so at my next conference. Speaking of which, the Voices That Matter: iOS Developers’ Conference is coming to Boston on Nov. 12-13, and you can get $150 off with my speaker code: BSSPKR5.
Tomorrow is the second and final day for CocoaConf, which has a shockingly deep and thorough collection of talks for a first time conference. Nice to have another good Mac/iOS conference in this part of the country.
Update 8/23/11 – Here, after much delay, are links to the slides and sample code from my CocoaConf presentation.
- Introduction to AV Foundation (CocoaConf, Aug ’11) (slides from slideshare.net)
- Advanced AV Foundation (CocoaConf, Aug ’11) (slides from slideshare.net)
- VTM_Player.zip, VTM_AVRecPlay.zip, VTM_AVEditor.zip – see Slides and stuff from Voices That Matter talk
- VTM_iPodReader.zip – see From iPod Library to PCM Samples in Far Fewer Steps Than Were Previously Necessary
- VTM_AVEffects.zip – see Wrap up from Voices That Matter iPhone, Spring 2011
- CCFScreenRecorder.zip
- Zxing – see “Barcodes” app in the zxing project on Google Code
Quick reminder: Early Bird pricing ($350) ends Friday (July 22) for CocoaConf in Columbus, Ohio. It’s a new conference, but it has put together a very impressive set of speakers, most recently adding Bill Dudney, former Apple evangelist and co-author on my revised iOS SDK Development book.
I’m doing two talks on AV Foundation. These may draw on the three talks I’ve done at the Voices That Matter conferences, but if I have time, I’m considering rebooting them from the “tour the APIs” format to a “one big project” format. Maybe something along the lines of “Let’s Write QuickTime Player for iPhone” and “Let’s Write Final Cut Pro for iPad”. We’ll see… don’t want to over-promise when I’m already buried with programming work and two books.
Oh, and Columbus has my favorite chain restaurant, a holdover from my time in Atlanta, and my former employer at One CNN Center. Just sayin’, in case plans are needed for Saturday night. Ahem.
Speaking announcement: I’ll be doing two talks at a new OSX/iOS conference, CocoaConf, being held in Columbus, OH on August 12 and 13.
I’m going to do two talks, introductory and advanced AV Foundation. Obviously, they’ll probably be a lot like what I’ve already done at Voices That Matter, except that the advanced talk I did in Seattle was about 1/3 recap, whereas this is all going to be in the same track, so it really can be a monster immersion in two parts.
It’s not settled whether sessions are going to be 60 or 90 minutes… I keep going long at VTM’s 75, so 90 would be fine. If that’s the case, then my two sessions are going to amount to a three-hour intro to AV Foundation. So we should be able to cover a lot of ground and get really deep into the tough stuff.
Ugh, this is twice in a row that I’ve done a talk for the Voices That Matter: iPhone Developer Conference and been able to neither get all my demos working perfectly in time, nor to cover all the important material in 75 minutes. Yeah, doing a 300-level talk will do that to you, but still…
This weekend’s talk in Seattle was “Advanced Media Manipulation with AV Foundation”, sort of a sequel to the intro talk I did at VTM:i Fall 2010 (Philly), but since the only people who would have been at both conferences are speakers and organizers, I spent about 25 minutes recapping material from the first talk: AVAssets, the AVPlayer and AVPlayerLayer, AVCaptureSession, etc.
Aside: AVPlayerLayer brings up an interesting point, given that it is a subclass of CALayer rather than UIView, which is what’s provided by the view property of the MPMoviePlayerController. What’s the big difference between a CALayer and a UIView, and why does it matter for video players? The difference is that UIView subclasses UIResponder and therefore responds to touch events (the one in the Media Player framework has its own pop-up controls after all), whereas a CALayer, and AVPlayerLayer, does not respond to touch input itself… it’s purely visual.
So anyways, on to the new stuff. What has interested me for a while in AV Foundation is the classes added in 4.1 to do sample level access, AVAssetWriter and AVAssetReader. An earlier blog entry, From iPod Library to PCM Samples in Far Fewer Steps Than Were Previously Necessary, exercises both of these, reading from an iPod Library song with an AVAssetReader and writing to a .caf file with an .
Before showing that, I did a new example, VTM_ScreenRecorderTest, which uses AVAssetWriter to make an iOS screen recorder for your application. Basically, it runs an onscreen clock (so that something onscreen is changing), and then uses an NSTimer to periodically do a screenshot and then write that image as a video sample to the single video track of a QuickTime .mov file. The screenshot code is copied directly from Apple's Technical Q&A 1703, and the conversion from the resulting UIImage to the CMSampleBufferRef needed for writing raw samples is greatly simplified with the AVAssetWriterInputPixelBufferAdaptor.
In the Fall in Philly, I showed a cuts-only movie editor that just inserted segments up at the AVMutableComposition level. For this talk, I wanted to do multiple video tracks, with transitions between them and titles. I sketched out a very elaborate demo project, VTM_AVEffects, which was meant to perform the simple effects I used for the Running Start (.m4v download) movie that I often use an example. In other words, I needed to overlay titles and do some dissolves.
About 10 hours into coding my example, I realized I was not going to finish this demo, and settled for getting the title and the first dissolve. So if you're going to download the code, please keep in mind that this is badly incomplete code (the massive runs of commented-out misadventures should make that clear), and it is neither production-quality, nor copy-and-paste quality. And it most certainly has memory leaks and other unresolved bugs. Oh, and all the switches and text fields? They do nothing. The only things that work are tapping "perform" and then "play" (or the subsequent "pause"). Scrubbing the slider and setting the rate field mostly work, but have bugs, particularly in the range late in the movie where there are no valid video segments, but the :30 background music is still valid.
Still, I showed it and will link to it at the end of this blog because there is some interesting working code worth discussing. Let's start with the dissolve between the first two shots. You'll notice in the code that I go with Apple's recommendation of working back and forth between two tracks ("A" and "B", because I learned on analog equipment and always think of it as A/B Roll editing). The hard part -- and by hard, I mean frustrating, soul-draining, why-the-frack-isn't-this-goddamn-thing-working hard -- is providing the instructions that describe how the tracks are to be composited together. In AV Foundation, you provide an AVVideoComposition that describes the compositing of every region of interest in your movie (oh, I'm sorry, in your AVComposition… which is in no way related to the AVVideoComposition). The AVVideoComposition has an array of AVVideoCompositionInstructions, each covering a specific timeRange, and each containing its own AVVideoCompositionLayerInstruction to describe the opacity and affine transform (static or animated) of each video track. Describing it like that, I probably should have included a diagram… maybe I'll whip one up in OmniGraffle and post it later. Anyways, this is fairly difficult to get right, as your various instructions need to account for all time ranges across all tracks, with no gaps or overlaps, and timing up identically with the duration of the AVComposition. Like I said, I got exactly one fade-in working before I had to go pencils-down on the demo code and start preparing slides. Maybe I'll be able to fix it later… but don't hold me to that, OK?
The other effect I knew I had to show off was titles. AVFoundation has a curious way to do this. Rather than add your titles and other overlays as new video tracks, as you'd do in QuickTime, AVF ties into Core Animation and has you do your image magic there. By using an AVSynchronizedLayer, you can create sublayers whose animations get their timing from the movie, rather than from the system clock. It's an interesting idea, given how powerful Quartz and Core Animation are. But it's also deeply weird to be creating content for your movie that is not actually part of the movie, but is rather just loosely coupled to the player object by way of the AVPlayerItem (and this leads to some ugliness when you want to export the movie and include the animations in the export). I also noticed that when I scrubbed past the fade-out of the title and then set the movie playback rate to a negative number to run it backward, the title did not fade back in as expected… which makes me wonder if there are assumptions in UIKit or Core Animation that time always runs forward, which is of course not true when AV Foundation controls animation time, via the AVSynchronizedLayer
My code is badly incomplete and buggy, and anyone interested in a solid demo of AV Foundation editing would do well to check out the AVEditDemo from Apple's WWDC 2010 sample code. Still, I said I would post what I've got, so there you go. No complaints from you people, or the next sample code you get from me will be another goddamned webapp server written in Java.
Oh yeah, at one point, I dreamed of having enough time to write a demo that would process A/V capture data in real-time, using AVCaptureSessionDataOutput, maybe showing a live audio waveform or doing an FFT. But that demo didn't even get written. Maybe next conference.
For a speaker on "advanced" AV Foundation, I find I still have a lot of unanswered questions about this framework. I'm not sure how well it supports saving an AVComposition that you're editing -- even if the AVF classes implement NSCopying / NSMutableCopying and could therefore be persisted with key-value archiving, that doesn't address how you'd persist your associated Core Audio animations. I also would have to think hard about how to make edits undoable and redoable… I miss QuickTime's MovieEditState already. And to roll an edit… dig into a track's segments and dick with their timeRanges, or do you have to remove and reinsert the segment?
And what else can I do with CASynchronizedLayer? I don't see particularly compelling transitions in AVF -- just dissolves and trivial push wipes (ie, animation of the affine transform) -- but if I could render whatever I like in a CALayer and pick up the timing from the synchronized layer, is that how I roll my own Quartz-powered goodness? Speaker Cathy Shive and I were wondering about this idea over lunch, trying to figure out if we would subclass CAAnimation or CALayer in hopes of getting a callback along the lines of "draw your layer for time t", which would be awesome if only either of us were enough of a Core Animation expert to pull it off.
So, I feel like there's a lot more for me to learn on this, which is scary because some people think I'm an expert on the topic… for my money, the experts are the people in the AV Foundation dev forums (audio, video), since they're the ones really using it in production and providing feedback to Apple. Fortunately, these forums get a lot of attention from Apple's engineers, particularly bford, so that sets a lot of people straight about their misconceptions. I think it's going to be a long learning curve for all of us.
If you're keen to start, here are the slides and demo code:
- Slides (at slideshare.net)
- VTM_iPodReader.zip - see From iPod Library to PCM Samples in Far Fewer Steps Than Were Previously Necessary
- VTM_Player.zip, VTM_AVRecPlay.zip, VTM_AVEditor.zip - see Slides and stuff from Voices That Matter talk
- VTMScreenRecorderTest.zip
- VTM_AVEffects.zip (caution: 24 MB)









