Philip Hodgetts e-mailed me yesterday, having found my recent CocoaHeads Ann Arbor talk on AV Foundation, and searching from there to find my blog. The first thing this brings up is that I’ve been slack about linking my various online identities and outlets… it should be easier for anyone who happens across my stuff to be able to get to it more easily. As a first step, behold the “More of This Stuff” box at the right, which links to my slideshare.net presentations and my Twitter feed. The former is updated less frequently than the latter, but also contains fewer obscenities and references to anime.

Philip co-hosts a podcast about digital media production, and their latest episode is chock-ful of important stuff about QuickTime and QTKit that more people should know (frame rate doesn’t have to be constant!), along with wondering aloud about where the hell Final Cut stands given the QuickTime/QTKit schism on the Mac and the degree to which it is built atop the 32-bit legacy QuickTime API. FWIW, between reported layoffs on the Final Cut team and their key programmers working on iMovie for iPhone, I do not have a particularly good feeling about the future of FCP/FCE.

Philip, being a Mac guy and not an iOS guy, blogged that he was surprised my presentation wasn’t an NDA violation. Actually, AV Foundation has been around since 2.2, but only became a document-based audio/video editing framework in iOS 4. The only thing that’s NDA is what’s in iOS 4.1 (good stuff, BTW… hope we see it Wednesday, even though I might have to race out some code and a blog entry to revise this beastly entry).

He’s right in the podcast, though, that iPhone OS / iOS has sometimes kept some of its video functionality away from third-party developers. For example, Safari could embed a video, but through iPhone OS 3.1, the only video playback option was the MPMoviePlayerController, which takes over the entire screen when you play the movie. 3.2 provided the ability to get a separate view… but recall that 3.2 was iPad-only, and the iPad form factor clearly demands the ability to embed video in a view. In iOS 4, it may make more sense to ditch MPMoviePlayerController and leave MediaPlayer.framework for iPod library access, and instead do playback by getting an AVURLAsset and feeding it to an AVPlayer.

One slide Philip calls attention to in his blog is where I compare the class and method counts of AV Foundation, android.media, QTKit, and QuickTime for Java. A few notes on how I spoke to this slide when I gave my presentation:

  • First, notice that AV Foundation is already larger than QTKit. But also notice that while it has twice as many classes, it only has about 30% more methods. This is because AV Foundation had the option of starting fresh, rather than wrapping the old QuickTime API, and thus could opt for a more hierarchical class structure. AVAssets represent anything playable, while AVCompositions are movies that are being created and edited in-process. Many of the subclasses also split out separate classes for their mutable versions. By comparison, QTKit’s QTMovie class has over 100 methods; it just has to be all things to all people.

  • Not only is android.media smaller than AV Foundation, it also represents the alpha and omega of media on that platform, so while it’s mostly provided as a media player and capture API, it also includes everything else media-related on the platform, like ringtone synthesis and face recognition. While iOS doesn’t do these, keep in mind that on iOS, there are totally different frameworks for media library access (MediaPlayer.framework), low-level audio (Core Audio), photo library access (AssetsLibrary.framework), in-memory audio clips (System Sounds), etc. By this analysis, media support on iOS is many times more comprehensive than what’s currently available in Android.

  • Don’t read too much into my inclusion of QuickTime for Java. It was deprecated at WWDC 2008, after all. I put it in this chart because its use of classes and methods offered an apples-to-apples comparison with the other frameworks. Really, it’s there as a proxy for the old C-based QuickTime API. If you counted the number of functions in QuickTime, I’m sure you’d easily top 10,000. After all, QTJ represented Apple’s last attempt to wrap all of QuickTime with an OO layer. In QTKit, there’s no such ambition to be comprehensive. Instead, QTKit feels like a calculated attempt to include the stuff that the most developers will need. This allows Apple to quietly abandon unneeded legacies like Wired Sprites and QuickTime VR. But quite a few babies are being thrown out with the bathwater — neither QTKit nor AV Foundation currently has equivalents for the “get next interesting time” functions (which could find edit points or individual samples), or the ability to read/write individual samples with GetMediaSample() / AddMediaSample().

One other point of interest is one of the last slides, which quotes a macro seen throughout AVFoundation and Core Media in iOS 4:


__OSX_AVAILABLE_STARTING(__MAC_10_7,__IPHONE_4_0);

Does this mean that AV Foundation will appear on Mac OS X 10.7 (or hell, does it mean that 10.7 work is underway)? IMHO, not enough to speculate, other than to say that someone was careful to leave the door open.

Update: Speaking of speaking on AV Foundation, I should mention again that I’m going to be doing a much more intense and detailed Introduction to AV Foundation at the Voices That Matter: iPhone Developer Conference in Philadelphia, October 16-17. $100 off with discount code PHRSPKR.

Now that I get to skip JavaOne for the first time in five years (more on that in a couple weeks), I have a short conference schedule for this summer.

  • Apple WWDC – June 8-12 – Expensive, but so worth it. The nature of the Mac and iPhone development community is, honestly, that of a Cargo Cult: it’s primarily driven by Apple’s decisions and announcements (and I don’t think that’s a bad thing; inclusion and community sounds great in theory, but sometimes the result is a four-year pissing match over closures in Java, or a competition of multiple awful Linux desktop environments, each awful in its own special way). So attendees get an advantage by having direct access to the essential APIs and frameworks, both in the form of sessions and labs with the engineers. There’s a lot in QuickTime and Core Audio that seems to come out of the blue, but you get the thinking behind it when the Apple guys present it in a session.

    There’s also a lot of information here that seems to never get out to the public. For example, last year’s Media and Graphics State of the Union announced the deprecation of QuickTime for Java, but no public announcement was ever made, and the QTJ home page, while dated, still goads developers into adopting it.

  • iPhone Camp Atlanta 2009 – July 18 – It’s a heck of a drive, but we moved out of Atlanta just last Fall, and the sale of our house there is finally closing three days before, so this half-day unconference affords a chance to pick up any paperwork or forgotten personal effects, to say nothing of meeting up with other iPhone devs. I proposed via Twitter a session on low-level Core Audio, something I’ve had my head a lot in this Spring.

    Right now, there are over 100 registered attendees, though I’d be surprised if this many show up (people will always register for something free, then half will flake the day of… I would have had a nominal [$25-50] registration fee just to weed out the flakes).

I’m also thinking about using one of the whiteboards at WWDC to propose the idea of a Core Audio unconference somewhere. A lot of people are digging into CA on the iPhone (probably out of necessity… in 2.0, the Audio Queue was the only way to play a flat audio file, and as of 2.2, recording still requires AQ [or audio units]), at different levels of experience and ambition. Maybe it makes sense to get together somewhere for a few days, share notes, and bang on code. We’ll see if anyone bites.

Written while sitting on my ass in the hallway near gate C4 at DFW, because there’s no decent place to recharge my iBook for the flight home.

I just got word from O’Reilly that they’ve opened up their O’Reilly Commons, basically a big O’R-branded wiki. A year or two ago, I gave them permission to put my book QuickTime for Java: A Developer’s Notebook on the site, as it has long since ceased to be commercially viable (it sells like 10 copies, worldwide, per quarter).

Anyways, I’m pleased to announced that you can now read, and edit the entirety of the book at its wiki page.

I’ve long planned to do some updates to its public form, most importantly to add the capture preview stuff that I did for the Lloyd project.

And I will.

As soon as I tear myself away from the iPhone SDK.

A commonality I’ve recently noticed is that some of the Mac frameworks encourage a combination of coding and authoring, in that for some set of functionality, access is possible both with user-level authoring (via visual tools, scripting, etc.) as well as with full-on coding. And this may be something that developers overlook.

It’s the old “if the only tool you have is a hammer, everything looks like a nail” analogy. Let me give you a practical example. Lots of people post to the various QuickTime lists, seeking to do some sort of “overlay”, taking a video and putting a static image (or, less commonly, another video) on top of it. When this comes up on the QTJ list, the poster almost always assumes they need to hack into the rendering pipeline, either stacking views atop one another with a JLayeredPane or, even worse, doing some sort of nasty callback-driven repainting hack.

Almost nobody realizes that they can do it with about 20 lines of XML:

<smil xmlns:qt="http://www.apple.com/quicktime/resources/smilextensions">
  <head>
    <layout>
      <root-layout id="root" width="320" height="240"
      background-color="white"/>
      <region id="main" width="320" height="240"
              z-index="1" fit="meet" />
      <region id="logo-reg" width="286" height="94"
              z-index="2" left="17" top="146"/>
    </layout>
  </head>

  <body>
    <par>
      <img src="sf-logo-2.png"
            qt:composite-mode="transparent-color;white"
            region="logo-reg" dur="15"/>
      <video src="keagy-closeup-320-sv.mov" region="root" dur="15" />
    </par>
  </body>
</smil>

When opened in any QuickTime application, the result looks like this:

SMIL example 1

The above is a simple SMIL movie, which defines areas for the movie and overlay and sets the compositing mode to punch out a transparency. The <par> tags indicate the elements are to be presented in parallel, which would allow you to add an audio file if you were so inclined. A corresponding <seq> tag lets you build sequences with script. And you can composite video too, using a fit attribute on the region to scale the overlaid movie:

SMIL example 2

One advantage of this approach is that all the rendering takes place in QuickTime, so you don’t sacrifice performance by having to copy your pixels over to Java2D/Swing.

But of course the big advantage is obviously the fact that it’s easy. It’s well within the range of the newbie QuickTime or Java programmer, as opposed to the pipeline-rendering hacks implied above. In this sense, it’s preferable to another option I’ve posted to the lists before: creating a one-sample video track with your overlay image. Creating the SMIL file does imply actually writing to disk somewhere, but I suspect that if you must do this on the fly and can’t write to disk, you could probably write the XML file in memory, wrap it with a QTHandle and then create the movie with Movie.fromHandle(). Bonus for experts: this might work faster / more reliably by creating a SMIL-specific MovieImporter and using its fromHandle(). Details…

See, that’s one of the things with a lot of the Mac technologies: there are multiple entry points. With QuickTime, there’s a bunch of functionality that you get by authoring, which you can do with GUIs (QuickTime Player Pro), scripting languages (AppleScript, JavaScript in a browser), or even markup, as seen here. These same things, and of course a lot more, can be achieved with code, but sometimes it makes sense to put down the compiler and just author your functionality. And QuickTime isn’t the only example. Don’t we also see this pattern in the following:

  • AppleScript – as opposed to low-level approaches to inter-application communication and automation
  • WebKit – HTML could be an option for large blocks of styled text, particularly if it needs to change at runtime
  • Quartz Composer

I’m working on a article or screencast proposal on the latter, and the “take” I’m using is that instead of writing your own fancy rendering code (yay, OpenGL), you could just use Quartz Composer to build a .qtz file with a few malleable properties, and then bind those to the values in your program that will change. Imagine a fancy, skinnable MP3 player: if you want the artist and title to be displayed with some crazy combination of gradients, reflections, 3D effects, etc., then instead of coding it, you (or better yet, a QC-savvy graphic designer) could just build those effects on dummy text with the Quartz Composer GUI, then expose those inputs as bindable values, which you’d then wire up in Interface Builder or directly in your code.

Java developers never think this way, perhaps because there are almost no examples of this kind of layered-technology approach in the Java world. About the only examples I can think of are Drools and Fitnesse, both of which are about coding your essential logic in something other than Java, usually because it’s meant for someone other than Java programmers to do. But why shouldn’t we want to see more of this kind of approach? Indeed, on the Java side, we now have all these scripting languages running on the JDK, and have a JavaScript interpreter build into JDK 6. If there’s some part of your system that’s particularly volatile or meant to be hacked on by many people, why not just define that as being written in JavaScript? Early on at Pathfire, we had to handle different business logic for different customers — today, I might well create an API for customer-specific JavaScripts to talk to, with the added bonus that there are millions of techies who might not know (or want to know) Java, but who can bang out JavaScript easily enough.

Speaking of which, Quartz Composer also lets you put JavaScript inside a composition. Authoring begets scripting, and if applied correctly, allows us developers to work on the hard parts they pay us the big bucks for.

After much gnashing of teeth and ill-informed screeds against Apple, the company has released a developer preview 8 of JDK 6 for Leopard. Surely not ready for prime-time, as the small amount of public documentation notes that it is only available for 64-bit Intel Macs (see my earlier entry about the emergence of Intel-only Mac apps).

The belligerent arrogance of Java developers that I responded to before remains in full effect, as the Whiny Little Bitch contingent continues to make up facts and conveniently ignore the difference between Mac developers and Java developers. Javalobby regular Michael Urban is particularly wretched in this regard, with endless gainsaying insults of Apple and anyone who criticizes him. Here’s one recent winner:

Well, I would say the real problem here is, and always has been, Apple, and the fact that they think they can treat developers and high end power users the same way they treat their consumer end users. And the simple fact is, they can’t. Developers need information about what the future is going to look like when they are trying to target a platform. For example if I start a new project today, it’s unsafe for me to use Java 6 features because of Apple. Even though my project is just starting, and certainly Apple should [have] Java 6 released before it is finished, I still can’t plan on using Java 6 because Apple provides absolutely no information at all about what the future will look like.

I would argue that Java developers are more like end-users than they are like Mac developers. End-users consume products for their own needs. Developers, in the traditional sense, consume products (IDEs, APIs) in a software ecosystem but also create new products for that ecosystem (end-user apps, libraries, etc.) The difference is that the Java developer is rarely (if ever) developing Java apps to be run on the Mac — most are developing web apps. So from Apple’s point of view, the Java developer consumes but never produces. Which is fine: the same could be said of photographers, video professionals, artists, writers, teachers, etc. Thing is, those groups aren’t writing pissy posts about how they deserve special treatment.

Or, for that matter, vowing a mass exodus from the platform:

I’m not surprised about anything either. And I hope Apple isn’t surprised that more and more Java developers who once swore by the Mac, have abandoned it and moved to other platforms because they simply can’t deal with the way Apple has been handling Java anymore.

As I argued before, with the size and growth of the Mac platform, losing a few pissy Java developers wouldn’t even be noticed. No, not even if they told all their friends and their moms how much Apple sucks now.

I’m also waiting for some idiot to say that Apple was “threatened” by the release of Landon Fuller’s Soy Latte, a port of BSD Java to Mac OS X, and that this pressure was what got Apple to put out a new developer preview of JDK 6.

Look, we do not know what is up with Apple and Java. Maybe it’s techincal, maybe it’s business priorities, maybe it’s a legal or licensing tussle with Sun. Since we don’t know, it’s foolish and pointless to speculate as to why they didn’t have a JDK 6 with Leopard, and why they put one out a few weeks later.

And I’d hate for Fuller to be dragged down into this. He was very gracious discussing the project with the Java Posse a few weeks back, and I’ve corresponded with him by e-mail and he seems very thoughtful, open, and fair. I don’t think he’s got some big axe to grind with Apple; I think he finds the project intellectually interesting, and that motivates him.

Honestly, if I could find the time this holiday season, I’d see if I could lend a hand with providing the Core Audio support his project needs. I’ve been reading up on CA. It’s nice. On the other hand, I loathe javax.sound.sampled; I dug in pretty deep for Swing Hacks, providing code to handle arbitrary-length streams in Hacks 76-78. Most JavaSound examples load an entire Clip into memory, because that API is easy to work with. Handling large audio files requires writing your own streamer to hand samples to JavaSound on a periodic basis — it’s a pain, and expecting every developer to have to baby-sit JavaSound for non-trivial sound probably explains why nobody’s real enthusiastic about doing it (indeed, when I was researching Swing Hacks, I found no tutorials, blogs, or other third-party documentation on using the library this way, so I was basically in green fields).

Ultimately, I think Soy Latte — possibly as an official part of the OpenJDK project — will probably be the major Mac JVM a few years from now. If the Java developer community is willing to do this work, and it matters much more to them than it does to Apple or most (any?) of its end-users, then that’s the end-game that makes the most sense. Developers would get an up-to-date VM to work with, and even if it ended up not being installed with the OS, who cares? It’s not like end-users would notice the difference between applets that use Java SE 5 and those that use Java SE 6. Most wouldn’t even notice the difference between having and not having Java. It’s not like we’re talking Flash here or anything…

One more note on Apple’s JDK: Apple replied to an open question I posted to clarify that specifics about JDK 6 DP 8 are not to be discussed publicly, as it’s distributed through ADC, which has a binding NDA.

So while I can’t comment on specifics, I can combine publicly-known facts to postulate on a rather important concern that’s going to come up. This build, at least, is 64-bit only. Remember, 64-bit and 32-bit code can’t exist in the same process, so a 32-bit browser couldn’t use a 64-bit JVM to support applets, and a 64-bit Java application can’t make a JNI call into any 32-bit library. Like, say, all of Carbon. And given that there’s a lot of Carbon-based JNI projects out there — QuickTime for Java, the Mac version of SWT, etc. — a lot of developers could be in for a whole lot of pain if they don’t have a Cocoa migration path figured out.

I had thought that the disappearance of the QuickTime for Java might signal the end for this little-updated but beloved library, but Apple just announced that the docs are back up.

That QTJ isn’t long for the world is a given, since it calls into a number of deprecated technologies, most notably QuickDraw, and uses deprecated methods, like image compression using the older functions that can’t handle frame-reordering codecs ( SCCompressSequenceFrame instead of the newer ICMCompressionSessionEncodeFrame), etc. And ultimately, all of Carbon is presumably meant for deprecation in the foreseeable future, as it is not supported in 64-bit mode. But more on that another time.

I think the weird thing about QTJ’s history is that rather than not living up to its potential, its potential didn’t live up to QTJ. You have to consider the time when it was introduced: for a short time in the 90′s, there was this “everything’s going to be redone in Java” craze, and that included desktop apps. Until the first major AWT and Swing efforts crashed and burned (Netscape Navigator in Java, Corel’s attempt to do an Office suite in Java, etc.), there was a very real belief that Java would soon be the future. Apple was concerned about losing QuickTime developers, so it created QTJ — from the ruins of Kalieda, actually — for the expressed purpose of holding on to QuickTime developers while they changed languages. It was never intended to provide Java programmers with a media framework; it was meant to keep QuickTime developers in Apple’s ballpark. That’s why the documentation is so bad: the target audience already knew the native QuickTime API, so the Javadocs didn’t really need to describe methods in particular detail; they just needed to give the recent-Java-convert developer some idea what native QuickTime function he or she was really calling.

Of course, the “everything’s going to be in Java” era came to a brutal end, at least on the desktop, and while QTJ worked pretty well, the whole point behind it has been rendered moot. Understandably, it hasn’t been an Apple priority for a while, because Apple would be better off putting work into a single kick-ass native QuickTime framework (unfortunately, they currently have two, but that’s another story), and they really don’t need to be providing the Java community with a kick-ass media framework, since that’s really Sun’s problem to solve.

Next Page »