In a July blog entry, I showed a gruesome technique for getting raw PCM samples of audio from your iPod library, by means of an easily-overlooked metadata attribute in the Media Library framework, along with the export functionality of AV Foundation. The AV Foundation stuff was the gruesome part — with no direct means for sample-level access to the song “asset”, it required an intermedia export to .m4a, which was a lossy re-encode if the source was of a different format (like MP3), and then a subsequent conversion to PCM with Core Audio.

Please feel free to forget all about that approach… except for the Core Media timescale stuff, which you’ll surely see again before too long.

iOS 4.1 added a number of new classes to AV Foundation (indeed, these were among the most significant 4.1 API diffs) to provide an API for sample-level access to media. The essential classes are AVAssetReader and AVAssetWriter. Using these, we can dramatically simplify and improve the iPod converter.

I have an example project, VTM_AViPodReader.zip (70 KB) that was originally meant to be part of my session at the Voices That Matter iPhone conference in Philadelphia, but didn’t come together in time. I’m going to skip the UI stuff in this blog, and leave you to a screenshot and a simple description: tap “choose song”, pick something from your iPod library, tap “done”, and tap “Convert”.

Screenshot of VTM_AViPodReader

To do the conversion, we’ll use an AVAssetReader to read from the original song file, and an AVAssetWriter to perform the conversion and write to a new file in our application’s Documents directory.

Start, as in the previous example, by using the valueForProperty:MPMediaItemPropertyAssetURL attribute to get an NSURL representing the song in a format compatible with AV Foundation.



-(IBAction) convertTapped: (id) sender {
	// set up an AVAssetReader to read from the iPod Library
	NSURL *assetURL = [song valueForProperty:MPMediaItemPropertyAssetURL];
	AVURLAsset *songAsset =
		[AVURLAsset URLAssetWithURL:assetURL options:nil];

	NSError *assetError = nil;
	AVAssetReader *assetReader =
		[[AVAssetReader assetReaderWithAsset:songAsset
			   error:&assetError]
		  retain];
	if (assetError) {
		NSLog (@"error: %@", assetError);
		return;
	}

Sorry about the dangling retains. I’ll explain those in a little bit (and yes, you could use the alloc/init equivalents… I’m making a point here…). Anyways, it’s simple enough to take an AVAsset and make an AVAssetReader from it.

But what do you do with that? Contrary to what you might think, you don’t just read from it directly. Instead, you create another object, an AVAssetReaderOutput, which is able to produce samples from an AVAssetReader.


AVAssetReaderOutput *assetReaderOutput =
	[[AVAssetReaderAudioMixOutput 
	  assetReaderAudioMixOutputWithAudioTracks:songAsset.tracks
				audioSettings: nil]
	retain];
if (! [assetReader canAddOutput: assetReaderOutput]) {
	NSLog (@"can't add reader output... die!");
	return;
}
[assetReader addOutput: assetReaderOutput];

AVAssetReaderOutput is abstract. Since we’re only interested in the audio from this asset, a AVAssetReaderAudioMixOutput will suit us fine. For reading samples from an audio/video file, like a QuickTime movie, we’d want AVAssetReaderVideoCompositionOutput instead. An important point here is that we set audioSettings to nil to get a generic PCM output. The alternative is to provide an NSDictionary specifying the format you want to receive; I ended up doing that later in the output step, so the default PCM here will be fine.

That’s all we need to worry about for now for reading from the song file. Now let’s start dealing with writing the converted file. We start by setting up an output file… the only important thing to know here is that AV Foundation won’t overwrite a file for you, so you should delete the exported.caf if it already exists.


NSArray *dirs = NSSearchPathForDirectoriesInDomains 
				(NSDocumentDirectory, NSUserDomainMask, YES);
NSString *documentsDirectoryPath = [dirs objectAtIndex:0];
NSString *exportPath = [[documentsDirectoryPath
				 stringByAppendingPathComponent:EXPORT_NAME]
				retain];
if ([[NSFileManager defaultManager] fileExistsAtPath:exportPath]) {
	[[NSFileManager defaultManager] removeItemAtPath:exportPath
		error:nil];
}
NSURL *exportURL = [NSURL fileURLWithPath:exportPath];

Yeah, there’s another spurious retain here. I’ll explain later. For now, let’s take exportURL and create the AVAssetWriter:


AVAssetWriter *assetWriter =
	[[AVAssetWriter assetWriterWithURL:exportURL
		  fileType:AVFileTypeCoreAudioFormat
			 error:&assetError]
	  retain];
if (assetError) {
	NSLog (@"error: %@", assetError);
	return;
}

OK, no sweat there, but the AVAssetWriter isn’t really the important part. Just as the reader is paired with “reader output” objects, so too is the writer connected to “writer input” objects, which is what we’ll be providing samples to, in order to write them to the filesystem.

To create the AVAssetWriterInput, we provide an NSDictionary describing the format and contents we want to create… this is analogous to a step we skipped earlier to specify the format we receive from the AVAssetReaderOutput. The dictionary keys are defined in AVAudioSettings.h and AVVideoSettings.h. You may find you need to look in these header files to look for the value types to provide for these keys, and in some cases, they’ll point you to the Core Audio header files. Trial and error led me to ultimately specify all of the fields that would be encountered in a AudioStreamBasicDescription, along with an AudioChannelLayout structure, which needs to be wrapped in an NSData in order to be added to an NSDictionary



AudioChannelLayout channelLayout;
memset(&channelLayout, 0, sizeof(AudioChannelLayout));
channelLayout.mChannelLayoutTag = kAudioChannelLayoutTag_Stereo;
NSDictionary *outputSettings =
[NSDictionary dictionaryWithObjectsAndKeys:
	[NSNumber numberWithInt:kAudioFormatLinearPCM], AVFormatIDKey, 
	[NSNumber numberWithFloat:44100.0], AVSampleRateKey,
	[NSNumber numberWithInt:2], AVNumberOfChannelsKey,
	[NSData dataWithBytes:&channelLayout length:sizeof(AudioChannelLayout)],
		AVChannelLayoutKey,
	[NSNumber numberWithInt:16], AVLinearPCMBitDepthKey,
	[NSNumber numberWithBool:NO], AVLinearPCMIsNonInterleaved,
	[NSNumber numberWithBool:NO],AVLinearPCMIsFloatKey,
	[NSNumber numberWithBool:NO], AVLinearPCMIsBigEndianKey,
	nil];

With this dictionary describing 44.1 KHz, stereo, 16-bit, non-interleaved, little-endian integer PCM, we can create an AVAssetWriterInput to encode and write samples in this format.


AVAssetWriterInput *assetWriterInput =
	[[AVAssetWriterInput assetWriterInputWithMediaType:AVMediaTypeAudio
				outputSettings:outputSettings]
	retain];
if ([assetWriter canAddInput:assetWriterInput]) {
	[assetWriter addInput:assetWriterInput];
} else {
	NSLog (@"can't add asset writer input... die!");
	return;
}
assetWriterInput.expectsMediaDataInRealTime = NO;

Notice that we’ve set the property assetWriterInput.expectsMediaDataInRealTime to NO. This will allow our transcode to run as fast as possible; of course, you’d set this to YES if you were capturing or generating samples in real-time.

Now that our reader and writer are ready, we signal that we’re ready to start moving samples around:


[assetWriter startWriting];
[assetReader startReading];
AVAssetTrack *soundTrack = [songAsset.tracks objectAtIndex:0];
CMTime startTime = CMTimeMake (0, soundTrack.naturalTimeScale);
[assetWriter startSessionAtSourceTime: startTime];

These calls will allow us to start reading from the reader and writing to the writer… but just how do we do that? The key is the AVAssetReaderOutput method copyNextSampleBuffer. This call produces a Core Media CMSampleBufferRef, which is what we need to provide to the AVAssetWriterInput‘s appendSampleBuffer method.

But this is where it starts getting tricky. We can’t just drop into a while loop and start copying buffers over. We have to be explicitly signaled that the writer is able to accept input. We do this by providing a block to the asset writer’s requestMediaDataWhenReadyOnQueue:usingBlock. Once we do this, our code will continue on, while the block will be called asynchronously by Grand Central Dispatch periodically. This explains the earlier retains… autoreleased variables created here in convertTapped: will soon be released, while we need them to still be around when the block is executed. So we need to take care that stuff we need is available inside the block: objects need to not be released, and local primitives need the __block modifier to get into the block.


__block UInt64 convertedByteCount = 0;
dispatch_queue_t mediaInputQueue =
	dispatch_queue_create("mediaInputQueue", NULL);
[assetWriterInput requestMediaDataWhenReadyOnQueue:mediaInputQueue 
										usingBlock: ^ 
 {

The block will be called repeatedly by GCD, but we still need to make sure that the writer input is able to accept new samples.


while (assetWriterInput.readyForMoreMediaData) {
	CMSampleBufferRef nextBuffer =
		[assetReaderOutput copyNextSampleBuffer];
	if (nextBuffer) {
		// append buffer
		[assetWriterInput appendSampleBuffer: nextBuffer];
		// update ui
		convertedByteCount +=
			CMSampleBufferGetTotalSampleSize (nextBuffer);
		NSNumber *convertedByteCountNumber =
			[NSNumber numberWithLong:convertedByteCount];
		[self performSelectorOnMainThread:@selector(updateSizeLabel:)
			withObject:convertedByteCountNumber
		waitUntilDone:NO];

What’s happening here is that while the writer input can accept more samples, we try to get a sample from the reader output. If we get one, appending it to the writer output is a one-line call. Updating the UI is another matter: since GCD has us running on an arbitrary thread, we have to use performSelectorOnMainThread for any updates to the UI, such as updating a label with the current total byte-count. We would also have to do call out to the main thread to update the progress bar, currently unimplemented because I don’t have a good way to do it yet.

If the writer is ever unable to accept new samples, we fall out of the while and the block, though GCD will continue to re-run the block until we explicitly stop the writer.

How do we know when to do that? When we don’t get a sample from copyNextSampleBuffer, which means we’ve read all the data from the reader.


} else {
	// done!
	[assetWriterInput markAsFinished];
	[assetWriter finishWriting];
	[assetReader cancelReading];
	NSDictionary *outputFileAttributes =
		[[NSFileManager defaultManager]
			  attributesOfItemAtPath:exportPath
			  error:nil];
	NSLog (@"done. file size is %ld",
		    [outputFileAttributes fileSize]);
	NSNumber *doneFileSize = [NSNumber numberWithLong:
			[outputFileAttributes fileSize]];
	[self performSelectorOnMainThread:@selector(updateCompletedSizeLabel:)
			withObject:doneFileSize
			waitUntilDone:NO];
	// release a lot of stuff
	[assetReader release];
	[assetReaderOutput release];
	[assetWriter release];
	[assetWriterInput release];
	[exportPath release];
	break;
}

Reaching the finish state requires us to tell the writer to finish up the file by sending finish messages to both the writer input and the writer itself. After we update the UI (again, with the song-and-dance required to do so on the main thread), we release all the objects we had to retain in order that they would be available to the block.

Finally, for those of you copy-and-pasting at home, I think I owe you some close braces:


		}
	 }];
	NSLog (@"bottom of convertTapped:");
}

Once you’ve run this code on the device (it won’t work in the Simulator, which doesn’t have an iPod Library) and performed a conversion, you’ll have converted PCM in an exported.caf file in your app’s Documents directory. In theory, your app could do something interesting with this file, like representing it as a waveform, or running it through a Core Audio AUGraph to apply some interesting effects. Just to prove that we actually have performed the desired conversion, use the Xcode Organizer to open up the “iPod Reader” application and drag its “Application Data” to your Mac:

Accessing app's documents with Xcode Organizer

The exported folder will have a Documents, in which you should find exported.caf. Drag it over to QuickTime Player or any other application that can show you the format of the file you’ve produced:

QuickTime Player inspector showing PCM format of exported.caf file

Hopefully this is going to work for you. It worked for most Amazon and iTunes albums I threw at it, but found I had an iTunes Plus album, Ashtray Rock by the Joel Plaskett Emergency, whose songs throw an inexplicable error when opened, so I can’t presume to fully understand this API just yet:


2010-12-12 15:28:18.939 VTM_AViPodReader[7666:307] *** Terminating app
 due to uncaught exception 'NSInvalidArgumentException', reason:
 '*** -[AVAssetReader initWithAsset:error:] invalid parameter not
 satisfying: asset != ((void *)0)'

Still, the arrival of AVAssetReader and AVAssetWriter open up a lot of new possibilities for audio and video apps on iOS. With the reader, you can inspect media samples, either in their original format or with a conversion to a form that suits your code. With the writer, you can supply samples that you receive by transcoding (as I’ve done here), by capture, or even samples you generate programmatically (such as a screen recorder class that just grabs the screen as often as possible and writes it to a movie file).

27 Comments

  • 1. nick_king_equate replies at 25th January 2011 um 6:26 am :

    Hi,

    first off, thanks for this useful bit of code. Im still a bit fuzzy about blocks, but understand the rest surprisingly enough.

    I am working on an app at present that uses the ipod media library. I have lifted a bit of your code to do this. Unfortunately when I run instruments over it it starts generating leaks despite what seems to be good releasing of the memory involved. This problem compounds for every successive access to the library until it gets an out of memory message.

    I thought this was bound to be something stupid that I did, but after much banging of head against the screen I ran your original code through instruments and saw that it was having the same issues.

    Is this something you are aware of and possibly a pitfall of using the library in this manner. Any direction you could give at this point would make my life a little better

    Cheers

    N.

  • 2. cocell replies at 4th February 2011 um 2:19 pm :

    // release a lot of stuff
    [assetReader release];
    [assetReaderOutput release];
    [assetWriter release];
    [assetWriterInput release];
    [exportPath release];

    Yeah, I notice that these aren’t being released at all and is causing major leaks and then crashes. Anyone know a fix to this? I’m trying right now, I thought it was my AU.

  • 3. [Time code];&hellip replies at 4th March 2011 um 11:11 pm :

    […] This got easier in iOS 4.1. Please forget everything you’ve read here and go read From iPod Library to PCM Samples in Far Fewer Steps Than Were Previously Necessary […]

  • 4. cocell replies at 6th March 2011 um 12:49 pm :

    I believe this is the right blog. I used the code from VTM_AViPodReader.zip, and I get Leaks for some reason I don’t know, And crash after converting new song from iPod library.

  • 5. Hardik Nimavat replies at 25th April 2011 um 10:07 am :

    Hello, After converting 4 to 5 songs app crashes due to memory issue. It gets memory level warning – 2. and crashes. Can you help me to solve this problem?

    Thanks,
    Hardik Nimavat

  • 6. Ahsan replies at 2nd May 2011 um 5:52 am :

    Hi there,
    Can you help me out a bit?

    I have read mp3 and I am able to get CMSampleBufferRef when I check AudioBuffer’s mData (the void pointer) by casting it in SInt16 I get some values. Does these values indicate amplitudes of the audio?
    What does mData indicates w.r.t audio?
    Thanks!

  • 7. adilsherwani replies at 27th May 2011 um 5:45 pm :

    Awesome article, thanks! A quick tip on the leaks: the CMSampleBufferRef is leaking as [assetReaderOutput copyNextSampleBuffer] follows the Create rule for memory management. A call to CFRelease after you’re done with it gets rid of the memory warnings!

  • 8. [Time code];&hellip replies at 28th May 2011 um 1:47 pm :

    […] added in 4.1 to do sample level access, AVAssetWriter and AVAssetReader. An earlier blog entry, From iPod Library to PCM Samples in Far Fewer Steps Than Were Previously Necessary, exercises both of these, reading from an iPod Library song with an AVAssetReader and writing to a […]

  • 9. profiles.google.com/milburn.and… replies at 12th June 2011 um 8:44 am :

    Greetings! Love your work. I’m getting errors with this approach (and related approaches) in iOS5 beta. I know we aren’t free to discuss these here, so I’ve started a thread on the dev forums here:

    https://devforums.apple.com/thread/104433

  • 10. cadamson replies at 13th June 2011 um 9:26 am :

    Andy: I just filed 9597654 on bugreport.apple.com to report the iOS 5 crash. I didn’t dupe to OpenRadar because of NDA concerns. See you on the forum.

  • 11. XinQiang Liu replies at 5th July 2011 um 7:15 am :

    Great code, It running fine at iPad2. But I want to directly play it like a stream song. Do you have some suggestion? My thought is convert it to a wave file , then share it by a embed web server. But convert time is too long.

  • 12. cadamson replies at 5th July 2011 um 9:04 am :

    XinQiang Liu: If all you need to do is to play a song from the music library, ignore all this and just use MPMusicPlayercontroller.

    I don’t get why you want to “play it like a stream song”… since we’ve already put the PCM samples in a CAF file, you could play it from the file with AVAudioPlayer or AVPlayer, etc. If you have some reason to go to a lower level, you could feed the samples directly to Core Audio’s Audio Queue Services or the AURemoteIO Audio Unit.

    Hope this helps. Sounds like you might want to look around at the different iOS media frameworks and get a sense of what each one provides. Have you read Apple’s Multimedia Programming guide?

  • 13. XinQiang Liu replies at 11th July 2011 um 5:15 am :

    Thanks, I want play music on iPhone by other application running on PC/Mac, for example: VLC player, MediaPlayer. These application can play a music on web site. I have running a web server on iPhone, so want to share the converted music to PC client. Is it possible? By the way , can I convert music to mp3 or other format? It looks PCM is very large and hard to play on network. Thanks

  • 14. cadamson replies at 11th July 2011 um 11:26 am :

    Of course you can convert to something else with AVAssetReader, but probably not MP3: iOS reads MP3 but doesn’t write it (licensing costs). AAC is an obvious choice. PCM is not meant for streaming… it’s uncompressed! PCM is interesting here because you can perform DSP on it, or run it through an AUGraph.
    Streaming from the iPhone to the computer means you need to start by picking a protocol supported by your PC player, like Shoutcast or HTTP Live Streaming. You need to ensure that the payload for these formats is something the iPhone can provide (so, probably not MP3, since AVF and core audio won’t convert to it).. If you find something that works, you can use AVAssetReader to get packets in that format, and then figure out how to deliver them via your network protocol. Also, to act as a server, you have a whole networking piece you will need to write, probably with CFNetwork or the classic unix networking APIs. Also, converting audio and keeping your wifi running full tilt is going to drain your battery really fast. Probably better to just write an .m4a file with AAC and serve it via your existing web server code.
    This sounds like an intellectual exercise at best, and more accurately a Fool’s Errand. Also, it sounds like you are in WAY over your head, and should find a simpler project without so many different concepts to master.

  • 15. juankmin replies at 31st October 2011 um 4:46 pm :

    Hello .. Great Blog… I was wondering if i could get some help here.. i’m trying to build a music player using the iPod Library and implement a level meter… i know i can meter de audio signal with PeakPowerForChannel and AveragePowerForChannel in the AVAudioPlayer but i don’t know if i have to do all this (convert to .caf) to play an MPMediaItem with the AVAudioplayer or if there’s another option for me to do that !!!
    Thanks!!!

  • 16. krzemienski replies at 15th December 2011 um 10:07 pm :

    So I have a question, Is there a way to just convert a certain part of the file. For instance if I have a timecode for the file like i only want to convert from 5.00 to 10.00 minutes of the track into the caf can i do that?

  • 17. How to make iPhone host l&hellip replies at 10th January 2012 um 3:52 pm :

    […] reference http://www.subfurther.com/blog/2010/12/13/from-ipod-library-to-pcm-samples-in-far-fewer-steps-than-w…, I have get AVAssetReader, how to create a url like “http:///a.mp3″, so other machine […]

  • 18. juankmin replies at 21st March 2012 um 9:50 pm :

    Please help me!! how can i play the ” exported.caf” with the AVAudioplayer ? !!

  • 19. jarryd replies at 1st August 2012 um 7:10 am :

    Hi. Thank you for this. How would I export an .aiff file instead of a .caf file?

  • 20. cadamson replies at 1st August 2012 um 7:59 am :

    jarryd: if you’ve already tried just changing the file name and that doesn’t work, go change the [NSNumber numberWithBool:NO], AVLinearPCMIsBigEndianKey to say YES, since AIFF can only be big-endian.

  • 21. How to stream MPMediaItem&hellip replies at 19th May 2013 um 3:02 pm :

    […] read this article From iPod Library to PCM Samples in Far Fewer Steps Than Were Previously Necessary about how to convert music from iPod library to PCM […]

  • 22. UITableViewCell not hidin&hellip replies at 30th May 2013 um 5:02 pm :

    […] from the music library, I’m using a variation of the code posted by Chris Adamson in his blog here. The user can queue up as many imports as necessary, and code will automatically import them in […]

  • 23. alkex replies at 5th October 2013 um 1:59 pm :

    Hi Chris,

    Thanks for posting this! Awesome stuff.

    Question: Would it be easy enough to have the app automatically choose a song at random and write it to CAF without any user interaction?
    I want to design a kind of “guess the song” app
    Thanks for your help!

    Cheers
    Alex

  • 24. cadamson replies at 6th October 2013 um 7:38 pm :

    alkex: You can do the random selection by just creating an MPMediaQuery with the class method songsQuery and then not adding any predicates. Then get the items array and select an item at random.

    If you just want to play the song as-is, use the MPMusicPlayerController… no need to do any format conversions.

    I wrote a music quiz app years ago (and just pulled it from the store, actually), and the MediaPlayer framework makes that kind of thing pretty easy.

  • 25. rvega replies at 1st December 2013 um 6:34 pm :

    Hi, thanks for this! Perhaps you can give me a hand…

    I’m implementing an ios app using libpd for the DSP but libpd can only read wav or aiff (no floating point) files. I’m changing the exported filename to “exported.aiff” and changing the big endian bit as suggested in another comment but the created file is 32bit floating point aiff.

    Any thoughts on how to force exporting to 16 or 24bit integer?

    Thanks!

  • 26. rvega replies at 1st December 2013 um 11:05 pm :

    Here’s what worked for me:

    Exported as .caf, opened as raw data in audacity and looked at the waveform to figure out that the header is 4096 bytes long and opened in pd using the soundfiler object with options: “-raw 4096 2 2 b” (headersize, num channels, num bytes per sample, big endian)

  • 27. ohmg00dn3ss19 replies at 1st April 2014 um 6:22 pm :

    Can I do this with a recording I made? Basically I need the app to record audio, save said audio, and display as a waveform – can I do this with a version of this code? I ask because I noticed that this is more for IOS 4 and was wondering what sort of updates should be made to this code as such, as well as for my own purposes.

Leave a comment

You must be logged in to post a comment.