Tutorial

Building a Full Screen Camera App Using AVFoundation


Today, we’ll be learning how to use AV Foundation, an Apple system framework that exists on macOS and iOS, along with watchOS and tvOS. The goal of this tutorial will be to help you build a fully functional iOS app that’s capable of capturing photos and videos using the device’s cameras. We’ll also be following the principles of good object oriented programming and designing a utility class that can be reused and extended in all your projects.

Note: This tutorial requires a physical iOS device, not the simulator. You won’t be able to run the demo app on the simulator. This tutorial also assumes that you have a relatively strong knowledge of basic UIKit concepts such as actions, Interface Builder, and Storyboards, along with a working knowledge of Swift.

What is AV Foundation?

AV Foundation is the full featured framework for working with time-based audiovisual media on iOS, macOS, watchOS and tvOS. Using AV Foundation, you can easily play, create, and edit QuickTime movies and MPEG-4 files, play HLS streams, and build powerful media functionality into your apps. – Apple

So, there you have it. AV Foundation is a framework for capturing, processing, and editing audio and video on Apple devices. In this tutorial, we’ll specifically be using it to capture photos and videos, complete with multiple camera support, front and rear flash, and audio for videos.

Do I need AV Foundation?

Before you embark on this journey, remember that AV Foundation is a complex and intricate tool. In many instances, using Apple’s default APIs such as UIImagePickerController will suffice. Make sure you actually need to use AV Foundation before you begin this tutorial.

Sessions, Devices, Inputs, and Outputs

At the core of capturing photos and videos with AV Foundation is the capture session. According to Apple, the capture session is “an object that manages capture activity and coordinates the flow of data from input devices to capture outputs.” In AV Foundation, capture sessions are managed by the AVCaptureSession object.

Additionally, the capture device is used to actually access the physical audio and video capture devices available on an iOS device. To use AVFoundation, you take capture devices, use them to create capture inputs, provide the session with these inputs, and then save the result in capture outputs. Here’s a diagram that I made that depicts this relation:

Flowchart

Example Project

As always, we want you to explore the framework by getting your hands dirty. You’ll work on an example project, but to let us focus on the discussion of the AVFoundation framework, this tutorial comes with a starter project. Before you move on, download the starter project here and take a quick look.

The example project is rather basic. It contains:

  • An Assets.xcassets file that contains all of the necessary iconography for our project. Credit goes to Google’s Material Design team for these icons. You can find them, along with hundreds of others, available for free at material.io/icons.
  • A Storyboard file with one view controller. This view controller will be used to handle all photo and video capture within our app. It contains:
    • A capture button to initiate photo / video capture.
    • A capture preview view so that you can see what the camera sees in real time.
    • The necessary controls for switching cameras and toggling the flash.
  • A ViewController.swift file that’s responsible for managing the view controller mentioned above. It contains:
    • All of the necessary outlets that connect the UI controls mentioned above to our code.
    • A computed property to hide the status bar.
    • A setup function that styles the capture button appropriately.

Build and run the project, and you should see something like this:

AVFoundation Sample Project

Cool! Let’s get started!

Working with AVFoundation

In this tutorial, we’re going to design a class called CameraController, that will be responsible for doing the heavy lifting related to photo and video capture. Our view controller will use CameraController and bind it to our user interface.

To get started, create a new Swift file in your project and call it CameraController.swift. Import AVFoundation and declare an empty class, like this:

Photo Capture

To begin, we’re going to implement the photo capture feature with the rear camera. This will be our baseline functionality, and we will add the ability to switch cameras, use the flash, and record videos by adding onto our photo capture functionality. Since configuring and starting a capture session is a relatively intensive procedure, we’re going to decouple it from init and create a function, called prepare, that prepares our capture session for use and calls a completion handler when it’s done. Add a prepare function to your CameraController class:

This function will handle the creation and configuration of a new capture session. Remember, setting up the capture session consists of 4 steps:

  1. Creating a capture session.
  2. Obtaining and configuring the necessary capture devices.
  3. Creating inputs using the capture devices.
  4. Configuring a photo output object to process captured images.

We’ll use Swift’s nested functions to encapsulate our code in a manageable way. Start by declaring 4 empty functions within prepare and then calling them:

In the above code listing, we’ve created boilerplate functions for performing the 4 key steps in preparing an AVCaptureSession for photo capture. We’ve also set up an asynchronously executing block that calls the four functions, catches any errors if necessary, and then calls the completion handler. All we have left to do is implement the four functions! Let’s start with createCaptureSession.

Create Capture Session

Before configuring a given AVCaptureSession, we need to create it! Add the following property to your CameraController.swift file:

Next, add the following to the body of your createCaptureSession function that’s nested within prepare:

This is simple code; it simply creates a new AVCaptureSession and stores it in the captureSession property.

Configure Capture Devices

Now that we’ve created an AVCaptureSession, we need to create the AVCaptureDevice objects to represent the actual iOS device’s cameras. Go ahead and add the following properties to your CameraController class. We’re going to add the frontCamera and rearCamera properties now because we’ll be setting up the basics of multicamera capture, and implementing the ability to change cameras later.

Next, declare an embedded type within CameraController.swift. We’ll be using this embedded type to manage the various errors we might encounter while creating a capture session:

You’ll notice that there are various error types in this enum. Just add them now, we’re going to use them later.

Now it comes to the fun part! Let’s find the cameras available on the device. We can do this with AVCaptureDeviceDiscoverySession. Add the following to configureCaptureDevices:

Here’s what we just did:

  1. These 2 lines of code use AVCaptureDeviceDiscoverySession to find all of the wide angle cameras available on the current device and convert them into an array of non-optional AVCaptureDevice instances. If no cameras are available, we throw an error.
  2. This loop looks through the available cameras found in code segment 1 and determines which is the front camera and which is the rear camera. It additionally configures the rear camera to autofocus, throwing any errors that are encountered along the way.

Cool! We used AVCaptureDeviceDiscoverySession to find the available cameras on the device and configure them to meet our specifications. Let’s connect them to our capture session.

Configure Device Inputs

Now we can create capture device inputs, which take capture devices and connect them to our capture session. Before we do this, add the following properties to CameraController to ensure that we can store our inputs:

Our code won’t compile in this state, because CameraPosition is undefined. Let’s define it. Add this as an embedded type within CameraController:

Great. Now we have all the necessary properties for storing and managing our capture device inputs. Let’s implement configureDeviceInputs:

Here’s what we did:

  1. This line simply ensures that captureSession exists. If not, we throw an error.
  2. These if statements are responsible for creating the necessary capture device input to support photo capture. AVFoundation only allows one camera-based input per capture session at a time. Since the rear camera is traditionally the default, we attempt to create an input from it and add it to the capture session. If that fails, we fall back on the front camera. If that fails as well, we throw an error.

Configure Photo Output

Up until this point, we’ve added all the necessary inputs to captureSession. Now we just need a way to get the necessary data out of our capture session. Luckily, we have AVCapturePhotoOutput. Add one more property to CameraController:

Now, let’s implement configurePhotoOutput like this:

This is a simple implementation. It just configures photoOutput, telling it to use the JPEG file format for its video codec. Then, it adds photoOutput to captureSession. Finally, it starts captureSession.

We’re almost done! Your CameraController.swift file should look something similar to this:

Note: I used extensions to segment the code appropriately. You don’t have to do this, but I think it’s good practice, because it makes your code easier to read and write.

Display Preview

Now that we have the camera device ready, it is time to show what it captures on screen. Add another function to CameraController (outside of prepare), called it displayPreview. It should have the following signature:

Additionally, import UIKit in your CameraController.swift file. We’ll need it to work with UIView.

As its name suggests, this function will be responsible for creating a capture preview and displaying it on the provided view. Let’s add a property to CameraController to support this function:

This property will hold the preview layer that displays the output of captureSession. Let’s implement the method:

This function creates an AVCaptureVideoPreview using captureSession, sets it to have the portrait orientation, and adds it to the provided view.

Wiring It Up

Cool! Now, let’s try connecting all this to our view controller. Head on over to ViewController.swift. First, add a property to ViewController.swift:

Then, add a nested function in viewDidLoad():

This function simply prepares our camera controller like we designed it to.

Unfortunately, we still have one more step. This is a security requirement enforced by Apple. You have to provide a reason for users, explaining why your app needs to use the camera. Open Info.plist and insert a row:

privacy-info-list-camera

This key tells the user why you’re using the camera when it asks for the necessary permissions.

Your ViewController.swift file should now look like this:

Build and run your project, press accept when the device asks for permission, and HOORAY! You should have a working capture preview. If not, recheck your code and leave a comment if you need help.

sample camera

Toggling the Flash / Switching Cameras

Now that we have a working preview, let’s add some more functionality to it. Most camera apps allow their users to switch cameras and enable or disable the flash. Let’s make ours do that as well. After we do this, we’ll add the ability to capture images and save them to the camera roll.

To start, we’re going to enable the ability to toggle the flash. Add this property to CameraController:

Now, head over to ViewController. Add an @IBAction func to toggle the flash:

For now, this is all we have to do. Our CameraController class will handle the flash when we capture an image. Let’s move on to switching cameras.

Switching cameras in AV Foundation is a pretty easy task. We just need to remove the capture input for the existing camera and add a new capture input for the camera we want to switch to. Let’s add another function to our CameraController class for switching cameras:

When we switch cameras, we’ll either be switching to the front camera or to the rear camera. So, let’s declare 2 nested functions within switchCameras:

Now, add the following to switchCameras():

Here’s what we just did:

  1. This guard statement ensures that we have a valid, running capture session before attempting to switch cameras. It also verifies that there is a camera that’s currently active.
  2. This line tells the capture session to begin configuration.
  3. This switch statement calls either switchToRearCamera or switchToFrontCamera, depending on which camera is currently active.
  4. This line commits, or saves, our capture session after configuring it.

Great! All we have to do now is implement switchToFrontCamera and switchToRearCamera:

Both functions have extremely similar implementations. They start by getting an array of all the inputs in the capture session and ensuring that it’s possible to switch to the request camera. Next, they create the necessary input device, remove the old one, and add the new one. Finally, they set currentCameraPosition so that the CameraController class is aware of the changes. Easy! Go back to ViewController.swift so that we can add a function to switch cameras:

Great! Open up your storyboard, connect the necessary outlets, and build and run the app. You should be able to freely switch cameras. Now we get to implement the most important feature: image capture!

Implementing Image Capture

Now we can implement the feature we’ve been waiting for this whole time: image capture. Before we get into it, let’s have a quick recap of everything we’ve done so far:

  • Designed a working utility class that can be used to easily hide the complexities of AV Foundation.
  • Implemented functionality within this class to allow us to create a capture session, use the flash, switch cameras, and get a working preview.
  • Connected our class to a UIViewController and built a lightweight camera app.

All we have left to do is actually capture the images!

Open up CameraController.swift and let’s get to work. Add a captureImage function with this signature:

This function, as its name suggests, will capture an image for us using the camera controller we’ve built. Let’s implement it:

Great! It’s not a complicated implementation, but our code won’t compile yet, because we haven’t defined photoCaptureCompletionBlock and CameraController doesn’t conform to AVCapturePhotoCaptureDelegate. First, let’s add a property, photoCaptureCompletionBlock to CameraController:

And now, let’s extend CameraController to conform to AVCapturePhotoCaptureDelegate:

Great. Now the compiler is raising one more issue:

We just need to make CameraController inherit from NSObject to fix this, so let’s do so now. Change the class declaration for CameraController to class CameraController: NSObject and we’ll be set!

Now, head back to ViewController one more time. First, import the Photos framework since we will use the built-in APIs to save the photo.

And then insert the following function:

We simply call the captureImage method of the camera controller to take photo, and then use the PHPhotoLibary class to save the image to the built-in photo library.

Lastly, connect the @IBAction func to the capture button in the Storyboard, and head over to Info.plist to insert a row:

privacy-info-list-photolib

This is a privacy requirement introduced in iOS 10. You have to specify the reason why your app needs to access the photo library.

Now build and run the app to capture a photo! After that, open your photo library. You should see the photo you just captured. Congrats, you now know how to use AV Foundation in your apps! Good luck, and stay tuned for the second part of this tutorial, where we’ll learn how to capture videos.

For the complete project, you can download it from GitHub.

Tutorial
Working with Drag and Drop APIs in iOS 11
iOS
iOS Programming 101: How To Create Swipeable Table View Cell to Display More Options
Tutorial
A Beginner’s Guide to Core ML Tools: Converting a Caffe Model to Core ML Format
Shares