Build a custom camera application that detects rectangular objects. Learn how to process video frames from the camera, how to leverage CoreImage and CIDetector, and how to do custom drawing on the live video preview.
Length: about 2 hours
In this introductory episode I show the finished version of the application we will be building in this series.
We start by setting up the project to take photos. We start by disabling rotation, which is typical of a camera app. We then add the appropriate Info.plist entry so that we can inform the user why we need camera access. We organize the view controller intro sections to keep things tidy as we add code to the project. Finally we setup the camera capture session, which is the first step to capturing video from the camera.
We set up the preview layer so we can see the video, then we add a sample buffer queue so we can get access to the individual frames of the video coming through the capture session.
We learn how to convert a CMSampleBuffer into a CIImage and how to set up our CIDetector to detect rectangles. We then draw a box around the detected rectangle and discuss why adding simple subviews won't work with full screen sublayers.
In the last episode it was clear our coordinate math was wrong. But why? We will answer that question and solve the problem. We also add code to hide the box if it hasn’t detected a rectangle for a few seconds.
We get smarter about drawing, this time leveraging a custom path so we can display non-rectangular outlines.
In this episode we implement features that make the app feel more like a camera. When tapping the screen, we play the system camera shutter sound using AudioServices, then we add a small but useful flash effect to reinforce the fact the user took a photo. We'll also talk about a strategy for capturing the image by using a flag.
In this episode we take the captured image and run the perspective correction filter on it in order to turn a skewed rect back into a flat rectangle. We then display the image on the screen for a few seconds as a preview mechanism.
I got a tip from a subscriber about the method I used to capture the photo in the last episode. Since we're capturing the data using the preset we chose for processing the rectangles, we are bound to that preset when we export to an actual photo. By using AVCapturePhotoOutput, we can have much more control over the format, size, and quality of the resulting image. In addition, since we are leveraging the SDK for capturing the photo, we benefit from things like auto-flash, HDR, auto focus, and the built-in camera shutter sound. (Yay for deleting code!) The end result might not look very different on device, but if you are taking that image somewhere else to do OCR or other processing on it, a higher quality image is important.
In this episode we extract our camera preview layer into its own view so we can add subviews. We’ll use this to add a flash button control to the UI, which will require us to learn about locking the device and controlling the camera’s "torch".