Hands-on Metal: Image Processing using Apple’s GPU framework

Avinash
4 min readJun 19, 2019

--

In a previous post, I had discussed the basics of the Metal framework as intuitively as possible. Please do give it a read before beginning with this article.

You’ll need a device that supports Metal to run the code as Metal is not supported by the Xcode’s device simulator

The big picture is to load an image (UIImage) from a .jpg and convert it to a texture (MTLTexture) for the GPU to work on it and get back an image after the GPU is done with the texture.

There’s an added step to this process as UIImage to MTLTexture is not directly feasible. UIImage has to be converted to CGImage and then to MTLTexture and in reverse to display the output. Let’s look at each step sequentially.

We create a class with helper functions to do all conversion and it will have an applyFilter function where the actual compute call happens.

The Filter Class

Create a new swift file named Filter. Make necessary imports and create a class named Filter.

import Metal
import MetalKit
class Filter {

Class Members

The first set of variables are related to GPU handling. Please to refer to the previous article for reference. The second set of variables is for image reading and its dimensions. The image dimensions are used across many functions so it makes sense to have it as a property. The last property is how many threads do we launch per block of the GPU. We wish to divide the 2D image into an equally split grid so that each block can be assigned a grid. You can read more here and here for more efficient means of thread grouping.

class Filter {var device: MTLDevice
var defaultLib: MTLLibrary?
var grayscaleShader: MTLFunction?
var commandQueue: MTLCommandQueue?
var commandBuffer: MTLCommandBuffer?
var commandEncoder: MTLComputeCommandEncoder?
var pipelineState: MTLComputePipelineState?
var inputImage: UIImage
var height, width: Int
// most devices have a limit of 512 threads per group
let threadsPerBlock = MTLSize(width: 16, height: 16, depth: 1)

The Constructor

We initiate the necessary GPU handling objects and set the pipeline state. Please to refer to the previous article for reference.

Since it’s just a tutorial I am just using an image from the local directory to be input. You may use an UIImagePickerController to load a new input image.

drag and drop any image file into the project directory
init(){self.device = MTLCreateSystemDefaultDevice()!
self.defaultLib = self.device.makeDefaultLibrary()
self.grayscaleShader = self.defaultLib?.makeFunction(name: "black")
self.commandQueue = self.device.makeCommandQueue()
self.commandBuffer = self.commandQueue?.makeCommandBuffer()
self.commandEncoder = self.commandBuffer?.makeComputeCommandEncoder()
if let shader = grayscaleShader { self.pipelineState = try? self.device.makeComputePipelineState(function: shader) } else { fatalError("unable to make compute pipeline") }self.inputImage = UIImage(named: "spidey.jpg")!
self.height = Int(self.inputImage.size.height)
self.width = Int(self.inputImage.size.width)
}

The ‘black’ Shader

#include <metal_stdlib>using namespace metal;kernel void black (
texture2d<float, access::write> outTexture [[texture(0)]],
texture2d<float, access::read> inTexture [[texture(1)]],
uint2 id [[thread_position_in_grid]]) {
float3 val = inTexture.read(id).rgb;
float gray = (val.r + val.g + val.b)/3.0;
float4 out = float4(gray, gray, gray, 1.0);
outTexture.write(out.rgba, id);
}

Helper Functions

There are 4 helper functions to convert images between different formats and 3 to just make the code cleaner. Look into the gist at the bottom for function definitions.

// create CGImage from UIImage
func getCGImage(from uiimg: UIImage) -> CGImage?
// create MTLTexture from CGImage
func getMTLTexture(from cgimg: CGImage) -> MTLTexture
// create CGImage from MTLTexture
func getCGImage(from mtlTexture: MTLTexture) -> CGImage?
// convert CGImage to UIImage
func getUIImage(from cgimg: CGImage) -> UIImage?
// we need an empty texture to write the output value after processing of each pixel
func getEmptyMTLTexture() -> MTLTexture?
// function that uses above helpers to convert the input image to texture
func getInputMTLTexture() -> MTLTexture?
// divides the image dimensions by threads per block to determine number of blocks to launch
func getBlockDimensions() -> MTLSize

ApplyFilter Function

Initially, we unwrap the optionals. Then we set the input and output texture and index them. We set GPU threads to work on the textures. Once commands are encoded the buffer is committed. We await the buffer to finish and convert the output texture back to UIImage to be displayed.

func applyFilter() -> UIImage? {if let encoder = self.commandEncoder, let buffer = self.commandBuffer, let outputTexture = getEmptyMTLTexture(), let inputTexture = getInputMTLTexture() {encoder.setTextures([outputTexture, inputTexture], range: 0..<2)encoder.dispatchThreadgroups(self.getBlockDimensions(), threadsPerThreadgroup: threadsPerBlock)encoder.endEncoding()buffer.commit()
buffer.waitUntilCompleted()
guard let outputImage = getCGImage(from: outputTexture) else { fatalError("Couldn't obtain CGImage from MTLTexture") }return getUIImage(from: outputImage)} else { fatalError("optional unwrapping failed") }}

Display Output

Create an imageView in the storyboard and hook it up to the view controller. In viewDidLoad() instantiate Filter and call its applyFilter. You can now display the result in your image view.

override func viewDidLoad() {super.viewDidLoad()
let filter = Filter()
let resultImage = Filter.applyFilter()
outputImageView.image = resultImage
}

You can read the complete code as a gist here.

--

--

Avinash
Avinash

Written by Avinash

Interested in ML & Software. Prev at Inito, ShareChat. Ola. Graduated from IIT Madras.

Responses (4)