Swift World: What’s new in iOS 11 — Core ML
iOS 11
was announced in WWDC 2017 and available to download now. One of the
most inspiring features is to leverage machine learning in different
levels. The direct advantage for our developers is Core ML. In this
article, I will introduce Core ML with an example to detect objects in
image.
Simply
speaking, there are two phases in complete machine learning
development: training with large-scale data to get model and use model
to predict with new input data. Core ML works in the second phase.
Here is a simple figure to depict the complete process.
I will
not cover the basic knowledge in machine learning. You can find more on
the Internet. Model is the most import product from machine learning
algorithm and data. Core ML defines its own format for model file. So
the models we get with machine learning frameworks need to be converted.
Fortunately, Apple provide Core ML Tools to help us. This is not in our main topic, so please refer to its document.
There
are already different models to apply to different areas like computer
vision and language processing. Apple has provided several popular
models including Inception v3, Places205-GoogLeNet, ResNet50 and VGG16
in its ow n format. Let’s start an example with model Inception v3 which
is popular for detecting objects in image.
First please download Inception v3’s model file from https://developer.apple.com/machine-learning/. Then drag it to our new project. In Xcode 9, we can see the information about this model.
In the first part “Machine Learning Model”, there are basic information like name, author, description, etc.
Xcode
will generate Swift code for this model which is shown in Model Class.
Click the right arrow, we’ll get the class definitions for input, output
and the model in Swift. Dive in if you want to learn more.
The
inputs data and the outputs result formats are shown in the third part
“Model Evaluation Parameters”. For Inception v3, the input is an image.
The output is probability for each category as a dictionary and the most
likely category as String.
For
simplicity’s sake, we’ll use an static image. Please resize the image
to 299*299 because it’s Inception v3’s default setting. Our new project
is very simple. We only place an image view to preview the image and an
label to show the results.
Let’s see the codes.
override func viewDidLoad() {
super.viewDidLoad()
let model = Inceptionv3()
let inputImage = UIImage(named: "sample.jpg")!.cgImage!
let pixelBuffer = getCVPixelBuffer(inputImage)
guard let pb = pixelBuffer, let output = try? model.prediction(image: pb) else {
fatalError("Unexpected runtime error.")
}
result.text = output.classLabel
}
In
the codes, we initialize an Inception v3 model. This class is generated
by Xcode. In the second step, we read the input image and convert it to
CVPixelBuffer with the helper function getCVPixelBuffer. At last we get
the result by model’s prediction function. It’s so easy with the help
of Xcode. I’ve uploaded the complete project to GitHub. Please first download Inception v3 model file from https://developer.apple.com/machine-learning/ and put in in project if you want to run it.
We only
show the best result in output’s classLabel. If you want to see all
probability values, print output’s classLabelProbs somewhere.
print(output.classLabelProbs)
Thanks for your time. Please click the ❤ button to get this article seen by more people.
No comments: