{"id":1647,"date":"2018-11-14T16:23:34","date_gmt":"2018-11-15T00:23:34","guid":{"rendered":"https:\/\/live-cometml.pantheonsite.io\/blog\/real-time-numbers-recognition-mnist-on-an-iphone-with-coreml-from-a-to-z\/"},"modified":"2018-11-14T16:23:34","modified_gmt":"2018-11-15T00:23:34","slug":"real-time-numbers-recognition-mnist-on-an-iphone-with-coreml-from-a-to-z","status":"publish","type":"post","link":"https:\/\/www.comet.com\/site\/blog\/real-time-numbers-recognition-mnist-on-an-iphone-with-coreml-from-a-to-z\/","title":{"rendered":"Real-time numbers recognition (MNIST) on an iPhone with CoreML from A to Z"},"content":{"rendered":"\n\n\n<h4 class=\"wp-block-heading\"><em>Learn how to build and train a deep learning network to recognize numbers (MNIST), how to convert it in the CoreML format to then deploy it on your iPhoneX and make it recognize numbers in real-time!<\/em><\/h4>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>This post originally appeared on the Liip blog <a href=\"https:\/\/www.liip.ch\/en\/blog\/numbers-recognition-mnist-on-an-iphone-with-coreml-from-a-to-z\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.liip.ch\/en\/blog\/numbers-recognition-mnist-on-an-iphone-with-coreml-from-a-to-z\">here<\/a> and was reposted with the author\u2019s permission. We also recommend reading Thomas Ebermann\u2019s other posts around <a href=\"https:\/\/www.liip.ch\/en\/blog\/sentiment-detection-with-keras-word-embeddings-and-lstm-deep-learning-networks\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.liip.ch\/en\/blog\/sentiment-detection-with-keras-word-embeddings-and-lstm-deep-learning-networks\">sentiment analysis with Keras<\/a> and the <a href=\"https:\/\/www.liip.ch\/en\/blog\/the-data-science-stack-2018\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.liip.ch\/en\/blog\/the-data-science-stack-2018\">Data Science Stack<\/a>!<\/p>\n<\/blockquote>\n\n\n\n<p class=\"wp-block-paragraph\">\u2014 By Thomas Ebermann (On Medium as <a href=\"https:\/\/medium.com\/u\/8a9685aabfa8\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/medium.com\/u\/8a9685aabfa8\" data-anchor-type=\"2\" data-user-id=\"8a9685aabfa8\" data-action-value=\"8a9685aabfa8\" data-action=\"show-user-card\" data-action-type=\"hover\">plotti<\/a>)<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Creating a CoreML model from A-Z in less than 10\u00a0Steps<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">This is the third part of our deep learning on mobile phones series. In part one I have shown you <a href=\"https:\/\/www.liip.ch\/en\/blog\/poke-zoo-or-making-deep-learning-tell-oryxes-apart-from-lamas-in-a-zoo-part-1-the-idea-and-concepts\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.liip.ch\/en\/blog\/poke-zoo-or-making-deep-learning-tell-oryxes-apart-from-lamas-in-a-zoo-part-1-the-idea-and-concepts\">the two main tricks on how to use convolutions and pooling to train deep learning networks<\/a>. In part two I have shown you <a href=\"https:\/\/www.liip.ch\/en\/blog\/zoo-pokedex-part-2-hands-on-with-keras-and-resnet50\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.liip.ch\/en\/blog\/zoo-pokedex-part-2-hands-on-with-keras-and-resnet50\">how to train existing deep learning networks like resnet50 to detect new objects<\/a>. In part three I will now show you how to train a deep learning network, how to convert it in the CoreML format and then deploy it on your mobile phone!<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">TLDR: I will show you how to create your own iPhone app from A-Z that recognizes handwritten numbers:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Let\u2019s get started!<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">1. How to\u00a0start<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">To have a fully working example I thought we\u2019d start with a toy dataset like the <a href=\"https:\/\/en.wikipedia.org\/wiki\/MNIST_database\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/en.wikipedia.org\/wiki\/MNIST_database\">MNIST set of handwritten letters<\/a> and train a deep learning network to recognize those. Once it\u2019s working nicely on our PC, we will port it to an iPhone X using the <a href=\"https:\/\/developer.apple.com\/documentation\/coreml\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/developer.apple.com\/documentation\/coreml\">CoreML standard<\/a>.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">2. Getting the\u00a0data<\/h4>\n\n\n\n<pre class=\"wp-block-code\"><code># Importing the dataset with Keras and transforming it from keras.datasets import mnist from keras import backend as K\n\ndef mnist_data():\n# input image dimensions\n    img_rows, img_cols = 28, 28\n    (X_train, Y_train), (X_test, Y_test) = mnist.load_data()\n\n    if K.image_data_format() == 'channels_first':\n          X_train = X_train.reshape(X_train.shape[0], 1, img_rows, img_cols)\n          X_test = X_test.reshape(X_test.shape[0], 1, img_rows, img_cols)\n          input_shape = (1, img_rows, img_cols)\n    else:\n          X_train = X_train.reshape(X_train.shape[0], img_rows, img_cols, 1)\n          X_test = X_test.reshape(X_test.shape[0], img_rows, img_cols, 1)\n          input_shape = (img_rows, img_cols, 1)\n\n    # rescale [0,255] --&gt; [0,1]\n    X_train = X_train.astype('float32')\/255\n    X_test = X_test.astype('float32')\/255\n\n    # transform to one hot encoding\n    Y_train = np_utils.to_categorical(Y_train, 10)\n    Y_test = np_utils.to_categorical(Y_test, 10)\n\n    return (X_train, Y_train), (X_test, Y_test)\n\n(X_train, Y_train), (X_test, Y_test) = mnist_data()<\/code><\/pre>\n\n\n\n<h4 class=\"wp-block-heading\">3. Encoding it correctly<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">When working with image data we have to distinguish how we want to encode it. Since Keras is a high level-library that can work on multiple \u201cbackends\u201d such as <a href=\"https:\/\/www.tensorflow.org\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.tensorflow.org\/\">Tensorflow<\/a>, <a href=\"http:\/\/deeplearning.net\/software\/theano\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/deeplearning.net\/software\/theano\/\">Theano<\/a> or <a href=\"https:\/\/www.microsoft.com\/en-us\/cognitive-toolkit\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.microsoft.com\/en-us\/cognitive-toolkit\/\">CNTK<\/a>, we have to first find out how our backend encodes the data. It can either be encoded in a \u201cchannels first\u201d or in a \u201cchannels last\u201d way which is the default in Tensorflow in the <a href=\"https:\/\/keras.io\/backend\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/keras.io\/backend\/\">default Keras backend<\/a>. So in our case, when we use Tensorflow it would be a tensor of (batch_size, rows, cols, channels). So we first input the batch_size, then the 28 rows of the image, then the 28 columns of the image and then a 1 for the number of channels since we have image data that is grey-scale.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We can take a look at the first 5 images that we have loaded with the following snippet:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># plot first six training images\nimport matplotlib.pyplot as plt\n%matplotlib inline\nimport matplotlib.cm as cm\nimport numpy as np\n\n(X_train, y_train), (X_test, y_test) = mnist.load_data()\n\nfig = plt.figure(figsize=(20,20))\n\nfor i in range(6):\n      ax = fig.add_subplot(1, 6, i+1, xticks=[], yticks=[])\n      ax.imshow(X_train[i], cmap='gray')\n      ax.set_title(str(y_train[i]))<\/code><\/pre>\n\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-5338\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/numbers.png.webp\" alt=\"\" width=\"808\" height=\"141\" srcset=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/numbers.png.webp 808w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/numbers.png-300x52.webp 300w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/numbers.png-768x134.webp 768w\" sizes=\"auto, (max-width: 808px) 100vw, 808px\" \/><\/figure>\n<\/div>\n<p>&nbsp;<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">4. Normalizing the\u00a0data<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">We see that there are white numbers on a black background, each thickly written just in the middle and they are quite low resolution\u200a\u2014\u200ain our case 28 pixels x 28 pixels.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">You have noticed that above we are rescaling each of the image pixels, by dividing them by 255. This results in pixel values between 0 and 1 which is quite useful for any kind of training. So each of the images pixel values look like this before the transformation:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># visualize one number with pixel values\ndef visualize_input(img, ax):\n      ax.imshow(img, cmap='gray')\n      width, height = img.shape\n      thresh = img.max()\/2.5\n      for x in range(width):\n          for y in range(height):\n                 ax.annotate(str(round(img[x][y],2)), xy=(y,x),\n                             horizontalalignment='center',\n                             verticalalignment='center',\n                             color='white' if img[x][y]&lt;thresh else 'black')\n\nfig = plt.figure(figsize = (12,12))\nax = fig.add_subplot(111)\nvisualize_input(X_train[0], ax)<\/code><\/pre>\n\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-5339\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/detail.png.webp\" alt=\"\" width=\"809\" height=\"793\" srcset=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/detail.png.webp 809w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/detail.png-300x294.webp 300w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/detail.png-768x753.webp 768w\" sizes=\"auto, (max-width: 809px) 100vw, 809px\" \/><\/figure>\n<\/div>\n\n<p class=\"wp-block-paragraph\">As you noticed each of the grey pixels has a value between 0 and 255 where 255 is white and 0 is black. Notice that here <code>mnist.load_data()<\/code> loads the original data into X_train[0]. When we write our custom mnist_data() function we transform every pixel intensity into a value of 0-1 by calling <code>X_train = X_train.astype('float32')\/255\u00a0<\/code>.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">5. One hot\u00a0encoding<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">Originally the data is encoded in such a way that the Y-Vector contains the number value that the X Vector (Pixel Data) contains. So for example if it looks like a 7, the Y-Vector contains the number 7 in there. We need to do this transformation, because we want to map our output to 10 output neurons in our network that fire when the according number is recognized.<\/p>\n\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-5340\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/onehot.png.webp\" alt=\"\" width=\"754\" height=\"580\" srcset=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/onehot.png.webp 754w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/onehot.png-300x231.webp 300w\" sizes=\"auto, (max-width: 754px) 100vw, 754px\" \/><\/figure>\n<\/div>\n\n\n\n<h4 class=\"wp-block-heading\">6. Modeling the\u00a0network<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">Now it is time to define a convolutional network to distinguish those numbers. Using the <a href=\"https:\/\/www.liip.ch\/en\/blog\/poke-zoo-or-making-deep-learning-tell-oryxes-apart-from-lamas-in-a-zoo-part-1-the-idea-and-concepts\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.liip.ch\/en\/blog\/poke-zoo-or-making-deep-learning-tell-oryxes-apart-from-lamas-in-a-zoo-part-1-the-idea-and-concepts\">convolution and pooling tricks from part one of this series<\/a> we can model a network that will be able to distinguish numbers from each other.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># defining the model\nfrom keras.models import Sequential\nfrom keras.layers import Dense, Dropout, Flatten\nfrom keras.layers import Conv2D, MaxPooling2D\ndef network():\n     model = Sequential()\n     input_shape = (28, 28, 1)\n     num_classes = 10\n\n     model.add(Conv2D(filters=32, kernel_size=(3, 3), padding='same', activation='relu', input_shape=input_shape))\n     model.add(MaxPooling2D(pool_size=2))\n     model.add(Conv2D(filters=32, kernel_size=2, padding='same', activation='relu'))\n     model.add(MaxPooling2D(pool_size=(2, 2)))\n     model.add(Conv2D(filters=32, kernel_size=2, padding='same', activation='relu'))\n     model.add(MaxPooling2D(pool_size=(2, 2)))\n     model.add(Dropout(0.3))\n     model.add(Flatten())\n     model.add(Dense(500, activation='relu'))\n     model.add(Dropout(0.4))\n     model.add(Dense(num_classes, activation='softmax'))\n\n     # summarize the model\n     # model.summary()\n     return model<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">So what did we do there?<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Well we started with a <a href=\"https:\/\/keras.io\/layers\/convolutional\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/keras.io\/layers\/convolutional\/\">convolution<\/a> with a kernel size of 3. This means the window is 3&#215;3 pixels. The input shape is our 28&#215;28 pixels. We then followed this layer by a <a href=\"https:\/\/keras.io\/layers\/pooling\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/keras.io\/layers\/pooling\/\">max pooling layer<\/a>. Here the pool_size is two so we downscale everything by 2. So now our input to the next convolutional layer is 14 x 14. We then repeated this two more times ending up with an input to the final convolution layer of 3&#215;3. We then use a <a href=\"https:\/\/keras.io\/layers\/core\/#dropout\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/keras.io\/layers\/core\/#dropout\">dropout layer<\/a> where we randomly set 30% of the input units to 0 to prevent overfitting in the training. Finally we then flatten the input layers (in our case 3x3x32 = 288) and connect them to the dense layer with 500 inputs. After this step we add another dropout layer and finally connect it to our dense layer with 10 nodes which corresponds to our number of classes (as in the number from 0 to 9).<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">7. Training the\u00a0model<\/h4>\n\n\n\n<pre class=\"wp-block-code\"><code>#Training the model\nmodel.compile(loss='categorical_crossentropy', optimizer=keras.optimizers.Adadelta(), metrics=['accuracy'])\n\nmodel.fit(X_train, Y_train, batch_size=512, epochs=6, verbose=1,validation_data=(X_test, Y_test))\n\nscore = model.evaluate(X_test, Y_test, verbose=0)\n\nprint('Test loss:', score[0])print('Test accuracy:', score[1])<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">We first compile the network by defining a loss function and an optimizer: in our case we select categorical_crossentropy, because we have multiple categories (as in the numbers 0\u20139). There are a number of optimizers that <a href=\"https:\/\/keras.io\/optimizers\/#usage-of-optimizers\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/keras.io\/optimizers\/#usage-of-optimizers\">Keras offers<\/a>, so feel free to try out a few, and stick with what works best for your case. I\u2019ve found that AdaDelta (an advanced form of AdaGrad) works fine for me.<\/p>\n\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-5341\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/train.png.webp\" alt=\"\" width=\"810\" height=\"287\" srcset=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/train.png.webp 810w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/train.png-300x106.webp 300w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/train.png-768x272.webp 768w\" sizes=\"auto, (max-width: 810px) 100vw, 810px\" \/><\/figure>\n<figure class=\"aligncenter\"><\/figure>\n<\/div>\n\n\n\n<p class=\"wp-block-paragraph\">So after training I\u2019ve got a model that has an accuracy of 98%, which is quite excellent given the rather simple network infrastructure. In the screenshot you can also see that in each epoch the accuracy was increasing, so everything looks good to me. We now have a model that can quite well predict the numbers 0\u20139 from their 28&#215;28 pixel representation.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">8. Saving the\u00a0model<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">Since we want to use the model on our iPhone we have to convert it to a format that our iPhone understands. There is actually an ongoing initiative from Microsoft, Facebook and Amazon (and others) to harmonize all of the different deep learning network formats to have an interchangable open neural networks exchange format that you can use on any device. Its called <a href=\"https:\/\/onnx.ai\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/onnx.ai\/\">ONNX<\/a>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Yet, as of today Apple devices work only with the CoreML format though. In order to convert our Keras model to CoreML Apple luckily provides a very handy helper library called <a href=\"https:\/\/apple.github.io\/coremltools\/generated\/coremltools.converters.keras.convert.html\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/apple.github.io\/coremltools\/generated\/coremltools.converters.keras.convert.html\">coremltools<\/a> that we can use to get the job done. It is able to convert scikit-learn models, Keras and XGBoost models to CoreML, thus covering quite a bit of the everyday applications. Install it with \u201cpip install coremltools\u201d and then you will be able to use it easily.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>coreml_model = coremltools.converters.keras.convert(model,                                                   input_names=\"image\",                                                    image_input_names='image',                                                    class_labels=['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']                                                    )<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">The most important parameters are class_labels, they define how many classes the model is trying to predict, and input names or image_input_names. By setting them to image XCode will automatically recognize that this model is about taking in an image and trying to predict something from it. Depending on your application it makes a lot of sense to study the <a href=\"https:\/\/apple.github.io\/coremltools\/generated\/coremltools.converters.keras.convert.html\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/apple.github.io\/coremltools\/generated\/coremltools.converters.keras.convert.html\">documentation<\/a>, especially when you want to make sure that it encodes the RGB channels in the same order (parameter is_bgr) or making sure that it assumes correctly that all inputs are values between 0 and 1 (parameter image_scale)\u00a0.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The only thing left is to add some metadata to your model. With this you are helping all the developers greatly, since they don\u2019t have to guess how your model is working and what it expects as input.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>#entering metadata\ncoreml_model.author = 'plotti'\ncoreml_model.license = 'MIT'\ncoreml_model.short_description = 'MNIST handwriting recognition with a 3 layer network'\ncoreml_model.input_description['image'] = '28x28 grayscaled pixel values between 0-1'\ncoreml_model.save('SimpleMnist.mlmodel')\n\nprint(coreml_model)<\/code><\/pre>\n\n\n\n<h4 class=\"wp-block-heading\">9. Use it to predict something<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">After saving the model to a CoreML model we can try if it works correctly on our machine. For this we can feed it with an image and try to see if it predicts the label correctly. You can use the MNIST training data or you can snap a picture with your phone and transfer it on your PC to see how well the model handles real-life data.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>#Use the core-ml model to predict something\nfrom PIL import Image\nimport numpy as np\nmodel =  coremltools.models.MLModel('SimpleMnist.mlmodel')\nim = Image.fromarray((np.reshape(mnist_data()[0][0][12]*255, (28, 28))).astype(np.uint8),\"L\")\n\nplt.imshow(im)\npredictions = model.predict({'image': im})\nprint(predictions)<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">It works hooray! Now it\u2019s time to include it in a project in XCode.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Porting our model to XCode in 10\u00a0Steps<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Let me start by saying: I am by no means a XCode or Mobile developer. I have studied a <a href=\"https:\/\/github.com\/markmansur\/CoreML-Vision-demo\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/github.com\/markmansur\/CoreML-Vision-demo\">quite a few<\/a> <a href=\"https:\/\/sriraghu.com\/2017\/06\/15\/computer-vision-in-ios-object-recognition\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/sriraghu.com\/2017\/06\/15\/computer-vision-in-ios-object-recognition\/\">super<\/a><a href=\"https:\/\/www.raywenderlich.com\/577-core-ml-and-vision-machine-learning-in-ios-11-tutorial\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.raywenderlich.com\/577-core-ml-and-vision-machine-learning-in-ios-11-tutorial\">helpful tutorials<\/a>, <a href=\"https:\/\/www.pyimagesearch.com\/2018\/04\/23\/running-keras-models-on-ios-with-coreml\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.pyimagesearch.com\/2018\/04\/23\/running-keras-models-on-ios-with-coreml\/\">walkthroughs<\/a> and <a href=\"https:\/\/www.youtube.com\/watch?v=bOg8AZSFvOc\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.youtube.com\/watch?v=bOg8AZSFvOc\">videos<\/a> on how to create a simple mobile phone app with CoreML and have used those to create my app. I can only say a big thank you and kudos to the community being so open and helpful.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">1. Install\u00a0XCode<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">Now it\u2019s time to really get our hands dirty. Before you can do anything you have to have XCode. So download it via <a href=\"https:\/\/itunes.apple.com\/us\/app\/xcode\/id497799835?mt=12\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/itunes.apple.com\/us\/app\/xcode\/id497799835?mt=12\">Apple-Store<\/a> and install it. In case you already have it, make sure to have at least version 9 and above.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">2. Create the\u00a0Project<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">Start XCode and create a single view app. Name your project accordingly. I did name mine \u201cnumbers\u201d. Select a place to save it. You can leave \u201ccreate git repository on my mac\u201d checked.<\/p>\n\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-5342\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/single.png.webp\" alt=\"\" width=\"809\" height=\"544\" srcset=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/single.png.webp 809w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/single.png-300x202.webp 300w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/single.png-768x516.webp 768w\" sizes=\"auto, (max-width: 809px) 100vw, 809px\" \/><\/figure>\n<\/div>\n\n\n\n<h4 class=\"wp-block-heading\">3. Add the CoreML\u00a0model<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">We can now add the CoreML model that we created using the coremltools converter. Simply drag the model into your project directory. Make sure to drag it into the correct folder (see screenshot). You can use the option \u201cadd as Reference\u201d, like this whenever you update your model, you don\u2019t have to drag it into your project again to update it. XCode should automatically recognize your model and realize that it is a model to be used for images.<\/p>\n\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-5343\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/addmodel.png.webp\" alt=\"\" width=\"810\" height=\"506\" srcset=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/addmodel.png.webp 810w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/addmodel.png-300x187.webp 300w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/addmodel.png-768x480.webp 768w\" sizes=\"auto, (max-width: 810px) 100vw, 810px\" \/><\/figure>\n<\/div>\n\n\n\n<h4 class=\"wp-block-heading\">4. Delete the view or storyboard<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">Since we are going to use just the camera and display a label we don\u2019t need a fancy graphical user interface\u200a\u2014\u200aor in other words a view layer. Since the storyboard in Swing corresponds to the view in the MVC pattern we are going to simply delete it. In the project settings deployment info make sure to delete the Main Interface too (see screenshot), by setting it to blank.<\/p>\n\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-5344\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/storyboard.png.webp\" alt=\"\" width=\"809\" height=\"878\" srcset=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/storyboard.png.webp 809w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/storyboard.png-276x300.webp 276w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/storyboard.png-768x834.webp 768w\" sizes=\"auto, (max-width: 809px) 100vw, 809px\" \/><\/figure>\n<\/div>\n\n\n\n<h4 class=\"wp-block-heading\">5. Create the root view controller programmatically<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">Instead, we are going to create view root controller programmatically by replacing the <code>funct application<\/code> in AppDelegate.swift with the following code:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>\/\/ create the view root controller programmatically\nfunc application(_ application: UIApplication, didFinishLaunchingWithOptions launchOptions: [UIApplicationLaunchOptionsKey: Any]?) -&gt; Bool {\n    \/\/ create the user interface window, make it visible\n      window = UIWindow()\n      window?.makeKeyAndVisible()\n    \/\/ create the view controller and make it the root view controller\n      let vc = ViewController()\n      window?.rootViewController = vc\n\n    \/\/ return true upon success\n     return true\n}<\/code><\/pre>\n\n\n\n<h4 class=\"wp-block-heading\">6. Build the view controller<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">Finally, it is time to build the view controller. We will use UIKit\u200a\u2014\u200aa lib for creating buttons and labels, AVFoundation\u200a\u2014\u200aa lib to capture the camera on the iPhone and Vision\u200a\u2014\u200aa lib to handle our CoreML model. The last is especially handy if you don\u2019t want to resize the input data yourself.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In the Viewcontroller we are going to inherit from UI and AV functionalities, so we need to overwrite some methods later to make it functional.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The first thing we will do is to create a label that will tell us what the camera is seeing. By overriding the <code>viewDidLoad<\/code> function we will trigger the capturing of the camera and add the label to the view.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In the function <code>setupCaptureSession<\/code> we will create a capture session, grab the first camera (which is the front facing one) and capture its output into <code>captureOutput<\/code> while also displaying it on the <code>previewLayer<\/code>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In the function <code>captureOutput<\/code> we will finally make use of our CoreML model that we imported before. Make sure to hit Cmd+B &#8211; build, when importing it, so XCode knows it&#8217;s actually there. We will use it to predict something from the image that we captured. We will then grab the first prediction from the model and display it in our label.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>define the ViewController\nimport UIKit\nimport AVFoundation\nimport Vision<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code>class ViewController: UIViewController, AVCaptureVideoDataOutputSampleBufferDelegate {\n      \/\/ create a label to hold the Pokemon name and confidence\n     let label: UILabel = {\n         let label = UILabel()\n         label.textColor = .white\n         label.translatesAutoresizingMaskIntoConstraints = false\n         label.text = \"Label\"\n         label.font = label.font.withSize(40)\n         return label\n      }()\n\n     override func viewDidLoad() {\n     \/\/ call the parent function\n           super.viewDidLoad()\n           setupCaptureSession() \/\/ establish the capture\n           view.addSubview(label) \/\/ add the label\n           setupLabel()\n}\n\n      func setupCaptureSession() {\n           \/\/ create a new capture session\n           let captureSession = AVCaptureSession()\n\n           \/\/ find the available cameras\n           let availableDevices = AVCaptureDevice.DiscoverySession(deviceTypes: [.builtInWideAngleCamera], mediaType: AVMediaType.video, position: .back).devices\n\n        do {\n            \/\/ select the first camera (front)\n            if let captureDevice = availableDevices.first {                captureSession.addInput(try AVCaptureDeviceInput(device: captureDevice))\n           }\n         } catch {\n           \/\/ print an error if the camera is not available\n           print(error.localizedDescription)\n         }\n\n        \/\/ setup the video output to the screen and add output to our capture session\n        let captureOutput = AVCaptureVideoDataOutput()\n        captureSession.addOutput(captureOutput)\n        let previewLayer = AVCaptureVideoPreviewLayer(session: captureSession)\n         previewLayer.frame = view.frame\n         view.layer.addSublayer(previewLayer)\n\n        \/\/ buffer the video and start the capture session\n        captureOutput.setSampleBufferDelegate(self, queue: DispatchQueue(label: \"videoQueue\"))\n        captureSession.startRunning()\n    }\n\n    func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {\n         \/\/ load our CoreML Pokedex model\n        guard let model = try? VNCoreMLModel(for: SimpleMnist().model) else { return }\n\n        \/\/ run an inference with CoreML\n        let request = VNCoreMLRequest(model: model) { (finishedRequest, error) in\n\n            \/\/ grab the inference results\n            guard let results = finishedRequest.results as? [VNClassificationObservation] else { return }\n\n            \/\/ grab the highest confidence result\n            guard let Observation = results.first else { return }\n\n            \/\/ create the label text components\n            let predclass = \"(Observation.identifier)\"\n\n            \/\/ set the label text\n            DispatchQueue.main.async(execute: {\n                self.label.text = \"(predclass) \"\n           })\n     }\n\n        \/\/ create a Core Video pixel buffer which is an image buffer that holds pixels in main memory\n       \/\/ Applications generating frames, compressing or decompressing video, or using Core Image\n      \/\/ can all make use of Core Video pixel buffers\n      guard let pixelBuffer: CVPixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }\n\n   func setupLabel() {\n      \/\/ constrain the label in the center      label.centerXAnchor.constraint(equalTo:view.centerXAnchor).isActive = true\n\n     \/\/ constrain the the label to 50 pixels from the bottom        label.bottomAnchor.constraint(equalTo: view.bottomAnchor, constant: -50).isActive = true\n    }\n}<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Make sure that you have changed the model part to the naming of your model. Otherwise you will get build errors.<\/p>\n\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-5345\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/modeldetails.png.webp\" alt=\"\" width=\"809\" height=\"470\" srcset=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/modeldetails.png.webp 809w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/modeldetails.png-300x174.webp 300w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/modeldetails.png-768x446.webp 768w\" sizes=\"auto, (max-width: 809px) 100vw, 809px\" \/><\/figure>\n<\/div>\n\n\n\n<h4 class=\"wp-block-heading\">7. Add Privacy\u00a0Message<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">Finally, since we are going to use the camera, we need to inform the user that we are going to do so, and thus add a privacy message \u201cPrivacy\u200a\u2014\u200aCamera Usage Description\u201d in the Info.plist file under Information Property List.<\/p>\n\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-5346\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/privacy.png.webp\" alt=\"\" width=\"809\" height=\"389\" srcset=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/privacy.png.webp 809w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/privacy.png-300x144.webp 300w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/privacy.png-768x369.webp 768w\" sizes=\"auto, (max-width: 809px) 100vw, 809px\" \/><\/figure>\n<\/div>\n\n\n\n<h4 class=\"wp-block-heading\">8. Add a build\u00a0team<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">In order to deploy the app on your mobile iPhone, you will need to <a href=\"https:\/\/developer.apple.com\/programs\/enroll\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/developer.apple.com\/programs\/enroll\/\">register with the Apple developer program<\/a>. There is no need to pay any money to do so, <a href=\"https:\/\/9to5mac.com\/2016\/03\/27\/how-to-create-free-apple-developer-account-sideload-apps\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/9to5mac.com\/2016\/03\/27\/how-to-create-free-apple-developer-account-sideload-apps\/\">you can register also without any fees<\/a>. Once you are registered you can select the team Apple calls it this way) that you have signed up there in the project properties.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">9. Deploy on your\u00a0iPhone<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">Finally, it\u2019s time to deploy the model on your iPhone. You will need to connect it via USB and then unlock it. Once it\u2019s unlocked you need to select the destination under Product\u200a\u2014\u200aDestination- Your iPhone. Then the only thing left is to run it on your mobile: Select Product\u200a\u2014\u200aRun (or simply hit CMD + R) in the Menu and XCode will build and deploy the project on your iPhone.<\/p>\n\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-5347\" src=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/destination.png.webp\" alt=\"\" width=\"809\" height=\"652\" srcset=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/destination.png.webp 809w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/destination.png-300x242.webp 300w, https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/destination.png-768x619.webp 768w\" sizes=\"auto, (max-width: 809px) 100vw, 809px\" \/><\/figure>\n<\/div>\n\n\n\n<h4 class=\"wp-block-heading\">10. Try it\u00a0out<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">After having had to jump through so many hoops, it is finally time to try out our app. If you are starting it for the first time it will ask you to allow it to use your camera (after all we have placed this info there). Then make sure to hold your iPhone sideways, since it matters on how we trained the network. We have not been using any augmentation techniques, so our model is unable to recognize numbers that are \u201clying on the side\u201d. We could make our model better by applying these techniques as I have shown in <a href=\"https:\/\/www.liip.ch\/en\/blog\/zoo-pokedex-part-2-hands-on-with-keras-and-resnet50\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.liip.ch\/en\/blog\/zoo-pokedex-part-2-hands-on-with-keras-and-resnet50\">this blog article<\/a>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">A second thing you might notice is, that the app always recognizes some number, as there is no \u201cbackground\u201d class. In order to fix this, we could train the model additionally on some random images, which we classify as the background class. This way our model would be better equipped to tell apart if it is seeing a number or just some random background.<\/p>\n\n\n\n\n\n<p class=\"wp-block-paragraph\">Obviously, this has is a very long blog post. Yet I wanted to get all the necessary info into one place in order to show other mobile devs how easy it is to create your own deep learning computer vision applications. In our case at Liip, it will most certainly boil down to a collaboration between our <a href=\"https:\/\/www.liip.ch\/en\/work\/data\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.liip.ch\/en\/work\/data\">data services team<\/a> and our mobile developers in order to get the best of both worlds.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In fact we are currently innovating together by creating an app that <a href=\"https:\/\/www.liip.ch\/en\/blog\/zoo-pokedex-part-2-hands-on-with-keras-and-resnet50\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.liip.ch\/en\/blog\/zoo-pokedex-part-2-hands-on-with-keras-and-resnet50\">will be able to recognize<\/a> <a href=\"https:\/\/www.liip.ch\/en\/blog\/poke-zoo-or-making-deep-learning-tell-oryxes-apart-from-lamas-in-a-zoo-part-1-the-idea-and-concepts\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.liip.ch\/en\/blog\/poke-zoo-or-making-deep-learning-tell-oryxes-apart-from-lamas-in-a-zoo-part-1-the-idea-and-concepts\">animals in a zoo<\/a> and working on another small fun game that lets two people doodle against each other: You will be given a task, as in \u201cdraw an apple\u201d and the person who draws the apple faster in such a way that it is recognized by the deep learning model wins.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Beyond such fun innovation projects the possibilities are endless but always depend on the context of the business and the users. Obviously the saying \u201cif you have a hammer every problem looks like a nail to you\u201d applies here too, not every app will benefit from having computer vision on board, and not all apps using computer vision are <a href=\"https:\/\/www.theverge.com\/2017\/6\/26\/15876006\/hot-dog-app-android-silicon-valley\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.theverge.com\/2017\/6\/26\/15876006\/hot-dog-app-android-silicon-valley\">useful ones<\/a> as some of you might know from the famous Silicon Valley episode.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Yet there are quite a few nice examples of apps that use computer vision successfully:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"http:\/\/leafsnap.com\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/leafsnap.com\/\">Leafsnap<\/a>, lets you distinguish different types of leafs.<\/li>\n<li><a href=\"https:\/\/www.aipoly.com\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.aipoly.com\/\">Aipoly<\/a> helps visually impaired people to explore the world.<\/li>\n<li><a href=\"http:\/\/www.snooth.com\/iphone-app\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/www.snooth.com\/iphone-app\/\">Snooth<\/a> gets you more infos on your wine by taking a picture of the label.<\/li>\n<li><a href=\"https:\/\/www.theverge.com\/2017\/2\/8\/14549798\/pinterest-lens-visual-discovery-shazam\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.theverge.com\/2017\/2\/8\/14549798\/pinterest-lens-visual-discovery-shazam\">Pinterest<\/a> has launched a visual search that allows you to search for pins that match the product that you captured with your phone.<\/li>\n<li><a href=\"http:\/\/www.caloriemama.ai\/\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"http:\/\/www.caloriemama.ai\/\">Caloriemama<\/a> lets you snap a picture of your food and tells you how many calories it has.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">As usual, the code that you have seen in this blog post is <a href=\"https:\/\/github.com\/plotti\/mnist-to-coreml\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/github.com\/plotti\/mnist-to-coreml\">available online<\/a>. Feel free to experiment with it. I am looking forward to your comments and I hope you enjoyed the journey. P.S. I would like to thank Stefanie Taepke for proofreading and for her helpful comments which made this post more readable.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Found this article useful? Here are some other articles you might enjoy:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/www.notion.so\/cometml\/Comet-ml-Release-Notes-93d864bcac584360943a73ae9507bcaa\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/www.notion.so\/cometml\/Comet-ml-Release-Notes-93d864bcac584360943a73ae9507bcaa\">comet.ml Release Notes\u200a<\/a>\u2014\u200aupdated daily with new features and fixes!<\/li>\n<li><a href=\"https:\/\/medium.com\/comet-ml\/monitoring-machine-learning-model-results-live-from-jupyter-notebooks-765a142069bb\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/medium.com\/comet-ml\/monitoring-machine-learning-model-results-live-from-jupyter-notebooks-765a142069bb\">Monitoring machine learning model results live from Jupyter Notebooks<\/a><\/li>\n<li><a href=\"https:\/\/medium.com\/comet-ml\/building-reliable-machine-learning-models-with-cross-validation-20b2c3e32f3e\" target=\"_blank\" rel=\"noopener noreferrer\" data-href=\"https:\/\/medium.com\/comet-ml\/building-reliable-machine-learning-models-with-cross-validation-20b2c3e32f3e\">Building Reliable Machine Learning Models with Cross Validation<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Learn how to build and train a deep learning network to recognize numbers (MNIST), how to convert it in the CoreML format to then deploy it on your iPhoneX and make it recognize numbers in real-time! This post originally appeared on the Liip blog here and was reposted with the author\u2019s permission. We also recommend [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"customer_name":"","customer_description":"","customer_industry":"","customer_technologies":"","customer_logo":"","_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[7],"tags":[],"coauthors":[107],"class_list":["post-1647","post","type-post","status-publish","format-standard","hentry","category-tutorials"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v25.9 (Yoast SEO v25.9) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Real-time numbers recognition (MNIST) on an iPhone with CoreML from A to Z - Comet<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.comet.com\/site\/blog\/real-time-numbers-recognition-mnist-on-an-iphone-with-coreml-from-a-to-z\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Real-time numbers recognition (MNIST) on an iPhone with CoreML from A to Z\" \/>\n<meta property=\"og:description\" content=\"Learn how to build and train a deep learning network to recognize numbers (MNIST), how to convert it in the CoreML format to then deploy it on your iPhoneX and make it recognize numbers in real-time! This post originally appeared on the Liip blog here and was reposted with the author\u2019s permission. We also recommend [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.comet.com\/site\/blog\/real-time-numbers-recognition-mnist-on-an-iphone-with-coreml-from-a-to-z\/\" \/>\n<meta property=\"og:site_name\" content=\"Comet\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/cometdotml\" \/>\n<meta property=\"article:published_time\" content=\"2018-11-15T00:23:34+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/numbers.png.webp\" \/>\n<meta name=\"author\" content=\"Gideon Mendels\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@Cometml\" \/>\n<meta name=\"twitter:site\" content=\"@Cometml\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Gideon Mendels\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"18 minutes\" \/>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Real-time numbers recognition (MNIST) on an iPhone with CoreML from A to Z - Comet","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.comet.com\/site\/blog\/real-time-numbers-recognition-mnist-on-an-iphone-with-coreml-from-a-to-z\/","og_locale":"en_US","og_type":"article","og_title":"Real-time numbers recognition (MNIST) on an iPhone with CoreML from A to Z","og_description":"Learn how to build and train a deep learning network to recognize numbers (MNIST), how to convert it in the CoreML format to then deploy it on your iPhoneX and make it recognize numbers in real-time! This post originally appeared on the Liip blog here and was reposted with the author\u2019s permission. We also recommend [&hellip;]","og_url":"https:\/\/www.comet.com\/site\/blog\/real-time-numbers-recognition-mnist-on-an-iphone-with-coreml-from-a-to-z\/","og_site_name":"Comet","article_publisher":"https:\/\/www.facebook.com\/cometdotml","article_published_time":"2018-11-15T00:23:34+00:00","og_image":[{"url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/numbers.png.webp","type":"","width":"","height":""}],"author":"Gideon Mendels","twitter_card":"summary_large_image","twitter_creator":"@Cometml","twitter_site":"@Cometml","twitter_misc":{"Written by":"Gideon Mendels","Est. reading time":"18 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.comet.com\/site\/blog\/real-time-numbers-recognition-mnist-on-an-iphone-with-coreml-from-a-to-z\/#article","isPartOf":{"@id":"https:\/\/www.comet.com\/site\/blog\/real-time-numbers-recognition-mnist-on-an-iphone-with-coreml-from-a-to-z\/"},"author":{"name":"engineering@atre.net","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/550ac35e8e821db8064c5bd1f0a04e6b"},"headline":"Real-time numbers recognition (MNIST) on an iPhone with CoreML from A to Z","datePublished":"2018-11-15T00:23:34+00:00","mainEntityOfPage":{"@id":"https:\/\/www.comet.com\/site\/blog\/real-time-numbers-recognition-mnist-on-an-iphone-with-coreml-from-a-to-z\/"},"wordCount":2651,"publisher":{"@id":"https:\/\/www.comet.com\/site\/#organization"},"image":{"@id":"https:\/\/www.comet.com\/site\/blog\/real-time-numbers-recognition-mnist-on-an-iphone-with-coreml-from-a-to-z\/#primaryimage"},"thumbnailUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/numbers.png.webp","articleSection":["Tutorials"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.comet.com\/site\/blog\/real-time-numbers-recognition-mnist-on-an-iphone-with-coreml-from-a-to-z\/","url":"https:\/\/www.comet.com\/site\/blog\/real-time-numbers-recognition-mnist-on-an-iphone-with-coreml-from-a-to-z\/","name":"Real-time numbers recognition (MNIST) on an iPhone with CoreML from A to Z - Comet","isPartOf":{"@id":"https:\/\/www.comet.com\/site\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.comet.com\/site\/blog\/real-time-numbers-recognition-mnist-on-an-iphone-with-coreml-from-a-to-z\/#primaryimage"},"image":{"@id":"https:\/\/www.comet.com\/site\/blog\/real-time-numbers-recognition-mnist-on-an-iphone-with-coreml-from-a-to-z\/#primaryimage"},"thumbnailUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/numbers.png.webp","datePublished":"2018-11-15T00:23:34+00:00","breadcrumb":{"@id":"https:\/\/www.comet.com\/site\/blog\/real-time-numbers-recognition-mnist-on-an-iphone-with-coreml-from-a-to-z\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.comet.com\/site\/blog\/real-time-numbers-recognition-mnist-on-an-iphone-with-coreml-from-a-to-z\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/blog\/real-time-numbers-recognition-mnist-on-an-iphone-with-coreml-from-a-to-z\/#primaryimage","url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/numbers.png.webp","contentUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2018\/11\/numbers.png.webp"},{"@type":"BreadcrumbList","@id":"https:\/\/www.comet.com\/site\/blog\/real-time-numbers-recognition-mnist-on-an-iphone-with-coreml-from-a-to-z\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.comet.com\/site\/"},{"@type":"ListItem","position":2,"name":"Real-time numbers recognition (MNIST) on an iPhone with CoreML from A to Z"}]},{"@type":"WebSite","@id":"https:\/\/www.comet.com\/site\/#website","url":"https:\/\/www.comet.com\/site\/","name":"Comet","description":"Build Better Models Faster","publisher":{"@id":"https:\/\/www.comet.com\/site\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.comet.com\/site\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.comet.com\/site\/#organization","name":"Comet ML, Inc.","alternateName":"Comet","url":"https:\/\/www.comet.com\/site\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/#\/schema\/logo\/image\/","url":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/logo_comet_square.png","contentUrl":"https:\/\/www.comet.com\/site\/wp-content\/uploads\/2025\/01\/logo_comet_square.png","width":310,"height":310,"caption":"Comet ML, Inc."},"image":{"@id":"https:\/\/www.comet.com\/site\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/cometdotml","https:\/\/x.com\/Cometml","https:\/\/www.youtube.com\/channel\/UCmN63HKvfXSCS-UwVwmK8Hw"]},{"@type":"Person","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/550ac35e8e821db8064c5bd1f0a04e6b","name":"engineering@atre.net","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.comet.com\/site\/#\/schema\/person\/image\/027c18177377edf459980f0cfb83706c","url":"https:\/\/secure.gravatar.com\/avatar\/d002a459a297e0d1779329318029aee19868c312b3e1f3c9ec9b3e3add2740de?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/d002a459a297e0d1779329318029aee19868c312b3e1f3c9ec9b3e3add2740de?s=96&d=mm&r=g","caption":"engineering@atre.net"},"sameAs":["https:\/\/live-cometml.pantheonsite.io"],"url":"https:\/\/www.comet.com\/site\/blog\/author\/engineeringatre-net\/"}]}},"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/1647","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/comments?post=1647"}],"version-history":[{"count":0,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/posts\/1647\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/media?parent=1647"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/categories?post=1647"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/tags?post=1647"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.comet.com\/site\/wp-json\/wp\/v2\/coauthors?post=1647"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}