Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks.
This deep learning model follows the 2014 paper by Goodfellow et al., which allows an end to end multiple digits classification for numbers of up to 5 digits. The model extracts a 4096 nodes feature vector using a CNN network and then applies 6 linear classifiers on it - one classifier for the length of the number and 5 classifiers for the individual digits. The model can be applied for numbers OCR for up to 5 digits in blurry, rotated and messy images.
Model Structure
The model consists of:
- A preprocessing script - create_dataset.lua
- A training script - model.lua
- An evaluation script - eval.lua It is possible to train the model or use a supplied trained model which achieves 95.4% accuracy.
Trained Model
The trained model can be downloaded from http://www.terminet.xyz/datasets/model.zip
Requirements
The model requires Torch7 including the following packages:
- nn (for the model structure)
- optim (for the model optimization)
- dp (for checkpoint loading)
- cutorch, cunn (for using CUDA)
Training Example
th model.lua -save 'model.net' -epochs 50
Evaluation Example
th eval.lua -load 'model.net' -image 'data/1.png'