Deepearning4j is an open-source deep learning library written in Java and designed to be used in business environments. The tool can be easily integrated with GPU and scaled on Hadoop or Spark. As the documentation says, Deeplearning4j offers support for majority of deep architectures:
- Convolutional Neural Networks
- Restricted Boltzmann Machines
- Recurrent Nets
- Deep Belief Networks
- Deep Autoencoders
Lasagne is a lightweight library written in Python to build and train neural networks in Theano. Lasagne supports feed-forward neural networks such as:
- Convolutional Neural Networks
- Recurring Neural Networks (including LTSM, GRU)
The library is designed to be easy to understand, customize and extend – as an open-source project by researchers for researchers.
Another useful package is nolearn, which makes using Lasagne simpler and faster for prototyping.
Using convolutional neural networks to classify CIFAR-10 images
For the following examples we will be using CIFAR-10 dataset. The CIFAR-10 dataset consists of 60000 32×32 colour images in 10 classes (airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck), with 6000 images per class. There are 50000 training and 10000 test images.
For the Java example I will use CIFAR-10 dataset available on Kaggle website: https://www.kaggle.com/c/cifar-10/ and for the Python example I will use python version of CIFAR-10 data set available here: http://www.cs.toronto.edu/~kriz/cifar.html.
Convolutional neural networks are widely used for image and video recognition. What is interesting about these networks is that they use relatively little pre-processing – they learn filters that in traditional algorithms were hand-engineered. In this article I will not focus on theoretical background and details of these neural networks. If you are interested how these networks work I encourage you to take a look at https://cs231n.github.io/.
In the next sections we will create a simple convolutional neural network for this problem using both of described earlier libraries. The architecture of the network we will build looks like this:
The code is divided into three main sections: data preparation, model definition, training and evaluation. First, all images are read by ImageRecorder. Then using the CSVRecorder I load all images labels – I slightly modified the original labels.csv file, so that it contains only one column with labels. To combine both images and labels data I used ComposableRecordReader – it’s really useful as you can combine multiple data from different sources. The only drawback is that I had to take care of replacing verbose labels with their integer indexes. For this simple example the same can be achieved using only ImageRecorder – you can initialize it with a list of string labels.
The next step is to configure the neural network. Deeplearning4j provides a builder for defining deep neural network layer by layer and setting other network parameters. The dataset used in this example is not split into train and test sets. I used RecordReaderDataSetIterator to split the whole dataset into 5000 batches, each of size 100 samples. In the training phase each batch is randomly divided into train and test datasets – 80 samples go to the train set and remaining 20 to test set. For each batch, we save test data in order to validate the model in the end, when the model is fully trained. By defining only one Evaluation object we are able to get statistics for the whole data set. The training and evaluation phases implemented in Deeplearning4j can be summed up with the following pseudo-code:
The data provided by http://www.cs.toronto.edu/~kriz/cifar.html come in six pickled batches: five for train data and one for test data. Before using this dataset with Lasagne and nolearn data need to be resharped as the images are initially in form of 10000×3072 numpy array.
Defining a neural net with nolearn is intuitive and easy. nolearn defines several wrappers around Lasagne. In order to create a neural network you just provide layers’ names and types and specify their additional parameters. The default batch size used by nolearn is 128 and each train batch is randomly split into train and validation sets with a ratio of 80:20. All these parameters are fully adjustable. When it comes to model training and evaluation Lasagne/nolearn use quite different approach than Deeplearning4j. The following pseudocode shows the simplified algorithm:
The main difference from Deeplearning4j is that with Lasagne/nolearn you define learning procedure in terms of epoch – which is a complete iteration over the whole train/validation dataset. During each epoch the whole train set is randomly sampled into several mini-batches of fixed size, which are next used for training the model.
Here is the structure of the network created with Lasagne/nolearn:
The networks created in this article are not optimal for the CIFAR-10 problem, but my intention was to give an overall insight into Deeplearning4j and Lasagne/nolearn. I published the complete code for this article on Github. Feel free to modify and experiment with the examples.