TY - GEN
T1 - Deep learning-based fine-grained car make/model classification for visual surveillance
AU - Gundogdu, Erhan
AU - Parlldl, Enes Sinan
AU - Solmaz, Berkan
AU - Yücesoy, Veysel
AU - Koç, Aykut
N1 - Publisher Copyright:
© COPYRIGHT SPIE. Downloading of the abstract is permitted for personal use only.
PY - 2017
Y1 - 2017
N2 - Fine-grained object recognition is a potential computer vision problem that has been recently addressed by utilizing deep Convolutional Neural Networks (CNNs). Nevertheless, the main disadvantage of classification methods relying on deep CNN models is the need for considerably large amount of data. In addition, there exists relatively less amount of annotated data for a real world application, such as the recognition of car models in a traffic surveillance system. To this end, we mainly concentrate on the classification of fine-grained car make and/or models for visual scenarios by the help of two different domains. First, a large-scale dataset including approximately 900K images is constructed from a website which includes fine-grained car models. According to their labels, a state-of-The-Art CNN model is trained on the constructed dataset. The second domain that is dealt with is the set of images collected from a camera integrated to a traffic surveillance system. These images, which are over 260K, are gathered by a special license plate detection method on top of a motion detection algorithm. An appropriately selected size of the image is cropped from the region of interest provided by the detected license plate location. These sets of images and their provided labels for more than 30 classes are employed to fine-Tune the CNN model which is already trained on the large scale dataset described above. To fine-Tune the network, the last two fully-connected layers are randomly initialized and the remaining layers are fine-Tuned in the second dataset. In this work, the transfer of a learned model on a large dataset to a smaller one has been successfully performed by utilizing both the limited annotated data of the traffic field and a large scale dataset with available annotations. Our experimental results both in the validation dataset and the real field show that the proposed methodology performs favorably against the training of the CNN model from scratch.
AB - Fine-grained object recognition is a potential computer vision problem that has been recently addressed by utilizing deep Convolutional Neural Networks (CNNs). Nevertheless, the main disadvantage of classification methods relying on deep CNN models is the need for considerably large amount of data. In addition, there exists relatively less amount of annotated data for a real world application, such as the recognition of car models in a traffic surveillance system. To this end, we mainly concentrate on the classification of fine-grained car make and/or models for visual scenarios by the help of two different domains. First, a large-scale dataset including approximately 900K images is constructed from a website which includes fine-grained car models. According to their labels, a state-of-The-Art CNN model is trained on the constructed dataset. The second domain that is dealt with is the set of images collected from a camera integrated to a traffic surveillance system. These images, which are over 260K, are gathered by a special license plate detection method on top of a motion detection algorithm. An appropriately selected size of the image is cropped from the region of interest provided by the detected license plate location. These sets of images and their provided labels for more than 30 classes are employed to fine-Tune the CNN model which is already trained on the large scale dataset described above. To fine-Tune the network, the last two fully-connected layers are randomly initialized and the remaining layers are fine-Tuned in the second dataset. In this work, the transfer of a learned model on a large dataset to a smaller one has been successfully performed by utilizing both the limited annotated data of the traffic field and a large scale dataset with available annotations. Our experimental results both in the validation dataset and the real field show that the proposed methodology performs favorably against the training of the CNN model from scratch.
KW - Deep convolutional neural networks
KW - Fine-Tuning
KW - Fine-grained object recognition
KW - traffic surveillance
UR - http://www.scopus.com/inward/record.url?scp=85038426240&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85038426240&partnerID=8YFLogxK
U2 - 10.1117/12.2278862
DO - 10.1117/12.2278862
M3 - Conference contribution
AN - SCOPUS:85038426240
T3 - Proceedings of SPIE - The International Society for Optical Engineering
BT - Counterterrorism, Crime Fighting, Forensics, and Surveillance Technologies
A2 - Yitzhaky, Yitzhak
A2 - Stokes, Robert James
A2 - Bouma, Henri
A2 - Carlysle-Davies, Felicity
PB - SPIE
T2 - Counterterrorism, Crime Fighting, Forensics, and Surveillance Technologies 2017
Y2 - 11 September 2017 through 12 September 2017
ER -