Automated recognition of cricket batting techniques in videos using deep learning
Summary: Researchers from the University of Johannesberg recently studied how complex batting techniques in cricket can be more efficiently analyzed using machine learning.
Challenge: There have been limited studies demonstrating the validation of batting techniques in cricket using machine learning. Cricket batting technique is intricate because it involves a series of complex gestures needed to perform a stroke, one of these gestures performed by the batsman is referred to as the batting backlift technique (BBT).
Previous research has indicated that the BBT can be seen as a contributing factor to successful batsmanship. There are two backlifts investigated in this study, namely the lateral batting backlift technique (LBBT), and the straight batting backlift technique (SBBT). The LBBT is a technique present where the toe and face of the bat are lifted laterally in the direction of second slip. The SBBT is represented whenever the toe and face of the bat are pointed toward the stumps and ground.
The study demonstrated how the batting backlift technique in cricket can be automatically recognized in video footage and compares the performance of popular deep learning architectures, namely, AlexNet, Inception V3, Inception Resnet V2, and Xception.
Findings: Building a unique dataset showing the lateral and straight backlift classes and assessed according to standard machine learning metrics, the researchers found that the architectures was comparable with similar performance with one false positive in the lateral class and a precision score of 100%, along with a recall score of 95%, and an f1-score of 98% for each architecture.
The AlexNet architecture performed the worst out of the four architectures as it incorrectly classified four images that were supposed to be in the straight class. The architecture that is best suited for the problem domain was the Xception architecture with a loss of 0.03 and 98.2.5% accuracy, thus demonstrating its capability in differentiating between lateral and straight backlifts.
The study provides a way forward in the automatic recognition of player patterns and motion capture, making it less challenging for sports scientists, biomechanists and video analysts working in the field.
How Labelbox was used: For the construction of the dataset, the process went through a comprehensive YouTube search of First-Class International Cricket Test Match highlights, where the match’s environment has fewer variations to consider.
Using the Labelbox platform, each object within a cricket scene is labeled, allowing for easier isolation and extraction of the batsman in each frame. The frame used for constructing the dataset was when the bowler was about to release the ball towards the batsman.
The frame was identified as the ideal time period for the position of the batsman at the instant of delivery. Using an 80:20 data split, the training class had 160 images, and the testing class had 40 images, resulting in a total of 200 images, which served as a baseline to draw comparisons of the proposed architectures. The image aspect ratio chosen through testing and validation is 128×128, which is chosen to avoid distorting the original image.
You can read the full PDF here.