Summary: Researchers from the Medical University of Graz recently studied the impact of using an artificial intelligence (AI)-based computer system on the accuracy and agreement rate of board-certified orthopedic surgeons in order to detect X-ray features that can indicate knee OA (osteoarthritis assessment). The researchers focused on building a framework to compare the efficacy of unaided assessments versus the results to those of senior residents.
Challenge: Early-stage OA signs are invisible on plain X-rays, as cartilage degeneration cannot be directly assessed, and OA constitutes a three-dimensional problem. This is reflected by fair to moderate interobserver reliability for knee OA assessment using X-rays alone. To overcome these issues, different solutions, including novel quantitative grading methods as well as automatic knee X-ray assessment tools, have been proposed. Artificial intelligence (AI) and deep learning have been used in medical image classification related to the musculoskeletal system. The researchers analyzed the intra- and interobserver reliability of board-certified orthopedic surgeons (also known as senior readers) for knee OA grade assessment using either AI-annotated or plain X-rays. Afterwards, they compared the outcome of senior readers to that of senior residents (termed junior readers) with aided analysis in terms of agreement rate and overall performance.
Findings: The use of AI-based software leads to improvement in the radiological judgement of senior orthopedic surgeons with regard to X-ray features indicative of knee OA and KL grade, as measured by the agreement rate and overall accuracy in comparison to the ground truth. The agreement and accuracy rates of senior readers were comparable to those of junior readers with aided analysis. Consequently, standard of care may be improved by the additional application of AI-based software in the radiological evaluation of knee OA.
How Labelbox was used: The entire labelling process was divided into 3 steps. First, three senior readers were trained on the structure of the AI software report, OARSI grading system labelling process and the Labelbox platform was used. Second, readers assessed—unaided (i.e. without AI annotations)—124 plain knee X-rays and defined KL grade, osteophytes, sclerosis, and JSN by completing a list. The readers were able to work remotely at their preferred time and allowed to interrupt and resume labelling at any time, without time restrictions for labelling individual images or the entire dataset. However, the time it took readers to label each image, as well as the time of the entire labelling process. Third, after a minimum of 2 weeks after the second step had been completed, the same 124 knee X-rays were relabelled by the readers, with images provided at random order (to avoid creating observer bias. At this point, however, each image was supplemented with the AI software’s report together with a binary score of whether OA was present on the X-ray.
You can read the full PDF here.