In beta: Surface high-impact model failures with auto-generated metrics
We'll soon be releasing auto-generated metrics for your team to automatically surface high-impact model failures.
With this feature:
- You'll no longer have to manually compute and upload model metrics anymore — simply upload model predictions and ground truths
- Some model metrics will be auto-generated by Labelbox based on your predictions and ground truths: precision, recall, F1-score, TP/TN/FP/FN, confusion matrix, etc.
- Model metrics and confidence scores are now attached to specific predictions rather than averaged on data rows
- Your team can analyze your models using a NxN confusion matrix, which is interactive: click on any cell to surface a specific type of misprediction (e.g. “truck” mispredicted as a “car”)
You'll still be able to upload their own custom metrics to complement our auto-generated metrics. Auto-generated metrics will be available for a variety of ML tasks: classification on all data types, image object detection, image segmentation, and text NER.
If you're interested in participating in the beta, please sign up here.