Leading AI lab taps into financial experts to train their frontier model on industry-specific reasoning

A frontier AI lab needed to harden its model on financial reasoning. Labelbox's platform produced expert-graded preference signal from CFA- and PhD-level finance experts.

The challenge

A leading AI lab wanted to improve its models' industry-specific reasoning on finance — performance, trustworthiness, and accuracy on financial queries. The target capability: give meaningful insights on any public company from a ticker symbol and the latest financial reports, and answer the questions a financial analyst would ask. Producing that signal was hard. The tasks were complex and domain-specific, the deadline was tight, and the lab lacked finance expertise at the scale required.

The approach

Labelbox produced the signal. Through its Alignerr network — spanning industry domains and languages — the platform captured judgment from finance experts with CFA, MBA, Master's, and PhD-in-Finance qualifications, screened from over 50 candidates, with a 24-hour calibration period. Labelbox and the lab developed the task instructions together and built a custom ontology in the platform's text editor: classifications, sub-classifications, and free-text inputs. Against complex, hypothetical prompts, experts ranked aspects of the model's outputs on a 1-to-5 scale — evaluating hypotheses for probability, importance, and feasibility, and argument quality for conclusiveness and causality. The lab monitored performance and quality metrics throughout, and workflows adjusted as feedback came in.

As someone with a PhD in finance, I was intrigued by the opportunity to apply my financial expertise to help train AI models. I've found the work both flexible and intellectually stimulating. While the financial tasks are technically challenging, they have been incredibly rewarding and have provided a welcome mental challenge.
— Shaun C, PhD Finance

The outcome

The lab got high-quality financial signal within its tight timeframe and used it to boost its LLM's performance, accuracy, and reliability. With a repeatable process for expert-graded financial signal, the lab can keep advancing its model on industry-specific reasoning like financial argumentation.

Where this goes

Preference ranking from domain experts is reward signal. This is how a general model becomes a specialist that financial analysts can trust.

Expert preference signal for a financial-reasoning frontier model

Problem

Solution

Result

The challenge

The approach

The outcome

Where this goes

Try Labelbox today