Stanford Alpaca 7B Training Dataset

Published on: 2023-03-13

Contributors: Rohan Taori, Ishaan Gulrajani, Tianyi Zhang, Yann Dubois, Xuechen Li, Carlos Guestrin, Percy Liang, Tatsunori B. Hashimoto

Datarows: 52,000 datarows

Large Language Models

Generative AI

Explore dataset

Alpaca 7B is a model fine-tuned from the Meta's LLaMA 7B model on 52K instruction-following demonstrations. On the preliminary evaluation of single-turn instruction following, Alpaca behaves qualitatively similarly to OpenAI’s text-davinci-003, but is smaller and easier to reproduce. The model was created and published by a group of Stanford PhD students.

This dataset contains the 52K instruction-following samples, generated in the style of self-instruct using text-davinci-003, used to train the Alpaca 7B model.

License
Apache License 2.0 (see more)

Try Labelbox today

Get started for free or see how Labelbox can fit your specific needs by requesting a demo

Start for free