Text classification

You can use our out-of-the-box Text classification tool to create global classifications on the entire unstructured text asset. To learn how to annotate strings of characters in the text file, see Named entity recognition.

Natural Language Processing (NLP) is an area of research and application that explores how to use computers to “understand” and manipulate natural language, such as text or speech. Most NLP techniques rely on machine learning to derive meaning from human languages. One of NLP’s methodologies for processing natural language is text classification, a method that leverages deep learning to categorize sequences of unstructured text.

We advise that you invest enough time pre-processing your data and configuring your ontology to avoid flaws or irregularities in your labeled dataset.

Import text data

To learn how to import your text files directly, see Direct upload. To learn how to import your text file URLs via JSON, see Import via JSON. To learn how to attach metadata to your imported text files, see Asset metadata via JSON.

Set up Text classification

  1. Create a project.
  2. Select “Editor” as your label editor.
  3. Click “Add classification” and name it.
  4. Select the classification type.
  5. Click “Confirm”.
  6. Click “Complete setup”.

To see a sample script for setting up your project’s ontology programmatically, see Project setup script.

(Note) You also have the option to reuse ontologies from other projects. To learn more, see Ontology overview.

Label format

To learn how to export your annotations, see How to export labels. To see a sample export for Text classification, see Label export formats.

Was this page helpful?

Named entity recognition (NER)