Text analytics

Text is one of the richest but most challenging types of business data. It lives in emails, support tickets, surveys, reviews, contracts, and more. To extract meaning from it at scale, you need the right mix of automation, structure, and interpretability.

TrueState supports a powerful set of text analytics tools, all accessible directly within the Pipeline canvas. These tools let you enrich, classify, tag, and summarise unstructured text—without writing code or managing external models.

This guide explains each of the supported methods, when to use them, and how to configure them inside a pipeline.

Supported methods:

Automations from Pipelines – Flexible LLM chains, scraping, and external enrichment
High-volume LLM inference – Fast summarisation and enrichment
Text Classification – Direct label assignment using fine-tuned classifiers
Universal Classification – Single-label classification using logic-based statements
Tagging – Multi-label classification using pipe-separated tags
Hierarchy Classification – Tree-based, MECE-aligned label selection

1. Calling automations from pipelines

For advanced or multi-step enrichment, you can call a full automation from within a pipeline.

Using the Automation step, you can run any automation once per record. Each input is passed through the automation, and the result is appended to the dataset.

This allows you to combine:

Large Language Model (LLM) chains
Web scraping
Conditional logic
External APIs

For more on how to build these, see the Automations guide.

Use cases:

Summarising documents using GPT-4 or Claude
Enriching lead records with web-sourced company metadata
Running multi-hop reasoning on product reviews

Use Automations when you need control, orchestration, or hybrid workflows. If you only need high-throughput enrichment, prefer LLM inference.

2. High-volume LLM inference

The LLM Inference step uses small, efficient models to enrich text quickly and cost-effectively. It’s ideal when you need structured output across thousands of records.

Capabilities include:

Summarising product descriptions
Extracting entities or topics
Rewriting or simplifying text
Light classification or tone detection

Use cases:

Creating short-form summaries for a UI
Extracting country and company mentions from survey responses
Rephrasing raw notes into business-ready summaries

This step is optimised for performance and throughput—not deep reasoning or chain-of-thought logic.

3. Text classification

Text Classification allows you to assign a label to each record by choosing from a predefined set of categories. It uses pre-trained or fine-tuned models behind the scenes to select the most appropriate label from your list.

Unlike Universal Classification or Tagging, this approach doesn’t require logic statements—it simply learns the mapping from text to labels based on examples or embeddings.

Use cases:

Categorising feedback into predefined themes (e.g., UI, Pricing, Support)
Assigning sentiment categories: Positive, Neutral, Negative
Labeling intent in form submissions or queries

Configuration:

You provide a list of possible labels (e.g., “Bug”, “Feature Request”, “General Inquiry”)
The model selects the best match for each row

Use Text Classification when you already have a clear list of categories and don’t need explanation logic or flexible tagging.

4. Universal classification

Universal classification uses a Natural Language Inference (NLI) model to classify a text input based on a statement you provide.

You define a set of labels, each paired with a statement. The model determines whether that statement is entailed by the input. If so, the corresponding label is applied.

Example:

Input: “The customer asked for a refund after receiving a broken item.”
Statement: “This message is a complaint.”
→ Entailed → Assign label: "Complaint"

Use cases:

Intent detection in messages or tickets
Filtering for eligibility criteria in open-ended responses
Auto-labeling short texts for downstream filtering

Write statements as plain-English factual assertions. Avoid ambiguous or compound phrasing.

5. Tagging (criteria-based)

Tagging is a multi-label abstraction of universal classification. Instead of assigning just one label, you define a set of tags, each with one or more criteria statements. If any statement is entailed by the input, the tag is applied.

Multiple tags can be assigned per row. They are returned as a |-separated string.

Example output: "Complaint|Urgent|Refund"

Use cases:

Flagging multiple concerns in a support transcript
Annotating user feedback with multiple themes
Extracting overlapping topics from interviews

How to define: Each tag is defined as:
TagName | Statement
Multiple criteria can be attached to the same tag.

Use tagging when you want broad annotation across multiple dimensions. For single-label classification, use universal classification instead.

6. Hierarchy classification

Hierarchy classification is for structured, multi-level label selection. You define a hierarchy of labels, grouped by level, where the model evaluates entailed statements within each level and selects the highest-scoring peer.

This approach ensures mutually exclusive decisions at each level of a hierarchy.

Key rule: Labels at each level must be MECE (Mutually Exclusive, Collectively Exhaustive).

Example structure:

Level 1

Product Feedback: “This message is about the product.”
Support Request: “This message is asking for help.”
General Comment: “This message is general commentary.”

Level 2 (under Product Feedback)

Pricing Concern: “The message discusses the product’s pricing.”
Feature Request: “The message asks for a new product feature.”

If a record is classified as “Product Feedback” at Level 1, it will be evaluated against the Level 2 options. Among any peer group, only the highest-scoring label is selected.

Use cases:

Classifying tickets into department → topic → subtopic
Routing forms through a business process hierarchy
Multi-level content categorisation

Use hierarchy classification when your label set is nested or tree-structured. Ensure each group at a level has no overlaps in definition.

Choosing the right text analytics method

Goal	Recommended Step	Notes
Flexible, high-quality enrichment	Automation step	Use for chains, scraping, or external APIs
Fast enrichment at scale	LLM Inference step	Best for summarisation and extraction
Simple multi-class prediction	Text Classification	Use when categories are known and unambiguous
Logic-based single-label classification	Universal Classification	Uses NLI to match statements
Multi-label annotation	Tagging step	Flexible tagging using pipe-separated output
Tree-based classification	Hierarchy Classification	Supports nested taxonomies with MECE label logic

Glossary

Automation step – Executes a full automation workflow for each row in a dataset.
LLM Inference step – Applies fast, high-throughput language models to text.
Text Classification – Assigns a best-match label from a list without needing logic statements.
Universal Classification – Uses NLI to assign a single label based on entailed logic.
Tagging – Assigns multiple tags using matching statements and pipe-separated outputs.
Hierarchy Classification – Selects labels across multiple levels with MECE structure.
MECE – A classification principle: Mutually Exclusive, Collectively Exhaustive.

Best practices

Write clear, concise, and specific statements for classification
Use Text Classification when labels are stable and fixed
Don’t overload tagging steps—group by theme when possible
Validate performance on a small batch before scaling
Use Automations for advanced logic, but monitor cost and latency

Next steps

Go to the Pipeline section in TrueState
Upload a dataset with one or more text columns
Add the appropriate text analytics node to your pipeline
Configure using natural language, label lists, or tag templates
Combine with downstream classification, enrichment, or reporting steps

Get Started

Essentials

Guides

Supported methods:

1. Calling automations from pipelines

2. High-volume LLM inference

3. Text classification

4. Universal classification

5. Tagging (criteria-based)

6. Hierarchy classification

Example structure:

Level 1

Level 2 (under Product Feedback)

Choosing the right text analytics method

Glossary

Best practices

Next steps

Get Started

Essentials

Guides

​Supported methods:

​1. Calling automations from pipelines

​2. High-volume LLM inference

​3. Text classification

​4. Universal classification

​5. Tagging (criteria-based)

​6. Hierarchy classification

​Example structure:

​Level 1

​Level 2 (under Product Feedback)

​Choosing the right text analytics method

​Glossary

​Best practices

​Next steps

Supported methods:

1. Calling automations from pipelines

2. High-volume LLM inference

3. Text classification

4. Universal classification

5. Tagging (criteria-based)

6. Hierarchy classification

Example structure:

Level 1

Level 2 (under Product Feedback)

Choosing the right text analytics method

Glossary

Best practices

Next steps