Revolutionize classification with Generative AI: The zero-shot approach

Discover the intricate workings of zero-shot classification and gain insights on its potential impact on AI-driven classification tasks.

Abhay Jain

Senior Data Scientist

Mandar Patil

Senior Data Scientist

Summary

Amid its evolving business use cases, generative artificial Intelligence is integral to classifying and grouping observations into categories, essential for the automated analysis and processing of large volumes of data. The zero-shot classification (ZSC) approach by harnessing generative AI is a ground-breaking technique for categorizing text data that eliminates the need for a training phase using pre-trained large language models (LLMs). While zero-shot classification has its share of challenges, the potential of LLMs to develop multimodal capabilities means that it could be applied to different data types, such as images or other media formats. Read on to learn about the potential of the zero-shot classification method in advancing AI-driven classification.

Generative Artificial Intelligence (GAI):

A Catalyst for Transforming Fraud Detection and Prevention

Helping enterprises solve

data-related problems with Generative AI

Summary

Artificial Intelligence integrally lends itself to the essential function of classification, i.e., systematically grouping observations into categories. For example, an AI system can efficiently analyze and evaluate customer reviews of a restaurant per the language and tone in the text, sorting them into positive or negative. This mechanism thus forms the bedrock for automated analysis and processing, centralizing data for an organization to enhance its business strategies.

Traditional classification methods necessitate extensive training and large labeled data sets to create an algorithm that accurately classifies. This poses a challenge for companies with tight budgets and time constraints. Zero-shot classification (ZSC) has emerged as a ground-breaking technique for categorizing text data, eliminating the need for a training phase using pre-trained large language models (LLMs).

Positives of zero-shot classification

Time-efficient results

ZSC draws upon existing knowledge within a pre-trained large language model (LLM) to classify text without additional samples. LLMs have an inherent grasp of semantics and context, enabling them to streamline the traditional classification process by eliminating the need for initial data analysis, data cleansing, class identification, and model training. As such, the selected model is accurate and efficient, producing prompt results.

Higher scalability

The versatility of ZSC techniques allows for agility in adapting to changing classification needs, significantly increasing efficiency and scalability. Unlike traditional methods that require the model to be retrained when additional classes are needed, ZSC allows for modifying prompts fed into the LLM, enabling immediate use of new categories. For example, a transition from two-class classification (positive/negative) to three-class classification (positive/negative/neutral) can be smoothly accommodated with ZSC, increasing process productivity.

Mitigated data privacy risks

ZSC doesn’t generate text out of the box. Instead, the LLM is provided with the text to be classified along with a list of labels, then assigns labels and their probabilities. Suppose appropriate data and suitable classes are supplied. Compared to traditional approaches, this technology is deemed low risk regarding data privacy as it does not require the utilization of specific, potentially sensitive data for model training.

The advantages of ZSC have unbottled an exciting spectrum of capabilities across various industries. ZSC has the potential to alter how organizations handle large volumes of data and efficiently categorize information. As it continues to develop, it’s likely to be adopted across various industries to enhance efficiency and productivity.

Proliferation of applications

Businesses often deal with a vast array of documents from different departments. ZSC enables efficient sorting and categorization without the need for custom-trained models. In the medical field, it can help identify health conditions in patient records, improving the accuracy and speed of diagnoses.

LLMs possess the ability to comprehend the context and rapidly changing trends. By incorporating ZSC models, it becomes possible to detect offensive language and content with remarkable precision. This innovative solution suits social media platforms where content moderation is paramount. Often labor-intensive and costly, data labeling can also be transformed with ZSC: large volumes of unlabeled data can be automatically labeled with high confidence based on probability scores, saving time and resources.

Additionally, ZSC can improve efficiency in search engines by automatically classifying a user’s text into specific categories, thereby reducing the number of results the user needs to review and making the process more streamlined and user-friendly.

Compounding the classification potential of ZSC using APIs

Ease and efficiency of text classification tasks are possible with ChatGPT, GPT-3, and Facebook’s BART. Leveraging the capabilities of these language models, specifically for Zero-Shot Classification (ZSC), is achievable through their APIs. We have developed two varieties of APIs for enhanced user experience: single-sentence classification and bulk classification.

The single-sentence classification API enables users to classify individual or small sets of sentences into specified classes by providing the text and class names. In response, the API returns classification results along with associated probabilities.

The bulk classification API for larger data sets can simultaneously process thousands of data points. This API is useful for categorizing large volumes of unlabeled data stored in Excel or CSV formats. Users upload their data file, specify the possible classes, and submit the request. The API then generates an output file with assigned labels and additional information like confusion matrices and accuracy metrics to evaluate the model’s performance.

A layered API architecture makes these abilities possible and performs multiple functions, including submitting and receiving data from the chosen LLM.

Decoding the API architecture

The ZSC API consists of four main elements: text data or file input, possible classes, a multi-level flag, and a model name. The input layer is at the forefront, the entry point for user data input. The multi-level flag, which can be toggled between true or false, allows for sentence classification into multiple categories. The model’s name parameter designates the specific model for classification — for example, Facebook’s BART or GPT-3.

At the front end, the model comprises four layers. The input layer serves as the initial stage where users input their data. Next, the data is preprocessed, which converts unstructured data into a structured format when necessary and prepares it for the LLMs. Data is passed to LLMs in the third layer, generating output responses. Finally, the fourth layer involves post-processing the results, analyzing them, and returning the information to the user.

Metrics and accuracy

For performance analysis of ZSC, predicted classes are compared with labeled data. Accuracy metrics, such as F1 scores, confusion matrices, and classification reports, comprehensively assess the model’s effectiveness.

LLMs can often classify text effectively because they are exposed to vast amounts of data and patterns. However, accuracy may need to improve when dealing with specific data in specialized fields. For example, when tested on medical data, the BART model achieved around 85% accuracy, while GPT-3 performed slightly better with an accuracy of approximately 92–93%.

However, hallucination in ZSC is minimal, as text data is classified into distinct categories, each with an associated probability. Post-processing and threshold values for classification probabilities can also be helpful in this regard. Most importantly, having unambiguous, clean, and distinct classes reduces the likelihood of hallucination. It is expected that as generative AI models improve, their ability to handle complex and unstructured data will increase, leading to better accuracy for challenging classification tasks.

Current challenges

One of the biggest challenges with using LLMs for ZSC is the limit of 4,000 tokens per API request in models like GPT. Facebook’s BART also has capacity constraints, as it can only handle up to 10 classes for classification.

It, therefore, becomes crucial to pre-process and engineer prompts while maintaining context and semantics to ensure the effective processing and classification of input data. The development of generative AI models predicts to the integration of more parameters, leading to heightened accuracy. In particular, the expected arrival of models such as GPT-4 will yield unprecedented levels of power and explainability, revolutionizing the AI industry.

Widening the horizons of text classification to multimedia

At Fractal, we are narrowing our focus on natural language processing for our current text-based input classification. However, the potential for AI models to develop multimodal capabilities means that we anticipate a future where different data types, such as images or other media formats, may be classified using similar techniques.

Image classification is an established field that can be enhanced with the potential of Zero-shot Classification (ZSC). Extending ZSC to media forms beyond text could unlock new possibilities for AI-driven classification across multiple domains. We are excited to explore these prospects as we venture into the broader AI and machine learning landscape.

Explore Fractal’s GenAI capabilities

Explore Now