SCIPY India 2013

Topic

∂ͻrypt: A novel text categorization and classification system

Session Summary

We introduce dCrypt, a novel ML algorithm, and its Python implementation, for unstructured text categorization and classification. dCrypt refines the classic text categorization algorithm of Cavnar and Trenkle (Proc. SDAIR ’94) to construct label-dependent n-gram profiles, as well as label-dependent multi-criteria feature selection. The dCrypt implementation is fully featured, and algorithmic details are abstracted away from the user by careful design of the application interface and modularizing Python components. Tuple-based dictionary design implies sparse storage of large vocabulary sets and efficient querying. Where scale prevents in-memory operations, we demonstrate use of in-process key-value stores, such as Redis and pros and cons vis-a-vis sqlite. We discuss a number of applications of the algorithm, including automated mapping of product attributes from unstructured product descriptions, and prediction of insurance and credit card fraud from case descriptions. Extensions of the algorithm to unsupervised and semi-supervised learning are discussed along with adaptation of the algorithm to the MapReduce framework.

About the speaker

Ruchir is a Data Scientist and holds a B.Tech. from IIT Kanpur. He is involved in development of solutions around unstructured/semi-structured text.

Event Date

December 15, 2013

AI and Data Engineering>

Generative AI>

Finance Analytics>

Dimension>

Supply Chain>

Responsible AI>

Quantum Computing>

IME>

ESG>

FAA>

FinalMile>

Customer Experience>

MLOps>

Trial Run>

Customer Genomics>

Image & Video Analytics>

Cognitive automation>

Conversational AI>

AIDE>

AI @ Scale>

Caliper>

Consumer Hub>

Concordia>

Asper.ai>

Crux Intelligence>

Eugenie.ai>

Flyfish.ai>

Qure.ai>

Senseforth.ai>

SCIPY India 2013

Topic

Session Summary

About the speaker

Event Date

Speaker

Consumer Packaged Goods>

Retail>

Industrials, Manufacturing, and Energy>

Healthcare & Life Sciences>

Financial Services>

Technology, Media, Telecom>

Insurance>

Capabilities

AI and Data Engineering>

Generative AI>

Finance Analytics>

Dimension>

Supply Chain>

Responsible AI>

Quantum Computing>

IME>

ESG>

FAA>

FinalMile>

Customer Experience>

MLOps>

Solutions

Trial Run>

Customer Genomics>

Image & Video Analytics>

Cognitive automation>

Conversational AI>

AIDE>

AI @ Scale>

Caliper>

Consumer Hub>

Concordia>

Products

Asper.ai>

Crux Intelligence>

Eugenie.ai>

Flyfish.ai>

Qure.ai>

Senseforth.ai>

Overview>

Webinars>

Client Advisory Board (CAB)>

AI Series>

Life at Fractal>

Job Openings at Fractal.ai>

Job Openings at Fractal Alpha>

ReBoot>

CEO message on COVID-19>

Our Values>

Leadership>

Newsroom>

Partnerships and Alliances>

Corporate Social Responsibility (CSR)>

Topic

Session Summary

About the speaker

Event Date

Speaker

Ruchir Gupta

Ruchir Gupta

Data Scientist