Goal
Derive meaningful insights and knowledge from large volumes of text data.
Role
I developed a python and spark based tools to recognise, process and summarise untructured text data.
Capability
Highly customisable toolkit for rapid and high quality exploration of text patterns:
- Profile unstructured text data by recognizing and quantifying either all text strings or user-defined ones.
- Capture context variations around text patterns of interest
- Conduct semantic grouping at scale to enhance clarity regarding patterns in text and improve the accuracy of conclusions.
- Produce analytics-ready summarized data.
Tech Stack
Regex, Spark, Python, Big Data
FROM:
Polluted + unnecessarily dilluted view of text patterns

TO:
Contextual focus + meaningful relative scale

Year
2022