Check out my work on GitHub: @bhairavi-m
Lexical Normalization System 💬
One of the most common methods of obtaining data used for NLP is through social media. This resource’s significant challenge is that the text is not traditionally accurate as it is filled with short forms and colloquial substitutes. The project’s goal was to develop a Lexical Normalization system, which enables efficient information extraction by converting non-standard text to a ready-to-use standard register.
The process involved experimenting with data augmentation methods and implementing baselines such as Maximum Frequency Replacement.
Session-based Skip Prediction: Spotify 🎹
Spotify is a leading music service fuelled by its customization and music knowledge driven by algorithms to understand the way users sequentially interact with music. This project focussed on the task of session-based sequential skip prediction, i.e., predicting whether users will skip tracks, given their immediately preceding interactions in their listening session.
This project was an exciting supervised machine learning classification problem. It helped us understand intricate behavioral patterns of how users engage with tracks and which track features play an important role in prediction.