EMBEDDED TUTORIAL ON CODE-MIXING IN SOCIAL MEDIA

 

Abstract: Code-mixing, or mixing of more than one language in a single conversation or utterance is a common phenomenon in any multilingual society. Extreme multilinguality of India makes code-mixing extremely common on social media content posted in Indian languages and by Indian users. In this tutorial, we will talk about why code-mixing is, on one hand a computational challenge that must be solved to effectively process IL content, and on the other hand, a wonderful linguistic resource for studying several allied phenomena. The tutorial will also introduce some basic NLP techniques for code-mixed data.


TUTORIAL COORDINATORS

Monojit Choudhury

Microsoft Research Lab India

Kalika Bali

Microsoft Research Lab India