We address the problem of computational sarcasm detection in natural language texts.
Understanding sarcasm requires contextual information, such as commonsense facts and cultural elements. As a linguistic phenomenon, we distinguish between two types of sarcasm. Encoded sarcasm occurs when the emitter of a statement has the specific intention of being sarcastic. Decoded sarcasm occurs when the receptor perceives the statement as sarcastic, irrespective of the intention of the emitter.
We believe that previous attempts at building annotated datasets for sarcasm detection are suboptimal because they do not account for the contextual nature of sarcasm. This gives us reasons to doubt the effectiveness of existing models built upon those datasets. We aim to build a gold standard dataset by collecting examples of encoded sarcasm, and looking at how receptors of varying backgrounds perceive them. We also aim to train context-aware machine learning models on this dataset and test the validity of our hypothesis by evaluating the relative benefit of using these models in sentiment analysis systems.