The nature of humour is elusive. It is at once universal and yet highly subjective. In terms of natural language processing (NLP), humour makes many common problems in the ﬁeld, such as word sense disambiguation, much more diﬃcult by requiring multiple senses for certain key words. In spite of this complexity, Computational Humour aims to enable computers to understand humour in the same way that humans do. Stock set out its goals as follows:
“A computational humour system should be able to: recognize situations appropriate for humour; choose a suitable kind of humour for the situation; generate an appropriately humorous output; and, if there is some form of interaction or control, evaluate the feedback.”
This deﬁnition highlights a key distinction in the ﬁeld - between humour detection and generation. The former, which is the focus of this proposal, concerns itself with determining which utterances are jokes and evaluating how good these jokes are. It is widely deemed to the foundation on which robust humour generation will be built.
Computational humour detection has several applications. It may be instrumental to automatic content moderation on social media, or to facilitating human-computer interaction in a warmer, more human-like way. However, one facet of humour detection which has not yet been addressed is the line between funny and oﬀensive.
Intergenerational or intercultural joke-telling is often hampered by the fact that what one party deems humorous, the other may ﬁnd oﬀensive. This project aims to address this by reannotating an existing dataset, and creating a new dataset to explore the factors which impact where a listener places a joke on the humour-oﬀensiveness continuum.