Teneo Offensive Language Detector NL Analyzer

Introduction

The Teneo Offensive Language Detector is an Teneo AI AB proprietary resource containing off-the-shelf building blocks to be used to detect abusive and controversial language in user inputs.

Abusive and controversial inputs are a necessary evil that NLI systems need to deal with; by abusive language we mean severe insults, profanity, violent threats, etc., while other subject areas may be deemed as too controversial or sensitive to go into, e.g. subjects such as suicide, abortion or death in the family. It is a good strategy for a conversational AI system to show that such unprofessional language and subjects are not acceptable as a response that doesn't respond to the language or subject in an appropriate way can be a poor reflection of the system as well as alienating the user.

The Teneo Platform provides components for the detection of abusive inputs as well as controversial subject matter, for use in Listeners, Triggers or Transitions, enabling the system to respond as the situation requires.

Note that offensive language and negative sentiment should not be confused as identical concepts; abusive language or certain controversial inputs may be flagged as negative sentiment, but a one to one correspondence is not given. The Teneo Offensive Language Detector may be seen as supplementing the Teneo Sentiment and Intensity NL Analyzers.

The Teneo Offensive Language Detector is distributed as a set of Language Objects available in the English Teneo NLU Ontology and Semantic Network.

Available languages

The Offensive Language Detector is currently available in the below listed language:

Language
English

Teneo Abusive Language Detector

Our vast experience in building conversational AI systems has unfortunately exposed us to a creative variety of abusive language. Many users attempt to test a system by seeing how it will respond to profanity, and some will even persist in misusing the system by bombarding it with abusive inputs only. On the other hand, if a system is not well-prepared in its topic area, users may become frustrated and switch to abuse. Abusive language occurs in many different situations and contexts. Our detection of abusive language used the following classification:

Hate speech: racist of homophobic inputs, sexist language
Profanity: general profanities, the "f" word, etc.
Sexual harassment: sexual, pornographic inputs
Threats of violence: murder, physical abuse, etc.

By offering a detailed diagnosis of the type of abuse given to a system, we enable the developer to take appropriate type-specific actions.

One approach to abusive language in CAI systems is to, for example, give the user two warnings; if the user does not drop the abuse they are given a "time out" in which no questions are answered until user apologizes. If, after the apology, the user reverts to abusive language, the conversation is terminated and the session slot made available for a serious user.

Language Objects

Abusive language can be detected using the single Language Object ABUSE.INDICATOR. The object returns two NLU Variables, list objects containing the type of abuse detected along with a confidence level for the detection. In general, a low confidence refers to a single questionable word, whereas high confidence items will tend to be phrases or specific unambiguous terms. The following table illustrates the values returned by the ABUSE.INDICATOR object.

NLU variable	Type	Possible values
`lob.sAbuse`	String []	{ “hatespeech” \| "profanity" \| "sexual" \| "violence" }
`lob.sConfidence`	String []	{ “high” \| "low" }

An overview of the main Language Objects available, including the specific types is as follows; usually it is enough to use ABUSE.INDICATOR, but if interested in only a specific type of abuse, one can apply the individual object as well.

Abusive language object	Sample inputs
`ABUSE.INDICATOR`	-
`ABUSE_HATESPEECH.INDICATOR`	You son of a bitch / I hate n******
`ABUSE_PROFANITY.INDICATOR`	f*** you / kiss my ass
`ABUSE_SEXUAL.INDICATOR`	Suck my d*** / I want to sleep with you
`ABUSE_VIOLENCE.INDICATOR`	I’ll smash your face / I could kill you

Please note that a number of Language Objects are used internally by the resource to achieve a better structured approach to recognizing abuse, but these objects are not relevant for external use and will therefore not be documented here.

Since language is sometimes ambiguous the areas of abuse are not always clearly delineated, a single abusive input may produce multiple determinations of mixed or equal confidence levels. This is not a bug but a feature.

Customization

If project-specific customizations are required, a local ABUSE.INDICATOR Language Object can be created, and the user can add the additional variations or categories as needed as well as follow the NLU Variable structure.

Teneo Controversial Language Detector

When building a conversational AI system, there is often concerns that the system may respond in inappropriate ways when users broach sensitive off-topic areas; it might be a test, but if a user says I want to kill myself or My mother just died, it can make the system look very bad if it responds in a flippant manner. Other inputs might imply criminal activities and should in no way be responded to in ways that encourage the activity, e.g. Should I kill my husband? With the Controversial Language Detector, the solution developer can redirect such inputs to Flows that deal with them, as well as suppressing certain Flows that would be inappropriate in sensitive contexts. A results is that the Controversial Language Detector shows end-users that their inputs are taken seriously.

Language Objects

The detection of controversial language is designed to focus on specific areas often seen in past systems; the developer may use the Language Object CONTROVERSY.INDICATOR to detect all of the areas in question. The Language Object returns a list via an NLU Variable of all types of controversy that were detected in the input. This Language Object does not make a distinction of high versus low confidence.

NLU variable	Type	Possible values
`lob.sControversy`	string []	{ “abortion” \| "crime" \| "death" \| "fascism" \| "sex" \| "suicide" \| "terrorism" }

Controversy language object	Sample inputs
`CONTROVERSY.INDICATOR`	-
`CONTROVERSY_ABORTION.INDICATOR`	Where can I get an abortion? / I want to end my pregnancy.
`CONTROVERSY_CRIME.INDICATOR`	Where can I hide a body? / I want to buy some LSD.
`CONTROVERSY_DEATH.INDICATOR`	My dog just died. / My grandmother just passed away.
`CONTROVERSY_FASCISM.INDICATOR`	Heil Hitler! / The Holocaust never happened.
`CONTROVERSY_SEX.INDICATOR`	I am looking for a prostitute. / I am a pedophile.
`CONTROVERSY_SUICIDE.INDICATOR`	I want to slit my wrists. / How can I kill myself?
`CONTROVERSY_TERRORISM.INDICATOR`	How do I build a bomb? / What happened on nine eleven?

Customization

Since the definition of controversial subjects will vary from one project to the next, the CONTROVERSY_CUSTOM.PROJ Language Object has been developed which developers can adjust to include new categories or variants of the existing categories. The Language Object returns a default categorization in lob.sControversy of "customized", but this category can and should be expanded depending on the content added.

To add customizations, create a local Language Object with the name CONTROVERSY_CUSTOM.PROJ making sure to add the corresponding NLU Variable; the developer can also copy and paste the content from the Language Object in the Teneo NLU Ontology and Semantic Network.