# Teneo Offensive Language Detector NL Analyzer

## Introduction

The Teneo Offensive Language Detector is an Artificial Solutions' proprietary resource containing off-the-shelf building blocks to be used to detect abusive and controversial language in user inputs.

Abusive and controversial inputs are a necessary evil that NLI systems need to deal with; by abusive language we mean severe insults, profanity, violent threats, etc., while other subject areas may be deemed as too controversial or sensitive to go into, e.g. subjects such as suicide, abortion or death in the family. It is a good strategy for a conversational AI system to show that such unprofessional language and subjects are not acceptable as a response that doesn't respond to the language or subject in an appropriate way can be a poor reflection of the system as well as alienating the user.

The Teneo Platform provides components for the detection of abusive inputs as well as controversial subject matter, for use in Listeners, Triggers or Transitions, enabling the system to respond as the situation requires.

Note that offensive language and negative sentiment should not be confused as identical concepts; abusive language or certain controversial inputs may be flagged as negative sentiment, but a one to one correspondence is not given. The Teneo Offensive Language Detector may be seen as supplementing the Teneo Sentiment and Intensity NL Analyzers.

?> The Teneo Offensive Language Detector is distributed as a set of Language Objects available in the English Teneo NLU Ontology and Semantic Network.

### Available languages

The Offensive Language Detector is currently available in the below listed language:

| Language |
| -------- |
| English  |

## Teneo Abusive Language Detector

Our vast experience in building conversational AI systems has unfortunately exposed us to a creative variety of abusive language. Many users attempt to test a system by seeing how it will respond to profanity, and some will even persist in misusing the system by bombarding it with abusive inputs only. On the other hand, if a system is not well-prepared in its topic area, users may become frustrated and switch to abuse. Abusive language occurs in many different situations and contexts. Our detection of abusive language used the following classification:

+ **Hate speech**: racist of homophobic inputs, sexist language
+ **Profanity**: general profanities, the "f" word, etc.
+ **Sexual harassment**: sexual, pornographic inputs
+ **Threats of violence**: murder, physical abuse, etc.

By offering a detailed diagnosis of the type of abuse given to a system, we enable the developer to take appropriate type-specific actions.

One approach to abusive language in CAI systems is to, for example, give the user two warnings; if the user does not drop the abuse they are given a "time out" in which no questions are answered until user apologizes. If, after the apology, the user reverts to abusive language, the conversation is terminated and the session slot made available for a serious user.

### Language Objects

Abusive language can be detected using the single Language Object `ABUSE.INDICATOR`. The object returns two NLU Variables, list objects containing the type of abuse detected along with a confidence level for the detection. In general, a low confidence refers to a single questionable word, whereas high confidence items will tend to be phrases or specific unambiguous terms. The following table illustrates the values returned by the `ABUSE.INDICATOR` object.

| NLU variable      | Type        | Possible values                                             |
| ----------------- | ----------- | ----------------------------------------------------------- |
| `lob.sAbuse`      | String \[\] | \{ “hatespeech” \| "profanity" \| "sexual" \| "violence" \} |
| `lob.sConfidence` | String \[\] | \{ “high” \| "low" \}                                       |

An overview of the main Language Objects available, including the specific types is as follows; usually it is enough to use `ABUSE.INDICATOR`, but if interested in only a specific type of abuse, one can apply the individual object as well.

| Abusive language object      | Sample inputs                              |
| ---------------------------- | ------------------------------------------ |
| `ABUSE.INDICATOR`            | \-                                         |
| `ABUSE_HATESPEECH.INDICATOR` | You son of a bitch / I hate n\*\*\*\*\*\*  |
| `ABUSE_PROFANITY.INDICATOR`  | f\*\*\* you / kiss my ass                  |
| `ABUSE_SEXUAL.INDICATOR`     | Suck my d\*\*\* / I want to sleep with you |
| `ABUSE_VIOLENCE.INDICATOR`   | I’ll smash your face / I could kill you    |

Please note that a number of Language Objects are used internally by the resource to achieve a better structured approach to recognizing abuse, but these objects are not relevant for external use and will therefore not be documented here.

Since language is sometimes ambiguous the areas of abuse are not always clearly delineated, a single abusive input may produce multiple determinations of mixed or equal confidence levels. This is not a bug but a feature.

### Customization

If project-specific customizations are required, a local `ABUSE.INDICATOR` Language Object can be created, and the user can add the additional variations or categories as needed as well as follow the NLU Variable structure. 

## Teneo Controversial Language Detector

When building a conversational AI system, there is often concerns that the system may respond in inappropriate ways when users broach sensitive off-topic areas; it might be a test, but if a user says *I want to kill myself* or *My mother just died*, it can make the system look very bad if it responds in a flippant manner. Other inputs might imply criminal activities and should in no way be responded to in ways that encourage the activity, e.g. *Should I kill my husband?* With the Controversial Language Detector, the solution developer can redirect such inputs to Flows that deal with them, as well as suppressing certain Flows that would be inappropriate in sensitive contexts. A results is that the Controversial Language Detector shows end-users that their inputs are taken seriously.

### Language Objects

The detection of controversial language is designed to focus on specific areas often seen in past systems; the developer may use the Language Object `CONTROVERSY.INDICATOR` to detect all of the areas in question. The Language Object returns a list via an NLU Variable of all types of controversy that were detected in the input. This Language Object does not make a distinction of high versus low confidence.

| NLU variable       | Type        | Possible values                                              |
| ------------------ | ----------- | ------------------------------------------------------------ |
| `lob.sControversy` | string \[\] | \{ “abortion” \| "crime" \| "death" \| "fascism" \| "sex" \| "suicide" \| "terrorism" \} |

| Controversy language object       | Sample inputs                                              |
| --------------------------------- | ---------------------------------------------------------- |
| `CONTROVERSY.INDICATOR`           | \-                                                         |
| `CONTROVERSY_ABORTION.INDICATOR`  | Where can I get an abortion? / I want to end my pregnancy. |
| `CONTROVERSY_CRIME.INDICATOR`     | Where can I hide a body? / I want to buy some LSD.         |
| `CONTROVERSY_DEATH.INDICATOR`     | My dog just died. / My grandmother just passed away.       |
| `CONTROVERSY_FASCISM.INDICATOR`   | Heil Hitler\! / The Holocaust never happened.              |
| `CONTROVERSY_SEX.INDICATOR`       | I am looking for a prostitute. / I am a pedophile.         |
| `CONTROVERSY_SUICIDE.INDICATOR`   | I want to slit my wrists. / How can I kill myself?         |
| `CONTROVERSY_TERRORISM.INDICATOR` | How do I build a bomb? / What happened on nine eleven?     |

### Customization

Since the definition of controversial subjects will vary from one project to the next, the `CONTROVERSY_CUSTOM.PROJ` Language Object has been developed which developers can adjust to include new categories or variants of the existing categories. The Language Object returns a default categorization in `lob.sControversy` of "customized", but this category can and should be expanded depending on the content added.

To add customizations, create a local Language Object with the name `CONTROVERSY_CUSTOM.PROJ` making sure to add the corresponding NLU Variable; the developer can also copy and paste the content from the Language Object in the Teneo NLU Ontology and Semantic Network.



RELEASE NOTES

Platform Dependencies

Dependencies and Licenses

Prerequisites of Teneo Frontends

Teneo 7.0

Teneo 7.0.1

Teneo 7.0.2

Teneo 7.0.3

Teneo 7.1

Technology and Deployment

Teneo Studio

Teneo 7.2

Teneo 7.3

Teneo Engine

Teneo Languages

Teneo 7.4

Technology and deployment

REFERENCE

APIs

API reference doc

Teneo Inquire Client

Interfacing with API JSPs

Logins, Sessions and Authentication

Web Socket API

Conceptual Overviews

Annotating Inputs

From Request to Response

Intent Classification

Log Data Handling

NLU Generation

Session Data Model

NLP Capabilities

Chinese IP Chain

Japanese IP Chain

Korean IP Chain

Standard IP Chain

Turkish IP Chain

Date and Time

Named Entity Recognizer

POS Tagger

Conversational Modules

Date & Time

Deprecated objects

Pre-built Entities

Flow lists

Named Entity ANNOT objects

POS/Morphology ANNOT objects

Dialogue Resources

Lexical Resources

Offensive Language Detection

Sentiment & Intensity Analysis

Teneo Programming

TLML Reference Manual

TQL Manual

TENEO 7.4.0

Auto-test

Auto-test results panel

Automated testing

Class management

Class Manager Basics

Class Manager Window

CLU Manager

Troubleshooting Class Manager

Entity

Entity Basics

Entity Window

Entries Editor

Troubleshooting: Entity

Variables

Flow

After Matches

Bot Output

Flow Basics

Flow Link

Flow Listener

Nodes and elements

Flow Script

Flow Variable

Flow window

Integration

Matches

Prompt

Script Node

Sub-flow

Transition

Troubleshooting: Flow

User Intent

Globals

Emotions

Global Listeners

Global Scripted Contexts

Global Scripts

Global Variables

Metadata

Troubleshooting: Globals

Import Export

Add Content

Bulk import

File format

Export

Generative QnA

Language Object

Language Object Basics

Language Object Window

Troubleshooting: Language Object

Optimization

Augmenters

Class performance

Classifier

Improvement suggestions

Log Data Sources

Log Data Source window

Query log data

Ordering

Ordering groups

Ordering suggestions

Trigger Ordering window

Working with trigger relations

Publish

Publish environments

Publish a solution

Resources

Integrations

Resource Files

Troubleshooting: Resources

Solution

Document Basics

Version Control

Keyboard Shortcuts

Lexical Resource Information

Properties Lists

Localization Structure

Updates From Master

Recycle Bin

Script Editor

Search

Solution Properties

Solution Window

Syntax Editor

Troubleshooting: Referenced Documents

Troubleshooting: Solution

Troubleshooting: TLML Syntax

Tryout

Advanced Tryout

Detailed information

Troubleshooting: Tryout

# Teneo Offensive Language Detector NL Analyzer

## Introduction

The Teneo Offensive Language Detector is an Artificial Solutions' proprietary resour