Search:
Computing and Library Services - delivering an inspiring information environment

Malicious Interlocutor Detection Using Forensic Analysis of Historic Data

Seedall, Michael (2022) Malicious Interlocutor Detection Using Forensic Analysis of Historic Data. Masters thesis, University of Huddersfield.

[img]
Preview
PDF - Accepted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (3MB) | Preview

Abstract

The on-going problem of child grooming online grows year on year and whilst government legislation looks to combat the issue by levying heavier penalties on perpetrators of online grooming, crime figures still increase. Government guidance directed towards digital platforms and social media providers places emphasis on child safety online. As this research shows, government initiatives have proved somewhat ineffective. Therefore, the aim of this research is to investigate the scale of the of the problem and test a variety of machine learning and deep learning techniques that could be used in a novel intelligent solution to protect children from online predation.

The heterogeneity of online platforms means that a one size fits all solution presents a complex problem that needs to be solved. The maturity of intelligent approaches to Natural Language Processing makes it possible to analyse and process text data in a wide variety of ways. Pre-processing data enables the preparation of text data in a format that machines can understand and reason about without the need for human interaction.

The on-going development of Machine Learning and Deep Learning architectures enables the construction of intelligent solutions that can classify text data in ways never imagined. This thesis presents research that tests the application of potential intelligent solutions such as Artificial Neural Networks and Machine Learning algorithms applied in Natural Language Processing. The research also tests the performance of pre-processing workflows and the impact of pre-processing of both online grooming and more general chat corpora. The storage and processing of data via a traditional relational database management system has also been tested for suitability when looking to detect grooming conversation in historical data.

The on-going development of Machine Learning and Deep Learning architectures enables the construction of intelligent solutions that can classify text data in ways never imagined. This thesis presents research that tests the application of potential intelligent solutions such as Artificial Neural Networks and Machine Learning algorithms applied in Natural Language Processing. The research also tests the performance of pre-processing workflows and the impact of pre-processing of both online grooming and more general chat corpora. The storage and processing of data via a traditional relational database management system has also been tested for suitability when looking to detect grooming conversation in historical data.

Document similarity measures such as Cosine Similarity and Support Vector Machines have displayed positive results in identifying grooming conversation, however, a more intelligent solution may prove to have better currency in developing a smart autonomous solution given the ever-evolving lexicon used by participants in online chat conversations.

Item Type: Thesis (Masters)
Subjects: H Social Sciences > H Social Sciences (General)
T Technology > T Technology (General)
Schools: School of Computing and Engineering
Depositing User: Rebecca Hill
Date Deposited: 14 Feb 2022 11:46
Last Modified: 14 Feb 2022 11:46
URI: http://eprints.hud.ac.uk/id/eprint/35673

Downloads

Downloads per month over past year

Repository Staff Only: item control page

View Item View Item

University of Huddersfield, Queensgate, Huddersfield, HD1 3DH Copyright and Disclaimer All rights reserved ©