Ph.D. started in: 2017
Expected year of graduation: 2021
COINS consortium member: Norwegian University of Science and Technology
Supervised by: Slobodan Petrovic, Katrin Franke
Research area: Digital Forensics
Project title: Detecting Dynamic Attack Patterns in Large and Diverse Data Sources
Project description: This project is done in part of the Ars Forensica project on “Computational Forensics for Large-scale Fraud Detection, Crime Investigation and Prevention”. My task in the project is to produce better algorithms for this end.
In particular, there is too much data for digital forensic analysts to process and there are ever evolving dynamic attacks which can bypass modern network intrusion detection systems. My research is about improving the accuracy and efficiency of approximate search and data reduction methods. Data reduction methods, such as fuzzy hash functions (similarity-preserving hash functions), may be used to identify relevant data for search without having to entire parse through entire volumes of data. Improved approximate pattern matching methods would return less frivolous hits from a search query. Additional benefits of improved approximate search are that it would allow for detection for dynamic attack patterns in intrusion detection with a less than typical number of false positives. These techniques may be used in concert to for digital forensic analysts to waste less time finding the evidence or information they may be looking for.
In general, the methods to accomplish are research questions are simple. Read state of the art literature related to our topics, create hypotheses for our research questions, gather theoretical background which supports our hypotheses, implement our solutions, analyze our solutions, and compare our results to the rest of the state of the art. Theoretical interests which will aid in our endeavors for improved approximate pattern matching are non-deterministic finite automata, dynamic programming matrices, bit parallelism, constrained edit-distance, bit-splitting architecture, practical heuristics, and other topics as well.
Data reduction methods such as fuzzy matching are less supported by rigorous mathematical groundings, but we investigate methods that have proven to work and are open for experimenting with novel methods. The state of the art uses techniques such as rolling hashes, non-cryptographic hashing functions, and fast algorithms for distinguishing unique data (such as functions for calculating Shannon Entropy on chunks of data).