On the Simulation of Financial Transactions for Fraud Detection Research

Abstract: This thesis introduces a financial simulation model covering two related financial domains: Mobile Payments and Retail Stores systems. The problem we address in these domains is different types of fraud. We limit ourselves to isolated cases of relatively straightforward fraud. However, in this thesis the ultimate aim is to cover more complex types of fraud, such as money laundering, that comprises multiple organisations and domains. Fraud is an important problem that impact the whole economy. Currently, there is a general lack of public research into the detection of fraud. One important reason is the lack of transaction data which is often sensitive. To address this problem we present a Mobile Money Simulator (PaySim) and Retail Store Simulator (RetSim), which allow us to generate synthetic transactional data. These simulations are based on real transaction data. These simulations are multi agent based simulations. Hence, we developed agents that represent the clients in PaySim and customers and salesmen in RetSim. The normal behaviour was based on behaviour observed in data from the field, and is codified in the agents as rules of transactions and interaction between clients, or customers and salesmen. Some of these agents were intentionally designed to act fraudulently, based on observed patterns of real fraud. We introduced known signatures of fraud in our model and simulations to test and evaluate our fraud detection results. The resulting behaviour of the agents generate a synthetic log of all transactions as a result of the simulation. This synthetic data can be used to further advance fraud detection research, without leaking sensitive information about the underlying data. Using statistics and social network analysis (SNA) on real data we could calibrate the relations between staff and customers and generate realistic synthetic data sets that were validated statistically against the original. We then used RetSim to model two common retail fraud scenarios to ascertain exactly how effective the simplest form of statistical threshold detection commonly in use could be. The preliminary results show that threshold detection is effective enough at keeping fraud losses at a set level, that there seems to be little economic room for improved fraud detection techniques.