Auditing Section Research Summaries Space

A Database of Auditing Research - Building Bridges with Practice

This is a public Custom Hive  public

research summary

    Finding Needles in a Haystack: Using Data Analaytics to...
    research summary posted June 26, 2017 by Jennifer M Mueller-Phillips, tagged 06.02 Fraud Risk Models, 08.09 Impact of Technology on Audit Procedures 
    55 Views
    Title:
    Finding Needles in a Haystack: Using Data Analaytics to Improve Fraud Predication
    Practical Implications:

    Data analytics can be used to create fraud prediction models that help auditors improve audit planning decisions. It can also be used to help regulators identify firms for potential fraud investigation. In particular, the SEC is investing resources to develop better fraud risk models and the results of this study could be useful. 

    Citation:

    Perols, Johan L., R. M. Bowen, C. Zimmermann, and B. Samba. 2017. “Finding Needles in a Haystack: Using Data Analytics to Improve Fraud Prediction”. The Accounting Review. 92.2 (2017): 221.

    Keywords:
    fraud; financial statement fraud; data analytics; predictive analytics; data rarity; data imbalance
    Purpose of the Study:

    Financial statement fraud causes organizations to lose an estimated 1.6% of annual revenue. This study examines 3 different methods that use data analytics in an attempt to predict fraud. The methods are as follows:

    • The first method, Multi-Subset Observation Undersampling (OU), addresses the imbalance between the low number of fraud observations relative to the number of non-fraud observations by creating multiple subsets of the original dataset that each contain all fraud observations and different random subsamples of non-fraud observations.
    • The second method, Multi-subset Variable Understampling (VU), addresses the imbalance between the low number of fraud observations relative to the number of explanatory variables identified in the fraud prediction literature by creating multiple subsets of randomly selected explanatory variables.
    • The third method, VU partitioned by type of fraud (PVU), is a variation of the second method that addresses issues associated with treating all fraud cases as homogenous events.
    Design/Method/ Approach:

    The sample contains data from 51 fraud firms. The authors identified fraud firms from SEC investigations that were reported in AAERS from 1998-2005. The objectives of the experiments were to determine how to best implement OU and VU and then to subsequently evaluate their performance against benchmarks. 

    Findings:

    The authors find the following:

    • When the Multi-Subset Observation Undersampling (OU) is used with 12 subsamples it improves fraud prediction by lowering the expected cost of misclassification by more than 10% relative to the best performing benchmark.
    • The Multi-Subset Variable Undersampling (VU) was found to improve fraud prediction in select situations. However, it does not do so reliably.
    • The Multi-Subsets Variable Undersampling by partitioning variables into subsets (PVU) was able to improve fraud prediction and reduce the expected cost of miscalculation by 9.6% relative to the best performing VU benchmark.
    Category:
    Auditing Procedures - Nature - Timing and Extent, Risk & Risk Management - Including Fraud Risk
    Sub-category:
    Fraud Risk Models, Impact of Technology on Audit Procedures Confirmation – Process and Evaluation of Responses
    Home:

    http://commons.aaahq.org/groups/e5075f0eec/summary