Mining Suspicious Tax Evasion Groups in Big Data

F. Tian, T. Lan, Kuo-Ming Chao, N. Godwin, Q. Zheng, Nazaraf Shah, F. Zhang

    Research output: Contribution to journalArticlepeer-review

    44 Citations (Scopus)


    There is evidence that an increasing number of enterprises plot together to evade tax in an unperceived way. At the same time, the taxation information related data is a classic kind of big data. These issues challenge the effectiveness of traditional data mining-based tax evasion detection methods. To address this problem, we first investigate the classic tax evasion cases, and employ a graph-based method to characterize their property that describes two suspicious relationship trails with a same antecedent node behind an Interest-Affiliated Transaction (IAT). Next, we propose a Colored Network-Based Model (CNBM) for characterizing economic behaviors, social relationships, and the IATs between taxpayers, and generating a Taxpayer Interest Interacted Network (TPIIN). To accomplish the tax evasion detection task by discovering suspicious groups in a TPIIN, methods for building a patterns tree and matching component patterns are introduced and the completeness of the methods based on graph theory is presented. Then, we describe an experiment based on real data and a simulated network. The experimental results show that our proposed method greatly improves the efficiency of tax evasion detection, as well as provides a clear explanation of the tax evasion behaviors of taxpayer groups.
    Original languageEnglish
    Pages (from-to)2651 - 2664
    JournalIEEE Transactions on Knowledge and Data Engineering
    Issue number10
    Early online date8 Jun 2016
    Publication statusPublished - 1 Oct 2016

    Bibliographical note

    This paper is not yet available on the repository


    • big data
    • Graph mining
    • tax evasion
    • interest-affiliated transaction
    • heterogeneous information network


    Dive into the research topics of 'Mining Suspicious Tax Evasion Groups in Big Data'. Together they form a unique fingerprint.

    Cite this