Mining Suspicious Tax Evasion Groups in Big Data

F. Tian, T. Lan, Kuo-Ming Chao, N. Godwin, Q. Zheng, Nazaraf Shah, F. Zhang

Research output: Contribution to journalArticlepeer-review

25 Citations (Scopus)


There is evidence that an increasing number of enterprises plot together to evade tax in an unperceived way. At the same time, the taxation information related data is a classic kind of big data. These issues challenge the effectiveness of traditional data mining-based tax evasion detection methods. To address this problem, we first investigate the classic tax evasion cases, and employ a graph-based method to characterize their property that describes two suspicious relationship trails with a same antecedent node behind an Interest-Affiliated Transaction (IAT). Next, we propose a Colored Network-Based Model (CNBM) for characterizing economic behaviors, social relationships, and the IATs between taxpayers, and generating a Taxpayer Interest Interacted Network (TPIIN). To accomplish the tax evasion detection task by discovering suspicious groups in a TPIIN, methods for building a patterns tree and matching component patterns are introduced and the completeness of the methods based on graph theory is presented. Then, we describe an experiment based on real data and a simulated network. The experimental results show that our proposed method greatly improves the efficiency of tax evasion detection, as well as provides a clear explanation of the tax evasion behaviors of taxpayer groups.
Original languageEnglish
Pages (from-to)2651 - 2664
JournalIEEE Transactions on Knowledge and Data Engineering
Issue number10
Early online date8 Jun 2016
Publication statusPublished - 1 Oct 2016

Bibliographical note

This paper is not yet available on the repository


  • big data
  • Graph mining
  • tax evasion
  • interest-affiliated transaction
  • heterogeneous information network

Fingerprint Dive into the research topics of 'Mining Suspicious Tax Evasion Groups in Big Data'. Together they form a unique fingerprint.

Cite this