A Big Data Analytics Approach for Construction Firms Failure Prediction Models

Hafiz Alaka, Lukumon O. Oyedele, Hakeem O Owolabi, Olugbenga O. Akinade, Muhammad Bilal, Saheed O. Ajayi

Research output: Contribution to journalArticle

Abstract

Using 693,000 datacells from 33,000 sample construction firms that operated or failed between 2008 and 2017, failure prediction models were developed using artificial neural network (ANN), support vector machine (SVM), multiple discriminant analysis (MDA) and logistic regression (LR). The accuracy of the models on test data surprisingly showed ANN to have only a slightly better accuracy than LR and MDA. The ANN’s number of units in the hidden layer and weight decay hyperparameters were consequently tuned using the grid search. Tuning process led to tedious machine computation that was aborted after many hours without completion. The state of art Big Data Analytics (BDA) technology was, for the first time in failure prediction, consequently employed and the tuning was completed in some seconds. Mean accuracy from cross-validation was used for selection of the model with best parameter values which were used to develop a new ANN model which outperformed all previously developed models on test data. Subsequent use of selected variables to develop new models led to reduced tuning computational cost but not improved performance. Since the real-life effect of a misclassification cost is greater than the tedious computation cost, it was concluded that BDA is the best compromise.
LanguageEnglish
Pages(In-Press)
Number of pages10
JournalIEEE Transactions on Engineering Management
Volume(In-Press)
DOIs
StateE-pub ahead of print - 17 Aug 2018

Fingerprint

Neural networks
Tuning
Discriminant analysis
Logistics
Costs
Support vector machines
Big data
Prediction model
Artificial neural network
Failure prediction
Logistic regression
Network model
Compromise
Grid
Decay
Support vector machine
Misclassification
Cross-validation

Keywords

  • Big Data
  • Data models
  • Computational modeling
  • Biological system modeling
  • Analytical models
  • Tuning
  • Computers

Cite this

Alaka, H., Oyedele, L. O., Owolabi, H. O., Akinade, O. O., Bilal, M., & Ajayi, S. O. (2018). A Big Data Analytics Approach for Construction Firms Failure Prediction Models. IEEE Transactions on Engineering Management, (In-Press), (In-Press). DOI: 10.1109/TEM.2018.2856376

A Big Data Analytics Approach for Construction Firms Failure Prediction Models. / Alaka, Hafiz; Oyedele, Lukumon O.; Owolabi, Hakeem O; Akinade, Olugbenga O.; Bilal, Muhammad; Ajayi, Saheed O.

In: IEEE Transactions on Engineering Management, Vol. (In-Press), 17.08.2018, p. (In-Press).

Research output: Contribution to journalArticle

Alaka, H, Oyedele, LO, Owolabi, HO, Akinade, OO, Bilal, M & Ajayi, SO 2018, 'A Big Data Analytics Approach for Construction Firms Failure Prediction Models' IEEE Transactions on Engineering Management, vol. (In-Press), pp. (In-Press). DOI: 10.1109/TEM.2018.2856376
Alaka H, Oyedele LO, Owolabi HO, Akinade OO, Bilal M, Ajayi SO. A Big Data Analytics Approach for Construction Firms Failure Prediction Models. IEEE Transactions on Engineering Management. 2018 Aug 17;(In-Press):(In-Press). Available from, DOI: 10.1109/TEM.2018.2856376
Alaka, Hafiz ; Oyedele, Lukumon O. ; Owolabi, Hakeem O ; Akinade, Olugbenga O. ; Bilal, Muhammad ; Ajayi, Saheed O./ A Big Data Analytics Approach for Construction Firms Failure Prediction Models. In: IEEE Transactions on Engineering Management. 2018 ; Vol. (In-Press). pp. (In-Press)
@article{fb4d87b0f7b448feaf5a553ff6870d7f,
title = "A Big Data Analytics Approach for Construction Firms Failure Prediction Models",
abstract = "Using 693,000 datacells from 33,000 sample construction firms that operated or failed between 2008 and 2017, failure prediction models were developed using artificial neural network (ANN), support vector machine (SVM), multiple discriminant analysis (MDA) and logistic regression (LR). The accuracy of the models on test data surprisingly showed ANN to have only a slightly better accuracy than LR and MDA. The ANN’s number of units in the hidden layer and weight decay hyperparameters were consequently tuned using the grid search. Tuning process led to tedious machine computation that was aborted after many hours without completion. The state of art Big Data Analytics (BDA) technology was, for the first time in failure prediction, consequently employed and the tuning was completed in some seconds. Mean accuracy from cross-validation was used for selection of the model with best parameter values which were used to develop a new ANN model which outperformed all previously developed models on test data. Subsequent use of selected variables to develop new models led to reduced tuning computational cost but not improved performance. Since the real-life effect of a misclassification cost is greater than the tedious computation cost, it was concluded that BDA is the best compromise.",
keywords = "Big Data, Data models, Computational modeling , Biological system modeling , Analytical models , Tuning, Computers",
author = "Hafiz Alaka and Oyedele, {Lukumon O.} and Owolabi, {Hakeem O} and Akinade, {Olugbenga O.} and Muhammad Bilal and Ajayi, {Saheed O.}",
year = "2018",
month = "8",
day = "17",
doi = "10.1109/TEM.2018.2856376",
language = "English",
volume = "(In-Press)",
pages = "(In--Press)",
journal = "IEEE Transactions on Engineering Management",
issn = "0018-9391",
publisher = "Institute of Electrical and Electronics Engineers",

}

TY - JOUR

T1 - A Big Data Analytics Approach for Construction Firms Failure Prediction Models

AU - Alaka,Hafiz

AU - Oyedele,Lukumon O.

AU - Owolabi,Hakeem O

AU - Akinade,Olugbenga O.

AU - Bilal,Muhammad

AU - Ajayi,Saheed O.

PY - 2018/8/17

Y1 - 2018/8/17

N2 - Using 693,000 datacells from 33,000 sample construction firms that operated or failed between 2008 and 2017, failure prediction models were developed using artificial neural network (ANN), support vector machine (SVM), multiple discriminant analysis (MDA) and logistic regression (LR). The accuracy of the models on test data surprisingly showed ANN to have only a slightly better accuracy than LR and MDA. The ANN’s number of units in the hidden layer and weight decay hyperparameters were consequently tuned using the grid search. Tuning process led to tedious machine computation that was aborted after many hours without completion. The state of art Big Data Analytics (BDA) technology was, for the first time in failure prediction, consequently employed and the tuning was completed in some seconds. Mean accuracy from cross-validation was used for selection of the model with best parameter values which were used to develop a new ANN model which outperformed all previously developed models on test data. Subsequent use of selected variables to develop new models led to reduced tuning computational cost but not improved performance. Since the real-life effect of a misclassification cost is greater than the tedious computation cost, it was concluded that BDA is the best compromise.

AB - Using 693,000 datacells from 33,000 sample construction firms that operated or failed between 2008 and 2017, failure prediction models were developed using artificial neural network (ANN), support vector machine (SVM), multiple discriminant analysis (MDA) and logistic regression (LR). The accuracy of the models on test data surprisingly showed ANN to have only a slightly better accuracy than LR and MDA. The ANN’s number of units in the hidden layer and weight decay hyperparameters were consequently tuned using the grid search. Tuning process led to tedious machine computation that was aborted after many hours without completion. The state of art Big Data Analytics (BDA) technology was, for the first time in failure prediction, consequently employed and the tuning was completed in some seconds. Mean accuracy from cross-validation was used for selection of the model with best parameter values which were used to develop a new ANN model which outperformed all previously developed models on test data. Subsequent use of selected variables to develop new models led to reduced tuning computational cost but not improved performance. Since the real-life effect of a misclassification cost is greater than the tedious computation cost, it was concluded that BDA is the best compromise.

KW - Big Data

KW - Data models

KW - Computational modeling

KW - Biological system modeling

KW - Analytical models

KW - Tuning

KW - Computers

U2 - 10.1109/TEM.2018.2856376

DO - 10.1109/TEM.2018.2856376

M3 - Article

VL - (In-Press)

SP - (In-Press)

JO - IEEE Transactions on Engineering Management

T2 - IEEE Transactions on Engineering Management

JF - IEEE Transactions on Engineering Management

SN - 0018-9391

ER -