Article Information

Shi-Wei Shen1
Tri-Dung Nguyen1
Udechukwu Ojiako2

1The Management School, University of Southampton, United Kingdom

2Faculty of Management, University of Johannesburg, South Africa

Correspondence to:
Udechukwu Ojiako

Postal address:
Faculty of Management, University of Johannesburg, Auckland Park Kingsway Campus, Johannesburg 2006

Received: 14 Sept. 2012
Accepted: 28 Mar. 2013
Published: 16 July 2013

How to cite this article:
Shen, S-W., Nguyen, T-D. & Ojiako, U., 2013, Modelling the predictive performance of credit scoring, Acta Commercii 13(1), Art. #189, 12 pages.

Copyright Notice:
© 2013. The Authors. Licensee: AOSIS OpenJournals.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Modelling the predictive performance of credit scoring
In This Original Research...
Open Access
   • Review of literature
      • Credit scoring
      • Modelling
Description of sample data
   • Corporate firms
   • Credit ratings and scorings
   • Choosing microeconomic and macroeconomic variables for modelling
Research method and design
   • Logit models I, II and III
   • Log-Likelihood and Wald Ratio
   • The predictive power of models
   • Receiver Operator Characteristics curve and Area under curve
Empirical analysis of results
   • Data analysis
   • Variables checking
   • The Logistic model
   • Goodness of fit
Ethical considerations
   • Limitations of the study
   • Competing interests
   • Authors' contributions

Orientation: The article discussed the importance of rigour in credit risk assessment.

Research purpose: The purpose of this empirical paper was to examine the predictive performance of credit scoring systems in Taiwan.

Motivation for the study: Corporate lending remains a major business line for financial institutions. However, in light of the recent global financial crises, it has become extremely important for financial institutions to implement rigorous means of assessing clients seeking access to credit facilities.

Research design, approach and method: Using a data sample of 10 349 observations drawn between 1992 and 2010, logistic regression models were utilised to examine the predictive performance of credit scoring systems.

Main findings: A test of Goodness of fit demonstrated that credit scoring models that incorporated the Taiwan Corporate Credit Risk Index (TCRI), micro- and also macroeconomic variables possessed greater predictive power. This suggests that macroeconomic variables do have explanatory power for default credit risk.

Practical/managerial implications: The originality in the study was that three models were developed to predict corporate firms’ defaults based on different microeconomic and macroeconomic factors such as the TCRI, asset growth rates, stock index and gross domestic product.

Contribution/value-add: The study utilises different goodness of fits and receiver operator characteristics during the examination of the robustness of the predictive power of these factors.


Literature (Berger et al. 2003; Bikker & Haaf 2002; Jokipii & Monnin 2013) suggests that the economy of both developed and developing countries is highly dependent on its banking industry. In general, the banking industry is highly competitive and therefore, in order to survive, many banks and financial institutions tend to view business lending as core to their business operations (Berlin & Mester 1998). The reality, however, is that such lending operations are fraught with a number of risks which, if unmanaged, may increase the likelihood that a customer will default on a loan agreement. For this reason, banks generally focus their decision-making processes on optimising trade-offs in terms of risk-return.

In order to optimise such trade-offs in terms of risk-return, banks tend to depend on judgement tools and decision support systems. These tools and systems which are designed around credit scoring models focus on assessing the risk of potential credit customers defaulting on loan agreements. In recognition of the importance of managing credit risk, policy statements have been put forward by banking authorities and regulatory bodies such as the Federal Reserve System Task Force on Internal Credit Risk Models (Federal Reserve Bank 1998), and the Basel Committee on Banking Supervision (1999:49).

In Taiwan, following the enactment of the 1991 Commercial Bank Establishment Promotion Decree, the Taiwanese government deregulated the banking industry as a means to facilitate its expansion (Chiu, Chen & Bai 2011). Although the potential benefit of banking deregulation in Taiwan is generally accepted (Chung 2006; Kao & Liu 2004; Liu & Hung 2006), one negative consequence of the expansion of banking services is that financial institutions in the country have increasingly taken on more risk in their quest to gain customers, resulting in an increase in the reported rate of bad loans (Chen & Shih 2006; Li 2005; Wang et al. 2008). One approach financial institutions have employed to manage risk associated with bad loans is the credit risk assessment of potential borrowers using credit scoring models (De Andrade & Thomas 2007; Maggi & Guida 2011; Thomas, Oliver & Hand 2005).

Sound credit scoring facilitates the minimisation, on one hand, of any likelihood that credit facilities are made available to customers with a high default probability whilst, on the other hand, it optimises the probability that credit facilities will be offered to customers with a higher chance of repayment. In the case of the minimisation of customers with a high chance of defaulting on loans, credit scoring is expected to encompass differentiation. This implies being able to analyse fully the borrower’s risk. This enables financial institutions to reject credit applications from potential defaulting clients. In other words, an effective credit-scoring model will ensure that the number of non-repaying customers is significantly reduced.

The second crucial function of the credit-scoring system is to optimise the selection of potentially ‘good’, in other words, repaying, customers. This implies that the selection of customers being eligible to receive credit facilities depends on the extent of the loss individual financial institutions can tolerate. To facilitate this, a proficient model is required that can discriminate effectively between good and bad so that the error rates can be minimised (Eisenbeis 1978). The general idea should be that clients with high scores should present a low probability of default risk, whereas borrowers with low scores may possess a significantly higher probability (or vice versa, if the interpretation of the numbers follows the opposite reasoning).

In this paper, our specific objective is to examine the predictive performance of such credit scoring systems. Our overriding hypothesis is that the effective modelling of credit scoring will have to incorporate a combination of variables. Our contribution to scholarship is that we utilise a broad selection of macroeconomic variables (annual interest rate, real gross domestic profit [GDP] and the stock index) that have been identified from extant literature.

In order to achieve the objective of this study, the remainder of the manuscript is organised as follows. In the next section, we provide a literature overview of credit scoring and the rationale for adopting logistic regression during modelling. The next section presents a description of our data sample and the research methodology, as well as three models for predicting defaults, and thereafter we undertake empirical analysis of the results. The paper concludes with a discussion of our findings and an articulation of limitations of the current study and recommendations for possible future studies.

Review of literature

Credit scoring
Corporate lending remains a major business line for financial institutions. However, in view of the increases in credit default, the importance of a rigorous credit-risk assessment by financial institutions cannot be overestimated. To undertake credit-risk assessment, banks usually employ credit scoring, which involves the use of historical data to isolate the characteristics or risk of potential customers to default (Finlay 2010).

Credit scoring is therefore a statistical approach employed by financial institutions to appraise the financial credibility of potential borrowers (Dinh & Kleimeier 2007; Finlay 2010). Credit scoring works on the principle that financial history can be used to predict solvency probabilities (Avery, Brevoort & Canner 2009). This capability enables credit default predictions, thus, according to Frame, Srinivasan and Woosley (2001) and Retzer, Soofi and Soyer (2009), reducing information cost to lenders. Scholars such as Yap, Ong and Husain (2011) point out that the history of credit scoring as a risk-management approach can be traced back to the 1940s, however its main application in financial services was in the 1960s following the emergence of bank and credit cards. By the 1980s, credit scoring was being utilised extensively to aid in decisions regarding loan applications.

Once a customer applies for credit, the support system will generate a ‘score’ from historical data that the bank then utilises to rank the customers in terms of default risk. This data may include, for example, the customer’s outstanding debt and financial assets and information on previous defaults. It can therefore be inferred that credit scoring is primarily a way of segmenting potential creditors (Abdou & Pointon 2011), based on a probability risk of default (PD).

There are a range of predictive models that have been used during credit scoring. Studies (Hand & Henley 1997) have shown that, historically, credit scoring has mainly been undertaken utilising discriminant analysis and linear regressions. In addition to these approaches, other techniques that have proved popular over the years have included Probit analysis, non-parametric smoothing methods and logistic regression. Other credit scoring models include the multiple discriminate analysis technique (MDA). This technique was utilised by Altman (1968) to develop the Z-score model for credit default prediction. Using the Z-score model, Altman (1968) demonstrated that firms with a Z-score higher than 2.675 were significantly more likely to default on a loan (the accuracy rate of this model was 95% based on given historic data). Subsequently, the Z-model was improved (Altman, Haldeman & Narayanan 1977), with the addition of three more variables representing stability of earning, liquidity and firm size. A further development of the Z-score model was undertaken by Dambolena and Khoury (1980), with the incorporation of core financial ratios. Other approaches to credit scoring involve the use of either the Probit or Logistic models.

The decision to adopt Logistic models as against Probit models was made based on earlier studies. Notwithstanding the fact that Tam and Kiang (1992) indicate that both models contribute similarly in terms of distinguishing default and non-default events, Westgaard and Wijst (2001:345) and Altman and Sabato (2007) posited that the suitability of the Logistic model is that it can produce real probabilities. Furthermore, Logistic models are less restrictive in distribution assumptions as compared with Probit models (Charitou, Neophytou & Charalambous 2004). Other studies (Greene 1993; Hahn & Soyer 2005) have shown that Probit models follow normal distributions, which makes them less flexible than Logistic models.

Description of sample data

Corporate firms
The data sample for our study is taken from publicly-quoted firms in Taiwan. The data period covers 1992 and 2010. The data covers credit-defaulted and non-defaulted firms. The collected information will take into account the Taiwan Corporate Credit Risk Index (TCRI), public ratings of firms and some specific micro- and macroeconomic factors. All the defaulted firms were selected at the first defaulted time in the period of 1992 to 2010 and every firm will be regarded as an individual observation. Figure 1 presents the total number of defaulted firms at the end of every year.

FIGURE 1: Number of default firms for every year.

Credit ratings and scorings
The credit rating provided by the Taiwanese Economic Journal (TEJ1), is assigned to public companies with a rank of 1 (for very good companies) to a rank of 10 (for the doubtful ones). The criteria for assigning these ranks are divided into three categories including, (1) financial statements, which include balance sheet information, profit tables and cash flow statements, (2) previous rating, which is adjusted with the corporate scales and the individual financial thresholds and (3) the final rating, which is affected further by the subjective estimations of the experts.

Companies with rank ‘1’ have lower credit risk than those which have a rank of ‘10’. The credit ratings were estimated biannually from 1992 until 2010. Table 1 shows the number of defaulted and non-defaulted firms and their credit rating for the period of 1992 to 2010. According to the TEJ, companies with a credit rating of ‘1’ to ‘4’ have the lowest probability of default, whilst those with ratings of ‘5’ and ‘6’ have a credit history which makes it difficult to establish their true financial situation. Firms with a rating of greater than or equal to ‘7’ (≥ 7) are deemed to represent a great credit risk and are seen to have a high probability of defaulting. Within the sample period (1992 and 2010), we observe that all the defaulted firms had a credit rating ranging from ‘5’ to ‘10’ (see Table 1 which is drawn from the TEJ). There were 23 defaulted companies with a credit rating of ‘5’ and ‘6’ and 216 firms with a rating of either ‘7’ or above.

TABLE 1: Taiwan Corporate Credit Risk Index and number of corporations.

As shown in the table, firms with a credit ratings of between ‘1’ to ‘4’ did not default. These firms were considered to present no credit risk and were therefore eliminated from the study2. Figure 2 illustrates the proportion of firms which defaulted per rating level. We observe that only 0.29% of the total firms which had a rating of ‘5’ defaulted, although the default rate approaches 41.5% for firms with a credit rating of ‘10’. The average default history tells us that 2.31% of the firms that had a credit rating of between ‘5’and ‘10’ defaulted. This is also the principal prediction.

FIGURE 2: Proportion of default and normal (non-defaulting) firms.

The accounting data and the macroeconomic variables for the current firms were selected with a time lag. We now explain the term ‘time lag’. If an observation was made at year (t), the accounting data was available in the annual report at the end of year (t – 1). All missing values were replaced with the average of the accounting data from the years (t – 2) and (t). If the data from year (t – 2) or year (t) was seen to be missing, then the observation was eliminated.

In order to assess the macroeconomic effects on the firms, we observe that if the non-defaulting firms (hereinafter referred to as ‘normal’ firms) are only formulated for a specific year, the macroeconomic factors will be of limited value. To overcome this restriction, normal firms are regarded as being single observations for different years. The implication is that the total number of observations we made for the study sample was 10 349, of which 10 110 were normal firms and 239 were defaulting firms. The sample was separated into two groups with labels ex_ante and ex_post. For brevity, we chose a training period ranging from 1992 to 2005. During this period, the number of normal firms was 5834, whilst the number of defaulting firms was 178. The firms after the period of 2006 are treated as the hold-out (test) sample; there were 4276 normal firms and 61 defaulted firms. All the information is summarised in Table 2 (below). The observations from the first period are meant to be utilised so as to build the model, whereas the hold-out set will be used for model testing. Simply put, the model built with the values from 1992–2005 should be able to capture the defaults that took place in the following 5 years.

TABLE 2: Description of sample.

Choosing microeconomic and macroeconomic variables for modelling
Based on a review of literature on microeconomic factors affecting credit risk (Altman & Sabato 2007; Minetti & Zhu 2011; Tsai & Huang 2010), seven financial key variables are seen as having a significant influence on whether firms default on credit arrangements (Table 3). These variables will be incorporated into our model.

TABLE 3: Microeconomic factors for modelling.

In terms of macroeconomic factors, based on a review of literature (Bellotti & Crook 2007; Bonfim 2009; Carling et al. 2007; Duffie, Saita & Wang 2007; Figlewski, Frydman & Liang 2012; Hamerle, Liebig & Rosch 2003; Pesaran et al. 2006), six macroeconomic variables have been selected in this research as impacting credit risk. Again, these variables have a time lag with the year of the observations, since the latter are collected at year (t), whilst the macroeconomic variables at from the end of year (t – 1). These factors are presented in Table 4.

TABLE 4: Macroeconomic factors for modelling.

Research method and design

Logit models I, II and III
To predict the performance of Taiwanese credit scoring systems, three models were developed; the first model only taking into account TCRI, the second model taking into account TCRI and microeconomic variables and the third model taking into consideration TCRI, microeconomic and macroeconomic variables. For modelling, selected observations were divided into normal and defaulting firms, which will be replaced by binary variables (‘0’ and ‘1’). The probability of default is a cumulative probability function based on the logistic distribution. The probability of default for firm ‘i’ is π. The Logistic regression function is shown as Equation 1:

Where Zi = b0 + ∑mj = i bjxij, i = 1, ..., N, this is a linear regression with independent variables Xij and (b0, bi) are the estimated coefficients. e is the base number of the natural logarithm. The π will be bounded between ‘0’ and ‘1’ and that is the probability of default. However, the function is nonlinear so that it is difficult to compare the probability. Thus, the function will be adjusted in order to represent the odds ratio, as shown in Equation 2, by using a logit transformation.


the Odds ratio is

The odds ratio is a ratio interpreting the probability of an event occurring divided by the probability of that event not occurring. In other words, the odds ratio explains the influence that a change of one unit of a particular variable has on the dependent variable, whilst the other variables are held constant. Taking the natural logarithm for both sides results in Equation 3:

with the predicted probabilities being retained after the transformation. The Logit model I is now represented as

Logit factor = b0 + b1 (TCRI)     [Eqn 4]

where TCRI is the credit rating.

We represent Logit model II as

Logit factor = b0 + b1 (TCRI) + b2 (Lta) + b3 (Netta) + b4 (IT) + b5 (AG) + b6 (Ebitta) + b7 (LogA)     [Eqn 5]

where Lta is liability/Total asset, Netta is retained earnings/Total asset, IT is EBIT/Interest expenses, AG is asset growth rate, Ebitta is EBIT (earnings before interest and taxes)/Total asset and LogA is log(size).

Logit model III is shown as

Logit factor = b0 + b1 (TCRI) + b2 (Lta) + b3 (Netta) + b4 (IT) + b5 (AG) + b6 (Ebitta) + b7 (LogA) + b8 (Runemp) + b9 (Yrate) + b10 (RGDP) + b11 (Rpro) + b12 (Mstock) + b13 (MGDP)     [Eqn 6]

where Runemp is unemployment rate, Yrate is the annual interest rate of Taiwan Bank, RGDP is the growth rate of real GDP, Rpro is the growth rate of the product index, Mstock is the stock index/mean of stock index and MGDP is the real GDP/mean of real GDP.

Log-Likelihood and Wald Ratio
To determine the explanatory power of the given variables and judge model fitness for the Logistic regression, the Likelihood ratio and Wald ratio are employed.

The Log-Likelihood (Llog) indicator represents the amount of unexplained information in the classification mode based on summation of the probability of predicted observations and actual observations (Bewick, Cheek & Ball 2005; Field & Miles 2010; Mood 2010), with larger log-likelihood statistics showing poorly-fitted models.

Llog is represented as equation 7

where yi is the ith firm, y can be either ‘1’ (default) or ‘0’ (normal) and P(yi), the predicted value, which will be between ‘0’ and ‘1’.

The fit of the model will also be determined by considering the Chi-square which shows whether the model has significant explanatory power and goodness-of-fit. In addition, to evaluate the contribution of individual variables, the Wald statistic (Equation 8), is employed as, similarly to the t-test in linear regression, the Wald statistic expresses whether the b coefficient of a predictor is significantly different from ‘0’ (Field & Miles 2010:237).

To boot, the forward approach will be used in order to determine the way in which the variables are going to be inserted into the model by setting a cut-off limit of 0.05 in both entering and removing predictors. This method compares the explanatory power of including variables at every stage by judging the likelihood so as to produce the final variables (Bewick et al. 2005; Mood 2010).

The predictive power of models
By employing logit regression, the probability of default was estimated for all observations, where P(yi) ϵ (0, 1), and is determined by the estimated parameters (coefficients) bj. A cut-off value, P, was assigned to serve as the criterion for classifying the case into a specific group (Equation 9).

P (yi) ≥ P, firm ith is classified as ‘Default’
P (yi) < P, firm ith is classified as ‘Normal’     [Eqn 9]

The observations were measured by grading models and assigned into ‘Default’ and ‘Normal’ groups, given a suitable cut-off value. When observations showed similar values with the predictive models, segments could be classed as either True or Alarm respectively (see Table 5). On the contrary, if the predicted statement of yi appeared to differ from the observed statement, an error was assumed. In other words, the prediction of ‘Default’ occurs when the observation has not defaulted (False, Type II error) or the prediction of ‘Normal’ occurs when the observation has defaulted (Miss, Type I error). The schematic in Table 5 summarises the concepts of the prediction outcomes.

TABLE 5: Type I and Type II errors.

Receiver Operator Characteristics curve and Area under curve
A more appropriate way to evaluate the overall predictive power of the model and not just for a specific cut-off point may be the Receiver Operator Characteristics (ROC) curve. The ROC curve provides an explanation for the trade-off between sensitivity and 1-specificity for all cut-off values. In the ROC curve, sensitivity refers to the ratio of predicting normal firms as normal and 1-specificity refers to the ratio of predicting defaulted firms as normal. This is a functional measurement in appraising the grading system of a binary classification and it pictures the accuracy of the classifier. Table 6 shows the possible classifications. In comparison with the type I and type II approach, the ROC curve illustrates the accuracy of the model in general, not only for a specific cut-off point (e.g. when sensitivity is 80%, 1-specificity will be 60%). It furthermore shows, for example, for what proportions of sensitivity the specificity can remain intact and vice versa.

TABLE 6: Classification of prediction.


Here, ‘TN’ refers to the true normal, ‘FD’ refers to the false default, ‘TD’ is the true default and ‘FP’ is the false positive.

Empirical analysis of results

The data sample is drawn from 10 349 observations between 1992 and 2010. Table 7 reports the summary statistics for all variables.

TABLE 7: Description of statistics.

Data analysis
The samples were examined for equality of means against different categories (normal and default) using the t-test. Since the sample size was above 30, the distribution was assumed to be normal, according to the central limited theory. Table 8 illustrates the mean for different variables and groups, as well as the standard deviation. We observe that except for LogA, all other variables have different means across groups.

TABLE 8: Mean and standard deviation of variables.

In Table 9 we report the result of the t-test of every variable. Variances are examined to establish equality across groups using Levene’s test (Field & Miles 2010:273).

TABLE 9: Independent samples test.

The hypothesis is that the variance of the Normal firm is equal to the variance of the Default firm, as is shown in Equation 10:

H0: VN = VD     [Eqn 10]

If this is so, we assume that the means of the default case and the normal case are the same as shown in Equation 11:

H0: MN = MD     [Eqn 11]

Yrate, RGDP and Rpro have different variances for default and normal cases since the hypothesis is rejected by the F statistic with a significance value of 1% (p < 0.01). MGDP is also significant at 5%. Therefore, except for these four variables, all other variables have similar variances. From the t-test, we find that Runemp, Yrate, RGDP, Rpro and MGDP have different means when comparing default with normal firms since the hypothesis is rejected for p < 0.01. Their means’ differences are –0.478%, 0.844%, 1.201%, 1.831% and –0.059% respectively. Specifically, firms with higher and lower Runemp and GDP increase the default risk. Additionally, yYrate, RGDP, Rpro and MGDP show a positive correlation with the number of defaults. We also observe that all five significant variables were macroeconomic factors, whilst the microeconomic variables are found to be insignificant. Consequently, we interpret this to mean that the macroeconomic variables may have a significant influence in predicting the defaulted firms, since the means of default and normal firms are significantly different.

Variables checking
In the logit regression, the defaulted firms were labelled ‘1’ and the normal firms were labelled ‘0’. The sample before 2005 is designated as being the training sample used to create the model, whereas cases after 2006 are employed for model testing. The observations from the sample (Table 10), are assumed to be normal firms, which is also the principal prediction.

TABLE 10: Classification Table without predictive model.

The variables were selected using SPSS Version 19.0.1 in a logistic regression format and the results are presented in Table 11. In line with earlier recommendations (Dong, Lai & Yen 2010), all variables are given a significance indicator. For the microeconomic variables, only TCRI and AG are observed as being significant in step ‘0’. All macroeconomic variables are also selected as important factors. However, in the forward Logistic regression, only five variables are adopted in the equation: TCRI, AG, Yrate, Mstock and MGDP; all other variables are insignificant since p < 0.05.

TABLE 11: Variables not in the equation.

The reason for SPSS selecting only five variables in the last iteration can be explained by Bewick et al. (2005:117), who suggested that different variables in a forward regression may be in the same best statistical fit. However, selecting variables should be an unbiased procedure in order to avoid subjectivity and thus to be applicable in different case scenarios. Due to the desired stability and objectivity of the model, all the selected variables will be significant at p < 0.05. To avoid the detrimental influence of collinearity within predictors, the adopted variables will be examined using Pearson’s correlation test (Field 2009). Table 12 demonstrates the correlation between the variables, which are all found to be below 0.37, apart from Yrate and MGDP which appear to have a strong relationship, assuming a 95% confidence level (value = 1). Nevertheless, as shown in Table 13, if the single variance inflation factor (VIF) values are above 10 and each Tolerance value is below 0.2, collinearity is probably not an issue since the last criteria are met.

TABLE 12: Correlation matrix.

TABLE 13: Tolerance and Variance Inflation Factor (VIF).

The Logistic model
Table 14, below shows the coefficients and the significance level of the variables for the three models. First, –2 log likelihood (–2LL) represents how much information the model cannot explain regarding the variation of the dependant. It shows that model I had a –2LL of 1175.187, whilst model II had a likelihood of 1171.177 (a lower value compared with model I). Model III has the smallest –2LL (1138.697). This can be interpreted as meaning that the model with TCRI, microeconomic and macroeconomic variables demonstrates a more rigorous explanatory power.

TABLE 14: Variables in the equation.

Whilst –2LL represents the amount of unexplained information within the model, the chi-square statistic shows how much each model has actually explained based on the initial –2LL. Model I has a chi-square statistic of 428.514, model II has 432.523 and model III has 465.004. All the models reject the hypothesis with a 99% confidence level. However, model I has the lowest chi-square value. In other words, the model with TCRI and microeconomic variables only and the model with all factors (TCRI, microeconomic and macroeconomic variables), assess more of the information that they have at their disposal and explain more of the variation that lies within the model. It can thus safely be inferred from inspection that their results are more robust and more trustworthy than the first model.

In terms of individual predictors, model I will not be discussed since it only has one significant variable. For model II, the Wald statistic test indicates that the hypothesis is rejected for TCRI and AG; their coefficients are different from ‘0’ with confidence levels of 99% and 90% respectively. Also, in model III, the coefficients of TCRI, AG, Yrate, Mstock and MGDP are different from ‘0’ with a confidence level of 95%.

It is, however, difficult to explain the coefficients in the logit model because the relationship between predictor variables and their probability is non-linear. Subsequently, the odds ratios have to be calculated. It should be recalled that the odds ratio represents the influence that an increase of 1 unit in the predictor’s terms has on the dependant. The odds ratio can overcome this barrier since it shows the probability of an event’s happening (increase of a predictor by 1 unit), divided by the probability of its not happening (all predictors are 0) (Westgaard & Wijst 2001). Hence, in model II, the odds ratios based on the estimated coefficients can be calculated as:

oddsratio = π / (1 – π) = e(Zi) = e(-11.234+1.055TCRI–0.004AG)     [Eqn 12]

At the starting point,

odds = e-11.234 = 0.0000132171

The odds ratio for the TCRI is estimated by increasing the latter by one unit, whilst all other variables are fixed at zero (‘0):

odds = e-11.234+1.055TCRI = 0.0000379591


The odds ratio for AG is estimated by increasing the latter by one unit, whilst all other variable are fixed at zero (‘0),

odds = e11.234–0.004AG = 0.0000131643


The percentage of change of the odds ratios is represented as Exp(B) in Table 14. For model III, the odds ratios for the variables can be computed based on the following formula:

At the starting point,

odds = e-21.666 = 3.895613E – 10

One unit of increase in TCRI and the rest of the variables are fixed at ‘0’:

odds = e-21.666+1.099TCRI = 1.169137E – 9

Then, the change is:

The rest of the variables have values of 0.996, 1.46, 2.745 and 6342.31 for AG, Yrate, Mstock and MGDP respectively. If the model is loaded with the exact number of real GDP, the Exp(B) becomes 100% (see Table 15), which seems more reasonable.

TABLE 15: Robust test for Real Gross Domestic Product (RGDP).

Goodness of fit
A better presentation of model performance can be achieved by examining the error rates because they show the risk of a financial institution incorrectly assessing a case. In this section, we examined the models built (Models I, II and III) with the training sample (ex_ante). The models were tested with the hold-out sample in order to demonstrate the accuracy and goodness of fit. Two different approaches were used.

The first approach employed was based on setting the same cut-off value of 0.23 for all three models. In Table 16, we show the sensitivity, specificity and classification accuracy for all three models in the training and test sets.

TABLE 16: Percentage correct of accuracy test.

Models I and II have similar classification accuracy in discriminating normal and defaulted firms, with 96.3% for the ex_ante set and 98.8% and 98.7% respectively for the ex_post set. Model III at ex_ante presents less classification accuracy than model I and II with 95.4% and the performance at ex_post is 85.5%. It should be highlighted that the classification accuracy is not the important measure in this study. This is due to the fact that the proportion of defaults is 2.3% of the total. The important measure is the number of defaults that are predicted as being normal. Hence the percentage of predicting the defaulted firms improves significantly in the training and test set for model III (50% and 90.2% respectively), whilst the other two models are bounded between 37% and 45% respectively.

The setting of the same cut-off value may not be the optimal way to determine which model is the best in predicting the defaults, since different cut-off values provide different results for each model. Hence the new approach will aim to compare the three models, assuming that the type II error is approximately 2% for all three models in the ex_ante set. The interpretation of classification accuracy (Table 17) is not critical because of the large number of normal firms.

TABLE 17: Percentage accuracy against specificity.

In order to examine the overall discriminatory power of the models, regardless of the cut-off value, the ROC curve of each training and test set is examined because in practice, financial institutions are more concerned with the cases classified as normal firms who are in reality likely to default on credit arrangements (Banasik, Crook & Thomas 1999). In general, the most efficient model in terms of the ROC curve is the one that loses a small amount of sensitivity when attempting to increase the specificity. The percentages of sensitivity and 1-specificity are plotted to create the ROC curve. Figure 3 shows the training test ROC curves of model I, model II and model III, whilst Figure 4 presents the ROC curves of the supplied test.

FIGURE 3: Receiver Operator Characteristics (ROC) curve of ex_ante.

FIGURE 4: Receiver Operator Characteristics (ROC) curve of ex_post.

In the ex_ante and the ex_post set, the ROC curve of model III is slightly wider than the curve for model II and model I. This can be interpreted as indicating that the (classification) accuracy of model III may be higher on average for different cut-off intervals.

To further examine the discriminatory power of the models, the area under ROC (AUC) is aggregated. The AUCs of models I, II and III in the ex_ante set are 0.866, 0.863 and 0.876 respectively, with a confidence level of 95% as shown in Table 18, which indicates that model III possesses greater predictive power than models I and II.

TABLE 18: Area under Receiver Operator Characteristics (ROC) curve (AUC).

A comparison of the models (Table 19) shows that model III is more robust as the logistic regression indicates that the systematic risk has a significant relationship with the default probability.

TABLE 19: Comparison of the results.

Ethical considerations

This study was undertaken to adhere to the framework and ethical policies of both the University of Southampton and the University of Johannesburg.


The authors confirm that to the best of their ability, the measuring instruments and procedures adopted in this study are accurate.


In this paper, our specific objective was to examine the predictive performance of credit scoring systems. A test of Goodness of fit demonstrated that credit-scoring models that incorporate the Taiwan Corporate Credit Risk Index (TCRI), micro-economic and macro-economic variables possessed greater predictive power, thus suggesting that macroeconomic variables do have explanatory power for default credit risk. Specifically, we found that in addition to the robustness of predictive power provided by holistic credit-scoring models that incorporate TCRI, as well as micro- and macro-factors, the most predictive inputs are the TCRI and the asset growth rate for the micro-factors and one year interest rate, stock index and real GDP for the macro-factors. Only asset growth contributes a negative influence to the probability of default, whilst the rest of the variables present positive relationships. To an extent, the findings from this study should not be surprising as they confirm the need to incorporate macro-economic factors during credit assessment. As indicated earlier, studies do suggest a continued interest by banking managers with regard to understanding the ever-increasing interdependencies in the global economy and how these impact on banking operations. For example, in the case of interest rates (a well-known macro-economic factor), rises in interest rate can increase default risk because of their impact on the cost of borrowing.

Limitations of the study
Notwithstanding the findings, this study was not without limitations. One limitation relates to the selection of predictive variables which would have been enhanced by an inductive determination of impact prior to selection. In our study, however, the selection of predictive variables was deductive. Although these limitations do exist, we feel that the value of the study is in its contribution as it provides a validated methodology as to what the fundamental issues are that financial institutions in Taiwan need to take into account when building internal predictive default credit-risk models. We therefore suggest that future studies may need to focus first on assessing the inductive determination of the impact of such macro-variables in order to better capture systematic or country risk and further explore their explanatory-discriminatory power on credit-risk scoring models.


The authors would like to thank the editor and two blind reviewers for their valuable comments and suggestions to this article.

Competing interests
The authors declare that they have no financial or personal relationship(s) which may have inappropriately influenced them in writing this article.

Authors’ contributions
S.W.S. (University of Southampton), T.D.N. (University of Southampton) and U.O. (University of Johannesburg), all made equal conceptual contributions that led to the development of this article.


Abdou, H. & Pointon, J., 2011, ‘Credit scoring, statistical techniques and evaluation criteria: A review of the literature’, Intelligent Systems in Accounting, Finance and Management 18(2–3), 59–88.

Altman, E., 1968, ‘Financial ratios, discriminant analysis and the prediction of corporate bankruptcy’, Journal of Finance 23(4), 589–609.

Altman, E., Haldeman, R. & Narayanan, P., 1977, ‘ZETA analysis: A new model to identify bankruptcy risk of corporations’, Journal of Banking & Finance 1(1), 29–54.

Altman, E. & Sabato, E., 2007, ‘Modelling credit risk for SMEs: Evidence from the US market’, Accounting Foundation 43(3), 332–357.

Avery, R., Brevoort, K. & Canner, G., 2009, ‘Credit scoring and its effects on the availability and affordability of credit’, Journal of Consumer Affairs 43(3), 516–537.

Banasik, J., Crook, J. & Thomas, L., 1999, ‘Not if but when will borrowers default’, Journal of Operational Research Society 50(12), 1185–1190.

Basel Committee on Banking Supervision, 1999, Credit risk modelling. Current Practices and Applications, Basel Committee Publications.

Bellotti, T. & Crook, J., 2007, ‘Credit scoring with macroeconomic variables using survival analysis’, Journal of the Operational Research Society 60(12), 1699–1707.

Berger, A., Dai, Q., Ongena, S. & Smith, D., 2003, ‘To what extent will the banking industry be globalized? A study of bank nationality and reach in 20 European nations’, Journal of Banking & Finance 27(3), 383–415.

Berlin, M. & Mester, L., 1998, ‘On the profitability and cost of relationship lending’, Journal of Banking & Finance 22(6−8), 873–897.

Bewick, V., Cheek, L. & Ball, J., 2005, ’Statistics review 14: Logistic regression’, Critical Care 9(1), 112–118., PMid:15693993, PMCid:1065119

Bikker, J. & Haaf, K., 2002, ‘Competition, concentration and their relationship: An empirical analysis of the banking industry’, Journal of Banking & Finance 26(11), 2191–2214.

Bonfim, D., 2009, ‘Credit risk drivers: Evaluating the contribution of firm level information and of macroeconomic dynamics’, Journal of Banking & Finance 33(2), 281–299.

Carling, K., Jacobson, T., Linde, J. & Roszbach, K., 2007, ‘Corporate credit risk modeling and the macroeconomy’, Journal of Banking & Finance 31(3), 845–868.

Charitou, A., Neophytou, E. & Charalambous, C., 2004, ‘Predicting corporate failure: empirical evidence for the UK’, European Accounting Review 13(3), 465–497.

Chen, W. & Shih, J., 2006, ‘A study of Taiwan’s issuer credit rating systems using support vector machines’, Expert Systems with Applications 30(3), 427–435.

Chiu, Y., Chen, Y. & Bai, X., 2011, ‘Efficiency and risk in Taiwan banking: SBM super-DEA estimation’, Applied Economics 43(5), 587–602.

Chung, H., 2006, ‘Managerial ties, control and deregulation: An investigation of business groups entering the deregulated banking industry in Taiwan’, Asia Pacific Journal of Management 23(4), 505–520.

Dambolena, I. & Khoury, S., 1980, ‘Ratio stability and corporate failure’, Journal of Finance 35(4), 1017–1026.

De Andrade, F. & Thomas, L., 2007, ‘Structural models in consumer credit’, European Journal of Operational Research, 183(3), 1569–1581.

Dinh, T. & Kleimeier, S., 2007, ‘A credit scoring model for Vietnam’s retail banking market’, International Review of Financial Analysis, 16(5), 471–495.

Dong, G., Lai, K. & Yen, J., 2010, ‘Credit scorecard based on logistic regression with random coefficients’, Procedia Computer Science 1(1), 2463–2468.

Duffie, D., Saita, L. & Wang, K., 2007, ‘Multi-period corporate failure prediction with stochastic covariates’, Journal of Financial Economics 83(3), 635–665.

Eisenbeis, R., 1978, ‘Problems in applying discriminant analysis in credit scoring models’, Journal of Banking and Finance 2(3), 205–219.

Federal Reserve Bank, 1998, ‘Credit risk models at major US banking institutions: current state of the art and implications for assessments of capital adequacy’, Federal Reserve Bank Board of Governors, Supervisory Staff Reports, Washington, viewed 03 October 2011, from

Field, A., 2009, Discovering statistics using SPSS, SAGE, London.

Field, A. & Miles, J., 2010, Discovering statistics using SAS, SAGE, London.

Figlewski, S., Frydman, H. & Liang, W., 2012, ‘Modeling the effect of macroeconomic factors on corporate default and credit rating transitions’, International Review of Economics & Finance 21(1) 87–105.

Finlay, S., 2010, ‘Credit scoring for profitability objectives’, European Journal of Operational Research 202(2), 528–537.

Frame, W., Srinivasan, A. & Woosley, L., 2001, ‘The effect of credit scoring on small-business lending’, Journal of Money, Credit and Banking 33(3), 813–825.

Greene, H., 1993, Econometric analysis, Prentice-Hall, Englewood Cliffs, N.J.

Hahn, E. & Soyer, R., 2005, ‘Probit and logit models: Differences in the multivariate realm’, unpublished Working Paper, viewed 14 September 2011, from

Hamerle, A., Liebig, T. & Rosch, D., 2003, ‘Credit risk factor modelling and the Basel II’, IRB approach, Series 2: Banking and Financial Supervision, No 02/2003, viewed 08 October 2011, from

Hand, D. & Henley, W., 1997, ‘Statistical classification methods in consumer credit scoring: A review’, Journal of the Royal Statistical Society: Series A. Statistics in Society 160(3), 523–541.

Jokipii, T. & Monnin, P., 2013, ‘The impact of banking sector stability on the real economy’, Journal of International Money and Finance 32, 1–16.

Kao, C. & Liu, S., 2004, ‘Predicting bank performance with financial forecasts: A case of Taiwan commercial banks’, Journal of Banking & Finance 28(10), 2353–2368.

Li, Y., 2005, ‘DEA efficiency measurement with undesirable outputs: an application to Taiwan’s commercial banks’, International Journal of Services Technology and Management 6(6), 544–555.

Liu, Y. & Hung, J., 2006, ‘Services and the long-term profitability in Taiwan’s banks’, Global Finance Journal 17(2), 177–191.

Maggi, B. & Guida, M., 2011. ‘Modelling non-performing loans probability in the commercial banking system: efficiency and effectiveness related to credit risk in Italy’, Empirical Economics 41(2) 269–291.

Minetti, R. & Zhu, S., 2011, ‘Credit constraints and firm export: Microeconomic evidence from Italy’, Journal of International Economics 83(2), 109–125.

Mood, C., 2010, ‘Logistic regression: Why we cannot do what we think we can do, and what we can do about it’, European Sociological Review 26(1), 67–82.

Pesaran, M., Schuermann, T., Treutler, B. & Weiner, S., 2006, ‘Macroeconomic dynamics and credit risk: A global perspective’, Journal of Money, Credit and Banking 38(5), 1211–1261.

Retzer, J., Soofi, E. & Soyer, R., 2009, ‘Information importance of predictors: Concept, measures, Bayesian inference, and applications’, Computational Statistics & Data Analysis 53(6), 2363–2377.

Taiwan Economic Journal, n.d., ‘Taiwan credit risk index’, viewed 10 June 2011, from

Tam, K. & Kiang, M., 1992, ‘Managerial applications of neural networks: The case of bank failure predictions’, Management Science 38(7), 926–947.

Thomas, L., Oliver, R. & Hand, D., 2005, ‘A survey of the issues in consumer credit modelling research’, Journal of the Operational Research Society 56(9), 1006–1015.

Tsai, B. & Huang, Y., 2010, ‘Alternative financial distress prediction models’, Journal of Contemporary Accounting 11(1), 51–78.

Wang, L., Kuo, H., Yu, S. & Wu, C., 2008, ‘Loan policy and bank performance: Evidence from Taiwan’, Banks and Bank Systems: International Research Journal 5(2), 108–120.

Westgaard, S. & Wijst, N., 2001, ‘Default probabilities in a corporate bank portfolio: A logistic model approach’, European Journal of Operational Research 135, 338–349.

Yap, B., Ong, S. & Husain, N., 2011, ‘Using data mining to improve assessment of credit worthiness via credit scoring models’, Expert Systems with Applications 38(10), 13274–13283.


1. The TEJ (Taiwanese Economic Journal) is a major source for Taiwanese financial institutions to obtain historic data on corporate statistics and macroeconomic factors.

2. According to Altman (1968), a sample which includes the cases which have a very rare probability of default is unwise. Please note that based on this exclusion, the TCRI rating will be accepted as being the base for the future credit-risk assessment of the models.

3. Altman and Sabato (2007) set the same cut-off value of 0.3 for different models so as to compare the accuracy ratios.

Crossref Citations

No related citations found.