The paper is aimed at comparing the divergence of existing credit risk models and creating a synergic model with superior forecasting power based on a rating model and probability of default model of Russian banks. The paper demonstrates that rating models, if applied alone, tend to overestimate an instability of a bank, whereas probability of default models give underestimated results. As a result of the assigning of optimal weights and monotonic transformations to these models, the new synergic model of banks’ credit risks with higher forecasting power (predicted 44% of precise estimates) was obtained.
Economic growth and stability of any country depend on the financial environment of its banking system. Given the critical role of banks as financial intermediaries, the estimation of their financial stability is one of the main goals of regulators and government. The most commonly used ways for assessing the financial performance and controlling the level of credit risk of a bank is an evaluation of its probability of a default and a rating grade. The probability of default (PD) is the likelihood of a bank failure over a fixed assessment horizon while a rating determines the class to which a company belongs based on the PD. Although both of these methods have been intensely studied, the forecasting power of these models still has a wide area for improvement. There are possible biases that may lead to misleading results. PD estimates provided by a model forecast are underestimations, because of imbalanced structure of datasets containing defaults. The occurrence of the default event is rare, so a PD model becomes overfitted towards nondefault events. Even the classical balancing data methods provided by
In the presence of this divergence, this paper is aimed at adjusting the previously used models of credit risks to a single scale and creating a
This research is based on the “Banks and Finance” database provided by the informational agency “Mobile”. The panel dataset of Russian banks was used in the analysis. The total number of banks after filtration was 395 (86 of them experienced the default). The financial performance of these banks was considered on a quarterly basis from the year 2007 to 2016, so the overall number of observations was 11,627 which should be sufficient to make consistent conclusions.
The rest of the paper is structured as follows. The second section provides results of a literature review. Then the analysis of empirical data and the formation of a representative sample are illustrated. The third section deals with the econometric models for forecasting a bank’s rating and PD on the same dataset with the further check of their “goodness of fit”. In the fourth section, PD’s and rating’s estimates are calibrated to the common scale and distributions of their forecast errors are compared and the divergence in these estimates is analyzed. As a result, the synergic model with a higher forecasting power is constructed by assigning optimal weights and monotonic transformations to PD and rating models. Finally, the synergic model is further checked for its outofsample fit and conclusions are formulated.
The paper unifies two seemingly separate areas of economic literature. The first area addresses the issue of underestimation of credit risk by default models, while the second area concerns the overcautious assignments of credit ratings.
All recent studies advise to pay great attention to the presence of the class imbalance problem in data on defaults and its impact on the estimation procedure and on some standard forecasting power indicators (
As the second literature stream, there exists a longtime tendency for estimation the differences between ratings assessments of different RAs. Despite the fact that many rating agencies use similar letter designations, the approaches to financial analysis differ amongst them. It was observed that the rating agency, Standard & Poor’s, is more cautious and conservative when evaluating the financial stability of banks, compared with its two largest competitors Fitch and Moody’s. Also, it was revealed that Moody’s approach to the assessment of banking risks is the most liberal (
Therefore, observing the divergence of ratings and PD modeling, an idea of combination of these two forecasts in order to increase the predictive power of financial instability of a bank has come to different researchers. Note that these two approaches give exactly opposite skews of their predictors that make their combination even more reliable. For example,
The algorithm of this paper includes several steps. The first step is to construct PD model and credit ratings’ models separately on the same dataset using the basic rating scale adjustment provided by
This research is based on the “Banks and Finance” database provided by the informational agency “Mobile”. It is a verified source of diverse information about international financial companies that is used extensively in the academic literature. The database provides monthly financial data that allowed us to obtain a panel dataset of Russian banks. Financial data includes balance sheet, income statement, calculated ratios and other information. There are 2071 banks in the “Mobile” database and the data was initially extracted from 2007 to 2016.
In order to generate a representative sample from the database, some data filtration methods were applied. First the bank’s distribution by the ownership type was considered. The focus of this paper is the individual profit maximizing banks, so all stateowned banks were omitted. The definition of stateowned bank was taken from the paper by
The main reduction of the sample size appeared due to the fact that only a small share of banks (395 banks) was assigned a rating grade. The data about history of rating changes was taken from Cbonds.ru and Bankodrom.ru that are the main online aggregators of banking statistics. The extracted data contained assessments of national RAs (RAEX, RusRating, AK&M, NRA, RiaRating) and international agencies (Moody’s, Standard & Poor’s or Fitch). Then the data on banks’ defaults were collected from Cbr.ru and Banki.ru. 86 Russian banks (which received a rating assessment at least once) had default in the concerned time period. Both ratings and defaults were added to the financial data with a twoquarter lag between them. This time lag was chosen due to the fact that the process of assigning a rating by the RA takes some time to complete all the necessary procedures.
The historical distribution of all Russian banks (before any filtrations) for the period from 2007 to 2017 is demonstrated on the Fig.
Historical annual distribution of defaults of Russian banks from 2007 to 2017.
The initial database was imbalanced (223 of defaults compared to 11,404 of nondefaults). The nature of imbalanced data is intrinsic (corresponds to the nature of a data set). Furthermore, there is no data available for the “default” class after the bank had experienced the default. It leads to embedded rarity and withinclass imbalances as well as the failure of generalizing inductive rules by learning algorithms (
RAs assign their grades in a symbolic form. However, in order to obtain coefficient estimates in an econometric model, these symbols should be transformed into numerical values. Moreover, symbolic ratings of different rating agencies should be unified to the base scale. The process of comparison of rating scales was taken from the paper of
As a result of comparisons of multiple mapping, it was found by
where
In this research, one more Russian rating agency was added to the comparison list: this agency is RiaRating with the estimated regression coefficients
In order to avoid the loss in consistency for the model, the ratings were assumed to be unchanged until the moment of the new rating assignment. The final version of the dependent variable was obtained by averaging all single scale numeric grades of a bank in a particular quarter for all rating agencies. However, the averaging procedure brought us to the numerous noninteger rating groups (e.g. rating = 17.43) and the difference between this groups was too small to be properly modeled. For this reason, the numeric rating was rounded to the closest integer. Therefore, in this paper, 30 different groups of ratings were considered.
The models introduced in this paper allow interested agents to determine the probability of default and credit ratings for Russian banks, having at their disposal only public information. As for the modeling methods of this research, binary logit/probit regressions were chosen for PD estimation and multinomial ordered logit/probit for credit ratings modeling. It was shown (
The optimal set of indicators was selected on the basis of the most significant parameters that were chosen by a stepwise procedure (
The variable specification of the models was continuously challenged by the choice of financial variables, their cross terms and macroeconomic variables used as principal components (PCs) (explained in the section 4.3). The final models were checked for multicollinearity and all explanatory variables had correlations less than 35% and reasonable descriptive statistics. The results obtained by the panel probit regressions of PD modeling are shown in Table
The results of PD models for different samples and different groups of variables.
Dependent variable / Independent variables  1.1. 
1.2. 
2.1. 
2.2. 

Equity / Assets  –10.023** 
–6.769** 
–22.567*** 
–11.759*** 

Operational expenses / Operating income  1.72*** 
0.009*** 
2.169*** 
1.982*** 

Net interest margin  –2.261* 
–10.28*** 
–5.859** 
–12.567*** 

Interbank ratio  –0.0004** 
–0.002*** 
–0.003*** 
–0.009*** 

Bank equity / Equity of all banks  ⎫ 
Share PC  –12.25*** 
–125.739*** 


Log total assets  –1.909*** 
–0.903*** 
–23.813*** 
–5.239*** 

(Log total assets)^{2}  –  –  
Current ratio (CR)  ⎫ 
Asset Liq PC  –0.004*** 
–3.985*** 

Loan loss reserves / Gross loans  0.159*** 
–0.203*** 
17.748*** 
–8.815*** 

CR × RGDP growth rate  –  – 


CR × Loan loss reserves / Gr. loans  –  – 


Real GDP growth rate  ⎫ 
Macro PC_{1} PC_{2}  –  – 


CPI growth rate  –  – 


Exchange rate USD/RUB  –  6.045*** 
–  10.725*** 

RGDP per capita  –  – 


Trade balance  –  –3.018*** 
–  –4.978** 

Number of observations  11 627  11 627  3289  3289  
Log 
–538.21  –307.55  –467.97  –309.03  
Log 
–1281.45  –1281.45  –882.97  –882.97  
Pseudo 
0.58  0.76  0.47  0.65  
% of correct predictions 
96 
98 
72.3 
79.5 

AIC  24 779.704  19 347.724  1276.8  845.2  
BIC  24 909.996  19 722.892  1387.2  912.4 
*
The results of ratings models for different samples and different groups of variables.
Dependent variable / Independent variables  1.1. 
1.2. 
2.1. 
2.2. 

Equity / Assets  –0.249*** 
–0.345*** 
–0.236**** 
–0.632*** 

Operational expenses/ Operating income  – 
– 
– 
– 

Net interest margin  –0.029*** 
–0.037*** 
–0.007** 
–0.013** 

Interbank ratio  –0.569*** 
–0.359** 
–0.487*** 
–0.678** 

Bank equity/ Equity of all banks  ⎫ 
Share PC  – 

– 

Log total assets  –  –0.612* 
– 
–0.0018** 

(Log total assets)^{2}  –0.009** 
–0.005** 

Current ratio (CR)  ⎫ 
Asset 
–0.051* 
–0.001 

Loan loss reserves / Gross loans  – 
–0.673*** 
4.254* 
–0.387*** 

CR × RGDP growth rate  – 
–  
CR × Loan loss reserves / Gr. loans  –0.002*** 
–0.003*** 

Real GDP growth rate  ⎫ 
Macro PC_{1} PC_{2}  – 
–  
CPI growth rate  – 
–  
Exchange rate USD/RUB  – 
–0.592*** 
–  –1.036*** 

RGDP per capita  – 
–  
Trade balance  – 
8.265*** 
–  7.024*** 

Number of observations  11 627  11 627  3289  3289  
Log 
–3254.21  –2964.40  –2965.91  –2283.72  
Log 
–5827.95  –5827.95  –4361.92  –4361.92  
Pseudo 
0.53  0.68  0.38  0.47  
% of correct predictions  12  18  5  7  
AIC  24 779.704  22 347.724  12 779.735  11 649.745  
BIC  24 909.996  22 722.892  12 937.468  11 732.491 
*
In order to interpret the signs of the estimated coefficients correctly, one should remember that the higher dependent variable is, the higher is the probability of default and the higher is the numeric value of a rating, which corresponds to banks with low financial stability. Keeping this in mind, we can conclude that all signs of coefficients coincide with their expected impact on PD and credit ratings for all regressions.
The first model specification (models 1.1 and 2.1) included only financial variables that were based on the BFSR methodology described in previous studies of authors (
Macroeconomic variables and cross terms of financial variables are heavily correlated with each other, which inevitably leads to multicollinearity problems if no measures are taken. Therefore, due to this, the model is constructed primarily to be applied in forecasting and principal component analysis (PCA) is used to eliminate potential problems. PCA (
Asset–Liquidity group (includes 5 variables: Current ratio; Current ratio × GDP growth rate; Current ratio × Loan loss reserves / Gross loans; Loan loss reserves / Gross loans; GDP growth rate);
Market share group (includes 3 variables: Log total assets; Log total assets 2; Bank equity share in total equity of all banks);
Macroeconomic group (includes 4 variables: CPI growth rate; Exchange rate USD/RUB; GDP per capita; Trade balance).
Let us assume that initial variables from liquidity group are called
Secondly, the normalized data are given a new orthogonal basis via constructing linear combinations of
As soon as PCA transformation is completed, the coefficients are no longer interpretable in an economic sense. In order to calculate marginal effects of variables that are inside PCs, we need to make the return procedure from principal components coefficients to initial coefficients.
The process will be shown on the example of macroeconomic variables. As each of
Marginal effect at any point is calculated as
If
The reverse procedure of marginal effects provided us with the expected sign interpretation. In the second specification (models 1.2 and 2.2), various cross products and macro variables were tested. An assetliquidity group principal component shows the interdependence of banks’ asset quality and liquidity, observing a tendency that banks with better loan portfolio tend to have a stable liquidity. Moreover, it shows a correlation between a GDP growth rate with liquidity of a bank. An increase in GDP growth rate leads to an increase in investments and savings of firms and households and they, in its turn, pay off their debts to banks more easily and banks’ liquid funds increase. A market share group principal component was also highly significant in all models and shows the importance to include different methods of estimation of a market share.
The comparison of predictive power of different variable specifications models gives us an expected result: the model with principal components (PC) that includes financial variables, their cross terms and macro variables gives us the highest level of forecasts in any sample of data. As the result of this step of the research, the predicted values for the best model specification for both PD default model (2.2) and credit ratings model (1.2) were computed.
In order to compare the forecasting power of PD and credit ratings models, they should be presented in the same scale. There are various papers that study calibration of ratings and defaults (
This scale clearly shows the nonlinear pattern of PD and rating grade. In order to correspond the scale provided in Table
Calibration of the rating scale of S&P and probability of default.
Base rating scale  S&P Rating Scale  PD, % 
9  ruAAA  0.3626 
11  ruAA+  0.4885 
12  ruAA  0.6579 
13  ruAA–  0.8855 
13.5  ruA+  1.1909 
14  ruA  1.5999 
14.5  ruA–  2.1464 
15  ruBBB+  2.8741 
15.25  ruBBB  3.8388 
15.5  ruBBB–  5.1103 
15.75  ruBB+  6.7732 
16  ruBB  8.9263 
16.5  ruBB–  5.1103 
17  ruB+  15.1375 
17.5  ruB  19.3964 
18  ruB–  24.5074 
18.5  ruCCC+  30.4565 
18.75  ruCCC  37.1391 
19  ruCCC–  44.3529 
19.5  ruCC  51.813 
20  ruC  59.1931 
21  ruD  66.1806 
Calibration of probability default (%) and the base rating scale.
Fig.
Calibrated models of PD and credit ratings now can be compared by their insample predictive power. For the precise visualization, PD model’s forecasts were converted into the base rating scale and the difference between the actual rating grade and the rating grade predicted by PD model was calculated. Fig.
Distribution of deviations of ratings model and PD model forecasts (%).
From Fig.
To sum up the comparison, we should conclude that the first hypothesis of this paper was not rejected after empirical modelling. Indeed, ratings models tend to overestimate the financial instability of a bank, whereas PD models underestimate it.
In order to construct a reliable synergic model, the ratings’ grade forecasts by PD and rating model should be computed for the same observations. Note that the rating model was estimated for 11627 observations, while PD model has 3489 estimates. Each observation has its own ID and time correspondence, so we find all id_time estimates that are present in both of these data sets. The overlapping of these datasets included 3011 estimates as the PD model had some artificially generated defaults. Then the regressions in which the dependent variable was the actual rating and explanatory variables were the fitted values of rating and PD models, were run on the 3011 observations.
The first synergic model was obtained as a linear combination of PD default and rating model:
where
Estimated coefficients for the linear synergic model.

8.265*** 

0.182*** 

0.344*** 
Pseudo 
0.184 
*
Distribution of forecast errors for the linear synergic model (%).
The linear synergic model shows much higher predictive power than PD or rating model on its own. It can predict 32% of precise rating grades and up to 58% of ratings with an error less than one grade. However, it still contains heavy tails. In order to solve this problem, we use the logarithmic model specification of synergic model:
The regressions output is provided below in Table
Estimated coefficients for the logarithmic synergic model.

–7.268*** 

10.981*** 
Pseudo 
0.21 
*
Distribution of forecast errors for the logarithmic synergic model (%).
The synergic model that was obtained by the logarithm of difference of rating and PD forecasts was found have the highest predicted power with the smallest deviations. Note that this distribution has very small tails and so such of the model does not have any prediction errors higher than three rating grades. Therefore, this optimal combination can bring us to the most consistent estimates with forecasting power of more than 44% of precise ratings and 83% of deviation less than one rating grade.
The second part of this section is devoted to the analysis of the outofsample predictive power of the logarithmic synergic model. In order to accomplish this task, the data were limited to the observations from 2007 to 2015. Based on the new coefficients of the PD and ratings models, the forecast for the year 2016 was made. In order to calculate the predicted ratings, the predicted probabilities of each rating grade were calculated as the difference between the values of the standard normal distribution (
The rating grade with the highest predicted probability was selected as the rating model’s forecast. Concerning the PD model, the probability forecasts were estimated and then calibrated to the rating scale. Then the predicted rating grades of the PD model and the rating model were taken with the functional form and estimated coefficients on 2007–2015 data for synergic models. The following coefficients and forecast error’s distributions were obtained under the outofsample fit check of the logarithmic synergic model (Table
Estimated coefficients for the logarithmic synergic model (outofsample).

–8.375*** 

10.729*** 
Pseudo 
0.19 
*
Then the actual financial data for the year 2016 was used as explanatory variables in both models and two separate forecasts of PD model (calibrated into ratings) and rating model were obtained. The following forecasts were placed into two different specifications of synergic models and the final synergic forecasts for the year 2016 were obtained. These forecasts were compared with the actual one assigned to a bank in the year 2016 and the distributions of forecast errors, illustrated on Fig.
Distribution of forecast errors for the logarithmic synergic model (outofsample, %).
The results show a slight expected deterioration in the predictive power of the synergic models under the outofsample fit check. Nevertheless, the logarithmic model can accurately predict the grade of the expected rating with a probability of 31.4%. In addition, the analysis of the outofsample power of the model shows that, in 72.3% of the cases, the prediction error of the expected rating of a bank will not exceed one rating grade. Based on this analysis, we can conclude that the logarithmic synergic model can have a practical use for predicting the credit risks. Therefore, the aim of this research was achieved by constructing a synergic model with higher forecasting power by using the set of alternative models: the model of probability of default and the model of credit ratings for Russian banks (hypothesis 2 was not rejected). Moreover, it should be noted that the horizontal scale in all forecast errors distributions shows the deviation of actual rating from the forecast based on the 30grades rating scale. Such scale is much more detailed than the usual 22grade scale of any international rating agency. That means that all results in this research are even more precise if we transform our forecasts to the 22grade scale.
The paper is aimed at comparing divergence of existing models of credit risks and at creating a synergic reliable model. For this purpose, credit ratings and PD models were applied to the same dataset and their estimates were normalized to the common scale. After thorough analysis of probability density functions of that output, the optimal weights and monotonic transformations were assigned to each model. As a result, the logarithmic synergic model with higher forecasting power (that predicted 44% of precise estimates out of a 30grade scale) was obtained.
It was found that there is a significant divergence in the predictions of credit ratings and PD models: credit ratings models tend to overestimate the probability of financial disease of a bank, whereas PD models give underestimated results, so the first hypothesis was not rejected. Indeed, the distribution of ratings forecast errors has a negative mode, while PD models forecasts have a positive mode. Therefore, both models have forecasting bias that decreases the number of correctly predicted forecasts.
The second hypothesis was not rejected either. The usage of the set of alternative models (ratings and PD) has improved banks’ credit risks forecasting power. The logarithmic synergic model has shown the insample precise estimates of 44% and 83% having less than one grade deviation. That is even higher than the biased modes of separate distributions of PD and rating model (33% and 36%). Moreover, it has shown the outofsample predictive power of 31% of precise estimates and more that 70% of forecasts with less than one rating grade deviation in a 30grades rating scale.
The novelty of the paper is the process of derivation of a single scale rating and PD econometric models on a new comprehensive database. Moreover, optimal weights and monotonic transformation of ratings and PD models were derived for a logarithmic synergic model that increases forecasting power of banks’ credit risks. In further research, we are going to apply such techniques of derivation for a synergic model to all other credit risk measurements. Moreover, more sophisticated methods of balanced dataset formation (