Estimation of Transition Probabilities

Introduction

Credit ratings rank borrowers according to their credit worthiness. Though this ranking is, in itself, useful, institutions are also interested in knowing how likely it is that borrowers in a particular rating category will be upgraded or downgraded to a different rating, and especially, how likely it is that they will default.

Transition probabilities offer one way to characterize the past changes in credit quality of obligors (typically firms), and are cardinal inputs to many risk management applications. Financial Toolbox™ software supports the estimation of transition probabilities using both cohort and duration (also known as hazard rate or intensity) approaches using transprob and related functions.

Note

The sample dataset used throughout this section is simulated using a single transition matrix. No attempt is made to match historical trends in transition rates.

Estimate Transition Probabilities

The Data_TransProb.mat file contains sample credit ratings data.

load Data_TransProb
data(1:10,:)

ans = 

        ID            Date         Rating
    __________    _____________    ______

    '00010283'    '10-Nov-1984'    'CCC' 
    '00010283'    '12-May-1986'    'B'   
    '00010283'    '29-Jun-1988'    'CCC' 
    '00010283'    '12-Dec-1991'    'D'   
    '00013326'    '09-Feb-1985'    'A'   
    '00013326'    '24-Feb-1994'    'AA'  
    '00013326'    '10-Nov-2000'    'BBB' 
    '00014413'    '23-Dec-1982'    'B'   
    '00014413'    '20-Apr-1988'    'BB'  
    '00014413'    '16-Jan-1998'    'B'

The sample data is formatted as a cell array with three columns. Each row contains an ID (column 1), a date (column 2), and a credit rating (column 3). The assigned credit rating corresponds to the associated ID on the associated date. All information corresponding to the same ID must be stored in contiguous rows. In this example, IDs, dates, and ratings are stored in character vector format, but you also can enter them in numeric format.

In this example, the simplest calling syntax for transprob passes the nRecords-by-3 cell array as the only input argument. The default startDate and endDate are the earliest and latest dates in the data. The default estimation algorithm is the duration method and one-year transition probabilities are estimated:

transMat0 = transprob(data)

transMat0 =

93.1170    5.8428    0.8232    0.1763    0.0376    0.0012    0.0001    0.0017
 1.6166   93.1518    4.3632    0.6602    0.1626    0.0055    0.0004    0.0396
 0.1237    2.9003   92.2197    4.0756    0.5365    0.0661    0.0028    0.0753
 0.0236    0.2312    5.0059   90.1846    3.7979    0.4733    0.0642    0.2193
 0.0216    0.1134    0.6357    5.7960   88.9866    3.4497    0.2919    0.7050
 0.0010    0.0062    0.1081    0.8697    7.3366   86.7215    2.5169    2.4399
 0.0002    0.0011    0.0120    0.2582    1.4294    4.2898   81.2927   12.7167
      0         0         0         0         0         0         0  100.0000

Provide explicit start and end dates, otherwise, the estimation window for two different datasets can differ, and the estimates might not be comparable. From this point, assume that the time window of interest is the five-year period from the end of 1995 to the end of 2000. For comparisons, compute the estimates for this time window. First use the duration algorithm (default option), and then the cohort algorithm explicitly set.

startDate = '31-Dec-1995';
endDate = '31-Dec-2000';
transMat1 = transprob(data,'startDate',startDate,'endDate',endDate)
transMat2 = transprob(data,'startDate',startDate,'endDate',endDate,...
'algorithm','cohort')

transMat1 =

90.6236    7.9051    1.0314    0.4123    0.0210    0.0020    0.0003    0.0043
 4.4780   89.5558    4.5298    1.1225    0.2284    0.0094    0.0009    0.0754
 0.3983    6.1164   87.0641    5.4801    0.7637    0.0892    0.0050    0.0832
 0.1029    0.8572   10.7918   83.0204    3.9971    0.7001    0.1313    0.3992
 0.1043    0.3745    2.2962   14.0954   78.9840    3.0013    0.0463    1.0980
 0.0113    0.0544    0.7055    3.2925   15.4350   75.5988    1.8166    3.0860
 0.0044    0.0189    0.1903    1.9743    6.2320   10.2334   75.9983    5.3484
      0         0         0         0         0         0         0  100.0000

transMat2 =

90.1554    8.5492    0.9067    0.3886         0         0         0         0
 4.9512   88.5221    5.1763    1.0503    0.2251         0         0    0.0750
 0.2770    6.6482   86.2188    6.0942    0.6233    0.0693         0    0.0693
 0.0794    0.8737   11.6759   81.6521    4.3685    0.7943    0.1589    0.3971
 0.1002    0.4008    1.9038   15.4309   77.8557    3.4068         0    0.9018
      0         0    0.2262    2.4887   17.4208   74.2081    2.2624    3.3937
      0         0    0.7576    1.5152    6.0606   10.6061   75.0000    6.0606
      0         0         0         0         0         0         0  100.0000

By default, the cohort algorithm internally gets yearly snapshots of the credit ratings, but the number of snapshots per year is definable using the parameter/value pair snapsPerYear. To get the estimates using quarterly snapshots:

transMat3 = transprob(data,'startDate',startDate,'endDate',endDate,...
'algorithm','cohort','snapsPerYear',4)

transMat3 =

90.4765    8.0881    1.0072    0.4069    0.0164    0.0015    0.0002    0.0032
 4.5949   89.3216    4.6489    1.1239    0.2276    0.0074    0.0007    0.0751
 0.3747    6.3158   86.7380    5.6344    0.7675    0.0856    0.0040    0.0800
 0.0958    0.7967   11.0441   82.6138    4.1906    0.7230    0.1372    0.3987
 0.1028    0.3571    2.3312   14.4954   78.4276    3.1489    0.0383    1.0987
 0.0084    0.0399    0.6465    3.0962   16.0789   75.1300    1.9044    3.0956
 0.0031    0.0125    0.1445    1.8759    6.2613   10.7022   75.6300    5.3705
      0         0         0         0         0         0         0  100.0000

Both duration and cohort compute one-year transition probabilities by default, but the time interval for the transitions is definable using the parameter/value pair transInterval. For example, to get the two-year transition probabilities using the cohort algorithm with the same snapshot periodicity and estimation window:

transMat4 = transprob(data,'startDate',startDate,'endDate',endDate,...
'algorithm','cohort','snapsPerYear',4,'transInterval',2)

transMat4 =

82.2358   14.6092    2.2062    0.8543    0.0711    0.0074    0.0011    0.0149
 8.2803   80.4584    8.3606    2.2462    0.4665    0.0316    0.0030    0.1533
 0.9604   11.1975   76.1729    9.7284    1.5322    0.2044    0.0162    0.1879
 0.2483    2.0903   18.8440   69.5145    6.9601    1.2966    0.2329    0.8133
 0.2129    0.8713    5.4893   23.5776   62.6438    4.9464    0.1390    2.1198
 0.0378    0.1895    1.7679    7.2875   24.9444   57.1783    2.8816    5.7132
 0.0154    0.0716    0.6576    4.2157   11.4465   16.3455   57.4078    9.8399
      0         0         0         0         0         0         0  100.0000

Estimate Transition Probabilities for Different Rating Scales

The dataset data from Data_TransProb.mat contains sample credit ratings using the default rating scale {'AAA', 'AA','A', 'BBB', 'BB', 'B', 'CCC', 'D'}. It also contains the dataset dataIGSG with ratings investment grade ('IG'), speculative grade ('SG'), and default ('D'). To estimate the transition matrix for this dataset, use the labels argument.

load Data_TransProb
startDate = '31-Dec-1995';
endDate = '31-Dec-2000';
dataIGSG(1:10,:)
transMatIGSG = transprob(dataIGSG,'labels',{'IG','SG','D'},...
'startDate',startDate,'endDate',endDate)

ans = 

    '00011253'    '04-Apr-1983'    'IG'
    '00012751'    '17-Feb-1985'    'SG'
    '00012751'    '19-May-1986'    'D' 
    '00014690'    '17-Jan-1983'    'IG'
    '00012144'    '21-Nov-1984'    'IG'
    '00012144'    '25-Mar-1992'    'SG'
    '00012144'    '07-May-1994'    'IG'
    '00012144'    '23-Jan-2000'    'SG'
    '00012144'    '20-Aug-2001'    'IG'
    '00012937'    '07-Feb-1984'    'IG'

transMatIGSG =

   98.1986    1.5179    0.2835
    8.5396   89.4891    1.9713
         0         0  100.0000

There is another dataset, dataIGSGnum, with the same information as dataIGSG, except the ratings are mapped to a numeric scale where 'IG'=1, 'SG'=2, and 'D'=3. To estimate the transition matrix, use the labels optional argument specifying the numeric scale as a cell array.

dataIGSGnum(1:10,:)
% Note {1,2,3} and num2cell(1:3) are equivalent; num2cell is convenient
% when the number of ratings is larger
transMatIGSGnum = transprob(dataIGSGnum,'labels',{1,2,3},...
'startDate',startDate,'endDate',endDate)

ans = 

    '00011253'    '04-Apr-1983'    [1]
    '00012751'    '17-Feb-1985'    [2]
    '00012751'    '19-May-1986'    [3]
    '00014690'    '17-Jan-1983'    [1]
    '00012144'    '21-Nov-1984'    [1]
    '00012144'    '25-Mar-1992'    [2]
    '00012144'    '07-May-1994'    [1]
    '00012144'    '23-Jan-2000'    [2]
    '00012144'    '20-Aug-2001'    [1]
    '00012937'    '07-Feb-1984'    [1]

transMatIGSGnum =

   98.1986    1.5179    0.2835
    8.5396   89.4891    1.9713
         0         0  100.0000

Any time the input dataset contains ratings not included in the default rating scale {'AAA', 'AA', 'A', 'BBB', 'BB', 'B', 'CCC', 'D'}, the full rating scale must be specified using the labels optional argument. For example, if the dataset contains ratings 'AAA', ..., 'CCC, 'D', and 'NR' (not rated), use labels with this cell array {'AAA', 'AA', 'A','BBB','BB','B','CCC','D','NR'}.

Working with a Transition Matrix Containing `NR` Rating

This example demonstrates how 'NR' (not rated) ratings are handled by transprob, and how to get transition matrix that use the 'NR' rating information for the estimation, but that do not show the 'NR' rating in the final transition probabilities.

The dataset data from Data_TransProb.mat contains sample credit ratings using the default rating scale {'AAA', 'AA','A', 'BBB', 'BB', 'B', 'CCC', 'D'}.

load Data_TransProb
head(data,12)

ans =

  12×3 table

        ID            Date         Rating
    __________    _____________    ______

    '00010283'    '10-Nov-1984'    'CCC' 
    '00010283'    '12-May-1986'    'B'   
    '00010283'    '29-Jun-1988'    'CCC' 
    '00010283'    '12-Dec-1991'    'D'   
    '00013326'    '09-Feb-1985'    'A'   
    '00013326'    '24-Feb-1994'    'AA'  
    '00013326'    '10-Nov-2000'    'BBB' 
    '00014413'    '23-Dec-1982'    'B'   
    '00014413'    '20-Apr-1988'    'BB'  
    '00014413'    '16-Jan-1998'    'B'   
    '00014413'    '25-Nov-1999'    'BB'  
    '00012126'    '17-Feb-1985'    'CCC'

Replace a transition to 'B' with a transition to 'NR' for the first company. Note that there is a subsequent transition from 'NR' to 'CCC'.

dataNR = data;
dataNR.Rating{2} = 'NR';
dataNR.Rating{7} = 'NR';

head(dataNR,12)

ans =

  12×3 table

        ID            Date         Rating
    __________    _____________    ______

    '00010283'    '10-Nov-1984'    'CCC' 
    '00010283'    '12-May-1986'    'NR'  
    '00010283'    '29-Jun-1988'    'CCC' 
    '00010283'    '12-Dec-1991'    'D'   
    '00013326'    '09-Feb-1985'    'A'   
    '00013326'    '24-Feb-1994'    'AA'  
    '00013326'    '10-Nov-2000'    'NR'  
    '00014413'    '23-Dec-1982'    'B'   
    '00014413'    '20-Apr-1988'    'BB'  
    '00014413'    '16-Jan-1998'    'B'   
    '00014413'    '25-Nov-1999'    'BB'  
    '00012126'    '17-Feb-1985'    'CCC'

'NR' is treated as another rating. The transition matrix shows the estimated probability of transitioning into and out of 'NR'. In this example, the transprob function uses the'cohort' algorithm, and the 'NR' rating is treated as another rating. The same behavior exists when using the transprob function with the 'duration' algorithm.

RatingsLabelsNR = {'AAA','AA','A','BBB','BB','B','CCC','D','NR'};
[MatrixNRCohort,TotalsNRCohort] = transprob(dataNR,...
   'Labels',RatingsLabelsNR,...
   'Algorithm','cohort');

fprintf('Transition probability, cohort, including NR:\n')
disp(array2table(MatrixNRCohort,'VariableNames',RatingsLabelsNR,...
   'RowNames',RatingsLabelsNR))

fprintf('Total transitions out of given rating, including 6 out of NR (5 NR->NR, 1 NR->CCC):\n')
disp(array2table(TotalsNRCohort.totalsVec,'VariableNames',RatingsLabelsNR))

Transition probability, cohort, including NR:
             AAA         AA          A          BBB         BB          B          CCC          D           NR   
           ________    _______    ________    _______    ________    ________    ________    ________    ________

    AAA      93.135     5.9335     0.74557    0.15533    0.031066           0           0           0           0
    AA       1.7359      92.92      4.5446    0.58514     0.15604           0           0    0.039009    0.019505
    A       0.12683     2.9716      91.991     4.3124      0.4711    0.054358           0    0.072477           0
    BBB    0.021048    0.37887      5.0726     89.771      4.0413     0.46306    0.042096     0.21048           0
    BB     0.022099     0.1105     0.68508      6.232      88.376      3.6464     0.28729     0.64088           0
    B             0          0    0.076161    0.72353       7.997      86.215      2.7037      2.2848           0
    CCC           0          0           0    0.30936      1.8561      4.4857      80.897      12.374     0.07734
    D             0          0           0          0           0           0           0         100           0
    NR            0          0           0          0           0           0      16.667           0      83.333

Total transitions out of given rating, including 6 out of NR (5 NR->NR, 1 NR->CCC):
    AAA      AA      A      BBB      BB      B      CCC      D      NR
    ____    ____    ____    ____    ____    ____    ____    ____    __

    3219    5127    5519    4751    4525    2626    1293    4050    6

To remove transitions to 'NR' from the transition matrix, you need to use the 'excludeLabels' optional name-value input argument to transprob.

The 'labels' input to transprob may or may not include the label that needs to be excluded. In the following example, the NR rating is removed from the labels for display purposes, but passing RatingsLabelsNR to transprob would also work.

RatingsLabels = {'AAA','AA','A','BBB','BB','B','CCC','D'};

[MatrixCohort,TotalsCohort] = transprob(dataNR,'Labels',RatingsLabels,'ExcludeLabels','NR','Algorithm','cohort');

fprintf('Transition probability, cohort, after postprocessing to remove NR:\n')

Transition probability, cohort, after postprocessing to remove NR:

disp(array2table(MatrixCohort,'VariableNames',RatingsLabels,...
   'RowNames',RatingsLabels))

Transition probability, cohort, after postprocessing to remove NR:
             AAA         AA          A          BBB         BB          B          CCC          D    
           ________    _______    ________    _______    ________    ________    ________    ________

    AAA      93.135     5.9335     0.74557    0.15533    0.031066           0           0           0
    AA       1.7362     92.938      4.5455    0.58525     0.15607           0           0    0.039017
    A       0.12683     2.9716      91.991     4.3124      0.4711    0.054358           0    0.072477
    BBB    0.021048    0.37887      5.0726     89.771      4.0413     0.46306    0.042096     0.21048
    BB     0.022099     0.1105     0.68508      6.232      88.376      3.6464     0.28729     0.64088
    B             0          0    0.076161    0.72353       7.997      86.215      2.7037      2.2848
    CCC           0          0           0     0.3096      1.8576      4.4892       80.96      12.384
    D             0          0           0          0           0           0           0         100

Total transitions out of given rating, AA and CCC have one less than before:
    AAA      AA      A      BBB      BB      B      CCC      D  
    ____    ____    ____    ____    ____    ____    ____    ____

    3219    5126    5519    4751    4525    2626    1292    4050

fprintf('Total transitions out of given rating, AA and CCC have one less than before:\n')

Total transitions out of given rating, AA and CCC have one less than before

disp(array2table(TotalsCohort.totalsVec,'VariableNames',RatingsLabels))

    AAA      AA      A      BBB      BB      B      CCC      D  
    ____    ____    ____    ____    ____    ____    ____    ____

    3219    5126    5519    4751    4525    2626    1292    4050

All transitions involving 'NR' are removed from the sample, but all other transitions are still used to estimate the transition probabilities. In this example, the transition from 'NR' to 'CCC' has been removed, as well as the transition from 'AA' to 'NR' (and five more transitions from 'NR' to 'NR'). That means the first company is still contributing transitions from'CCC' to 'CCC' for the estimation, only the periods overlapping with the time this company spent in 'NR' have been removed from the sample, and similarly for the other company.

This procedure is different from removing the 'NR' rows from the data itself.

For example, if you remove the 'NR' rows in this example, the first company seems to stay in its initial rating of 'CCC' all the way from the initial date in 1984 to the default event in 1991. With the previous approach, the estimation knows that the company transitioned out of 'CCC' at some point, it knows it was not staying at 'CCC' all the time.

If the 'NR' row is removed for the second company, this company seems to have stayed in the sample as an 'AA' company until the end of the sample. With the previous approach, the estimation knows that this company stopped being an 'AA' earlier.

dataNR2 = dataNR;
dataNR2([2 7],:) = [];

head(dataNR2,12)

ans =

  12×3 table

        ID            Date         Rating
    __________    _____________    ______

    '00010283'    '10-Nov-1984'    'CCC' 
    '00010283'    '29-Jun-1988'    'CCC' 
    '00010283'    '12-Dec-1991'    'D'   
    '00013326'    '09-Feb-1985'    'A'   
    '00013326'    '24-Feb-1994'    'AA'  
    '00014413'    '23-Dec-1982'    'B'   
    '00014413'    '20-Apr-1988'    'BB'  
    '00014413'    '16-Jan-1998'    'B'   
    '00014413'    '25-Nov-1999'    'BB'  
    '00012126'    '17-Feb-1985'    'CCC' 
    '00012126'    '08-Mar-1989'    'D'   
    '00011692'    '11-May-1984'    'BB'

If the 'NR' rows are removed, the transition matrices will be different. The probability of staying at 'CCC' goes slightly up, and so does the probability of staying at 'AA'.

The transition matrices will be different. The probability of staying at 'CCC' goes slightly up, and so does the probability of staying at 'AA'.

[MatrixCohort2,TotalsCohort2] = transprob(dataNR2,...
   'Labels',RatingsLabels,...
   'Algorithm','cohort');

fprintf('Transition probability, cohort, if NR rows are removed from data:\n')
disp(array2table(MatrixCohort2,'VariableNames',RatingsLabels,...
   'RowNames',RatingsLabels))

fprintf('Total transitions out of given rating, many more out of CCC and AA:\n')
disp(array2table(TotalsCohort2.totalsVec,'VariableNames',RatingsLabels))

Transition probability, cohort, if NR rows are removed from data:

disp(array2table(MatrixCohort2,'VariableNames',RatingsLabels,...
   'RowNames',RatingsLabels))

Transition probability, cohort, if NR rows are removed from data:
             AAA         AA          A          BBB         BB          B          CCC          D    
           ________    _______    ________    _______    ________    ________    ________    ________

    AAA      93.135     5.9335     0.74557    0.15533    0.031066           0           0           0
    AA       1.7346     92.945       4.541    0.58468     0.15592           0           0    0.038979
    A       0.12683     2.9716      91.991     4.3124      0.4711    0.054358           0    0.072477
    BBB    0.021048    0.37887      5.0726     89.771      4.0413     0.46306    0.042096     0.21048
    BB     0.022099     0.1105     0.68508      6.232      88.376      3.6464     0.28729     0.64088
    B             0          0    0.076161    0.72353       7.997      86.215      2.7037      2.2848
    CCC           0          0           0    0.30888      1.8533      4.4788      81.004      12.355
    D             0          0           0          0           0           0           0         100

fprintf('Total transitions out of given rating, many more out of CCC and AA:\n')

Total transitions out of given rating, many more out of CCC and AA:

disp(array2table(TotalsCohort2.totalsVec,'VariableNames',RatingsLabels))

    AAA      AA      A      BBB      BB      B      CCC      D  
    ____    ____    ____    ____    ____    ____    ____    ____

    3219    5131    5519    4751    4525    2626    1295    4050

Estimate Point-in-Time and Through-the-Cycle Probabilities

Transition probability estimates are sensitive to the length of the estimation window. When the estimation window is small, the estimates only capture recent credit events, and these can change significantly from one year to the next. These are called point-in-time (PIT) estimates. In contrast, a large time window yields fairly stable estimates that average transition rates over a longer period of time. These are called through-the-cycle (TTC) estimates.

The estimation of PIT probabilities requires repeated calls to transprob with a rolling estimation window. Use transprobprep every time repeated calls to transprob are required. transprobprep performs a preprocessing step on the raw dataset that is independent of the estimation window. The benefits of transprobprep are greater as the number of repeated calls to transprob increases. Also, the performance gains from transprobprep are more significant for the cohort algorithm.

load Data_TransProb
prepData = transprobprep(data);

Years = 1991:2000;
nYears = length(Years);
nRatings = length(prepData.ratingsLabels);
transMatPIT = zeros(nRatings,nRatings,nYears);
algorithm = 'duration';
sampleTotals(nYears,1) = struct('totalsVec',[],'totalsMat',[],...
'algorithm',algorithm);
for t = 1:nYears
   startDate = ['31-Dec-' num2str(Years(t)-1)];
   endDate = ['31-Dec-' num2str(Years(t))];
   [transMatPIT(:,:,t),sampleTotals(t)] = transprob(prepData,...
    'startDate',startDate,'endDate',endDate,'algorithm',algorithm);
end

Here is the PIT transition matrix for 1993. Recall that the sample dataset contains simulated credit migrations so the PIT estimates in this example do not match actual historical transition rates.

transMatPIT(:,:,Years==1993)

ans =

   95.3193    4.5999    0.0802    0.0004    0.0002    0.0000    0.0000    0.0000
    2.0631   94.5931    3.3057    0.0254    0.0126    0.0002    0.0000    0.0000
    0.0237    2.1748   95.5901    1.4700    0.7284    0.0131    0.0000    0.0000
    0.0003    0.0372    3.2585   95.2914    1.3876    0.0250    0.0001    0.0000
    0.0000    0.0005    0.0657    3.8292   92.7474    3.3459    0.0111    0.0001
    0.0000    0.0001    0.0128    0.7977    8.0926   90.4897    0.5958    0.0113
    0.0000    0.0000    0.0005    0.0459    0.5026   11.1621   84.9315    3.3574
         0         0         0         0         0         0         0  100.0000

A structure array stores the sampleTotals optional output from transprob. The sampleTotals structure contains summary information on the total time spent on each rating, and the number of transitions out of each rating, for each year under consideration. For more information on the sampleTotals structure, see transprob.

As an example, the sampleTotals structure for 1993 is used here. The total time spent on each rating is stored in the totalsVec field of the structure. The total transitions out of each rating are stored in the totalsMat field. A third field, algorithm, indicates the algorithm used to generate the structure.

sampleTotals(Years==1993).totalsVec
sampleTotals(Years==1993).totalsMat
sampleTotals(Years==1993).algorithm

ans =

  144.4411  230.0356  262.2438  204.9671  246.1315  147.0767   54.9562  215.1479


ans =

     0     7     0     0     0     0     0     0
     5     0     8     0     0     0     0     0
     0     6     0     4     2     0     0     0
     0     0     7     0     3     0     0     0
     0     0     0    10     0     9     0     0
     0     0     0     1    13     0     1     0
     0     0     0     0     0     7     0     2
     0     0     0     0     0     0     0     0


ans =

duration

To get the TTC transition matrix, pass the sampleTotals structure array to transprobbytotals. Internally, transprobbytotals aggregates the information in the sampleTotals structures to get the total time spent on each rating over the 10 years considered in this example, and the total number of transitions out of each rating during the same period. transprobbytotals uses the aggregated information to get the TTC matrix, or average one-year transition matrix.

transMatTTC = transprobbytotals(sampleTotals)

transMatTTC =

   92.8544    6.1068    0.7463    0.2761    0.0123    0.0009    0.0001    0.0032
    2.9399   92.2329    3.8394    0.7349    0.1676    0.0050    0.0004    0.0799
    0.2410    4.5963   90.3468    3.9572    0.6909    0.0521    0.0025    0.1133
    0.0530    0.4729    7.9221   87.2751    3.5075    0.4650    0.0791    0.2254
    0.0460    0.1636    1.1873    9.3442   85.4305    2.9520    0.1150    0.7615
    0.0031    0.0152    0.2608    1.5563   10.4468   83.8525    1.9771    1.8882
    0.0009    0.0041    0.0542    0.8378    2.9996    7.3614   82.4758    6.2662
         0         0         0         0         0         0         0  100.0000

The same TTC matrix could be obtained with a direct call to transprob, setting the estimation window to the 10 years under consideration. But it is much more efficient to use the sampleTotals structures, whenever they are available. (Note, for the duration algorithm, these alternative workflows can result in small numerical differences in the estimates whenever leap years are part of the sample.)

In Estimate Transition Probabilities, a 1-year transition matrix is estimated using the 5-year time window from 1996 through 2000. This is another example of a TTC matrix and this can also be computed using the sampleTotals structure array.

transprobbytotals(sampleTotals(Years>=1996&Years<=2000))

ans =

   90.6239    7.9048    1.0313    0.4123    0.0210    0.0020    0.0003    0.0043
    4.4776   89.5565    4.5294    1.1224    0.2283    0.0094    0.0009    0.0754
    0.3982    6.1159   87.0651    5.4797    0.7636    0.0892    0.0050    0.0832
    0.1029    0.8571   10.7909   83.0218    3.9968    0.7001    0.1313    0.3991
    0.1043    0.3744    2.2960   14.0947   78.9851    3.0012    0.0463    1.0980
    0.0113    0.0544    0.7054    3.2922   15.4341   75.6004    1.8165    3.0858
    0.0044    0.0189    0.1903    1.9742    6.2318   10.2332   75.9990    5.3482
         0         0         0         0         0         0         0  100.0000

Estimate t-Year Default Probabilities

By varying the start and end dates, the amount of data considered for the estimation is changed, but the output still contains, by default, one-year transition probabilities. You can change the default behavior by specifying the transInterval argument, as illustrated in Estimate Transition Probabilities.

However, when t-year transition probabilities are required for a whole range of values of t, for example, 1-year, 2-year, 3-year, 4-year, and 5-year transition probabilities, it is more efficient to call transprob once to get the optional output sampleTotals. You can use the same sampleTotals structure can be used to get the t-year transition matrix for any transition interval t. Given a sampleTotals structure and a transition interval, you can get the corresponding transition matrix by using transprobbytotals.

load Data_TransProb
startDate = '31-Dec-1995';
endDate = '31-Dec-2000';

[~,sampleTotals] = transprob(data,'startDate', ...
startDate, 'endDate',endDate);

DefProb = zeros(7,5);
for t = 1:5
   transMatTemp = transprobbytotals(sampleTotals,'transInterval',t);
   DefProb(:,t) = transMatTemp(1:7,8);
end
DefProb

DefProb =

    0.0043    0.0169    0.0377    0.0666    0.1033
    0.0754    0.1542    0.2377    0.3265    0.4213
    0.0832    0.1936    0.3276    0.4819    0.6536
    0.3992    0.8127    1.2336    1.6566    2.0779
    1.0980    2.1189    3.0668    3.9468    4.7644
    3.0860    5.6994    7.9281    9.8418   11.4963
    5.3484    9.8053   13.5320   16.6599   19.2964

Estimate Bootstrap Confidence Intervals

transprob also returns the idTotals structure array which contains, for each ID, or company, the total time spent on each rating, and the total transitions out of each rating. For more information on the idTotals structure, see transprob. The idTotals structure is similar to the sampleTotals structures (see Estimate Point-in-Time and Through-the-Cycle Probabilities), but idTotals has the information at an ID level. Because most companies only migrate between few ratings, the numeric arrays in idTotals are stored as sparse arrays to reduce memory requirements.

You can use the idTotals structure array to estimate confidence intervals for the transition probabilities using a bootstrapping procedure, as the following example demonstrates. To do this, call transprob and keep the third output argument, idTotals. The idTotals fields are displayed for the last company in the sample. Within the estimation window, this company spends almost a year as 'AA' and it is then upgraded to 'AAA'.

load Data_TransProb
startDate = '31-Dec-1995';
endDate = '31-Dec-2000';

[transMat,~,idTotals] = transprob(data,...
   'startDate',startDate,'endDate',endDate);

% Total time spent on each rating
full(idTotals(end).totalsVec)
% Total transitions out of each rating
full(idTotals(end).totalsMat)
% Algorithm
idTotals(end).algorithm

ans =

    4.0820    0.9180         0         0         0         0         0         0


ans =

     0     0     0     0     0     0     0     0
     1     0     0     0     0     0     0     0
     0     0     0     0     0     0     0     0
     0     0     0     0     0     0     0     0
     0     0     0     0     0     0     0     0
     0     0     0     0     0     0     0     0
     0     0     0     0     0     0     0     0
     0     0     0     0     0     0     0     0


ans =

duration

Next, use bootstrp from Statistics and Machine Learning Toolbox™ with transprobbytotals as the bootstrap function and idTotals as the data to sample from. Each bootstrap sample corresponds to a dataset made of companies sampled with replacement from the original data. However, you do not have to draw companies from the original data, because a bootstrap idTotals sample contains all the information required to compute the transition probabilities. transprobbytotals aggregates all structures in each bootstrap idTotals sample and finds the corresponding transition matrix.

To estimate 95% confidence intervals for the transition matrix and display the probabilities of default together with its upper and lower confidence bounds:

PD = transMat(1:7,8);

bootstat = bootstrp(100,@(totals)transprobbytotals(totals),idTotals);
ci = prctile(bootstat,[2.5 97.5]); % 95% confidence
CIlower = reshape(ci(1,:),8,8);
CIupper = reshape(ci(2,:),8,8);
PD_LB = CIlower(1:7,8);
PD_UB = CIupper(1:7,8);

[PD_LB PD PD_UB]

ans =

    0.0004    0.0043    0.0106
    0.0028    0.0754    0.2192
    0.0126    0.0832    0.2180
    0.1659    0.3992    0.6617
    0.5703    1.0980    1.7260
    1.7264    3.0860    4.7602
    1.7678    5.3484    9.5055

Group Credit Ratings

Credit rating scales can be more or less granular. For example, there are ratings with qualifiers (such as, 'AA+', 'BB-', and so on), whole ratings ('AA', 'BB', and so on), and investment or speculative grade ('IG', 'SG') categories. Given a dataset with credit ratings at a more granular level, transition probabilities for less granular categories can be of interest. For example, you might be interested in a transition matrix for investment and speculative grades given a dataset with whole ratings. Use transprobgrouptotals for this evaluation, as illustrated in the following examples. The sample dataset data has whole credit ratings:

load Data_TransProb
startDate = '31-Dec-1995';
endDate = '31-Dec-2000';
data(1:5,:)

ans = 

    '00010283'    '10-Nov-1984'    'CCC'
    '00010283'    '12-May-1986'    'B'  
    '00010283'    '29-Jun-1988'    'CCC'
    '00010283'    '12-Dec-1991'    'D'  
    '00013326'    '09-Feb-1985'    'A'

A call to transprob returns the transition matrix and totals structures for the eight ('AAA' to 'D') whole credit ratings. The array with number of transitions out of each credit rating is displayed after the call to transprob:

[transMat,sampleTotals,idTotals] = transprob(data,'startDate',startDate,...
'endDate',endDate);
sampleTotals.totalsMat

ans =

     0    67     7     3     0     0     0     0
    67     0    68    15     3     0     0     1
     4   101     0    93    11     1     0     1
     1     7   163     0    62    10     2     5
     1     3    16   168     0    37     0    11
     0     0     2    10    83     0    10    14
     0     0     0     2     8    16     0     7
     0     0     0     0     0     0     0     0

Next, use transprobgrouptotals to group whole ratings into investment and speculative grades. This function takes a totals structure as the first argument. The second argument indicates the edges between rating categories. In this case, ratings 1 through 4 ('AAA' through 'BBB') correspond to the first category ('IG'), ratings 5 through 7 ('BB' through 'CCC') to the second category ('SG'), and rating 8 ('D') is a category of its own. transprobgrouptotals adds up the total time spent on ratings that belong to the same category. For example, total times spent on 'AAA' through 'BBB' are added up as the total time spent on 'IG'. transprobgrouptotals also adds up the total number of transitions between any 'IG' rating and any 'SG' rating, for example, a credit migration from 'BBB' to 'BB'.

The grouped totals can then be passed to transprobbytotals to obtain the transition matrix for investment and speculative grades. Both totalsMat and the new transition matrix are both 3-by-3, corresponding to the grouped categories 'IG', 'SG', and 'D'.

sampleTotalsIGSG = transprobgrouptotals(sampleTotals,[4 7 8])
transMatIGSG = transprobbytotals(sampleTotalsIGSG)

sampleTotalsIGSG = 

    totalsVec: [4.8591e+003 1.5034e+003 1.1621e+003]
    totalsMat: [3x3 double]
    algorithm: 'duration'

transMatIGSG =

   98.1591    1.6798    0.1611
   12.3228   85.6961    1.9811
         0         0  100.0000

When a totals structure array is passed to transprobgrouptotals, this function groups each structure in the array individually and preserves sparsity, if the fields in the input structures are sparse. One way to exploit this feature is to compute confidence intervals for the investment grade default rate and the speculative grade default rate (see also Estimate Bootstrap Confidence Intervals).

PDIGSG = transMatIGSG(1:2,3);

idTotalsIGSG = transprobgrouptotals(idTotals,[4 7 8]);
bootstat = bootstrp(100,@(totals)transprobbytotals(totals),idTotalsIGSG);
ci = prctile(bootstat,[2.5 97.5]); % 95% confidence
CIlower = reshape(ci(1,:),3,3);
CIupper = reshape(ci(2,:),3,3);
PDIGSG_LB = CIlower(1:2,3);
PDIGSG_UB = CIupper(1:2,3);

[PDIGSG_LB PDIGSG PDIGSG_UB]

ans =

    0.0603    0.1611    0.2538
    1.3470    1.9811    2.6195

Work with Nonsquare Matrices

Transition probabilities and the number of transitions between ratings are usually reported without the 'D' ('Default') row. For example, a credit report can contain the following table, indicating the number of issuers starting in each rating (first column), and the number of transitions between ratings (remaining columns):

     Initial  AAA   AA    A  BBB   BB    B  CCC    D
  AAA     98   88    9    1    0    0    0    0    0
   AA    389    0  368   19    2    0    0    0    0
    A   1165    1   21 1087   56    0    0    0    0
  BBB   1435    0    2   89 1289   45    8    0    2
   BB    915    0    0    1   60  776   73    2    3
    B    867    0    0    1    7   88  715   39   17
  CCC    112    0    0    0    1    3   34   61   13

You can store the information in this table in a totals structure compatible with the cohort algorithm. For more information on the cohort algorithm and the totals structure, see transprob. The totalsMat field is a nonsquare array in this case.

% Define totals structure
totals.totalsVec = [98 389 1165 1435 915 867 112];
totals.totalsMat = [
   88    9    1    0    0    0    0    0;
    0  368   19    2    0    0    0    0;
    1   21 1087   56    0    0    0    0;
    0    2   89 1289   45    8    0    2;
    0    0    1   60  776   73    2    3;
    0    0    1    7   88  715   39   17;
    0    0    0    1    3   34   61   13];
totals.algorithm = 'cohort';

transprobbytotals and transprobgrouptotals accept totals inputs with nonsquare totalsMat fields. To get the transition matrix corresponding to the previous table, and to group ratings into investment and speculative grade with the corresponding matrix:

transMat = transprobbytotals(totals)

% Group into IG/SG and get IG/SG transition matrix
totalsIGSG = transprobgrouptotals(totals,[4 7]);
transMatIGSG = transprobbytotals(totalsIGSG)

transMat =

   89.7959    9.1837    1.0204         0         0         0         0         0
         0   94.6015    4.8843    0.5141         0         0         0         0
    0.0858    1.8026   93.3047    4.8069         0         0         0         0
         0    0.1394    6.2021   89.8258    3.1359    0.5575         0    0.1394
         0         0    0.1093    6.5574   84.8087    7.9781    0.2186    0.3279
         0         0    0.1153    0.8074   10.1499   82.4683    4.4983    1.9608
         0         0         0    0.8929    2.6786   30.3571   54.4643   11.6071


transMatIGSG =

   98.2183    1.7169    0.0648
    3.6959   94.5618    1.7423

Remove Outliers

The idTotals output from transprob can also be exploited to update the transition probability estimates after removing some outlier information. For more information on idTotals, see transprob. For example, if you know that the credit rating migration information for the 4th and 27th companies in the data have problems, you can remove those companies and efficiently update the transition probabilities as follows:

load Data_TransProb
startDate = '31-Dec-1995';
endDate = '31-Dec-2000';
[transMat,~,idTotals] = transprob(data,'startDate', ...
startDate, 'endDate',endDate);
transMat

transMat =

90.6236    7.9051    1.0314    0.4123    0.0210    0.0020    0.0003    0.0043
 4.4780   89.5558    4.5298    1.1225    0.2284    0.0094    0.0009    0.0754
 0.3983    6.1164   87.0641    5.4801    0.7637    0.0892    0.0050    0.0832
 0.1029    0.8572   10.7918   83.0204    3.9971    0.7001    0.1313    0.3992
 0.1043    0.3745    2.2962   14.0954   78.9840    3.0013    0.0463    1.0980
 0.0113    0.0544    0.7055    3.2925   15.4350   75.5988    1.8166    3.0860
 0.0044    0.0189    0.1903    1.9743    6.2320   10.2334   75.9983    5.3484
      0         0         0         0         0         0         0  100.0000

nIDs = length(idTotals);
keepInd = setdiff(1:nIDs,[4 27]);
transMatNoOutlier = transprobbytotals(idTotals(keepInd))

transMatNoOutlier =

90.6241    7.9067    1.0290    0.4124    0.0211    0.0020    0.0003    0.0043
 4.4917   89.5918    4.4779    1.1240    0.2288    0.0094    0.0009    0.0756
 0.3990    6.1220   87.0530    5.4841    0.7643    0.0893    0.0050    0.0833
 0.1030    0.8576   10.7909   83.0207    3.9971    0.7001    0.1313    0.3992
 0.1043    0.3746    2.2960   14.0955   78.9840    3.0013    0.0463    1.0980
 0.0113    0.0544    0.7054    3.2925   15.4350   75.5988    1.8166    3.0860
 0.0044    0.0189    0.1903    1.9743    6.2320   10.2334   75.9983    5.3484
      0         0         0         0         0         0         0  100.0000

Deciding which companies to remove is a case-by-case situation. Reasons to remove a company can include a typo in one of the ratings histories, or an unusual migration between ratings whose impact on the transition probability estimates must be measured. transprob does not reorder the companies in any way. The ordering of companies in the input data is the same as the ordering in the idTotals array.

Estimate Probabilities for Different Segments

You can use idTotals efficiently to get estimates over different segments of the sample. For more information on idTotals, see transprob. For example, assume that the companies in the example are grouped into three geographic regions and that the companies were grouped by geographic regions previously, so that the first 340 companies correspond to the first region, the next 572 companies to the second region, and the rest to the third region. You can efficiently get transition probabilities for each region as follows:

load Data_TransProb
startDate = '31-Dec-1995';
endDate = '31-Dec-2000';
[~,~,idTotals] = transprob(data,'startDate', ...
startDate, 'endDate',endDate);

n1 = 340;
n2 = 572;
transMatG1 = transprobbytotals(idTotals(1:n1))
transMatG2 = transprobbytotals(idTotals(n1+1:n1+n2))
transMatG3 = transprobbytotals(idTotals(n1+n2+1:end))

transMatG1 =

90.8299    7.6501    0.3178    1.1700    0.0255    0.0044    0.0021    0.0002
 4.3572   89.0262    5.7838    0.8039    0.0245    0.0029    0.0013    0.0001
 0.7066    6.7567   86.6320    5.4950    0.3721    0.0252    0.0101    0.0023
 0.0626    1.3688   10.3895   83.5022    3.6823    0.6466    0.3084    0.0396
 0.0256    0.7884    2.6970   13.7857   78.8321    2.8310    0.0561    0.9842
 0.0026    0.1095    0.4280    3.5204   21.1437   72.9230    1.6456    0.2273
 0.0005    0.0216    0.0730    0.4574    4.9586    4.2821   80.3062    9.9006
      0         0         0         0         0         0         0  100.0000

transMatG2 =

90.5798    8.4877    0.8202    0.0884    0.0132    0.0011    0.0000    0.0096
 4.1999   90.0371    3.8657    1.4744    0.2144    0.0128    0.0001    0.1956
 0.3022    5.9869   86.7128    5.5526    1.0411    0.1902    0.0015    0.2127
 0.0204    0.5606   10.9342   82.9195    4.0123    0.7398    0.0059    0.8073
 0.0089    0.3338    2.1185   16.6496   76.2395    3.1241    0.0261    1.4995
 0.0013    0.0465    0.6710    2.4731   14.7281   76.7378    1.2993    4.0428
 0.0002    0.0080    0.0681    0.4598    4.1324    8.4380   80.9092    5.9843
      0         0         0         0         0         0         0  100.0000

transMatG3 =

90.5655    7.5408    1.5288    0.3369    0.0258    0.0015    0.0003    0.0004
 4.8073   89.3842    4.4865    0.9582    0.3509    0.0095    0.0009    0.0025
 0.3153    5.8771   87.6353    5.4101    0.7160    0.0322    0.0052    0.0088
 0.1995    0.8625   10.8682   82.8717    4.1423    0.6903    0.1565    0.2090
 0.2465    0.1091    2.1558   12.0289   81.5803    3.0057    0.0616    0.8122
 0.0227    0.0400    0.9380    4.3175   12.3632   75.9429    2.5766    3.7991
 0.0149    0.0180    0.3414    3.6918    8.1414   13.6010   70.7254    3.4661
      0         0         0         0         0         0         0  100.0000

Work with Large Datasets

This example shows how to aggregate estimates from two (or more) datasets. It is possible that two datasets, coming from two different databases, must be considered for the estimation of the transition probabilities. Also, if a dataset is too large and cannot be loaded into memory, the dataset can be split into two (or more) datasets. In these cases, it is simple to apply transprob to each individual dataset, and then get the final estimates corresponding to the aggregated data with a call to transprobbytotals at the end.

For example, the dataset data is artificially split into two sections in this example. In practice the two datasets would come from different files or databases. When aggregating multiple datasets, the history of a company cannot be split across datasets. You can analyze that this condition is satisfied for the arbitrarily chosen cut-off point.

load Data_TransProb

cutoff = 2099;
data(cutoff-5:cutoff,:)
data(cutoff+1:cutoff+6,:)

ans = 

    '00011166'    '24-Aug-1995'    'BBB'
    '00011166'    '25-Jan-1997'    'A'  
    '00011166'    '01-Feb-1998'    'AA' 
    '00014878'    '15-Mar-1983'    'B'  
    '00014878'    '21-Sep-1986'    'BB' 
    '00014878'    '17-Jan-1998'    'BBB'


ans = 

    '00012043'    '09-Feb-1985'    'BBB'
    '00012043'    '03-Jan-1988'    'A'  
    '00012043'    '15-Jan-1994'    'AAA'
    '00011157'    '24-Jun-1984'    'A'  
    '00011157'    '09-Dec-1999'    'BBB'
    '00011157'    '28-Mar-2001'    'A'

When working with multiple datasets, it is important to set the start and end dates explicitly. Otherwise, the estimation window differs for each dataset because the default start and end dates used by transprob are the earliest and latest dates found in the input data.

startDate = '31-Dec-1995';
endDate = '31-Dec-2000';

In practice, this is the point where you can read in the first dataset. Now, the dataset is already obtained. Call transprob with the first dataset and the explicit start and end dates. Keep only the sampleTotals output. For details on sampleTotals, see transprob.

[~,sampleTotals(1)] = transprob(data(1:cutoff,:),...
   'startDate',startDate,'endDate',endDate);

Repeat for the remaining datasets. Note the different sampleTotals structures are stored in a structured array.

[~,sampleTotals(2)] = transprob(data(cutoff+1:end,:),...
   'startDate',startDate,'endDate',endDate);

To get the transition matrix corresponding to the aggregated dataset, use transprobbytotals. When the totals input is a structure array, transprobbytotals aggregates the information over all structures, and returns a single transition matrix.

transMatAggr = transprobbytotals(sampleTotals)

transMatAggr =

   90.6236    7.9051    1.0314    0.4123    0.0210    0.0020    0.0003    0.0043
    4.4780   89.5558    4.5298    1.1225    0.2284    0.0094    0.0009    0.0754
    0.3983    6.1164   87.0641    5.4801    0.7637    0.0892    0.0050    0.0832
    0.1029    0.8572   10.7918   83.0204    3.9971    0.7001    0.1313    0.3992
    0.1043    0.3745    2.2962   14.0954   78.9840    3.0013    0.0463    1.0980
    0.0113    0.0544    0.7055    3.2925   15.4350   75.5988    1.8166    3.0860
    0.0044    0.0189    0.1903    1.9743    6.2320   10.2334   75.9983    5.3484
         0         0         0         0         0         0         0  100.0000

As a sanity check, for this example you can analyze that the aggregation procedure yields the same estimates (up to numerical differences) as estimating the probabilities directly over the entire sample:

transMatWhole = transprob(data,'startDate',startDate,'endDate',endDate)
aggError = max(max(abs(transMatAggr - transMatWhole)))

transMatWhole =

   90.6236    7.9051    1.0314    0.4123    0.0210    0.0020    0.0003    0.0043
    4.4780   89.5558    4.5298    1.1225    0.2284    0.0094    0.0009    0.0754
    0.3983    6.1164   87.0641    5.4801    0.7637    0.0892    0.0050    0.0832
    0.1029    0.8572   10.7918   83.0204    3.9971    0.7001    0.1313    0.3992
    0.1043    0.3745    2.2962   14.0954   78.9840    3.0013    0.0463    1.0980
    0.0113    0.0544    0.7055    3.2925   15.4350   75.5988    1.8166    3.0860
    0.0044    0.0189    0.1903    1.9743    6.2320   10.2334   75.9983    5.3484
         0         0         0         0         0         0         0  100.0000

aggError =

  2.8422e-014