Operational Risk Management: A Complete Guide for Banking and Fintech

Reading outline

Definition and drivers of operational risk
- The definition of operational risk
  - In the past anything not captured by market risk or credit risk was considered operational risk
  - Now the Basel 2 definition has been adopted as a common one.
    - Operational risk is the risk of loss resulting from inadequate or failed processes, people, and systems or from external events. This definition includes legal risk, but excludes strategic and reputational risk.
      - Operational risk is a risk of incurring a financial loss.
      - This definition captures the four main drivers of operational risk, which are:
        
        Processes
        
        People
        
        Systems
        
        External events
- 2012 London Olympics: a case study
- Operational risk management and operational risk measurement
- Drivers of operational risk management
- Key points
The regulatory push
- History of the Basel accords
- Rules of the accords
- Adoption of Basel 2 in Europe
- Adoption of Basel 2 in the United States
- Impact of the Global financial crisis
- Basel 3
- Key points
The operational risk framework
- Overview of the operational risk framework
  - A paper by BCBS titled ‘Revisions to the Principles for the Sound Management of Operational Risk’ provides helpful guidelines for the best practices for operational risk departments.
  - Following are the ‘moving parts’ of a decent ORMF:
  - Governance
  - Risk appetite
  - Policies and procedures
  - Culture and awareness
  - Measurement and modeling (founded on the ‘main data building blocks’ mentioned below)
  - Reporting
  - Main data building blocks of an operational risk framework are: loss data collection, risk and control self-assessment (RCSA), scenario analysis and key risk indicators
- The foundations of the framework
  - Governance determines the roles and responsibilities of:
  - the head of the operation risk function,
  - the team that manages the framework,
  - the committees that oversee and make key decisions about risk management,
  - the operational risk managers in lines of business and
  - every employee who may encounter operational risk.
  - For an ORMF to be effective the governance structure should be appropriate and reviewed regularly (at least once a year).
- Without culture and awareness in the organization an ORMF would just be reduced to a paper tiger. Winning the hearts and minds of your colleagues for the topic of operational risk, as well as promoting the skill of spotting operational risks, is the difference between you fighting windmills all alone and everybody doing their part.
- Policies and procedures are paramount in effective operational risk management, as it clearly and defines the ‘rules of engagement’ and establishes a foundation for process transparency. Additionally, often
- The four data building blocks
  - Internal loss data is needed to be able to accurately model operational risk as well as have just plain managerial transparency into what is going on in the organization, what types of risks are more common, what types of risks are more costly, etc.
  - External loss data is an important supplement to the internal data in two regards. Firstly, it allows you to estimate the likelihood and severity of risks which might not yet have occurred in your own organization. Secondly, it allows you to benchmark against the industry and identify any outliers.
  - Risk and control self-assessment (RCSA) is designed to help us understand what additional potential risks we are facing today (in addition to the loss data, which explains what has happened). During RCSA are identified and assessed. RCSA often becomes the most important part of the framework because it proactively addresses the basic requirements of operational risk:
    - identify,
    - assess,
    - control, and
    - mitigate risk.
  - Scenario analysis is focused on identifying major, plausible, catastrophic risks. As opposed to RCSA, which identifies ‘expected’ risks, scenario analysis deals with tail events. Scenario analysis used to be a key element in the Basel 2 AMA capital calculation, but a lot of organizations struggled to meet those requirements. Hence, in Basel 3 it is now a risk management tool, rather than a key input for capital calculations.
  - Key risk indicators are often confused with risk measures. Which they are not. KRIs are indicators that predict a change in a certain risk. A consumer credit risk equivalent would be, for example, the amount missed or delayed payments by a debtor in the first few months of their loan’s term. A more gruesome example could be taken from recent events: the amount of field medics and field blood banks the Russian army concentrates on your border indicates the likelihood they are going to try and invade.
- With measurement and modeling the book means mostly Basel 3 capital calculation via the business indicator component (BIC) method and an internal loss multiplier method (ILM).
- Reporting
- Risk appetite is the foundational binding element of an ORMF. Although it is difficult to express a risk appetite for operational risk, it is not impossible. #draft
Operational risk governance
- Roles of governance
- First line of defense
- Second line of defense
- Third line of defense
- Risk committees
- Key points
Culture and awareness
Policies and procedures
Internal operational risk event loss data
External loss data
Key risk indicators

Risk and control self-assessments

Three main methods for conducting RCSA:
- Questionnaires
  - Advantages
    - Convenient to set up and allows the respondents to complete the questionnaire at their own paces and time.
    - Easy to set up something online and have automatic data collection.
  - Disadvantages
    - Easy to miss emerging risks, since respondents might be reluctant to raise issues.
    - Easy to promote a ‘tick all boxes’ culture.
- Workshops
  - Advantages
    - Opportunity to have a really in-depth discussion about the risk situation of the organization.
    - Effective in embedding operational risk management in the firm.
  - Disadvantages
    - Full RCSA workshops can go on for multiple hours, require several sessions and participation of senior stakeholders.
    - Requires a lot of work to prepare: workshop program, topics, materials, coordination and scheduling.
    - Can produce inconsistent results.
- A hybrid approach is also possible in order to mitigate the tradeoffs of each individual approach and find a better combination method valid for the firm in question.
RCSA requires some form of scoring of the probability and impact severity of a risk.
This is also necessary to evaluate the impact of controls, since those can reduce the severity or probability impacts of a risk.

Firms having to adhere to a Sarbanes-Oxley Act might need to have a control effectiveness scoring in place, which can be integrated with a running RCSA. A scoring approach to control design and performance can be set up as follows:

Dimension	Low	Medium	High
Design	The design provides only limited protection when used correctly	The design provides some protection when used correctly	The design provides excellent protection when used correctly
Performance	The control is rarely performed	The control is sometimes performed	The control is always performed

The low-medium-high (or red-amber-green) ranking can then be combined into final score calculations as needed.

In addition to control effectiveness assessments, you also need the risk assessment, which can also be done with the ‘LMH/RAG-algebra’. It is common to score a risk at least, according to both its probability of occurring and impact severity (i.e. this much money has been lost). Additional dimensions, also non-financial, can be included into risk scoring, like for example:
- Reputational
- Client
- Regulatory or legal
- Loss of life
RCSA best practices
- Interview the participants first hand. This will help you to design an RCSA that is relevant to the participants and is taken serious.
- Review available background data from other functions. In preparing an RCSA review all possible background information, which might help you in designing a better RCSA or you might get insights into past or remediated operational risks.
- Review past RCSAs and related RCSAs. Not sure why this is a separate point, since it’s part of appropriate background preparation: review past RCSAs or any related documents that might have assessed the effectiveness of controls (like audit reports) or severity of potential risks.
- Review internal and external loss data and events. Build up your own data intelligence about the risk situation of the firm.
- Carefully select and train participants. Ensure that you have the right people conducting the self-assessment: both from the seniority and specialist knowledge perspectives.
- Document results. Surprised this is a separate ‘best practice’, but document your results in a clear, structured and searchable manner.
- Score appropriately. Make sure that your scoring methodology is firm and business appropriate.
- Identify mitigating actions. Any mitigating actions that have been agreed upon or already undertaken must be included in an RCSA for an accurate overall picture.
- Implement appropriate technology. Don’t do things manually or with inferior tools.
- Ensure completeness using taxonomies. Using taxonomies (for risks, processes, organization, etc.) can guide you towards having complete coverage of the firm and its risks.
- Identify themes. Specifically, firm-wide or even industry-wide themes that are affecting your organization. This will help you identify additional risks and build up a narrative to promote buy-in with your stakeholders.
- Leverage existing assessments. Duh.
- Schedule appropriately. Most firms do an annual RCSA, but a different schedule might be relevant for you.
- Backtest or validate results. Use data and models to ensure your RCSA makes sense.

Scenario analysis
- Challenging element in an ORMF.
- Scenarios are an instrument to assess a ‘rare but plausible’ loss, or in other words — tail risks.
- Is a Basel 2 Advanced Measurement Approach (AMA).
- Firms that do not follow AMA might still pursue scenario analysis programs, as they provide valuable insight into major risks faced by the organization.
- Scenario analysis is a way to explore ‘what-if’ scenarios beyond the experience of the firm. Here the use of external data grows in importance, since it offers a solid foundation for modeling these scenarios.
- Approaches
  - Workshops
  - Interviews
  - Data analysis
- In practice people need to answer difficult questions, like ‘How big could event X be?’ or ‘Could it happen in the next 20 years?’
- Basel guidance on AMA scenario analysis
  - a. A clearly defined and repeatable process
    - The contents of a scenario analysis might vary considerably year-to-year or scenario-to-scenario, which is why it is important to have a consistent and repeatable process for this exercise.
    - Ensure you have consistent procedures and standards for all scenarios over time.
    - A robust scenario process does not need to be overly complex.
  - b. Good quality background preparation of the participants in the scenario generation process
    - Preparation for scenario analysis is similar to preparations for RCSA workshops:
      - Interviews — the facilitator or the preparation team interviews the key business managers and support manager for the considered domain.
      - Internal loss data provides a good ‘floor’ for losses, but does not show what could go wrong. History of losses should not be shared directly with the participants, but facilitators of scenario analysis workshops should be aware of them.
      - External loss data is one of the most important inputs for scenario analysis process. For example in the category of ‘Internal Fraud’ the firm might not have that many examples to develop a credible scenario and might need to rely on public examples in other companies.
      - RCSA results are also a valuable source of input, as it might guide scenario authors to find a scenario that is most pertinent to the firm. Importantly though is that even low risks might become dominant in a stress scenario.
      - Compliance and audit findings are helpful in challenging claims that controls are effective and working well.
      - Key metrics and analytics provide a good, quantitative foundation for scenario generation. For example, there are quantitative and statistical models for terrorist attacks in large cities. Use of such metrics is sometimes referred to as factor analysis.
      - Straw man scenario list might be initially suggested to the participants to promote discussions towards the most relevant topics.
  - c. Qualified and experienced facilitators with consistency in the facilitation process
    - There needs to be a neutral facilitator to the workshop who not only knows the process completely, but also is proficient in managing the conversations to ensure no one person dominates the discussion.
  - d. The appropriate representatives of the business, subject matter experts and the corporate operational risk management function as participants involved in the process
    - It might be necessary that scenario exercises have a set quorum of participants to ensure that a broad range of topics is covered and there is a wide buy-in in the organization for the generated scenarios.
  - e. A structured process for the selection of data used in the developing scenario estimates
    - Preparation for scenario analysis is all about data collection which is often facilitated through free-flowing discussions either in workshops or interviews. Hence there needs to be checkpoints and structure to these discussions for the output to be meaningful and useable.
  - f. High quality documentation which provides clear reasoning and evidence supporting the scenario output
    - There might be a reluctance to document these discussions as sensitive issues are raised and discussed.
    - There’s a push for firms that underlie the purview of certain regulators to maintain detailed documentation.
    - Although you don’t need to record all the conversations, you do need to have to document the thought process thoroughly to be able to review the design and scenario generation process later on both for audit trail and design iteration purposes.
  - g. A robust independent challenge process and oversight by the corporate operational risk management function to ensure the appropriateness of scenario estimates
    - Scenario estimates must be challenged.
    - Usually it is done by the operational risk function.
    - It should be done both:
      - as part of the scenario generation and estimation process
      - as part of a formal challenge and review process (which can take on a form of an email thread or post-scenario-generation review and documentation workshop).
  - h. A process that is responsive to change in both the internal and external environment
    - A scenario analysis activity should capture the current state of the business and control environments and any change in those environments must trigger a new activity.
    - Scenario generation methods should be updated when there are significant changes to the business or the business environment — for example, after the 2012 UBS many firms would adjust their treatment of rogue trading risks.
  - i. Mechanisms of mitigating biases inherent in scenario process. Such biases include anchoring, availability and motivational biases.
    - All methods have biases.
    - An expert might be knowledgeable in their particular field, yet might not have the statistical background in estimating the probabilities or impact of a risk, or understanding those estimates.
    - Scenario analysis exercises should therefore be facilitated by people knowledgeable of these biases and of methods necessary to compensate those biases.
    - Of course, avoid introducing new biases throughout the process.
Capital modeling
- Basel 2 offers three approaches to calculating regulatory operational risk capital:
  - Basic Indicator Approach where you need to have 15% of your 3-year average gross revenue as a capital buffer for operational risk.
  - Standardized approach where you need to risk-weight your 3-year average gross revenue by your business lines.
  - Advanced measurement approach where you need to build a regulator-approved internal risk model with the following inputs:
    - Internal loss data
    - External loss data
    - Scenario analysis
    - Business environment and internal control factors
Reporting
Risk appetite
Reputational risk and operational risk
Operational risk and convergence
Best practices in related risk management activities
Case studies
- JPMorgan Chase ‘London whale’ scandal, which is here an introductory case
  - In May 2012, JPMorgan announced that it had lost 2 billion USD on a hedging strategy that was being driven by Bruno Michel Iksil, also known as ‘The London Whale’. In it’s CIO office, nonetheless.
  - Question is: was it due to poor governance? Were these losses predictable according to JPMorgan’s risk management setup? Was this a result of acceptable risk practices or unacceptable risk taking?
  - 2 billion USD is a lot of money to loose, but not unheard of in bulge bracket trading business. So why the fuss?
  - Financial journalists have warned already in April that these positions are unmanageable:
    - Wall Street Journal
    - ~~Bloomberg ‘London’s Biggest Whale’, April 2012~~ (no longer available)
  - Later, Jamie Dimon admitted that the losses stemmed from a flawed process:
    - In hindsight, the new strategy was flawed, complex, poorly reviewed, poorly executed, and poorly monitored. The portfolio has proven to be riskier, more volatile, and less effective as an economic hedge than we thought. — Source
    - We are also amending a disclosure in the first quarter press release about CIO’s VaR, value at risk. We’d shown average VaR at 67. It will now be 129. — Source
      - If VaR was wrong, then most of the firm was pretty much blind to the risks.
  - The trading strategy was shutdown four days after it hut the press in April — which might have further exacerbated the losses.
  - What followed were a series of lawsuits by disgruntled shareholders, a criminal inquiry by the DoJ, and SEC opened in review.
  - To sum up:
    - Positions were not being accurately captured by the firm’s risk management tools
    - The trading was going on with little or no understanding at the senior management level
    - Regulators suspected foul play
  - JPMorgan conducted internal reviews are released special reports on the matter. To quote Bloomberg reporting at the time:
    - In a 129-page report issued yesterday, the bank described an ‘error prone’ risk-modeling system that required employees to cut and paste electronic data to a spreadsheet. Workers inadvertently used the sum of two numbers instead of the average in calculating volatility. The firm also reiterated an assertion that London traders initially tried to hide losses that ballooned beyond 6.2 billion USD in last year’s first nine months.
  - JPMorgan’s internal review task force had five key observations:
    - CIO’s judgement, execution and escalation of issues in the first quarter of 2012 were poor, in at least six critical areas:
      1. CIO management established competing and inconsistent priorities for the Synthetic Credit Portfolio without adequately exploring or understanding how the priorities would be simultaneously addressed;
      2. The trading strategies that were designed in an effort to achieve the various priorities were poorly conceived and not fully understood by CIO management and other CIO personnel who might have been in a position to manage the risks of the Synthetic Credit Portfolio effectively;
      3. CIO management (including CIO’s finance function) failed to obtain robust, detailed reporting on the activity in the Synthetic Credit Portfolio, and/or to otherwise appropriately monitor the traders’ activity as closely as the should have;
      4. CIO personnel at all levels failed to adequately respond to and escalate (including to senior firm management and the board of directors) concerns that were raised at various points during the trading;
      5. certain of the traders did not show the full extent of the Synthetic Credit Portfolio’s losses; and
      6. CIO provided to senior firm management excessively optimistic and inadequately analyzed estimates of the Synthetic Credit Portfolio’s future performance in the days leading op to the April 13 earnings call. …
    - The firm did not ensure that the controls and oversight of CIO evolved commensurately with the increased complexity and risks of CIO’s activities
    - CIO risk management lacked the personnel and structure necessary to manage the risks of the Synthetic Credit Portfolio.
    - The risk limits applicable to CIO were not sufficiently granular.
    - Approval and implementation of the new CIO VaR model for the Synthetic Credit Portfolio in late January 2012 were flawed, and the model as implemented understated the risks presented by the trades in the first quarter of 2012.
  - Note ORX and IBM FIRST classify this event differently, so that might have an impact on models relying on risk classification.
- Credit Suisse’s Archegos debacle
  - See also my review of the Credit Suisse special report on Archegos
  - The supplied case study is from IBM FIRST.
  - Between 24 March and 26 March 2021 an obscure New York hedge fund was hit with rounds of margin calls.
  - This also seems to have inflicted significant losses on several banks that acted as prime brokers to Archegos Capital Management (a family office of Bill Hwang).
  - Banks that had entered into large swaps positions with Archegos as their counterparts found themselves trying to liquidate blocks of shares in certain US media and Chinese tech stocks that Mr. Hwang favored.
  - On 06 April 2021, after selling blocks of 60 million shares linked to Archegos Credit Suisse announced in a trading update that it estimated its Archegos-linked loss at 4.4 billion CHF.
  - Several executives and senior managers —including head of investment banking and chief risk officer— had to leave the firm.
  - On 29 July 2021, Credit Suisse released a summary of an independent report into it’s recent Archegos losses:
    - ‘Conspicuous’ risks posed by Archegos’ positions were ignored by managers
    - Dynamic hedging was not used, causing the bank to pay 2.4 billion USD in ‘variable’ margin collateral to Archegos
  - This even was described as Event #19063.
  - Archegos was set up to manage Bill Hwang’s own money, after SEC’s complaint for insider trading against him.
  - The SEC’s complaint agains Mr. Hwang and Tiger Asia (his former hedge fund) was eventually settled for payments totaling 44 million USD (Event #9260)
  - Archegos has made very large and bullish bets on a small number of equities, including ViacomCBS, Discovery, IQIYI, Baidu, Tencent, GSX Techedu.
  - These exposures were set up by entering into derivatives contracts —including total return swaps— with various investment banks’ prime brokerage units.
  - The exposure was heavily leveraged.
  - The triggering event was Viacom CBS’ issue of 30 million new shares on 22 March 2021 in an equity market that was hitting new highs. This diluted the stock, yet the issue was under-subscribed. This caused the share price to fall sharply.
  - This led to a margin call being issued to the heavily leveraged Archegos by one or more of Mr. Hwang’s primer brokers. Archegos’ forced sales in turn triggered further margin calls.
  - 6 banks were acting as prime brokers to Archegos or engaged in derivatives contracts with the firm. Apparently, they were unaware of the fact until the first margin call.
  - …
  - Control failings and contributory factors
    - Inadequate due diligence efforts; failure to disclose
      - To some extent, all banks that lent to Archegos or entered into swaps contracts with him were unaware of the extent of his liabilities or the extent of leverage Mr. Hwang had taken on. This led to a ‘rush for the exits’ as Archegos’ holdings came under pressure to sell.
    - Undertook excessive risks; poor judgement
      - The decision to keep extending loans to Archegos raised questions about risk-management practices by prime brokers towards ‘family offices’ and other wealthy investors
    - Lack of internal controls
      - Internal reviews by banks (and possible future action by regulators) may lead to a reassessment of how large banks manage counterparty risk and collateral, and may lead to more disclosures related to swaps contracts.
    - Inadequate organizational structures
      - It appears that a July 2020 merger of risk-management and compliance functions into a single unit to save money may have backfired and even contributed to even larger losses.
    - Corporate and market conditions
      - A low-interest rate environment has pushed investors and lenders to take more risk in search of returns, and may have weakened risk-management. Margin calls were triggered by a share offering that disfavored Archegos’ positions. Some of the banks appear to have had inadequate hedges in place.
  - Corrective actions and management response
  - Forensic report
- DNB Bank ASA money laundering scandal
- UBS rogue trading scandal
- Knight Capital technology glitch