Scorecarding Security

Sep 12, 24

I’ve previously discussed my view on security programs steeped in Security Engineering. I’ve highlighted the challenges (Challenges in Security Engineering Programs), and put out a call to action (Don’t Security Engineer Asymmetric Workloads).

Now, I want to synthesize a useful tactic I’ve seen in these programs, scorecarding.¹

This is most prominent in security organizations that:

Try to avoid antagonism, adversariality, and “saying no”
Take a partnership-based approach to security. This could involve Security Champions, explicitly titled Security Partners (as seen at Netflix or Facebook), or just a culture of consultation.

Scaling these programs is challenging.

As Jacob Salassi put it at LocoMocoSec:

Security Champions are human crutches that prop up cumbersome processes that don’t scale
Every security consultation is a failure

Partnership oriented programs need to navigate the facts that:

The security team is not omniscient
The security team must avoid generating excess or asymmetric work (and imposing cost) on engineering
The security team relies on engineering teams to help resolve risk, but need to build a relationship with engineers, and especially avoid escalation (“when practiced routinely, they are a telltale sign of a dysfunctional team”)

In addition to these challenges, tracking progress, establishing metrics, and reporting upwards are all generally seen as challenging for security programs.

To start, let’s extract some lessons from prominent public reports of Scorecarding in security programs.

Case Studies

Chime

Chime’s posts about Monocle, now a couple years old, were where I first started thinking about Scorecarding.

They discuss their approach, including process, program, and tooling. In short, they use an extensible security/risk score presented on a centralized dashboard and leaderboard. New controls can be rolled up into the security score. They also provide remediation guidance for each control, and both monitor and reactively communicate in response to score dips. Tied into GitOps, this is all presented as non-blocking, with explicit support for risk acceptance.

Overall, it is an approach that includes elements of gamification, while educating service and code owners on security standards and posture. Reporting is also consumable by leadership. They share a haiku:

Atomic fixes
doable by engineers
observed by leaders

Chime pitches the benefits as:

Choosing where to prioritize investments in security
Empowering engineers and teams to independently improve the security posture of their code
Achieve these two goals while preserving a philosophy centered on cross-functional collaboration

Article 1: Monocle: How Chime creates a proactive security & engineering culture (Part 1)
Article 2: Mitigating Risky Pull Requests with Monocle Risk Advisor (Part 2)

Netflix

Netflix’s attempt predates Chime’s. Development started as early as 2015, and is discussed in the linked 2019 talk (summarized by Clint).

Two distinct tools are outlined. The first, “Penguin Shortbread,” handles application inventory and automated risk rating. It “can see which apps are using security controls, like SSO, app-to-app mTLS and secrets storage, and if they’re filled the security questionnaire.” The second is “Security Brain,” a customer facing dashboard for this risk and security data. At the time, Security Brain encompassed application risk, open vulnerabilities, and outstanding betterments.

Article: A Pragmatic Approach for Internal Security Partnerships

GitHub

GitHub is the most recent company to highlight a scorecarding program, with theirs branded as “Engineering Fundamentals.” One notable distinction: GitHub’s brings availability, security, and accessibility all under a single banner. Engineering Fundamentals is positioned as a Governance program, with Fundamentals rolling up to strategic planning. Data is surfaces to leaders.

A couple more meaningful extracts:

“We expect that some scorecards will eventually become concrete technical controls”
“we celebrate the wins publicly, however small they may seem”

Article: GitHub’s Engineering Fundamentals program: How we deliver on availability, security, and accessibility

Atlassian

h/t Kane Narraway

Atlassian briefly alludes to their version of this approach in their Trust center:

an automated process Atlassian has created whereby we use a broad range of security-focused criteria (e.g. current vulnerabilities, training coverage, and recent security incidents) to provide an overall security score for each of our products.

Article: Security scorecards

Vulnerability Management Case Studies

All of the prior Case Studies share an important characteristic. They all go beyond just managing vulnerability data, to also cover risk, posture, or even non-security considerations. However, Vulnerability Management has long been home to similar practices. The rise of ASPM is only the latest manifestation in attempts to turn it into a product category.

A few discussions of Vulnerability Scorecarding include generalizable lessons for us.

Segment (Twilio) Embracing Risk Responsibly: Moving beyond inflexible SLAs and exception hell by treating security vulnerabilities and risk like actual debt

Eric Ellett outlines the process of bringing a vulnerability management process to Segment after $16k in vulnerabilities were dropped on his lap day one. Version 1 was a basic dashboard for metrics derived from Jira data. This included tracking vulnerability counts, vulnerability categories, and breached SLAs. In Version 2, they changed Risk Acceptance to due date extension and improved notifications. Most importantly, it began the process of supporting metric rollup and handling re-orgs.

This allowed them to observe issues around risk acceptance watering down “breached SLA” as a useful metric. So they introduced “security debt,” which also allowed for Org Wide rollups and apples-to-apples comparison. From a process perspective, this was paired with a monthly “Top Debt” review meeting. Future looking plans involve tying security debt to the cost of manifest incidents.

(in)Secure Development - Why some product teams are great and others aren’t…: In this case, Riot was able to tie their scorecarding to a financial metric. They sorted teams into four maturity buckets (Absence, Reactive, Proactive process, Proactive mindset) and then used bug bounty data to quantify the cost reduction of increased security maturity.
An Inside Look: What Happens to Bug Reports at Uber?: In another case leveraging bug bounty data, here Uber built an interactive dashboard that highlighted teams’ share of the overall security debt.

Takeaways

Have I convinced you to pursue scorecarding in your security program? If so, here are a few practical tips:

Understand and solve for pre-requisites. Most of these blog posts elide the substantial work that already had been done to build service/application discovery, ownership, business criticality and service severity.
Start small. Pick a team you already have a good relationship with, and get feedback continuously. Use a single, high signal data feed initially, and always work post-triage. Good options would be to use triaged bug bounty findings, vulnerabilities from SAST, or a small subset set of posture findings.
Make explicit space for risk acceptance. Figure out how risks can be accepted, under which circumstances, and at which altitude. Determine your approach to managing accepted risks as part of reporting. Consider terminology like “due date extension” to clarify the process (and periodic review) for partner teams.
Establish and market the benefits of maturity. Find a way to tie risk improvement to company goals, financial impact, or other top-level business metrics.
Celebrate wins, big and small. Find opportunities to talk positively about teams’ ownership of security risk. Chat with some UX folks, sprinkle in some delight. Get improvements formally recognized as part of performance management. You’ll have no better friend than one you helped get promoted!
Look for every opportunity to turn a measurement into an invariant. Kill bug classes, and build secure defaults!

While I’m focused on recent discussions, obviously this is not a new approach. There are ties to studies of gamification in security, such as has been done with phishing simulations. Security has always sought a Single Pane of Glass. Kenna Security had a very successful outcome with a related approach. ↩