Previously on this blog, we’ve talked about how MSRC automates the root cause analysis of vulnerabilities reported and found. After doing this, our next step is variant analysis: finding and investigating any variants of the vulnerability. It’s important that we find all such variants and patch them simultaneously, otherwise we bear the risk of these being exploited in the wild. In this post, I’d like to explain the automation we use in variant finding.
For the past year or so, we’ve been augmenting our manual code review processes with Semmle, a third-party static analysis environment. It compiles code to a relational database (the snapshot database – a combination of database and source code), which is queried using Semmle QL, a declarative, object-oriented query language designed for program analysis.
The basic workflow is that, after root cause analysis, we write queries to find code patterns that are semantically similar to the original vulnerability. Any results are triaged as usual and provided to our engineering teams for a fix to be developed. Also, the queries are placed in a central repository to be re-run periodically by MSRC and other security teams. This way, we can scale our variant finding over time and across multiple codebases.
In addition to variant analysis, we’ve been using QL proactively, in our security reviews of source code. This will be the topic of a future blog post. For now, let’s look at some real-world examples inspired by MSRC cases.