Property Simulation – a new model for CDT’s Static Analysis

Several months ago, I discovered the need for improvements with Eclipse CDT’s static analysis. Namely, we need an analysis model that correctly accounts for control flow paths of a program. To test the abilities of the framework, I created a checker to detect instances where close was being called on a file descriptor that did not have a corresponding open call earlier in the control flow. Consider the following snippet of code:

int fd;
if (x) {
        fd = open("file", O_RDONLY);
}
close(fd);

If we simply traversed a function containing this code with an ASTVisitor, all we could do is detect that the open call is visited before the close call and assume there is no error condition. In reality, we know that close is called improperly if the branch is not executed. Thus we have a conditional error that will occur if x evaluates to false.

In my search for a solution, I came across a paper entitled “ESP: path-sensitive program verification in polynomial time” from Microsoft Research and the University of Washington. The paper explains and compares different static analysis techniques. A temporal property to check is expressed as a finite state machine. A finite state machine for our file open/close example could look like1:

The finite state machine’s states are known to the algorithm as Property States. We also keep track of the Execution State of the program at a given instant. This includes necessary conditions to execute a given block in the control flow graph. In our code snippet above, the open call would have an Execution State of x, meaning x must be true in order to execute that block. Each Execution State takes the form of a Boolean formula and subsequent if statements append another clause to it. The basic structure that contains all of this information in the algorithm is called a Symbolic State. Each Symbolic State contains one or more Property States and an Execution State. Each edge in the control flow graph references a set of Symbolic States that describe the state of the program at that point. In our snippet, the out-edge of the open statement block would have the following Symbolic State associated with it: [opened, x].

The paper covers three static analysis methods, which differ in the way Symbolic States are grouped and simplified. Here is how they differ at merge points:

  1. Fully Path-sensitive Analysis (PSA) – No grouping is performed. All Symbolic States from either side of the branch are kept.
  2. Dataflow Analysis – All Symbolic States from both sides of the branch are consolidated into a union of the Property States and join of Execution States.
  3. Property Simulation – Symbolic States are grouped by their Property State, corresponding Execution States are joined.

This diagram shows how each method analyzes a control flow graph1.

Click for the complete diagram

You can see that the Fully Path-sensitive Analysis (PSA) retains all information and performs no state grouping at merge points. Note that this means that each branch point doubles the number of states. Fully Path-sensitive Analysis is seen as intractable due to this exponential blowup in the number of states. The Dataflow Analysis does the very opposite. Whenever the Execution State is made more specific by a branch point, it is discarded at the merge point. We then mistakenly transition to an error state because we lost the Execution State context that caused the file to be opened.

Property Simulation meets the two approaches in the middle. It only discards Execution State information when the branch did not cause a Property State transition, thus it is not meaningful to our analysis. Note the Fully Path-sensitive Analysis, but not Property Simulation, stored Execution State information for p even though the branch that depends on p does not have any file operations. In doing this, we avoid the inefficiencies of the Fully Path-sensitive Analysis, while avoiding the false error reporting of the Dataflow Analysis.

Details on my work implementing this algorithm for the CDT will follow shortly.


1. Image Credit: Das, M., Lerner, S., and Seigle, M. 2002. ESP: path-sensitive program verification in polynomial time. In Proceedings of the ACM SIGPLAN 2002 Conference on Programming Language Design and Implementation (Berlin, Germany, June 17 – 19, 2002). PLDI ’02. ACM, New York, NY, 57-68.

Advertisements

7 Responses to “Property Simulation – a new model for CDT’s Static Analysis”

  1. Ben Says:

    Interesting! Does this work with dependant function calls etc. as well? They would have to be unrolled into your model or something i guess?

    • Elliott Says:

      It can deal with function calls by generating a summary for the function call. The summary attempts to express the called function as a single control flow block. So it maps input Symbolic States to output Symbolic States. The implementation I wrote doesn’t yet deal with control flow between function calls though, but it can be extended to do so using this method.

  2. Doto Says:

    Have you looked at the new GCC plug-in architecture http://gcc.gnu.org/wiki/plugins, http://gcc.gnu.org/onlinedocs/gccint/Plugins.html?
    Having a GCC plug-in to do static analysis will allow long runs (even a whole night) for advanced dataflow analysis. The GCC static analysis plug-in could also be integrated into Eclipse just like GCC, GDB, etc. are integrated into Eclipse

  3. Andrew Overholt Says:

    Very cool stuff, Elliott! I look forward to seeing the results with the CDT.

  4. Property Simulation – a new model for CDT’s Static Analysis (Part 2) « Elliott Baron's Blog Says:

    […] Elliott Baron's Blog Just another WordPress.com weblog « Property Simulation – a new model for CDT’s Static Analysis […]

  5. Mike Norman Says:

    Hi Elliott,

    Cool stuff. I’m proposing a BOF (Birds of a Feather session) at Eclipsecon on Static Analysis. I’d like to meet you there, although I’d also like you to put a comment on my BOF propsoal at https://www.eclipsecon.org/submissions/2010/view_talk.php?id=1636 to help make sure the BOF happens!

    See you at EclipseCon.

    MIke

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: