cozy.analysis

Module Contents

Classes

FieldDiff

EqFieldDiff

For a field to be equal, all subcomponents of the body must be equal. In this case, left_body and right_body

NotEqLeaf

A not equal leaf is a field that cannot be further unpacked/traversed.

NotEqFieldDiff

For a field to be not equal, there must be at least one subcomponent of the body that was not equal. In this case,

DiffResult

StateDiff

StateDiff encapsulates the memoized state used by the difference method. This class is used internally by Comparison and is typically not for external use.

CompatiblePair

Stores information about comparing two compatible states.

Comparison

This class stores all compatible pairs and orphaned states. An orphan state is one in which there is no compatible state in the other execution tree. In most scenarios there will be no orphaned states.

Functions

_invalid_stack_addrs(→ range)

_invalid_stack_overlap(invalid_stack_left, ...)

_stack_addrs(→ range)

nice_name(→ str | None)

This function attempts to create a human understandable name for an address, or returns None if it can't figure

compare_side_effect(→ FieldDiff)

hexify(val0)

Recursively transforms all integers in a Python datastructure (that is mappable with functools_ext.fmap) to hex strings.

cozy.analysis._invalid_stack_addrs(st: angr.SimState) range
cozy.analysis._invalid_stack_overlap(invalid_stack_left: range, invalid_stack_right: range, stack_change: int)
cozy.analysis._stack_addrs(st: angr.SimState) range
cozy.analysis.nice_name(state: angr.SimState, malloced_names: portion.IntervalDict[tuple[str, portion.Interval]], addr: int) str | None

This function attempts to create a human understandable name for an address, or returns None if it can’t figure one out.

class cozy.analysis.FieldDiff
class cozy.analysis.EqFieldDiff(left_body, right_body)

Bases: FieldDiff

For a field to be equal, all subcomponents of the body must be equal. In this case, left_body and right_body should not hold any further FieldDiffs within themselves. Rather left_body and right_body should be the entire fields for which differencing was checked (and it was determined that all subfields are equal).

class cozy.analysis.NotEqLeaf(left_leaf, right_leaf)

Bases: FieldDiff

A not equal leaf is a field that cannot be further unpacked/traversed.

class cozy.analysis.NotEqFieldDiff(body_diff)

Bases: FieldDiff

For a field to be not equal, there must be at least one subcomponent of the body that was not equal. In this case, body_diff will hold further FieldDiffs within itself. Equal subfields of the bodies will be represented by EqFieldDiff, whereas unequal subfields will be represented by further nested NotEqFieldDiff.

cozy.analysis.compare_side_effect(joint_solver, left_se, right_se) FieldDiff
class cozy.analysis.DiffResult(mem_diff: dict[range, tuple[claripy.ast.bits, claripy.ast.bits]], reg_diff: dict[str, tuple[claripy.ast.bits, claripy.ast.bits]], side_effect_diff: dict[str, list[tuple[cozy.side_effect.PerformedSideEffect | None, cozy.side_effect.PerformedSideEffect | None, FieldDiff]]])
class cozy.analysis.StateDiff

StateDiff encapsulates the memoized state used by the difference method. This class is used internally by Comparison and is typically not for external use.

difference(sl: angr.SimState, sr: angr.SimState, ignore_addrs: collections.abc.Iterable[range] | None = None, compute_mem_diff=True, compute_reg_diff=True, compute_side_effect_diff=True, use_unsat_core=True, simplify=False) DiffResult | None

Compares two states to find differences in memory. This function will return None if the two states have non-intersecting inputs. Otherwise, it will return a dict of addresses and a dict of registers which are different between the two. This function is based off of angr.analyses.congruency_check.CongruencyCheck().compare_states, but has been customized for our purposes. Note that this function may use memoization to enhance performance.

Parameters:
  • sl (SimState) – The first state to compare

  • sr (SimState) – The second state to compare

  • ignore_addrs (collections.abc.Iterable[range] | None) – Memory addresses to ignore when doing the memory diffing. This representation is more efficient than a set of integers since the ranges involved can be quite large.

  • compute_mem_diff (bool) – If this flag is True, then we will diff the memory. If this is false, then the first element of the return tuple will be None.

  • compute_reg_diff (bool) – If this flag is True, then we will diff the registers. If this is false, then the second element of the return tuple will be None.

  • use_unsat_core (bool) – If this flag is True, then we will use unsat core optimization to speed up comparison of pairs of states. This option may cause errors in Z3, so disable if this occurs.

Returns:

None if the two states are not compatible, otherwise returns an object containing the memory, register differences, and side effect differences.

Return type:

DiffResult | None

cozy.analysis.hexify(val0)

Recursively transforms all integers in a Python datastructure (that is mappable with functools_ext.fmap) to hex strings.

Parameters:

val0 – The datastructure to traverse.

Returns:

A deep copy of the datastructure, with all integers converted to hex strings.

class cozy.analysis.CompatiblePair(state_left: cozy.terminal_state.TerminalState, state_right: cozy.terminal_state.TerminalState, mem_diff: dict[range, tuple[claripy.ast.Base, claripy.ast.Base]], reg_diff: dict[str, tuple[claripy.ast.Base, claripy.ast.Base]], side_effect_diff: dict[str, list[tuple[cozy.side_effect.PerformedSideEffect | None, cozy.side_effect.PerformedSideEffect | None, FieldDiff]]], mem_diff_ip: dict[int, tuple[frozenset[claripy.ast.Base]], frozenset[claripy.ast.Base]], compare_std_out: bool, compare_std_err: bool)

Stores information about comparing two compatible states.

Variables:
  • state_left (TerminalState) – Information pertaining specifically to the pre-patched state being compared.

  • state_right (TerminalState) – Information pertaining specifically to the post-patched state being compared.

  • mem_diff (dict[range, tuple[claripy.ast.Base, claripy.ast.Base]]) – Maps memory addresses to pairs of claripy ASTs, where the left element of the tuple is the data in memory for state_left, and the right element of the tuple is what was found in memory for state_right. Only memory locations that are different are saved in this dict.

  • reg_diff (dict[str, tuple[claripy.ast.Base, claripy.ast.Base]]) – Similar to mem_diff, except that the dict is keyed by register names. Note that some registers may be subparts of another. For example in x64, EAX is a subregister of RAX.

  • side_effect_diff (dict[str, list[tuple[PerformedSideEffect | None, PerformedSideEffect | None, FieldDiff]]]) – Maps side effect channels to a list of 3 element tuples, where the first element is the performed side effect from the left binary, the second element is the performed side effect from the right binary, and the third element is the diff between the body of the side effects.

  • mem_diff_ip (dict[int, tuple[frozenset[claripy.ast.Base]], frozenset[claripy.ast.Base]]) – Maps memory addresses to a set of instruction pointers that the program was at when it wrote that byte in memory. In most cases the frozensets will have a single element, but this may not be the case in the scenario where a symbolic value determined the write address.

  • compare_std_out (bool) – If True then we should consider stdout when checking if the two input states are equal.

  • compare_std_err (bool) – If True then we should consider stderr when checking if the two input states are equal.

equal_side_effects() bool
equal() bool

Determines if the two compatible states are observationally equal. That is, they contain the same memory contents, registers, stdout, and stderr after execution.

Returns:

True if the two compatible states are observationally equal, and False otherwise.

Return type:

bool

concrete_examples(args: any, num_examples=3) list[cozy.concrete.CompatiblePairInput]

Concretizes the arguments used to put the program in these states by jointly using the constraints attached to the compatible states.

Parameters:
  • args (any) – The input arguments to concretize. This argument may be a Python datastructure, the concretizer will make a deep copy with claripy symbolic variables replaced with concrete values.

  • num_examples (int) – The maximum number of concrete examples to generate for this particular pair.

Returns:

A list of concrete inputs that satisfy both constraints attached to the states.

Return type:

list[CompatiblePairInput]

class cozy.analysis.Comparison(pre_patched: cozy.session.RunResult, post_patched: cozy.session.RunResult, ignore_addrs: list[range] | None = None, ignore_invalid_stack=True, compare_memory=True, compare_registers=True, compare_side_effects=True, compare_std_out=False, compare_std_err=False, use_unsat_core=True, simplify=False)

This class stores all compatible pairs and orphaned states. An orphan state is one in which there is no compatible state in the other execution tree. In most scenarios there will be no orphaned states.

Variables:
  • pairs (dict[tuple[SimState, SimState], CompatiblePair]) – pairs stores a dictionary that maps a pair of (pre_patch_state, post_patch_state) compatible states to their comparison information

  • orphans_left (set[TerminalState]) – Pre-patched states for which there are 0 corresponding compatible states in the post-patch

  • orphans_right (set[TerminalState]) – Post-patched states for which there are 0 corresponding compatible states in the pre-patch

Compares a bundle of pre-patched states with a bundle of post-patched states.

Parameters:
  • pre_patched (project.RunResult) – The pre-patched state bundle

  • post_patched (project.RunResult) – The post-patched state bundle

  • ignore_addrs (list[range] | None) – A list of addresses ranges to ignore when comparing memory.

  • ignore_invalid_stack (bool) – If this flag is True, then memory differences in locations previously occupied by the stack are ignored.

  • compare_memory (bool) – If True, then the analysis will compare locations in the program memory.

  • compare_registers (bool) – If True, then the analysis will compare registers used by the program.

  • compare_side_effects (bool) – If True, then the analysis will compare side effects outputted by the program.

  • compare_std_out (bool) – If True, then the analysis will save stdout written by the program in the results. Note that angr currently concretizes values written to stdout, so these values will be binary strings.

  • compare_std_err (bool) – If True, then the analysis will save stderr written by the program in the results.

  • use_unsat_core (bool) – If this flag is True, then we will use unsat core optimization to speed up comparison of pairs of states. This option may cause errors in Z3, so disable if this occurs.

  • simplify (bool) – If this flag is True, then symbolic memory and register differences will be simplified as much as possible. This flag is typically only necessary if you want to do some deep inspection of symbolic contents. simplify can speed things down a lot, and symbolic expressions are usually very complex to the point where they are not easily understandable. This is why in most scenarios the flag should be left as False.

get_pair(state_left: angr.SimState, state_right: angr.SimState) CompatiblePair

Retrieves a CompatiblePair given two compatible input states.

Parameters:
  • state_left (SimState) – The pre-patched state

  • state_right (SimState) – The post-patched state

Returns:

The CompatiblePair object corresponding to this compatible state pair.

Return type:

CompatiblePair

is_compatible(state_left: angr.SimState, state_right: angr.SimState) bool

Returns True when the two input states are compatible based on the pairs stored in this object, and False otherwise.

Parameters:
  • state_left (SimState) – The pre-patched state

  • state_right (SimState) – The post-patched state

Returns:

True if the input states are compatible, and False otherwise

Return type:

bool

__iter__() collections.abc.Iterator[CompatiblePair]

Iterates over compatible pairs stored in the comparison.

Returns:

An iterator over compatible pairs.

Return type:

Iterator[CompatiblePair]

verify(verification_assertion: Callable[[CompatiblePair], claripy.ast.Base | bool]) list[CompatiblePair]

Determines what compatible state pairs are valid with respect to a verification assertion. Note that the comparison results are verified with respect to the verification_assertion if the returned list is empty (has length 0).

Parameters:

verification_assertion (Callable[[CompatiblePair], claripy.ast.Base | bool]) – A function which takes in a compatible pair and returns a claripy expression which must be satisfiable for all inputs while under the joint constraints of the state pair. Alternatively the function can return a bool. If the return value is False, this will be considered a verification failure. If the return value is True, this will be considered a verification success.

Returns:

A list of all compatible pairs for which there was a concrete input that caused the verification assertion to fail.

Return type:

list[CompatiblePair]

report(args: any, concrete_post_processor: Callable[[any], any] | None = None, num_examples: int = 3) str

Generates a human-readable report of the result object, saved as a string. This string is suitable for printing.

Parameters:
  • args (any) – The symbolic/concolic arguments used during exeuction, here these args are concretized so that we can give examples of concrete input.

  • concrete_post_processor (Callable[[any], any] | None) – This function is used to post-process concretized versions of args before they are added to the return string. Some examples of this function include converting an integer to a negative number due to use of two’s complement, or slicing off parts of the argument based on another part of the input arguments.

  • num_examples (int) – The number of concrete examples to show the user.

Returns:

A human-readable summary of the comparison.

Return type:

str