[文档]classSAILRStructurer(PhoenixStructurer):""" The SAILR structuring algorithm is the phoenix-based algorithm from the USENIX 2024 paper SAILR. The entirety of the algorithm is implemented across this class and various optimization passes in the decompiler. To find each optimization class, simply search for optimizations which reference this class.NAME. At a high-level, SAILR does three things different from the traditional Phoenix schema-based algorithm: 1. It recursively structures the graph, rather than doing it in a single pass. This allows decisions to be made based on the current state of what the decompilation would look like. 2. It performs deoptimizations targeting specific optimizations that introduces gotos and mis-structured code. It can only do this because of the recursive nature of the algorithm. 3. It uses a more advanced heuristic for virtualizing edges, which is implemented in this class. Additionally, some changes in Phoenix are only activated when SAILR is used. """NAME="sailr"
def_order_virtualizable_edges(self,graph:networkx.DiGraph,edges:list,node_seq:dict[Any,int])->list:""" The criteria for "best" is defined by a variety of heuristics described below. """iflen(edges)<=1:returnedges# TODO: the graph we have here is not an accurate graph and can have no "entry node". We need a better graph.try:entry_node=next(nodefornodeingraph.nodesifgraph.in_degree(node)==0)exceptStopIteration:entry_node=Nonebest_edges=edgesifentry_nodeisnotNone:# the first few heuristics are based on the post-dominator count of the edge# so we collect them for each candidate edgeedge_postdom_count={}edge_sibling_count={}foredgeinedges:_,dst=edgegraph_copy=networkx.DiGraph(graph)graph_copy.remove_edge(*edge)sibling_cnt=graph_copy.in_degree(dst)ifsibling_cnt==0:continueedge_sibling_count[edge]=sibling_cntpost_dom_graph=PostDominators(graph_copy,entry_node).post_dompost_doms=set()forpostdom_node,dominateeinpost_dom_graph.edges():ifnotisinstance(postdom_node,TemporaryNode)andnotisinstance(dominatee,TemporaryNode):post_doms.add((postdom_node,dominatee))edge_postdom_count[edge]=len(post_doms)# H1: the edge that has the least amount of sibling edges should be virtualized first# this is believed to reduce the amount of virtualization needed in future rounds and increase# the edges that enter a single outer-scope if-stmtifedge_sibling_count:min_sibling_count=min(edge_sibling_count.values())best_edges=[edgeforedge,cntinedge_sibling_count.items()ifcnt==min_sibling_count]iflen(best_edges)==1:returnbest_edges# create the next heuristic based on the best edges from the previous heuristicfiltered_edge_postdom_count=edge_postdom_count.copy()foredgeinlist(edge_postdom_count.keys()):ifedgenotinbest_edges:delfiltered_edge_postdom_count[edge]iffiltered_edge_postdom_count:edge_postdom_count=filtered_edge_postdom_count# H2: the edge, when removed, that causes the most post-dominators of the graph should be virtualized# first. this is believed to make the code more linear looking be reducing the amount of scopes.# informally, we believe post-dominators to be an inverse indicator of the number of scopes presentifedge_postdom_count:max_postdom_count=max(edge_postdom_count.values())best_edges=[edgeforedge,cntinedge_postdom_count.items()ifcnt==max_postdom_count]iflen(best_edges)==1:returnbest_edges# H3: the edge that goes directly to a return statement should be virtualized first# this is believed to be good because it can be corrected in later optimization by duplicating# the returncandidate_edges=best_edgesbest_edges=[]forsrc,dstincandidate_edges:ifgraph.has_node(dst)andstructured_node_is_simple_return(dst,graph):best_edges.append((src,dst))iflen(best_edges)==1:returnbest_edgesifnotbest_edges:best_edges=candidate_edges# if we have another tie, or we never used improved heuristics, then we do the default ordering.returnsuper()._order_virtualizable_edges(graph,best_edges,node_seq)