Testing Graph Rewritting with Hypothesis

in programming •  6 years ago 

I've continued work on Soffit, but more slowly in the past couple weeks. My focus had been getting better testing covering using Hypothesis.

My first attempt was to generate graphs by their edges, like this:

nodeIds = st.text( alphabet="ABCDEFGHIJKLMNOPQRSTUVWZYZ",
                   min_size=1 )
edgeStrategy = st.lists( st.tuples( nodeIds, nodeIds ) )

and we can use this strategy to, for example, generate a graph to match and a larger graph in which it's embedded:

    @given( edgeStrategy, edgeStrategy )
    def test_delete_subgraph( self, edges, moreEdges ):
        assume( len( edges ) > 0 )
        assume( len( moreEdges ) > 0 )
        
        l = self.undirectedGraphFromEdgeList( edges )
        r = nx.Graph()
        self._buildRename( r )
        g = self.undirectedGraphFromEdgeList( edges + moreEdges )        
        g = sg.graphIdentifiersToNumbers( g )
...

This is too open-ended. We get lots of possible graphs, but if we add lots of assume() clauses to filter specific cases, Hypothesis starts complaining that we aren't finding very many examples. It's also hard to either predict what will happen on a random example (without just rewriting the code) or get the filter right to "do what I mean" for a particular outcome.

hypothesis.errors.FailedHealthCheck: It looks like your strategy is filtering 
out a lot of data. Health check found 50 filtered examples but only 1 good ones.
This will make your tests much slower, and also will probably distort the data
generation quite a lot. You should adapt your strategy to filter less. This can 
also be caused by a low max_leaves parameter in recursive() calls

Another problem is that Hypothesis will test a lot of different node IDs, which don't really matter for structural tests. That may also lead to problems finding cases where two edge sets do intersect, if we want that to happen.

So, what does a strategy that's more specific look like?

Say we want to test the ability to delete a set of edges from the graph. We need the pattern to exist in the graph, without any edges connecting it to the rest of the graph. (Those edges violate the "dangling" condition.)

I change the node IDs to all be single characters from a fixed alphabet. I also changed the edge strategy to specifically bound the number of edges, rather than doing this less efficiently with assume().

import hypothesis.strategies as st

nodeIds_list = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
nodeIds = st.sampled_from( nodeIds_list )
edgeStrategy = st.lists( st.tuples( nodeIds, nodeIds ) )
edgeStrategy1_to_10 = st.lists( st.tuples( nodeIds, nodeIds ), min_size=1, max_size=10 )

Then we can define a "composite" strategy which is Hypothesis' way of turning a function that produces one example into a factory for producing many examples. We can write arbitrary Python code that uses draw to select from other strategies and process the result. It's still possible for us to use assume() but this should be less necessary.

To generate disjoint edge sets, we can select the second edge set using only vertex IDs that weren't used in the first edge set. In Python that looks like:

@st.composite
def disjoint_subgraphs( draw ):
    """Return graphs that contain no nodes in common."""
    edges = draw( edgeStrategy1_to_10 )
    nodes = [ s for (s,t) in edges ] + [ t for (s,t) in edges ]

    remainingIds_list = [ x for x in nodeIds_list if x not in nodes ]
    remainingIds = st.sampled_from( remainingIds_list )
    moreEdges = draw( st.lists( st.tuples( remainingIds, remainingIds ), min_size=0, max_size=10 ) )
    return (edges, moreEdges)

With that in place, the test case is a lot simpler, and without assume() every example generated is valid! The given decorator takes the function we defined as its argument, and assigns the return value of that function to the first argument of the test method, sgs:

    @given( disjoint_subgraphs() )
    @settings( deadline=200 )
    def test_delete_subgraph( self, sgs ):
        (edges, moreEdges ) = sgs

        nodesG = set( [ s for (s,t) in moreEdges ] + [ t for (s,t) in moreEdges ] )
        
        l = self.undirectedGraphFromEdgeList( edges )
        r = nx.Graph()
        self._buildRename( r )
        g_orig = self.undirectedGraphFromEdgeList( edges + moreEdges )        
        g = sg.graphIdentifiersToNumbers( g_orig )

        finder = sg.MatchFinder( g, verbose=False )
        finder.maxMatches = 2
        finder.maxMatchTime = 0.2
        finder.leftSide( l )
        finder.rightSide( r )
        
        m = finder.matches()
        self.assertGreater( len( m ), 0 )
        
        app = sg.RuleApplication( finder, m[0] )
        g2 = app.result()

        self.assertEqual( len( g2.nodes ), len( nodesG ) )

Hypothesis identified a bug in my handling of self-loops, and my next step is to repeat the testing for directed graphs, which probably have a similar bug. Hypothesis also points out that my constraint solver can be very slow for even medium-sized examples of deleting many nodes at once.

Authors get paid when people like you upvote their post.
If you enjoyed what you read here, create your account today and start earning FREE STEEM!
Sort Order:  




This post has been voted on by the SteemSTEM curation team and voting trail in collaboration with @curie.

If you appreciate the work we are doing then consider voting both projects for witness by selecting stem.witness and curie!

For additional information please join us on the SteemSTEM discord and to get to know the rest of the community!