For Soffit I implemented a few Constraint
objects with the python-constraint library. These objects represent a particular restriction on the solution, and they can prune the search space either ahead of time or while the search is running.
One of them just checks variables to see if they're in a set of allowed tuples, and I make heavy use of this for edge-based constraints. Originally the constraint checking was very simple:
def __call__(self, variables, domains, assignments, forwardcheck=False):
current = tuple( assignments.get( v, Unassigned ) for v in variables )
if Unassigned in current:
return True
else:
return current in self.allowedSet
That is, if all the variables are assigned, then check for a matching tuple. Otherwise say the constraint is still satisfied. There was a lot more code in the preprocessing stage to eliminate individual values that were impossible.
I noticed that certain rules took a long time to evaluate (even if there were no matches.) A little digging with cProfile
showed that both my check above as well as the built-in AllDifferentConstraint
were handling a lot of different queries, accounting for the bulk of the time. So it would be advantageous to give the constraint solver more guidance.
In general, the problem is quite hard: given a partially-filled tuple, how do you find all compatible tuples from a set? Fortunately, the way I was using this class, tuples of size two (graph edges) dominated, and I was able to verify that by instrumenting my code. And with just two values, you can build two indexes at a reasonable cost. The first says what "B" values exist for a given value of "A", and the second index the opposite.
def insertTrees( self, a, b ):
if a in self.forward:
self.forward[a].add( b )
else:
self.forward[a] = set( [b] )
if b in self.backward:
self.backward[b].add( a )
else:
self.backward[b] = set( [a] )
Given that index, we can look at which values in the domain of the variable are permissible, and temporarily suppress those that aren't. The hideValue
function temporarily removes a value; when the solver backtracks the value will be restored.
def pairCheck( self, current, domain, whichMap ):
domainValues = set( domain )
restrictedValues = whichMap[ current ]
#print( "domainValues", domainValues )
#print( "restrictedValues", restrictedValues )
for val in domainValues.difference( restrictedValues ):
#print( "Hiding value", val )
domain.hideValue( val )
if not domain:
return False
return True
So, in the current version, the code that checks the constraints special-cases when there are just two variables. When the solver passes in forwardcheck=true
, that means the Constraint
is allowed to modify the domain of variables in the way shown above. We can also signal infeasibility for values that don't appear in the index at all, but this is redundant if the preprocessing step got to run. (In some cases, it doesn't because the check is conditional.)
def __call__(self, variables, domains, assignments, forwardcheck=False):
current = tuple( assignments.get( v, Unassigned ) for v in variables )
if Unassigned not in current:
return current in self.allowedSet
if len( variables ) != 2:
return True
if current[0] is not Unassigned:
if current[0] not in self.forward:
return False
if forwardcheck:
return self.pairCheck( current[0],
domains[variables[1]],
self.forward )
if current[1] is not Unassigned:
if current[1] not in self.backward:
return False
if forwardcheck:
return self.pairCheck( current[1],
domains[variables[0]],
self.backward )
return True
This optimization is pretty effective; the larger graph below took about 95 seconds, which is as long as my previous run took for about 1/4 as many nodes.
The current profile shows that hideValue
is actually where a lot of the time is spent, but some of the graph functions appear as well, so there maybe some opportunity to do less copying or renaming and get a further increase in speed.
Two list comprehensions appear on the list, this one for nodes and the corresponding one for tags:
matchingTag = [ (i,) for i in self.graph.nodes
if self.graph.nodes[i].get( 'tag', None ) == tag ]
It would be pretty easy to keep a tag->node index in the graph object so that this search could be done more efficiently.
ncalls tottime percall cumtime percall filename:lineno(function)
8949895 3.698 0.000 4.232 0.000 /home/mark/.local/share/virtualenvs/soffit-71IOh0Y6/lib/python3.5/site-packages/constraint/__init__.py:798(hideValue)
7553 2.777 0.000 6.728 0.001 /home/mark/.local/share/virtualenvs/soffit-71IOh0Y6/lib/python3.5/site-packages/networkx/classes/digraph.py:628(add_edges_from)
143429 2.212 0.000 6.345 0.000 /home/mark/soffit/soffit/constraint.py:63(pairCheck)
7553 1.357 0.000 1.989 0.000 /home/mark/.local/share/virtualenvs/soffit-71IOh0Y6/lib/python3.5/site-packages/networkx/classes/digraph.py:415(add_nodes_from)
2836653 1.352 0.000 1.741 0.000 /home/mark/.local/share/virtualenvs/soffit-71IOh0Y6/lib/python3.5/site-packages/networkx/classes/graph.py:646(nodes)
9611 1.331 0.000 2.822 0.000 /home/mark/soffit/soffit/graph.py:118(<listcomp>)
1446731 1.297 0.000 1.505 0.000 /home/mark/.local/share/virtualenvs/soffit-71IOh0Y6/lib/python3.5/site-packages/networkx/classes/reportviews.py:666(<genexpr>)
3129 1.117 0.000 3.519 0.001 /home/mark/soffit/soffit/graph.py:139(<listcomp>)
1446103 1.042 0.000 3.049 0.000 /home/mark/.local/share/virtualenvs/soffit-71IOh0Y6/lib/python3.5/site-packages/networkx/relabel.py:161(<genexpr>)
10226094 0.958 0.000 0.963 0.000 {method 'get' of 'dict' objects}
451272 0.821 0.000 0.821 0.000 /home/mark/soffit/soffit/constraint.py:26(insertTrees)
3532989 0.783 0.000 0.783 0.000 {method 'copy' of 'dict' objects}
9616 0.698 0.000 1.805 0.000 /home/mark/soffit/soffit/constraint.py:37(__init__)
10279728 0.613 0.000 0.613 0.000 {method 'append' of 'list' objects}
3418673 0.604 0.000 1.714 0.000 {method 'update' of 'dict' objects}
103074 0.587 0.000 0.850 0.000 /home/mark/.local/share/virtualenvs/soffit-71IOh0Y6/lib/python3.5/site-packages/constraint/__init__.py:979(__call__)
Hello! Your post has been resteemed and upvoted by @ilovecoding because we love coding! Keep up good work! Consider upvoting this comment to support the @ilovecoding and increase your future rewards! ^_^ Steem On!
Reply !stop to disable the comment. Thanks!
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit