Worst-case matching in Soffit

in soffit •  6 years ago  (edited)

Testing with Hypothesis turned up some matches in Soffit that took a long time to run. Some optimization of constraint checking got the runtime down from 600ms to about 400ms, but now I'm kind of stuck.

The case it has a left-hand graph that looks like this:

and the right-hand rule is "delete everything." The following graph has no matches for the rule, but it takes a lot of work for the constraint solver to figure it out:

That self-edge from 0 to 0 doesn't match anything in the pattern, so we can't delete 0 and leave it dangling. But, the constraint solver isn't smart enough to reason that some node in the pattern must map to 0, so no solution exists. Instead, as far as I can tell, it does its best along these lines:

  1. Where can I match A? How about 0? No, the self-loop prohibits this.
  2. How about A=8? Seems OK, that means B=9.
  3. Now where can I match C? How about 0? No.
  4. How about C=2? That means D = 3.
    ...
  5. The only place left I can match I is 1, but then J has to be zero but that's impossible. Better back up and try some other choices.
    ...
  6. I tried everything, but A=8 didn't work for some reason. What if I set A=7 instead?
    ...

The "global" picture is that we have 10 nodes but only of 9 them can be matched. Locally, though, it looks like we have lots of possible options and, who knows, maybe the next one would work? So we're exploring about 2*5!=240 possibilities (because A=8 and A=9 are both OK) just to discover that we always end up one match short.

I could add a rule specifically for self-loops but that wouldn't help because adding an edge from 0 to a different node that couldn't be deleted (say it had the wrong tag) would cause the same problem.

A standard way of solving this is to impose some ordering on the solution. So, A has to be lower than B has to be lower than C, etc. It's hard to see how I would do this in a robust way, even as a "pre-check" for feasibility. I'd have to determine that A--B was isomorphic to C--D and so anything that worked for one pair also worked for the other.

Probably I will leave things as they are-- there are other performance problems that are more realistic and more easily addressed. But I'll try to think if there's some way to hint or abort the constraint solver that's general-purpose enough. Checks like "number of nodes in pattern == number of nodes in graph" would handle this case but not other, similar cases.

Anyway, here's the new "conditional" Constraint class I added replacing the more general-purpose one that couldn't do forward matching: https://github.com/mgritter/soffit/commit/2b014a71bde0bb38e54222f1ccf6bac20ebf2659

The directions I'm thinking about heading now:

  1. Rewrite of the rule format. I kind of need something that allows rule names and comments.
  2. Allow wildcarding; many grammars I want to write would benefit from it. (Might also rethinking tagging and make it general attributes.)

Also on my development backlog:

  • Better error messages from parser
Authors get paid when people like you upvote their post.
If you enjoyed what you read here, create your account today and start earning FREE STEEM!
Sort Order:  

One idea is replace all the individual constraints:

A has to be in [0, 1, 2, 3, 4, ... ]
B has to be in [0, 1, 2, 3, 4, ... ]
(A,B) has to be in [(0,1), (2,3), (4,5) ... ]
C has to be in [0, 1, 2, 3, 4, ...]
etc.

with a group contraint:

[(A,B), (C,D), (E,F), ...] has to be a subset of [(0,1), (2,3), (4,5), ...]

But, constraints can't inform other constraints in py-constraint, only the domain. We'd be able to tell if there were globally not enough edges, but not that (0,1) wouldn't work due to other constraints.

It might work to have variables that represented edge assignments rather than node assignments, though.

AB has domain [(0,1), (2,3), (4,5), ...]
new constraints: if AB = (0,1) then A = 0, B=1, etc.

Then we could remove AB=(0,1) explicitly as a possibility instead of removing the assignment A=0 from the list.





This post has been voted on by the SteemSTEM curation team and voting trail in collaboration with @curie.

If you appreciate the work we are doing then consider voting both projects for witness by selecting stem.witness and curie!

For additional information please join us on the SteemSTEM discord and to get to know the rest of the community!