Thanks for this post! I used ccminer 2.2 skunk instead of the ccminer 2.1 Tribus. Squeezed out 5-6 MH/s more out of my GTX 1070s! From 32MH/s to 38MH/s with the PNY models and 37MH/s to 42MH/s with the EVGA models. Hopefully that helps you as well.
https://github.com/signatumd/releases/blob/master/ccminer-2.2-skunk.zip