floorPlan-GrANd: Thoughts on Using A Style Transfer Algorithm

This post follows my previous post: https://steemit.com/computervision/@maicro/floorplan-grand-introduction

I have just gone through a style transfer model: https://maicro879342585.wordpress.com/2021/09/01/repo-bethge-lab-basic-style-transfer/

I was looking into style transfer algorithms because I was hoping something like that could impose a desired style of interior design onto say the floor plan of an existing house or apartment. So it's worth a look at how style transfer actually works. Let's take a look at what the algorithm does first:

The core design of a style transfer model is to let the loss function be a weighted average of the content loss and the style loss.

As demonstrated by the following illustration from one of the authors' papers, the style picture generates a pattern from randomness, whereas the content picture is getting blurred in the process.

The style loss makes use the the Gram Matrix, each entry of which is a picture's response to one feature/channel cross multiplied with that of another feature. So each matrix element shows whether a picture responds to two features similarly or dissimilarly. It is this Gram Matrix that is being compared between the synthesized picture and the style picture. Since all the pixels of a picture is dotted into one number, the Gram Matrix doesn't distinguish WHERE one picture responds highly with respect to a certain feature versus WHERE in another picture. Instead, it only picks up the information of whether where a picture responds well to one feature is also where it responds well to another feature. If we consider each channel as a kind of brush stroke, a similar Gram Matrix entry would mean the same kind of "drawing technique" is used in both pictures.

So two pictures that have similar response patterns at "some" place in each picture would have some similar Gram Matrix entries. That is what helps to ensure that representative patterns in the style picture gets reflected in the synthesized picture. At the same time though, the system will try to fit any pattern in the style picture somewhere into the synthesized picture. From a painting point of view, the machine doesn't really appreciate which pattern makes the artistic essence of the style picture. In terms of interior design, however, it may help to ensure that certain proportion of the area is used for certain functions. If we want only small area alteration to create a certain atmosphere, we may need more modifications.

It seems that if we just use the original floor plan as the content "picture" and find a floor plan with desired additional styles as the style "picture", the style transfer model could be doing a lot of the "right things". I may try to simplify the style loss, possibly by involving less layers, since I will be using a simplified representation of a "picture". This representation will also allow me to largely simplify the convolution process, even saving that step altogether. If possible, I may try assigning "weights" according to the "pixels". I hope that would allow me to deal with hard constraints on the floor plan, as well as emphasizing certain parts of the style picture. I also want to try to be able to input multiple floor plans of a certain architectural style, and make the "style picture" a collection of patterns most representative of that style.

Please watch for my upcoming "floorPlan-GrANd:" posts on my coding and other updates. You can also find a summary of this model here: https://maicro879342585.wordpress.com/2021/08/18/grand-floorplan/

References links:
https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Gatys_Image_Style_Transfer_CVPR_2016_paper.pdf
https://arxiv.org/pdf/1611.07865.pdf