Understanding the scaling of L² regularization in the context of neural networks

in dlike •  6 years ago 

share-with-dlike.jpg

Did you ever look at the L² regularization term of a neural network’s cost function and wondered why it is scaled by both 2 and m? You may have encountered it in one of the numerous papers using it…


Source of shared Link

Authors get paid when people like you upvote their post.
If you enjoyed what you read here, create your account today and start earning FREE STEEM!
Sort Order:  

Hmm highly technical subject...