MLE and Cross Entropy


Because maximizing "log likelihood" is equals as minimizing "negative log likelihood", negative log likelihood can be view as a loss function.

An important distinction between Negative Log Likelihood Loss (NLLLoss) and Cross Entropy Loss (CE) is that CE implicitly applies a softmax activation followed by a log transformation but NLLLoss does not.
Last updated