Explanation of Logistic Regression Cost Function (Optional) 逻辑回归成本函数的解释
In an earlier video, I've written down a form for the cost function for logistic regression. In this optional video, I want to give you a quick justification for why we like to use that cost function for logistic regression. To quickly recap, in logistic regression, we have that the prediction y hat is sigmoid of w transpose x + b, where sigmoid is this familiar function. And we said that we want to interpret y hat as the p( y = 1 | x). So we want our algorithm to output y hat as the chance that y = 1 for a given set of input features x. So another way to say this is that if y is equal to 1 then the chance of y given x is equal to y hat. And conversely if y is equal to 0 then the chance that y was 0 was 1- y hat, right? So if y hat was a chance, that y = 1, then 1- y hat is the chance that y = 0. So, let me take these last two equations and just copy them to the next slide. So what I'm going to do is take these two equations which basically define p(y|x) for the two cases of y = 0 or y = 1. And then take these two equations and summarize them into a single equation. And just to point out y has to be either 0 or 1 because in binary cost equations, y = 0 or 1 are the only two possible cases, all right. When someone take these two equations and summarize them as follows. Let me just write out what it looks like, then we'll explain why it looks like that. So (1 – y hat) to the power of (1 – y). So it turns out this one line sum