Python scikit-learn (metrics): difference between r2_score and explained_variance_score?

Python scikit-learn (metrics): difference between r2_score and explained_variance_score?

I noticed that that ‘r2_score’ and ‘explained_variance_score’ are both build-in sklearn.metrics methods for regression problems.

I was always under the impression that r2_score is the percent variance explained by the model. How is it different from ‘explained_variance_score’?

When would you choose one over the other?

Thanks!

 

OK, look at this example:

In [123]:
#data
y_true = [3, -0.5, 2, 7]
y_pred = [2.5, 0.0, 2, 8]
print metrics.explained_variance_score(y_true, y_pred)
print metrics.r2_score(y_true, y_pred)
0.957173447537
0.948608137045
In [124]:
#what explained_variance_score really is
1-np.cov(np.array(y_true)-np.array(y_pred))/np.cov(y_true)
Out[124]:
0.95717344753747324
In [125]:
#what r^2 really is
1-((np.array(y_true)-np.array(y_pred))**2).sum()/(4*np.array(y_true).std()**2)
Out[125]:
0.94860813704496794
In [126]:
#Notice that the mean residue is not 0
(np.array(y_true)-np.array(y_pred)).mean()
Out[126]:
-0.25
In [127]:
#if the predicted values are different, such that the mean residue IS 0:
y_pred=[2.5, 0.0, 2, 7]
(np.array(y_true)-np.array(y_pred)).mean()
Out[127]:
0.0
In [128]:
#They become the same stuff
print metrics.explained_variance_score(y_true, y_pred)
print metrics.r2_score(y_true, y_pred)
0.982869379015
0.982869379015

 

So, when the mean residue is 0, they are the same. Which one to choose dependents on your needs, that is, is the mean residue suppose to be 0?

 

Most of the answers I found (including here) emphasize on the difference between R2 and Explained Variance Score, that is: The Mean Residue (i.e. The Mean of Error).

However, there is an important question left behind, that is: Why on earth I need to consider The Mean of Error?


Refresher:

R2: is the Coefficient of Determination which measures the amount of variation explained by the (least-squares) Linear Regression.

You can look at it from a different angle for the purpose of evaluating the predicted values of y like this:

Varianceactual_y × R2actual_y = Variancepredicted_y

 

So intuitively, the more R2 is closer to 1, the more actual_y and predicted_y will have samevariance (i.e. same spread)


As previously mentioned, the main difference is the Mean of Error; and if we look at the formulas, we find that’s true:

R2 = 1 - [(Sum of Squared Residuals / n) / Variancey_actual]

Explained Variance Score = 1 - [Variance(Ypredicted - Yactual) / Variancey_actual]

 

in which:

Variance(Ypredicted - Yactual) = (Sum of Squared Residuals - Mean Error) / n 

 

So, obviously the only difference is that we are subtracting the Mean Error from the first formula! … But Why?


When we compare the R2 Score with the Explained Variance Score, we are basically checking the Mean Error; so if R2 = Explained Variance Score, that means: The Mean Error = Zero!

The Mean Error reflects the tendency of our estimator, that is: the Biased v.s Unbiased Estimation.


In Summary:

If you want to have unbiased estimator so our model is not underestimating or overestimating, you may consider taking Mean of Error into account.

 

参考链接:https://stackoverflow.com/questions/24378176/python-sci-kit-learn-metrics-difference-between-r2-score-and-explained-varian

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请联系我们举报,一经查实,本站将立刻删除。

发布者:全栈程序员-站长,转载请注明出处:https://javaforall.net/119588.html原文链接:https://javaforall.net

(0)
全栈程序员-站长的头像全栈程序员-站长


相关推荐

  • 有序的四字成语_LinkedHashMap

    有序的四字成语_LinkedHashMapHashMap是无序的,HashMap在put的时候是根据key的hashcode进行hash然后放入对应的地方。所以在按照一定顺序put进HashMap中,然后遍历出HashMap的顺序跟put的顺序不同(除非在put的时候key已经按照hashcode排序号了,这种几率非常小)单纯的HashMap是无法实现排序的,这的排序是指,我们将键值对按照一定的顺序put进HashMap里,然后在进行

    2022年9月23日
    0
  • 黑苹果 服务器系统安装教程,黑苹果安装教程,详细教您黑苹果怎么安装[通俗易懂]

    黑苹果 服务器系统安装教程,黑苹果安装教程,详细教您黑苹果怎么安装[通俗易懂]科技发展至今,安装黑苹果的方式多种多样,最开始的变色龙引导,到现在的clover引导,正所谓通往罗马的路不止一条啊,今天我们要说的是黑苹果安装方式,那黑苹果怎么安装?下面,小编跟大家讲解安装黑苹果的操作流程了。随着iphone的流行,苹果大行其道。越来越多的应用开发者加入苹果的行列,黑苹果的升级虽然说不像白苹果升级那样简单,但是只要掌握了方法,我们也可以很简单地完成黑苹果的安装。下面,小编跟大家分…

    2022年6月11日
    44
  • CMS指纹字典大全[通俗易懂]

    CMS指纹字典大全[通俗易懂]CMS识别是依靠特殊的文件,所以识别字典是非常重要的

    2022年9月27日
    0
  • api接口开放平台_手机系统分享接口在哪里

    api接口开放平台_手机系统分享接口在哪里免费在线接口(资源链接)1.csdnhttps://blog.csdn.net/c__chao/article/details/785737372.sojsonhttps://www.sojson.com/api/3.bejsonhttp://www.bejson.com/knownjson/webInterface/4.csdnhttps://blog.csdn.net/ishxiao/article/details/528573615.juhe(部分免费)https://www.j

    2022年10月3日
    1
  • 旌扬机器人_“http://club.liangchanba.com/”搜索蜘蛛、机器人模拟抓取结果–站长工具…

    旌扬机器人_“http://club.liangchanba.com/”搜索蜘蛛、机器人模拟抓取结果–站长工具…量产部落_量产吧论坛_量产之家_量产网_U盘之家_固态硬盘之家_U盘量产工具_SSD量产工具_固态硬盘开卡软件-提供量产工具和相关资料下载-量产吧论坛-专业U盘/SSD量产交流网站量产部落,量产吧论坛,量产之家,量产网,U盘之家,固态硬盘之家,U盘量产工具,SSD量产工具,固态硬盘开卡软件量产部落(LCB.CLUB)提供量产工具和相关资料下载请登录后使用快捷导航没有帐号?加入部落…

    2022年5月31日
    30
  • 六:面向对象(上)

    六:面向对象(上)跳转到总目录文章目录01、面向过程与面向对象02、类和对象2.1、Java类及类的成员2.2、类与对象的创建及使用2.3、对象的创建和使用:内存解析03、类的成员之一:属性04、类的成员之二:方法4.1、类中方法的声明和使用4.2、理解“万事万物皆对象”4.3、对象数组的内存解析4.4、匿名对象的使用4.5、自定义数组的工具类4.6、方法的重载(overload)4.7、可变个数的形参4.8、方法参数的值传递机制(重点!!!)4.8.1、**针对基本数据类型**4.8.2、**针对引用数据类型**4

    2022年7月24日
    6

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注

关注全栈程序员社区公众号