免费idc公益接口
内容(Contents)
Introduction
介绍
What is Open Data?
什么是开放数据?
Open vs. Free vs. Online Data
开放与免费与在线数据
Where to find Open Data?
在哪里可以找到开放数据?
- International organizations 国际组织
- United States美国
- Europe欧洲
- Latin America拉丁美洲
- Asia亚洲
Other Open Data sources (Google Public Data Explorer, Kaggle, FiveThirtyEight, UCI Machine Learning Repository etc.)
其他开放数据源(Google公共数据资源管理器,Kaggle,FiveThirtyEight,UCI机器学习存储库等)
5. Conclusions
5。结论
介绍(Introduction)
Data Science has the power to bring great contributions to building the world we want to live in. And there are already numerous use cases which demonstrate how it can be leveraged for solving real-world problems.
数据科学可以为建立我们想要生活的世界做出巨大贡献。并且已经有许多用例证明了如何利用它来解决现实问题。
Some examples of such cases can also be found in my previous article on this subject:
在我以前关于该主题的文章中也可以找到这种情况的一些示例:
However, for doing so, we need data that is freely available for reusing and structured in a useful format. In this article, I am going through some of the most well-known and important portals that can be used in this regard.
但是,为此,我们需要可免费使用的数据并以有用的格式进行结构化。 在本文中,我将介绍一些可以在这方面使用的最著名和重要的门户。
什么是开放数据? (What is open data?)
‘Open data’ refers to data that are freely available without restrictions from copyright, patents or other mechanisms of control. (UNICEF Data)
“开放数据”是指不受版权,专利或其他控制机制限制而免费提供的数据。 (联合国儿童基金会数据)
In this context, it is not enough to just share data publicly in hard copy reports. For data to be considered fully open, it must follow certain principles that maximizes its utility:
在这种情况下,仅在硬拷贝报告中公开共享数据是不够的。 为了使数据被认为是完全开放的,它必须遵循某些原则以最大化其实用性:
- to be structured using classifications accepted internationally (ISO-3166 for countries); 使用国际认可的分类进行结构化(各国采用ISO-3166);
- to use non-proprietary file formats (such as JSON or CSV); 使用非专有文件格式(例如JSON或CSV);
- to be available via standards-compliant communication interfaces (such as SDMX-JSON); 可通过符合标准的通信接口(例如SDMX-JSON)使用;
- and have appropriate metadata describing it. 并有描述它的适当元数据。
Open data is part of a larger set of movements, that includes also open-source software, open educational resources, open access, open science, open government and other.
开放数据是更大范围运动的一部分,其中还包括开放源代码软件,开放教育资源,开放访问,开放科学,开放政府等。
More and more, certain types of data have started being considered a ‘public good’ which, when made available for use, reuse and free distribution can lead to better policy-making, better informed decisions, value creation and citizen-centric services. And this is how, Open Government Data philosophy and set of policies have also appeared.
越来越多的某些类型的数据已开始被视为“公共物品” ,当这些数据可供使用,重用和自由分发时,可以导致更好的决策,更明智的决策,价值创造和以公民为中心的服务。 这就是公开政府数据哲学和一系列政策的出现方式。
Open government is a doctrine according to which citizens should have access to governmental documents and data for effective public oversight. By making governmental data open, public institutions show transparency and accountability in front of the citizens they are serving.
公开政府是一种理论,根据该理论,公民应有权获取政府文件和数据以进行有效的公众监督。 通过公开政府数据,公共机构可以在所服务的公民面前展现透明度和问责制。
One amazing example I have encountered comes from Seoul, South Korea, where open data has become the norm and it is used for tackling real challenges the city and its citizens are facing. In Seoul, not only public institutions are using the data they are collecting, but also any business, non-profit organization or regular citizen can access them if they wish to build upon them or just check them for accountability reasons.
我遇到的一个令人惊讶的例子来自韩国的首尔,那里的开放数据已成为常态,用于应对城市及其市民面临的实际挑战。 在首尔,不仅公共机构正在使用它们收集的数据,而且任何企业,非营利组织或普通公民都可以访问它们,如果他们希望在它们之上建立数据,或者只是出于问责的原因对其进行检查。
One of the goals of the City Hall is to provide open data to its citizens so that they can use them and build upon them. And by doing so, it has contributed to the creation of a new industry, in which many startups use the data provided for developing innovative solutions to some of the challenges faced within the city.
市政厅的目标之一是向其公民提供开放数据,以便他们可以使用它们并在其上建立基础。 通过这样做,它为创建一个新行业做出了贡献,在这个行业中,许多初创公司使用提供的数据来开发创新解决方案,以应对城市面临的一些挑战。
For more information regarding the example from South Korea and others alike, see the video below from The Economist:
有关韩国和其他国家/地区的示例的更多信息,请参见下面的《经济学人》视频:
开放与免费与在线数据(Open vs. Free vs. Online Data)
Open data is data without restrictions. Free data is data that is available without cost. Usually, open data is also free of charge. But when it comes to online data, not all of it can be used for free or without restrictions. In many cases it is copyrighted, being the propriety of its creators, and it requires permission or paying a fee.
开放数据是没有限制的数据。 免费数据是免费提供的数据。 通常,开放数据也是免费的。 但是,当涉及在线数据时,并非所有数据都可以免费使用或不受限制。 在许多情况下,它是受版权保护的(属于其创建者的专有财产),并且需要获得许可或付费。
Even when the data is not copyrighted, things are not perfectly clear. And we can think here of web scrapping data from LinkedIn. In 2019, the US Court of Appeals denied LinkedIn’s request to prevent the analytics company HiQ from scraping its data. Even so, LinkedIn does not appreciate anyone trying to scrape data from its platform, and warns against it in some articles.
即使数据没有版权,情况也不是很清楚。 我们可以在这里考虑来自LinkedIn的网络抓取数据。 2019年,美国上诉法院拒绝了LinkedIn的要求,以防止分析公司HiQ抓取其数据。 即便如此,LinkedIn还是不欢迎任何尝试从其平台抓取数据的人,并在某些文章中对此提出警告。
在哪里可以找到开放数据? (Where to find Open Data?)
Now, let's get to the meat of this article: where can one find open data; be it governmental or of other types. Below, I have covered sources of data provided by international organizations, sources specific to certain regions (US, Europe, Latin America, Asia), and other types of sources of global relevance.
现在,让我们开始本文的重点:在哪里可以找到开放数据;在哪里可以找到开放数据? 无论是政府的还是其他类型的。 下面,我介绍了国际组织提供的数据来源,特定地区(美国,欧洲,拉丁美洲,亚洲)的特定来源以及其他与全球相关的类型的来源。
国际组织 (International organizations)
世界银行开放数据(World Bank Open Data)
Through this portal, the World Bank provides free and open access to a large palette of data regarding development in countries around the globe. And this comes as a result of their belief that by providing a broader access to their data, they increase transparency and accountability, as well as contribute to helping policy makers to make better informed decisions.
通过该门户网站,世界银行可以免费开放地访问有关全球各国发展的大量数据。 这是由于他们相信,通过提供对数据的更广泛访问,他们可以提高透明度和问责制,并有助于帮助决策者做出更明智的决策。
The users can navigate around the 4593 datasets either by country and regions or by indicators, organized around different sectors (agriculture, education, gender, infrastructure, environment, urban development etc.).
用户可以按国家和地区或按指标(在不同部门(农业,教育,性别,基础设施,环境,城市发展等)中进行组织)浏览4593个数据集。
What is even more valuable about their search portal is the fact that it provides access to types of data such as time series, microdata (obtained from sample surveys, censuses and administrative systems), and geospatial data.
他们的搜索门户网站更有价值的事实是它提供了对数据类型的访问,例如时间序列,微数据(从样本调查,人口普查和行政系统获得)以及地理空间数据。
Moreover, if you wish to get a better impression regarding the type of information that can be extracted from their datasets, take a look at their 191 visualizations that cover topics such as the no. of people without access to electricity, the rise of global CO2 emissions, resource depletion, access to improved water sources etc.
此外,如果您希望对可以从其数据集中提取的信息类型有更好的印象,请查看其191种可视化技术,其中涵盖了主题。 无电人口,全球二氧化碳排放量上升,资源枯竭,获得改善的水源等
经合组织数据(OECD Data)
The OECD Data portal provides access to 875 databases that can be searched according to the country of interest or topic (agriculture, development, economy, education, energy, environment, finance, government, health, innovation and technology, jobs, society).
经合组织数据门户网站提供对875个数据库的访问,可以根据感兴趣的国家或主题(农业,发展,经济,教育,能源,环境,金融,政府,卫生,创新和技术,就业,社会)进行搜索。
One of the portal’s benefits is that it also provides data recorded over time, sometimes as early as 1959. One downside is that it covers mostly data related to countries which are part of the OECD. For example, Romania is not part of it.
该门户网站的好处之一是,它还提供了随时间推移而记录的数据,有时甚至早在1959年。不利的一面是,它涵盖了与经合组织一部分国家相关的大部分数据。 例如,罗马尼亚不属于该国。
And if you do not wish to download datasets yet, an just explore what they have in store, you can make your own queries on large databases in their data warehouse, OECD.Stat.
而且,如果您还不希望下载数据集,只需浏览它们存储的内容,就可以对数据仓库OECD.Stat中的大型数据库进行自己的查询。
联合国数据 (United Nations Data)
The United Nation’s data portal has been created as a result of the belief that statistics should be considered a public good, which can serve for evidence-based policy and better informed decision-making.
联合国的数据门户网站的建立是因为人们相信统计数据应被视为一种公共物品,可以为基于证据的政策和更明智的决策提供服务。
The portal aims to provide free access to over 60 million data points organized in 32 large databases compiled by the UN, as same as by other international agencies in a single-entry point. Examples of source organizations are: Food and Agriculture Organization, World Health Organization, The World Bank, the OECD, International Monetary Fund etc.
该门户旨在免费访问联合国组织的32个大型数据库中组织的超过6000万个数据点,与其他国际机构在一个条目中一样。 来源组织的例子有:粮食及农业组织,世界卫生组织,世界银行,经合组织,国际货币基金组织等。
The search engine allows users to look for information either based on the larger datasets, sources of data, or topics. Each such element has a drop-down menu which, in my opinion, allows for an easy user navigation.
搜索引擎允许用户根据较大的数据集,数据源或主题来查找信息。 每个这样的元素都有一个下拉菜单,我认为,该菜单允许用户轻松导航。
Moreover, UN Data provides access to three specialized UNSD Databases, such as UNComtrade, Monthly Bulletin of Statistics Online and the well-known Sustainable Development Goals indicators, through separate individual portals. The UN Comtrade is a repository of official international trade statistics, relevant analytical tables and publications. MBS Online provide access to economic and social statistics regarding more than 200 countries and territories in the world. And it contains 55 tables with over 100 indicators on a variety of subjects, recorder for 80 years.
此外,联合国数据通过单独的门户网站提供对三个专门的联合国统计司数据库的访问,例如联合国商品贸易委员会,在线统计月报和著名的可持续发展目标指标。 联合国商品贸易统计数据库是官方国际贸易统计资料,相关分析表和出版物的储存库。 MBS Online提供了有关全球200多个国家和地区的经济和社会统计信息的访问权限。 它包含55个表格,其中包含100多个指标,涵盖各种主题,并具有80年的记录器。
The United Nations Global SDG Database offers access to 460 data series that illustrate the progress registered towards achieving the Sustainable Development Goals. The search on the portal can be filtered either by goals and their specific targets and indicators, as same as by geographic areas (as it also includes country profiles) and years (2000 to 2019).
联合国全球可持续发展目标数据库可提供460个数据系列,这些数据系列说明了在实现可持续发展目标方面取得的进展。 门户网站上的搜索可以按目标,特定目标和指标进行过滤,也可以按地理区域(还包括国家概况)和年份(2000年至2019年)进行过滤。
Some other features provided by the UN Data portal include access to popular statistical tables produced as part of the UN Statistical Yearbook and statistical profiles of countries (areas) and regions.
联合国数据门户网站提供的其他一些功能包括访问作为《联合国统计年鉴》一部分而编写的流行统计表以及国家(地区)和地区的统计资料。
联合国儿童基金会数据 (UNICEF DATA)
The UNICEF DATA portal is for those wishing to work with data specifically about children and women. Their Data Warehouse includes datasets related to topics such as child mortality, child poverty, child protection and development, education, gender, maternal, child and newborn health, migration, nutrition, transition to work and other. And, again, data can be also filtered by country.
联合国儿童基金会数据门户网站面向希望使用有关儿童和妇女数据的人们。 他们的数据仓库包含与以下主题相关的数据集,例如儿童死亡率,儿童贫困,儿童保护与发展,教育,性别,孕产妇,儿童和新生儿健康,移民,营养,工作过渡等。 同样,数据也可以按国家/地区进行过滤。
GHO数据存储库—世界卫生组织 (GHO data repository — World Health Organization)
When it comes to data, WHO has a high coverage, as it works with 194 Member States from six regions. And through the Global Health Observatory, WHO provides access to more than 1000 indicators that it monitors that can be navigated either according to themes under SDG health and health-related targets, by category, or by country. Some examples of the types of data it provides are: road traffic injuries, noncommunicable diseases and mental health, mortality from environmental pollution, tobacco control, clean cities, Health Equity Monitor etc.
在数据方面,世卫组织与六个地区的194个会员国合作,因此覆盖面广。 世卫组织通过全球卫生观察站提供了1000多个监测指标,可以根据SDG卫生和与健康相关的目标下的主题(按类别或按国家)进行导航。 它提供的数据类型的一些示例是:道路交通伤害,非传染性疾病和精神健康,环境污染造成的死亡率,烟草控制,清洁城市,健康公平监测等。

美国(United States)
DATA.GOV(DATA.GOV)
The US Government’s open data portal helps users navigate over 225 079 datasets from different Governmental Agencies, which can be used together with the tools and other resources provided to conduct research, develop web and mobile applications, design data visualizations and other.
美国政府的开放数据门户可帮助用户浏览来自不同政府机构的225 079个数据集,这些数据集可与提供的工具和其他资源一起使用,以进行研究,开发Web和移动应用程序,设计数据可视化等。
One advantage when using it is that it allows filtering data according to location (on map), topics, format, types of data (geospatial or non-geospatial), organizations, organizations types, Bureaus and Publishers.
使用它的一个优势是,它允许根据位置(在地图上),主题,格式,数据类型(地理空间或非地理空间),组织,组织类型,局和发布者来过滤数据。
One drawback of the portal is that, even though most datasets have valid metadata, there are still some that do not have working URLs that permit download.
门户网站的一个缺点是,即使大多数数据集都具有有效的元数据,但仍有一些没有允许下载的有效URL。
美国人口普查局 (US Census Bureau)
The United States Census Bureau is in charge with producing data about the American people and economy, as it’s primary mission is to conduct the US Census every ten years. The data that it collects is then used by policy makers at all levels — federal, state or local.
美国人口普查局负责产生有关美国人民和经济的数据,因为其首要任务是每十年进行一次美国人口普查。 然后,其收集的数据将由联邦,州或地方各级的决策者使用。
Some examples of tools that it provides access to are: American Fact Finder, Census Data Explorer and Quick Facts which allow users to search and visualize data according to their interests.
它提供了一些可访问的工具示例:American Fact Finder,人口普查数据浏览器和Quick Facts,它们使用户可以根据自己的兴趣搜索和可视化数据。
欧洲 (Europe)
欧盟开放数据门户(EU Open Data Portal)
The EU Open Data Portal provides free access to data from a broad range of subjects, such as: education, environment, economy and finance, agriculture, forestry, food, health, government and public sector, justice, energy, science and technology, transport etc. The 15 561 datasets (till date) come from all EU institutions, bodies and agencies (e.g. Eurostat, EU’s statistical office, the Joint Research Center, European Investment Bank, the European Commission Directorate Generals, Environment Agency etc.).
欧盟开放数据门户网站可让您免费访问以下广泛领域的数据:教育,环境,经济和金融,农业,林业,食品,卫生,政府和公共部门,司法,能源,科学技术,运输15561个数据集(截止日期)来自所有欧盟机构,团体和机构(例如,欧盟统计局,欧盟统计局,联合研究中心,欧洲投资银行,欧盟委员会总干事,环境署等)。
Most data provided on the portal can be reused free of charge, for both non-commercial and commercial purposes, on the condition that the source is acknowledged. And only a small number of datasets have special conditions of reuse, as a result of the necessity to protect third-party intellectual property rights.
门户网站上提供的大多数数据都可以在确认来源的情况下免费用于非商业目的和商业目的。 由于保护第三方知识产权的必要性,只有少数数据集具有特殊的重用条件。
As a bonus, the portal also provides access to a visualization catalogue that includes a collection of visual tools, training materials [data visualization workshops and webinars which involve working with tools such as D3.js, Qlik Sense, Webtools Maps, PowerBI) and re-usable visualizations.
作为奖励,门户网站还提供了访问可视化目录,其包括视觉工具的集合,培训材料[数据可视化车间和网络研讨会其中涉及使用工具如D3.js,Qlik感,Webtools的地图,PowerBI)和再工作可用的可视化。
欧洲数据门户 (European Data Portal)
This Portal is managed by the Publications Office of the European Union and it harvests the metadata of the Public Sector Information available on public data portals across European countries. To date, it covers 36 countries, 81 catalogues and 1 089 978 datasets, through which one can search based on categories similar to those used by the EU Open Data Portal.
该门户网站由欧盟出版局管理,它收集了欧洲国家公共数据门户网站上可用的公共部门信息的元数据。 迄今为止,它涵盖了36个国家/地区,81个目录和1 089 978个数据集,通过这些数据集,可以根据与欧盟开放数据门户网站所使用的类别相似的类别进行搜索。
Moreover, it also includes information regarding the provision of data and the benefits of re-using data.
此外,它还包括有关数据提供和重复使用数据的好处的信息。
打开所有欧盟成员国的政府数据网站 (Open Government Data websites from all EU Member States)
Plus the United Kingdom, which no longer is part of the EU:
加上不再属于欧盟的英国:
亚洲 (Asia)
亚行数据库(ADB Data Library)
The Asian Development Bank (ADB) was established in 1966 and it has 68 members, of which 49 are form Asia and the Pacific region. Its Data Library has a pretty intuitive search system, through which one can browse either by topic or country. The repository contains (to date) 234 datasets, 45 dashboards and 10 data stories. Among the topics covered are: financial sector, poverty, people, public sector governance, economics, and other.
亚洲开发银行(ADB)成立于1966年,拥有68个成员,其中49个成员来自亚洲及太平洋地区。 它的数据库具有一个非常直观的搜索系统,通过该系统可以按主题或国家进行浏览。 该存储库包含(迄今为止) 234个数据集,45个仪表板和10个数据故事。 涉及的主题包括:金融部门,贫困,人民,公共部门治理,经济学等。
One other interesting product of ADB I have learned about during the Bank’s recent conference in Evaluation is EVA, an AI engine that scans evaluation and other types of documents in order to identify lessons in ADB’s operations developed in its member countries.
我在世界银行最近的评估会议上了解到的亚行的另一个有趣产品是EVA ,它是一种AI引擎,可以扫描评估和其他类型的文档,以识别亚行在其成员国中开展的业务中的经验教训。
韩国公开政府数据门户 (South Korea Open Government Data portal)
South Korea is a very good example of best practice when it comes to open data. However, their website is designed only for native speakers.
在公开数据方面,韩国是最佳做法的一个很好的例子。 但是,他们的网站仅面向母语人士。
拉丁美洲 (Latin America)
发展数字(Numbers for Development)
Numbers for Development is the Inter-America Development Bank’s Open Data portal, and it showcases socio-economic indicators for the Latin American and the Caribbean Region. And it is built upon seven data sources: Agrimonitor (tracks agricultural policies), INTrade (trade in the region), Latin Macro Watch (macroeconomics, social issues, trade, capital flows, markets and governance), Public Management, Social Pulse (living conditions), SIMS (labor markets), Sociometro (socio-economic conditions). The search process can be filtered either by country, or by indicator.
《发展数字》是美洲开发银行的开放数据门户,它展示了拉丁美洲和加勒比地区的社会经济指标。 它基于七个数据源:Agrimonitor(跟踪农业政策),INTrade(该地区的贸易),Latin Macro Watch(宏观经济学,社会问题,贸易,资本流动,市场和治理),公共管理,Social Pulse(生活)条件),SIMS(劳动力市场),Sociometro(社会经济条件)。 搜索过程可以按国家或指示器进行筛选。
Below, I have added an interesting article regarding how big and open data were previously used for social good in Latin American countries:
下面,我添加了一篇有趣的文章,内容涉及拉丁美洲国家以前用于社会福利的大数据和开放数据:
Open Data portals from Latin American countries
来自拉丁美洲国家的开放数据门户
其他开放数据源(Other Open Data sources)
Google Public Data Explorer(Google Public Data Explorer)
The Google Public Data Explorer is in part a search engine that facilitates access to datasets provided by international organization (as those covered previously in this article), national statistical offices, NGOs and research institutions. In addition, the team behind it wanted to give more to its users and that is why their aim is to make the large datasets of public interest easier to explore, visualize and communicate even by non-technical audiences.
Google Public Data Explorer在某种程度上是一个搜索引擎,可以方便地访问国际组织(如本文前面所述),国家统计局,非政府组织和研究机构提供的数据集。 此外,它背后的团队希望向用户提供更多收益,这就是为什么他们的目标是使公众感兴趣的大型数据集更容易被非技术受众浏览,可视化和交流。

Beside the Google Public Data Explorer, there is also the Google Dataset Search engine which enables its users to find datasets stored across the Web through simple keyword searches. When using it, one can apply filters related to the download format, usage rights, topics, or according to the last update. One criteria the source uses for ranking its datasets in search return results is the number of scholarly articles that has citied a dataset.
除了Google Public Data Explorer之外,还有Google Dataset Search引擎,它的用户可以通过简单的关键字搜索来查找存储在Web上的数据集。 使用它时,可以应用与下载格式,使用权利,主题相关的过滤器,或根据最近的更新。 来源用来在搜索返回结果中对其数据集进行排名的标准之一是引用数据集的学术文章的数量。

五十八(FiveThirtyEight)
FiveThirtyEight is a very comprehensive source for high-quality data coming from the field of Journalism. The topics covered include: politics, sports, science & health, economics and culture.
FiveThirtyEight是来自新闻领域的高质量数据的非常全面的来源。 涵盖的主题包括:政治,体育,科学与健康,经济学和文化。
卡格勒 (Kaggle)
Among open data sources, Kaggle might be the most well known by data scientists, due to the community that it has built around it.
在开放数据源中,由于Kaggle围绕它建立了社区,因此它可能是数据科学家最广为人知的。
Kaggle supports a variety of publication formats for datasets, but they also encourage their dataset publishers to share their data in an accessible and non-proprietary format, where possible. Among the supported file types are: CSVs, JSON, and SQLite.
Kaggle支持多种数据集发布格式,但它们还鼓励数据集发布者在可能的情况下以可访问且非专有的格式共享数据。 受支持的文件类型包括:CSV,JSON和SQLite。
One big advantage of Kaggle for those who are new to Data Science is that it supports learning by creating communities around each of its dataset, in which every interested user can contribute by solving tasks related to that dataset, submit their results and participate in discussions, receive and give feedback.
Kaggle对于数据科学新手来说的一大优势在于,它通过在每个数据集周围创建社区来支持学习,每个感兴趣的用户都可以通过解决与该数据集相关的任务,提交结果并参与讨论来做出贡献,接收并提供反馈。
DBpedia (DBpedia)
DBpedia was built based on the most commonly used infoboxes within Wikipedia and its ontology currently contains 4 233 000 instances, from which, for example, 1 450 000 are persons and 241 000 are organizations. Its data has previously benefited companies such as Apple, Google and IBM for some of their most important artificial intelligence projects.
DBpedia是基于Wikipedia中最常用的信息框构建的,其本体当前包含4 233 000实例,例如,其中145万人是组织,而241 000是组织。 以前,其数据已使苹果,谷歌和IBM等公司的一些最重要的人工智能项目受益。
UCI机器学习存储库 (UCI Machine Learning Repository)
The UC Irvine Machine Learning Repository contains 557 datasets that can be used for empirical analysis of machine learning algorithms. It has been created in 1987 and has been used by students, educators and researchers as a primary source for machine learning datasets. Among the topics covered by their newest uploaded datasets are: Facebook Large Page-Page Network, amphibians, early stage diabetes risk prediction, bitcoin, and other. And the top 5 most popular dataset since 2007 refer to: classes of iris plant, predict whether income exceeds $50K/year based on census data, using chemical analysis to determine the origin of wines, diagnosing breast cancer, presence of heart disease in patients.
加州大学尔湾分校的机器学习资源库包含557个数据集,可用于对机器学习算法进行经验分析。 它创建于1987年,已被学生,教育者和研究人员用作机器学习数据集的主要来源。 他们最新上传的数据集涵盖的主题包括:Facebook大页面网络,两栖动物,早期糖尿病风险预测,比特币等。 自2007年以来最热门的5个数据集涉及:鸢尾花植物的种类,根据普查数据预测收入是否超过50,000美元/年,使用化学分析确定葡萄酒的来源,诊断乳腺癌,患者患上心脏病。
结论 (Conclusions)
While going through the above-mentioned portals, I was amazed by the wealth of information available as well as by the additional tools some of them are offering for public use. Data truly can be beautiful.
通过上述门户网站时,我对可用的丰富信息以及其中一些提供给公众使用的其他工具感到惊讶。 数据确实可以是美丽的。
As the amounts of data that become available in the world grow bigger and bigger, I believe we have increasing chances of using them for higher purposes, and in helping shape a better world.
随着世界上可用的数据量越来越大,我相信我们越来越有机会将其用于更高的目的,并帮助塑造一个更好的世界。
Thank you for reading. I hope the content was useful. And if you believe that there are other sources of open data worth adding which were not included, please mention them through a comment.
感谢您的阅读。 我希望内容有用。 而且,如果您认为还有其他值得添加的开放数据源未包括在内,请通过评论提及它们。
免费idc公益接口