Global Data Barometer: Do the data capabilities of governments correlate with the availability of data?
Key Findings
- At the global level and between regions, governments with robust data capabilities, including open data initiatives, data institutions, digital government policies, and trained civil servants in data skills, tend to have higher availability of data.
- Countries with a strong correlation between their data capabilities and data availability should focus on enhancing specific areas of data capabilities, such as improving data skills, digital government efforts, or internet access.
- Some countries exhibit low data availability despite possessing high data capabilities e.g. UAE, Malaysia and Ireland. These countries should evaluate their existing data initiatives and ensure that datasets are made accessible in accordance with GDB standards.
- Future GDB surveys should consider tailoring their approach to countries that demonstrate high scores in data availability despite having lower capabilities e.g. US, Brazil and New Zealand. This is most likely because their data is available but they are on decentralized platforms.
Introduction
About Global Data Barometer
The Global Data Barometer (GDB) is the result of the efforts of over 100 researchers and a network of regional research hubs around the world. The design of the GDB builds on the previous editions of the Open Data Barometer, but takes a broader look at data sharing and use for the public good, including giving additional attention to issues of privacy and inclusion.
The Barometer is a multi-dimensional and multi-layered study that assessed the state of data for public good in 109 countries. An expert survey was conducted from May 2019 – May 2021 to create a new global benchmark that looks at data governance, capability, availability, and use and impact of data for public good. The GDB full report and datasets of the GDB are available on their website.
Researchers in Asia were engaged in collaboration with the Data for Development AsiaHub to provide a new benchmark and the essential data needed to drive a fuller understanding of the state of data for development, open data implementation, and data justice in Asia.
What is measured on capabilities and availabilities in the GDB
In the report, the Capabilities pillar measures four primary indicators, namely training of civil servants in data literacy and skills, availability of open government data initiatives in the country, government support for data re-use and capacity of sub-national governments to manage data.
The GDB assessed the availability of 16 datasets based on their potential to address key issues such as climate change, public health, political integrity and land rights. The Availability pillar measures datasets availability as structured open data on:
|
|
State of Capabilities
Chart: Capabilities scores across countries according to the Global Data Barometer
Global
Pillar Score |
42 |
According to the GDB report, there are significant variations in the capabilities of governments as among the pillars, it has the greatest range of scores between the highest (Estonia, 91.2) and the lowest (Haiti, 11.8). This variation seems to be consistent with the digital divide in the world, including digital literacy and access to digital technologies.
Estonia has scores of more than 90 in most of the Capabilities modules, and it is no wonder that they are claimed to be the one of the most digitally advanced countries.
Between regions
Between regions, there is also a high level of variation of Capabilities scores, with European Union, United Kingdom, North America, Israel, Australia and New Zealand achieving an average module score of 70 and Africa with a score of 32.
Region |
Average Score |
Africa |
32 |
Eastern Europe and Central Asia |
45 |
European Union, United Kingdom, North America, Israel, Australia and New Zealand |
70 |
Latin America and the Caribbean |
44 |
Middle East and North Africa |
46 |
South and East Asia |
53 |
State of Availability
Chart: Availability scores across countries according to the Global Data Barometer
Global
Pillar Score |
42 |
Based on the GDB assessment, the pandemic has shown that most countries have capacity to make health datasets available, where 98.2% of countries have data on COVID-19 infection & mortality and 84.4% have data on COVID-19 testing. Many countries also have budget and spending information (96.3%) and public procurement data (91.7%).
However, very few countries have data on lobbying (16.5%) and RTI performance (38.5%).
Comparison between regions
Although all regions achieved scores below 50, the variation of Availability scores does not seem to be as high as Capabilities.
Region |
Average Score |
Africa |
11 |
Eastern Europe and Central Asia |
29 |
European Union, United Kingdom, North America, Israel, Australia and New Zealand |
50 |
Latin America and the Caribbean |
29 |
Middle East and North Africa |
14 |
South and East Asia |
32 |
Comparing Capabilities with Availability
Chart: Capabilities vs Availability scores for the 109 countries assessed in the GDB.
Global
Pillar Score |
0.80 |
Using the scores on Capabilities and Availability pillars in the GDB, the two pillars seem to be highly correlated based on the Pearson correlation score of 0.8. This means that the capability of a country or government in the use of data is almost directly correlated with their ability to make core datasets available.
Between regions
By region, the statement is also quite true although there is slight variation on the range of scores. In African countries, although the Capabilities scores are below average, the availability scores achieved are twice that of the former.
Region |
Capabilities |
Availability |
Africa |
11 |
27 |
Eastern Europe and Central Asia |
29 |
37 |
European Union, United Kingdom, North America, Israel, Australia and New Zealand |
50 |
61 |
Latin America and the Caribbean |
29 |
37 |
Middle East and North Africa |
14 |
38 |
South and East Asia |
32 |
47 |
World |
30 |
42 |
Countries with expected results: strong correlation between the two pillars
The following table shows 10 countries which have high correlation between the two pillars based on their lowest absolute residual score i.e. nearest to the regression line.
Country |
Availability |
Capabilities |
Details |
---|---|---|---|
Bulgaria |
41 |
54 |
Bulgaria has had an open data initiative since 2014 and has high use of standards in statistics, although sub-national capabilities are limited. This translates into their corresponding Availability score, with available health, public finance, political integrity and public procurement datasets. |
Mozambique |
5 |
13 |
Mozambique has 0 score in open data initiative but has a high digitalisation of government services (52). However, it has low availability except in the health module. |
Trinidad and Tobago |
18 |
28 |
Has low open data initiatives but above average scores in terms of digital government services (61.2) and political freedoms and civil liberties (82.0). Despite this, they have low availability of datasets except for health, public finance and procurement modules. |
Togo |
11 |
21 |
Has low internet access and average score of government digital services, but has low availability scores except for procurement data. |
Sweden |
44 |
59 |
High high capabilities in various areas, particular open data initiatives (80) and data institutions (100). In return, has good scores on data availability on all modules except land, procurement and political integrity. |
Bolivia (Plurinational State of) |
20 |
31 |
Has below average scores in most Capabilities indicators, except for Political freedoms and civil liberties. However, they have above average scores on data availability for land and procurement. |
Germany |
54 |
69 |
High high scores in Capabilities indicators, except civil service and political integrity interoperability. In return, they have above average scores on data availability except for company information. |
Bahamas |
17 |
28 |
Very high scores on certain indicators like internet access and political freedoms and civil liberties, but low scores on open data initiative, civil service and digital skills. However, they have some availability on all modules albeit low scores except for health, procurement and public finance. |
Finland |
52 |
69 |
High generally high scores on all Capabilities indicators, except sub-national. In return, they also have high Availability scores in all modules. |
Bahrain |
18 |
28 |
Has high scores in internet access, government online services and digital skills despite below average overall score. And in general, they have above average availability scores for company information, health and procurement. |
In general, it was interesting to see that many of these countries had high scores on political freedoms and civil liberties including the lower-scored countries such as Bahamas, Bolivia, Trinidad and Tobago.
Since these are countries which are most likely to follow the linearity model, we can try to calculate the residual score for each of the indicators with respect to the Availability pillar scores. It was found that only certain indicators were contributing to the scores.
Low residual score, thus contributing factor |
Mixed or high residual score, thus likely not contributing factor |
|
|
This was also regardless of the indicators, as the highest weighted indicators were Civil service, Government support for re-use, Open data initiative and Sub-national capabilities.
Countries with anomalous results: weak correlation between the two pillars
High availability, low capabilities:
The following table shows 10 countries which are anomalous to the hypothesis, where these countries have high Availability scores despite their low Capabilities scores. This is based on their highest residual value i.e. furthest positive distance to the regression line.
In general, these countries seem to have a decentralized statistical or information system and the datasets assessed were identified on different platforms. This may have contributed to their general below-average scores for open data initiatives, except for Chile, Italy and New Zealand, as well as their below-average scores for sub-national capabilities, showing that the data released was by various government agencies at the national level. Hence it may be more fair to assess their sub-national capabilities by their participation in preparing national level data.
Additionally, they have low scores in Government support for re-use, except for the US. So this may mean that even though their datasets are highly available, there is little evidence that they are reused. At the same time, they also have low scores in terms of the civil service training, which could be the contributing factor of the lack of government support in data reuse.
Moreover, despite the fact that most of these countries have political integrity, public finance and procurement datasets available, the datasets have low interoperability, meaning that they do not have common identifiers that facilitate mapping across the system. However, this may not be specific to this group of countries as only 27 countries scored more than 0 out of the 109 countries assessed.
Country |
Availability |
Capabilities |
Details |
---|---|---|---|
Armenia |
49 |
28 |
Armenia has no explicit open data initiative in the country. However, the datasets shown to be available seem to be on decentralized platforms. |
United States of America |
80 |
64 |
The US has generally high scores in Capabilities indicators except civil service, data institutions and sub-national capabilities. With these, they have high availability of datasets assessed, except for company information. These datasets are on decentralized platforms. |
Brazil |
62 |
49 |
Brazil has above average scores in Capabilities indicators, except civil service, digital skills, government support for re-use. For availability scores, they generally scored above average in all modules. |
Chile |
59 |
50 |
Chile has high scores in digital government, government online services, open data initiative and use of standards in statistics offices. However, they have low scores in civil service and government support for re-use. For availability scores, they generally scored above average, except for land data. |
New Zealand |
70 |
62 |
New Zealand has generally high scores in all Capabilities indicators, except civil service and government support for re-use. In return, they have above average scores in Availability modules. |
Georgia |
46 |
40 |
Georgia has average scores across Capabilities scores, yet scored above average in most Availability modules, |
Croatia |
47 |
43 |
Croatia has average scores across Capabilities scores, although scored below average for civil service and government support for re-use. Despite this, they scored above average in Availability modules except in climate action (6). |
Latvia |
54 |
50 |
Latvia achieved average scores across Capabilities scores, but scored below average for civil service and government support for re-use. However, they generally scored well in Availability modules except for land data (12). |
Mexico |
51 |
47 |
Mexico has high scores in a few Capabilities indicators such as data institutions (100) and digital government (83), but low scores in civil service and government support for re-use. However they scored above in most Availability modules. |
Italy |
56 |
54 |
Italy scored well in Capabilities indicators, particularly open data initiative (60), data institutions (100) and use of standards in statistics (100). Consequently, they have above average scores in Availability scores, except land data. |
High capabilities, low availability
The following table shows 10 countries which are anomalous to the hypothesis, where these countries have low Availability scores despite their Capabilities scores. This is based on their lowest residual value i.e. furthest negative distance to the regression line.
In general, these countries have high scores in digital government, digital skills, government online services and open data initiatives. However, for their below average Availability scores, they may need to improve on civil servants training, political freedoms and civil liberties (except Tunisia and Ghana), use of standards and methods in statistics offices, and sub-national capabilities (except Malaysia and UAE).
Particularly, they may need to thoroughly review their current open data priorities to make datasets available according to GDB standards.
Country |
Availability |
Capabilities |
Details |
---|---|---|---|
United Arab Emirates |
18 |
58 |
UAE has high scores in digital government, digital skills, sub-national capabilities and open data initiative. Despite this, they have below average scores across Availability modules. |
Tunisia |
7 |
38 |
Tunisia has average scores in most Capabilities indicators, but has above average scores in political freedom and civil liberties. Nevertheless, they scored below average across Availability modules including the political datasets. |
Sri Lanka |
8 |
35 |
Sri Lanka has high scores in digital skills (54), political freedom and civil liberties (56) and digital government (77), but average scores in other Capabilities indicators. However, they did poorly in data availability assessments, with no data for climate action, company information, land and procurement. |
Saudi Arabia |
20 |
49 |
Saudi Arabia has high scores in a few Capabilities indicators, including open data initiative (80), digital government (93) and digital skills (72). But they have poor scores in political freedoms and civil liberties (7) and data institutions (0). On the other hand, they have some availability in public finance data. |
Rwanda |
11 |
39 |
Rwanda has below average scores in Capabilities, with low internet access but above average in digital government. |
Qatar |
14 |
45 |
Qatar has average scores in general for Capabilities indicators, although achieve some high scores in data institutions and digital government. Despite this, they scored below average in Availability, with no datasets found for climate action and procurement. |
Malaysia |
24 |
69 |
Malaysia has a high Capabilities score as a result of their high scores in digital government, data institutions and sub-national capabilities. However, this does not translate into high availability of datasets. |
Ghana |
15 |
43 |
Like Tunisia, Ghana has average scores in Capabilities indicators but scored high in political freedom and civil liberties. However, they did poorly in Availability scores, except public finance (61). |
Jordan |
8 |
34 |
Jordan has above average scores in open data initiative (80) and digital skills (65). However, for Availability modules, they do not have any data for land and political integrity. |
Côte d'Ivoire |
5 |
41 |
Côte d'Ivoire has average scores across the Capabilities indicators, although scored excellently in government support for re-use (100). But they scored poorly for digital government. |
Conclusions and recommendations
At the global and regional level, the hypothesis of whether a government or country’s capabilities in data is correlated to their ability to make core datasets available seems to be true. In general, the countries that already show high correlation may want to improve their data skills, digital government or internet access, depending on their current state of capabilities.
The countries that showed anomalous results may need to be looked at individually to identify the reasons for the gaps. For the countries that showed high Availability scores despite their lower-than-expected Capabilities scores, the reason may be a decentralized statistical system. Future GDB study may be refined to cater to these countries, particularly in participation of sub-national institutions in the preparation of national level data. However, these countries may also want to make use of the datasets available by promoting re-use.
On the other hand, the countries which showed the opposite, that is low Availability despite high Capabilities, will need to review their current data initiatives or priorities so as to better make datasets available according to GDB standards.
About Sinar Project
Sinar Project is a civic tech initiative using open technology, open data and policy analysis to systematically make important information public and more accessible. It aims to improve governance and encourage greater citizen involvement in the public affairs of the nation by making Parliament and Government more open, transparent and accountable.