✴️ Glossary¢

A glossary of common terms used throughout Development Data Partnership.

AWS (Amazon Web Services)ΒΆ

AWS it is a cloud service platform. Its services are widely used to store, organize, process and analyze several types of data.

AWS SageMakerΒΆ

SageMaker is one of AWS services to analyze your datasets. It was developed specifically to build, train and deploy Machine Learning Models

Application / Application Program / Application SoftwareΒΆ

An application is a program inside your computer. It could be your note pad, powerpoint, excel, and many others. Application can also be web browsers (Internet Explore, Google Chrome, Mozila), emails client, and web application (Facebook, Twitter, Google Docs).

APIs (Application Programming Interface)ΒΆ

An API is basically a channel, with some set of rules, where two applications can talk to each other. Thus, one application uses this β€œchannel” to make queries and requests for another application. Then, this same API will bring back the information or data you are asking for.

Data PartnerΒΆ

Data Partner, Data Provider or Dataset Provider is an organization that provides data and/or Metadata under the Master Data License Agreement.

Development Data PartnershipΒΆ

Development Data Partnership or β€œData Partnership” is a partnership between international organizations, created to promote the use of third-party data in research and international development.

Development DataΒΆ

Development Data is data about countries that can be used for reference or analysis in the process of development, typically in sectors such as the economy and finance, poverty, education, health, public administration, private sector development, agriculture, land use, gender, climate change, the environment, infrastructure, trade and others and does not contain Personally Identifiable Information (PII).

Devivative WorksΒΆ

Derivative Works are works based on or derived from one or more existing works. For the purposes of this document, derivative works include derived data and analytical products, including but not limited to: research papers, analytical studies, data visualizations, derived indicators, aggregated and/or derived databases, and other outputs (e.g. publications, CDs, mobile device applications, biogs, online data products, etc.) created using the Dataset(s) and Metadata in question.


Geographic Information System (GIS) software is designed to capture, manage, analyze, and display all forms of geographically referenced information. GIS can show many kinds of data on one map. This enables researchers and other data users to more easily see, analyze, and understand patterns and relationships.


Geo-spatial data or GIS data or geo-data has explicit geographic positioning information included within it, such as a road network from a GIS, or a gee-referenced satellite image. Geospatial data may include attribute data that describes the features found in the Data


Git is an application, a distrubed version control system to track changes on any set of files. Git is mainly used to programmers and developers who are working together on a project and need to verify the changes and modifications their teammates made on a specific code set


Github is a code hosting platform for version control (GIT) which also provides a sharing and publishing service, and a social networking environment for data scientists and programmers.


Metadata are defined as β€˜data about data’. They help understand the meaning of data, or provide useful information about its provenance or licensing status.


Microdata are unit-level data obtained from sample surveys, censuses, and administrative systems. They provide information about characteristics of individual people or entities such as households, business enterprises, facilities, farms or even geographical areas such as villages or towns.

Personally Identifiable Information (PII). ΒΆ

Any information that permits the identity of an individual to be directly or indirectly inferred, or any information which is linked or linkable, or may be attributed, to that individual

Relational Database (RD)ΒΆ

Relational Databases correspond to a database (a table) build in a such specific manner (a schema). It is a table because RD manages data through colunms and rows. However, this is not the same as an spreadshet as we are used to. While Relational Database you can only build organized and strutured tables designed by a schemma, a database on a spreedsheet lacks a definite and standard structure to organize your table

Relational Database Management System (RDBMS)ΒΆ

A RDBM is an application (a program) where you can create, manage or alter a relational database.

Schema ΒΆ

Schema it is how the data is organized and structured. It is a logical order of how the computer can read, access and return the information from a dataset. Depending how your schema was defined it can make a huge difference on how the computer respond and process the information you have requested

SQL (Structured Query Language)ΒΆ

Just like Python, R, Stata or C#, SQL is a programming language which was specially designed for managing and communicating with relational databases through a relational database managem system (RDBMS).