Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

How are datasets selected for this project?

This is at this point mainly based on our team finding data of interest, and our ability to ingest them rapidly in the catalog. Our current focus on the United States and Canada, and we are also looking into Europe and other highly affected countries. Africa is also high in our priority. There are many data sources out there, and we welcome suggestions.

How is RDS COVID data & metadata curated?

We typically perform the following tasks when assessing and converting data for publication in the RDS COVID-19 Catalog:

  • Restructure the data as needed to facilitate analysis with RDS or statistical and

  • Convert names into standard codes. For example, we typically change country, subdivisions, and other geospatial entities into ISO, ANSI, FIPS, codes.

  • Convert dates to standard ISO formats, and extend the dataset by adding additional time variables (particularly for time series)

  • Capture core metadata (data dictionary), such as variable names, label, description, classification

  • Load the data into a data warehouse

Once a workflow is well-defined, we then automate the process to ensure data refreshes with the source.

What standards is this project using?

On the metadata aspects, our management practices and platform are informed by internationally accepted standards for the management of official statistics and scientific data, specifically the Generic Statistical Information Model (GSIM) and the Data Documentation Initiative. These have been endorsed by the High Level Group for the Modernization of Official Statistics, the Research Data Alliance, and numerous data archives, research groups, and organizations around the world. We as much as possible aim to abide by the FAIR principles.

  • ISO 3166 for countries, their subdivisions

  • ISO 8601 for dates and other temporal variables

  • For United States, FIPS 5-2 and FIPS 6-4 (while technically obsolete, this remains widely used) as well as other coding schemes used by the U.S. census Bureau.

  • Statistics Canada classifications

  • Other international and national classifications maintained or endorsed by the United Nations Classification Division.

Once a workflow is well-defined, we then automate the process to ensure data refreshes with the source.

How can I get my dataset included in RDS?

If you are a data producer and have relevant COVID-19 data, we are interested to hear from you, and may be able to help get your dataset into the RDS system. File a ticket here, or email us at mtna@mtna.us and we will get back to you.