RDS Manager: Editing a Data Product

After creating a Data Product you will be able to start to edit and refine the metadata associated with it through the RDS Manager. The Data Product editor is similar to the Catalog editor in that it has the various fields to edit along with actions in the top right. However, the Data Product editor also has tabs where managers can search for and edit the associated variable and classification metadata that applies to it.

Metadata Curation

Metadata can be added to a Data Product and its related resources by hand through the user interface. However, if the client has metadata available in a supported format they can use it to bulk load the metadata into RDS. Currently DDI Codebook is the only standard that is supported for import. Note that all imported metadata will override what is currently defined, because of this it is advised that the metadata be imported before any metadata is entered by hand, after which managers can refine as needed.

Data Product Actions

Data Product actions are found in the top right corner of the page.

Discard / Save

After managers have updated the fields of the data product they can choose to discard these changes or to save them. Discarding the changes will revert the data product fields to their previous values. Save will persist the changes. Note that auto save functionality is on the road map so these actions will eventually be phased out.

Public Toggle

By default all new resources are created as private (not public) so this toggle is turned off. Making a data product public will make it accessible to the consumer if the catalog it is a part of is also public. This should be done once the data product has been curated and is at a place where managers are comfortable with it being available to their end users.

Additional Actions

The additional actions icon (…) can be used to access other actions that can be performed on the catalog.

Import Metadata

Importing metadata allows managers to bulk load metadata for a data product so that they can bring metadata from another source into RDS without having to manually migrate it all. This involves uploading a supported file and filling out a simple configuration of what metadata is desired to be imported. Currently DDI Codebook is the only supported metadata standard to use for import.

 

 

After the file has been uploaded there are a number of configuration options that can be adjusted depending on what the manager wants to accomplish. The configuration allows a dataset to be selected (if there is more than one defined in the uploaded file). The language to import the metadata into can also be selected. RDS will try to determine the appropriate language based on what it finds in the file, but this should be confirmed by the manager as well. Managers can use the checkboxes provided to select what metadata should be imported and how to line up the variables in the metadata in with the variables in the data product (by variable ID or name).

 

View JSON Metadata

This allows managers to retrieve all of the data products metadata in a single JSON object. This is intemded to be used as a way to pull all the metadata from the product and re-import it to another data product. This import must be done through the RDS Manager API at this point.

Profile

The profile action will run data profiling over all the variables in the data product. It will compute summary statistics for all variables and frequencies for categorical variables (those with a classification associated with them). It is recommended that this is run after the desired metadata has been entered.

Caching and Clearing the Cache

By default the data product will be cached. This means that when a query is run against the data product, the returned data set or subset will be cached. If subsequent calls are made to get the same data set it will be returned from cache rather than queried from the data source. This is appropriate for cases where the data is not live (constantly changing). For data sets that change but have long intervals between updates (such as once per quarter) managers can keep the data product cached and use the “Clear Cache” option after new data has been brought in. This will ensure that the cached values are removed and the updated values are returned. For data that changes more often, such as every day, hour, or all the time, managers should turn off caching altogether. This will ensure that the users are always receiving the appropriate data.

Delete Data Product

This will start a process that will delete the data product, its variables and classifications. This should be done with care.