Dataset Publishing Cycle

Datasets are managed in the MDX first at the descriptive level, simple properties of a datasets, such as the owner, type of data contained, geographical area covered, etc are entered into the Data Catalogue part of the MDX. Datasets are given a unique identifier, and made “LIVE” - at this point they cannot be changed, except by creating a new version, which then has an identifer linked to the original dataset.

There can only be 1 live data model at a time. It means if one defines a data model as a standard, for instance we say that within our organisation this is the standard for a medical disease laboratory test, and it has unique id of https://mylab/diseasemodelX. Everyone can then reference a single standard model, so that if a new clinical trial gets underway, it can be guaranteed to conform to a particular data standard simply by refering to that URL.

Normally the process for defining a dataset is to define a schema definition for that dataset, this defines all the properties that will be used to describe and manage the dataset. Once this is done a new template for the MDX onboarding component is generated, together with the input screens and the metadata - or data about the dataset - can be entered into the “DRAFT” datamodel (or dataset). Once this is complete the dataset can be made “LIVE” - see below for details.

 

 

Live datasets can be retired at any time, if not needed, and then restored. If a dataset needs to be updated, because a new change or new definition has been introduced by the dataset owner, then rather than change the dataset itself - which allows confusion into the process - a new draft dataset is created, and this can then become the “LIVE” instance of this dataset, replacing the “historical” version. The historical version is still available for reference, but it is no longer the default go-to version of that dataset or data standard.

 

Since it is not good practise to keep updated datasets when making a dataset “LIVE” is critical to bear in mind a few key points - which are listed below:

 

 

These points are illustrated in the MDX go-live process, illustrated below: