🧠
Open Data Commons
  • Open Data Commons
    • Principles of the ODC
  • How to get help
    • General help
    • FAQ
    • Let us know
  • 📒Tutorials & Documentation
    • Getting started
    • Demo site
    • Getting your data ready
      • Required variables
      • Data format
      • Data dictionary
      • Common errors for Dataset and Data Dictionary
    • Upload to ODC
      • Upload new data
      • Upload a dictionary
      • Upload a Methodology file
      • Supplementary files
      • Common errors during upload
    • Manage a dataset
      • Update a dataset
      • Add metadata
      • Share data
    • Publish your dataset with a DOI
      • Request a DOI
      • Data Quality Checks for DOI
      • Publication in ODC-SCI
        • Summary of review process
      • Publication in ODC-TBI
    • Adding an experimental protocol to a dataset
    • How to cite ODC and dataset
    • Manage a lab
    • Get a reviewer token
    • Estimating costs for data management and sharing
    • Sample DMS
    • ODC Standards
      • Data formatting specifications
      • Common Terminology
        • ODC-SCI CoDEs
      • Metadata standards
        • ODC data dictionary
        • ODC Narrative and Metadata
    • Glossary
  • 🛠️ODC Tool Sandbox
    • Tool Sandbox
      • ODC quality control app
    • For developers
  • 📗Fundamentals
    • Why share data with ODC?
    • What are the different account types on the ODC?
    • How does privacy work on the ODC?
    • FAIR data
  • ➕Extras
    • The ODC team
      • About ODC-SCI
      • About ODC-TBI
    • Funding and support
      • ODC-SCI funding
      • ODC-TBI funding
    • Publications
    • Our blogs
    • Workshops and Outreach
    • What people are saying
    • Terms of use and policies
Powered by GitBook
On this page

Was this helpful?

Edit on GitHub
Export as PDF
  1. Tutorials & Documentation
  2. Publish your dataset with a DOI

Data Quality Checks for DOI

PreviousRequest a DOINextPublication in ODC-SCI

Last updated 1 year ago

Was this helpful?

In ODC, the dataset and the data dictionary undergo quality checks for . These checks ensure that the data is Interoperable and Reusable with other datasets. Some quality checks are performed during uploading datasets, ensuring minimal quality to all private and public datasets in the ODC. The check during the upload process is automatic without human oversight since data upload is handled privately within the account of the data owner. When data is released for publication, further checks will be conducted to ensure that the released dataset meets FAIR standards:

Source checks

(Checked at upload): ODC can not read the data file. Possible reasons include:

  • The data file is not a *.csv. The ODC only accepts the upload of *.csv data files.

  • Reserved special characters were used in the column headers (first row with the variable names). Check our recommendations for How to upload data.

Structure checks

  • Blank-header (Checked at upload): There is a blank variable name. All cells in the header row (first row) must have a value.

  • Duplicate-header (Checked at upload): Multiple columns with the same name. All column names must be unique.

  • Blank-row (Checked at upload): Rows must have at least one non-blank cell.

  • Duplicate-row: Rows can not be duplicated.

Schema checks

In ODC, the schema is marked by the data dictionary. These errors reflect conflicts between the data dictionary and the dataset.

  • Extra-header: The dataset contains at least one variable name not defined in the data dictionary.

  • Missing-header: The dataset is missing at least one variable name defined in the data dictionary.

  • Missing-definition: The definition of a variable in the data dictionary is missing.

  • Required-constraint (Checked at upload): A required field for the dataset contains no values or is not assigned to the dataset. Currently, the only required value in the datasets is the subject identifier. As ODC develops additional data standards, more variables may be required on all datasets.

  • Value-constraint: The values of a variable should be equal to one of the permitted values enumerated in the data dictionary or within the limits of the permitted values.

📒
proper formatting