🧠
Open Data Commons
  • Open Data Commons
    • Principles of the ODC
  • How to get help
    • General help
    • FAQ
    • Let us know
  • 📒Tutorials & Documentation
    • Getting started
    • Demo site
    • Getting your data ready
      • Required variables
      • Data format
      • Data dictionary
      • Common errors for Dataset and Data Dictionary
    • Upload to ODC
      • Upload new data
      • Upload a dictionary
      • Upload a Methodology file
      • Supplementary files
      • Common errors during upload
    • Manage a dataset
      • Update a dataset
      • Add metadata
      • Share data
    • Publish your dataset with a DOI
      • Request a DOI
      • Data Quality Checks for DOI
      • Publication in ODC-SCI
        • Summary of review process
      • Publication in ODC-TBI
    • Adding an experimental protocol to a dataset
    • How to cite ODC and dataset
    • Manage a lab
    • Get a reviewer token
    • Estimating costs for data management and sharing
    • Sample DMS
    • ODC Standards
      • Data formatting specifications
      • Common Terminology
        • ODC-SCI CoDEs
      • Metadata standards
        • ODC data dictionary
        • ODC Narrative and Metadata
    • Glossary
  • 🛠️ODC Tool Sandbox
    • Tool Sandbox
      • ODC quality control app
    • For developers
  • 📗Fundamentals
    • Why share data with ODC?
    • What are the different account types on the ODC?
    • How does privacy work on the ODC?
    • FAIR data
  • ➕Extras
    • The ODC team
      • About ODC-SCI
      • About ODC-TBI
    • Funding and support
      • ODC-SCI funding
      • ODC-TBI funding
    • Publications
    • Our blogs
    • Workshops and Outreach
    • What people are saying
    • Terms of use and policies
Powered by GitBook
On this page

Was this helpful?

Edit on GitHub
Export as PDF
  1. Tutorials & Documentation
  2. Getting your data ready

Common errors for Dataset and Data Dictionary

PreviousData dictionaryNextUpload to ODC

Last updated 1 year ago

Was this helpful?

In ODC, the dataset and the data dictionary undergo quality checks for proper formatting (based on framework). These checks ensure that the data is Interoperable and Reusable with other datasets. Some of the quality checks are performed during the uploading of datasets, ensuring a minimal level of quality to all private and public datasets in the ODC-SCI. The check during the upload process is automatic without human oversight since data upload is handled privately within the account of the data owner. When data is released to the Community data space or submitted for publication, further checks will be conducted to ensure that the released or published dataset meets FAIR standards:

  • Source errors (Checked at upload): ODC-SCI can not read the data file. Possible reasons include:

    • The data file is not a *.csv. The ODC only accepts upload of *.csv data files.

    • Reserved special characters were used in the column headers (first row with the variable names). Check our recommendations for How to upload data.

  • Structure errors:

    • Blank-header (Checked at upload): There is a blank variable name. All cells in the header row (first row) must have a value.

    • Duplicate-header (Checked at upload): There are multiple columns with the same name. All column names must be unique.

    • Blank-row (Checked at upload): Rows must have at least one non-blank cell.

    • Duplicate-row: Rows can not be duplicated.

  • Schema errors: In ODC-SCI the schema is marked by the data dictionary. These errors reflect conflicts between the data dictionary and the dataset.

    • Extra-header: The dataset contains at least one variable name not defined in the data dictionary.

    • Missing-header: The dataset is missing at least one variable name defined in the data dictionary.

    • Missing-definition: The definition of a variable in the data dictionary is missing.

    • Required-constraint (Checked at upload): A required field for the dataset contains no values or is not assigned on the dataset. Currently the only required value in the datasets is the subject identifier. As ODC-SCI develops additional data standards, it is possible that more variables will be required on all datasets.

    • Value-constraint: The values of a variable should be equal to one of the permitted values enumerated in the data dictionary, or within the limits of the permitted values.

📒
goodTables