This is an old revision of the document!


CRISPR Screen Metadata

Many of the manually curated fields in ORCS use controlled vocabularies developed by curators after a survey of the literature. These vocabularies are provided below.

Field Name Description
Screen Name Each screen is assigned a number followed by the PubMed ID (e.g. 1-PMID26627737) to ensure publications with multiple screens have a unique identifier for each individual screen.
Organism Controlled vocabulary.
Screen Rationale A free text field designed for curators to succinctly describe the purpose of the screen to provide additional context for the significance of hits. Common rationales include “Cell essential genes” or “Increased drug resistance” but allow the freedom for added specificity when needed such as “Tumor response to drug treatment”.
Experimental Setup Controlled vocabulary based on survey of the data. Can be easily modified to accommodate new screen types that may be developed.
Duration Controlled vocabulary of hours, days or doublings.
Condition Name/Condition Dosage If the condition is a drug then add an Ontology ID from ChEBi, if a protein ligand add a UniProt ID, if a virus or bacteria add taxon ID, if a mutation add relevant gene ID. If a toxin, relevant culture condition or another term with no ID the ID field can be left blank. Currently if multiple conditions need to be specified for a screen (usually a mutation in addition to another condition) they can be separated using a pipe character. For the dosage there is a numeric field with an associated controlled vocabulary for units. A list of ontology terms currently in use in ORCS can be found here.
MOI Optional (Multiplicity of Infection) should be entered if provided in the publication.
CRISPR Library Name/Type Controlled vocabulary terms generated by survey of the literature. Can easily be modified as new libraries are developed. Accession IDs associated when possible.
Screen Format Controlled vocabulary terms generated by survey of the literature. Can easily be modified as new screen formats are developed.
Enzyme Controlled vocabulary terms generated by survey of the literature. Can easily be modified as new enzymes such as Base Editors are developed.
Cell Line/Type Ontology terms from BTO (first choice), EFO (second choice) or CLO (third choice).Additional guidelines for this curation can be found here.
Phenotype Ontology terms from APO, CMPO or Controlled Vocabulary as needed. Terms currently in use in ORCS can be found here.
Analysis Method Controlled vocabulary terms generated by survey of the literature. Can easily be modified as new analysis methods are developed.
Significance Threshold If provided by the authors a numeric significance threshold can be applied to any score column entered for a screen, and even to multiple score columns if necessary based on analysis method. Alternatively if only a list of hits was published the curator can mark the screen data as “all significant” or if a yes/no hit value was used the author can specify a boolean column.

Back to ORCS Curation Guide

 
orcs/curation_guide/metadata_terms.1657638836.txt.gz · Last modified: 2022/07/12 11:13 by jenn