BioGRID ORCS - Screen Index (*.index.tab.txt)
BioGRID ORCS - Screen Index (*.index.tab.txt)
The BioGRID ORCS Screen Index format provides all details and annotation curated about a specific screen or a set of screens. It is a format that is easy to load into Microsoft Excel or Google Sheets and can be easily processed with scripts for bioinformatic applications. All columns are separated by tabulations and described in detail below.
How to detect a BioGRID ORCS Screen Index File
BioGRID ORCS Screen Index files are denoted by the extension .index.tab.txt
Header Definitions
The first line of a BioGRID ORCS Screen Index file is the heading line and starts with a hash (#). This line is purely for informational purposes and gives a brief description of the content contained in each column. If you are scripting the use of this file, you can simply ignore it.
Screen Index Column Definitions
The column contents of BioGRID ORCS Screen Index Files should be as follows:
- Screen ID - A unique identifier in the BioGRID ORCS database representing this screen. You will need this value in order to match this screen with score values in a screen or screen matrix file.
- Source ID - An identifier representing the source of the original data. This is usually a pubmed id, but it can also be other sources as well. The Source Type column below will tell you what kind of identifier it is.
- Source Type - This is a short name representing the type of id used for Source ID above. In most cases, this will simply be a pubmed id.
- Author - A short string representation of the author and year of publication of the dataset
- Screen Name - A short string name for the screen. Usually this follows the convention ###-PUBMEDID
- Scores Size - Total number of scores made available for this screen. In some cases, the full dataset is not published. In these cases, Scores Size and Full Size will differ.
- Full Size - Full number of scores for this screen. In cases in which we have not been able to load all data, this number will be greater than Scores Size listed above.
- Full Size Available - Will be yes or no depending on whether or not the full dataset was made available.
- Number of Hits - The number of results considered 'hits' when using the criteria defined by the authors.
- Analysis - The analysis method used on the dataset (example: MaGeCK or CERES)
- Significance Indicator - How significant results are determined (aka hits).
- Significance Criteria - The actual criteria used to determine significance. This can involve one column or a combination of columns, and in most cases reflects the criteria outlined by the authors in the original publication.
- Throughput - Either high or low throughput.
- Screen Type - The type of screen being performed (example: Positive, Negative etc.)
- Screen Format - The format of the screen (exmaple: Pool, Array)
- Experimental Setup - The type of experiment being performed (example: Timecourse, Toxin Exposure)
- Duration - When applicable, the duration of the experiment (example: 3 days, 20 doublings)
- Condition Name - Conditions for the experiment such as drugs used for toxin exposure.
- Condition Dosage - The dosage of the condition used above.
- MOI - Multiplicity of Infection
- Library - The name of the library used for this screen.
- Library Type - The type of library used (expample: CRISPRn, CRISPRa)
- Library Methodology - The methodology behind the library (example: Knockout)
- Enzyme - Enzyme used (example: CAS9)
- Cell Line - Cell line used
- Cell Type - Cell type used
- Phenotype - Phenotype presented
- Score Column Count - Number of columns containing score values used in this screen. Can be 1 through 5. This will tell you which of the following 5 fields (Score Type 1-5) are relevant.
- Score 1 Type - The type of score stored in Score Type 1 fields. (example: p-value, ceres score, log 2 fold change)
- Score 2 Type - The type of score stored in Score Type 2 fields. (example: p-value, ceres score, log 2 fold change)
- Score 3 Type - The type of score stored in Score Type 3 fields. (example: p-value, ceres score, log 2 fold change)
- Score 4 Type - The type of score stored in Score Type 4 fields. (example: p-value, ceres score, log 2 fold change)
- Score 5 Type - The type of score stored in Score Type 5 fields. (example: p-value, ceres score, log 2 fold change)
- Organism ID - The NCBI Taxonomy ID of the organism used in this screen (example: 9606 for Human)
- Organism Official - The NCBI official name for the organism used in this screen (example: Homo sapiens for Human)
- Notes - Notes on this dataset written by our curation team. These can include additional insight into the dataset and how it was curated.
- Source - The name of the database this screen was curated from. When it was curated by our team, this field will contain “BioGRID ORCS”
All columns are mandatory so columns with no values are filled with “-“