Consumer-focused section
- Date (published): [Month D, YYYY]
- Version (of this document): [vX.0]
- Authors: [TMC contact and (specific) HuBMAP Assay team]
- What is being measured? a. Tags*: [e.g., Protiens; single-cell resolution; imaging] b. Descriptive (optional): [Suitable descriptive detail (1 paragraph?)…]
- What analytical activities will the assay be used for within HuBMAP? a. Tags*: [e.g., Azimuth-type references; Map-creation-for-Organs; Data integration; General Characterization Assay] b. Descriptive (optional): […]
- What type of human samples are needed or used? a. Tags*: [e.g., Fresh fozen; FFPE] b. Descriptive (optional): […]
- Commercial Product: https://www.akoyabio.com/phenocycler/assays/ Note: Not all assay types will have this. *Include tags that can be normalized across assays, allowing for assay filtering. When possible, use structured terminology (ontologies).
Assay Description
Primary Reference PMCIDs
Technology Overview
- Approximately 1 paragraph (half-page max).
Key Definitions
- Define and illustrate, as relevant, important terms used in the assay description. Use tables and figures, as relevant. See CODEX for a good example of these types of tables and figures.
Antibodies
- As relevant, include any details about antibody usage that are assay-specific.* Please see the HuBMAP standard report for antibody validation.
Provider-focused section
Directories and Files
Directory structure
- Structure the information as a table, exemplified below.
- When possible, an agreed-upon single assay-specific directory structure should be used rather than allowing for variable directory names with regular expressions (more conducive to downstream Data Consumer use).
- The directory structure should not include files. File definitions should happen in the “Files included” section where files can be more appropriately documented.
- THE FOLLOWING TABLE IS AN EXAMPLE — EDIT AS APPROPRIATE.
Directory Name | Level | Required? | Description |
---|---|---|---|
raw | 0 | yes | This is the raw, unmodified files coming from the instrument (e.g., Akoya system). [Populated by the data provider] |
lab | 1 | yes | Processed files produced by the lab that generated the data. [Populated by the data provider] |
lab/demultiplexed | 1.1 | yes | The demultiplexed files. |
lab/processed | 1.2 | yes | The output from the primary analysis pipeline. |
hive | 2 | yes | Processed files produced by HIVE using the common pipeline. [Populated by the HIVE] |
hive/processed | 2.1 | yes | Image data that has been stitched and aligned and has undergone background subtraction and deconvolution. |
Files Included
- Structure the information as a table, exemplified below.
- Files included (outside of the “lab” directory) should be agreed upon by the Assay Team and HIVE.
- When possible, “file types” should include a link to an external definition, as exemplified below.
- When relevant, include a link to the program or pipeline used to generate each file. The program or pipeline used should be detailed in the “pipeline or data processing” metadata section below.
- If the program/pipeline will perform any QA/QC filtering of the data when generating the file, note this in the file description with additional details provided in the “Data processing pipeline” section below.
- Avoid regular expressions in file names unless absolutely necessary (e.g., to denote a batch of files as in a set of fastq files).
- Files containing the metadata should also be included when relevant, for example, the TSV with assay-level metadata, the antibodies TSV, a file with the pipeline parameters, etc.
- *THE FOLLOWING TABLE IS AN EXAMPLE — EDIT AS APPROPRIATE.
File | File type | Directory | Input file or precursor data | Generator program or pipeline with URL | Description |
---|---|---|---|---|---|
*.fastq.gz | fastq | lab/demultiplexed | BCL files from Illumina sequencer | Pipeline level 2: Space Ranger [mkfastq] | Sample-specific fastq files generated by the demultiplexer. These files are compressed with gzip. |
raw_feature_ bc_matrix.h5 | HDF5 | lab/processed | fastq | Pipeline level 3: Space Ranger [count] | Unfiltered feature-barcode matrix, including every barcode with at least one read, from list of known-good barcodes. See software documentation. |
filtered_feature_ bc_matrix.h5 | HDF5 | lab/processed | fastq | Pipeline level 3: Space Ranger [count] | A filtered feature-barcode matrix, including only tissue-associated barcodes. See software documentation. |
lab-processing.tsv | TSV | lab | n/a | manual | Comprehensive table containing the details of the lab-processing pipeline including all relevant parameters |
Metadata
Sample-level
- Any assay-specific considerations for the sample-level metadata should be detailed here. This is a required documentation element. To avoid any confusion, you should explicitly state if there are no assay-specific considerations. Example fields that may warrant assay-level definitions are included below.
Sample field name | Sample type [block; section; suspension] |
Definition |
---|---|---|
Processing time | ||
Source storage time |
Assay-level
- This is the assay-specific metadata that’s included in the assay metadata TSV files.
- Please include full descriptions.
- Structure the information as a table, exemplified below.
- *THE FOLLOWING TABLE IS AN EXAMPLE — EDIT AS APPROPRIATE.
Field | Required? | Data type | Description |
---|---|---|---|
Visium permeabil- ization time value | yes | int | Time used for tissue permeabilization during RNA extraction from tissue in Step 1.1 of the Visium protocol. |
Visium permeabil- ization time unit | yes | categorical | Unit of measure for “Visium permeabilization time value”. |
Visium Staining Type | yes | categorical | Staining used on the accompanying Visium slide for acquisition of tissue morphology information. Standard is hematoxylin and eosin stain (H&E) for basic morphology information imaged in bright-field, but fluorescent probes might be used for mRNA or protein level measurements. |
Amplification PCR Cycles | yes | int | Refers to the number of PCR cycles in the amplification step; this varies according to the target number cells captured |
Assay-level categorical field values
- Categorical field options should be listed in the following table. *As the list will change over time, please coordinate the categorical lists with the HIVE.
Field [from above] | Values [semicolon separated] |
---|---|
Antibody
- Link to the Antibody TSV file.
HIVE data processing pipeline
- This section is to be completed by the HIVE.
- All pipeline processing steps should be detailed in the table below, including all parameter values used, as exemplified below.
- A figure should be included, as relevant, to better elucidate the pipeline levels, with each level being fully described in the table.
- Yes/No — The pipeline will produce the processed files from raw without human intervention. If “No”, then the required steps (human interventions) need to be detailed here.
- This processing should detail any expected pre-processing of input file(s)?
- The description should include what the processing step achieves.
- Any manual interventions should be documented with links to publications, as relevant.
- *THE FOLLOWING TABLE IS AN EXAMPLE — EDIT AS APPROPRIATE.
Level | Program | Version | Source URL | Input file | Command line (including all non-default parameters) | Description | |
---|---|---|---|---|---|---|---|
2 | space- ranger | 6.1.2 | Space Ranger | raw/* | spaceranger mkfastq | 10x Genomics program to demultiplex the sequencing data, generating the fastq files. | |
3 | space- ranger | 6.1.2 | Space Ranger | lab/ demulti- plexed/ .fastq | spaceranger count | 10x Genomics program to process Visium data, combining the spatial and genomic![[PipelineArrows1-trans1.png | 474x101]] |
Lab data processing pipeline
- The same details as provided in the above section (“HIVE data processing pipeline”) should be detailed for each lab that uploads data that is processed independently from the HIVE.
- Do not include the lab-processing details here; but rather include this lab-specific processing table of information with your data upload.
- In the Files section, describe any files that include the details of the lab-processing pipeline.