Consumer-focused section

  1. Date (published): [Month D, YYYY]
  2. Version (of this document): [vX.0]
  3. Authors: [TMC contact and (specific) HuBMAP Assay team]
  4. What is being measured? a. Tags*: [e.g., Protiens; single-cell resolution; imaging] b. Descriptive (optional): [Suitable descriptive detail (1 paragraph?)…]
  5. What analytical activities will the assay be used for within HuBMAP? a. Tags*: [e.g., Azimuth-type references; Map-creation-for-Organs; Data integration; General Characterization Assay] b. Descriptive (optional): […]
  6. What type of human samples are needed or used? a. Tags*: [e.g., Fresh fozen; FFPE] b. Descriptive (optional): […]
  7. Commercial Product: Note: Not all assay types will have this. *Include tags that can be normalized across assays, allowing for assay filtering. When possible, use structured terminology (ontologies).

Assay Description

Primary Reference PMCIDs

Technology Overview

  • Approximately 1 paragraph (half-page max).

Key Definitions

  • Define and illustrate, as relevant, important terms used in the assay description. Use tables and figures, as relevant. See CODEX for a good example of these types of tables and figures.


Provider-focused section

Directories and Files

Directory structure

  • Structure the information as a table, exemplified below.
  • When possible, an agreed-upon single assay-specific directory structure should be used rather than allowing for variable directory names with regular expressions (more conducive to downstream Data Consumer use).
  • The directory structure should not include files. File definitions should happen in the “Files included” section where files can be more appropriately documented.
Directory Name Level Required? Description
raw 0 yes This is the raw, unmodified files coming from the instrument (e.g., Akoya system). [Populated by the data provider]
lab 1 yes Processed files produced by the lab that generated the data. [Populated by the data provider]
lab/demultiplexed 1.1 yes The demultiplexed files.
lab/processed 1.2 yes The output from the primary analysis pipeline.
hive 2 yes Processed files produced by HIVE using the common pipeline. [Populated by the HIVE]
hive/processed 2.1 yes Image data that has been stitched and aligned and has undergone background subtraction and deconvolution.

Files Included

  • Structure the information as a table, exemplified below.
  • Files included (outside of the “lab” directory) should be agreed upon by the Assay Team and HIVE.
  • When possible, “file types” should include a link to an external definition, as exemplified below.
  • When relevant, include a link to the program or pipeline used to generate each file. The program or pipeline used should be detailed in the “pipeline or data processing” metadata section below.
  • If the program/pipeline will perform any QA/QC filtering of the data when generating the file, note this in the file description with additional details provided in the “Data processing pipeline” section below.
  • Avoid regular expressions in file names unless absolutely necessary (e.g., to denote a batch of files as in a set of fastq files).
  • Files containing the metadata should also be included when relevant, for example, the TSV with assay-level metadata, the antibodies TSV, a file with the pipeline parameters, etc.
File File type Directory Input file or precursor data Generator program or pipeline with URL Description
*.fastq.gz fastq lab/demultiplexed BCL files from Illumina sequencer Pipeline level 2: Space Ranger [mkfastq] Sample-specific fastq files generated by the demultiplexer. These files are compressed with gzip.
raw_feature_ bc_matrix.h5 HDF5 lab/processed fastq Pipeline level 3: Space Ranger [count] Unfiltered feature-barcode matrix, including every barcode with at least one read, from list of known-good barcodes. See software documentation.
filtered_feature_ bc_matrix.h5 HDF5 lab/processed fastq Pipeline level 3: Space Ranger [count] A filtered feature-barcode matrix, including only tissue-associated barcodes. See software documentation.
lab-processing.tsv TSV lab n/a manual Comprehensive table containing the details of the lab-processing pipeline including all relevant parameters



  • Any assay-specific considerations for the sample-level metadata should be detailed here. This is a required documentation element. To avoid any confusion, you should explicitly state if there are no assay-specific considerations. Example fields that may warrant assay-level definitions are included below.
Sample field name Sample type
[block; section; suspension]
Processing time    
Source storage time    


  • This is the assay-specific metadata that’s included in the assay metadata TSV files.
  • Please include full descriptions.
  • Structure the information as a table, exemplified below.
Field Required? Data type Description
Visium permeabil- ization time value yes int Time used for tissue permeabilization during RNA extraction from tissue in Step 1.1 of the Visium protocol.
Visium permeabil- ization time unit yes categorical Unit of measure for “Visium permeabilization time value”.
Visium Staining Type yes categorical Staining used on the accompanying Visium slide for acquisition of tissue morphology information. Standard is hematoxylin and eosin stain (H&E) for basic morphology information imaged in bright-field, but fluorescent probes might be used for mRNA or protein level measurements.
Amplification PCR Cycles yes int Refers to the number of PCR cycles in the amplification step; this varies according to the target number cells captured

Assay-level categorical field values

  • Categorical field options should be listed in the following table. *As the list will change over time, please coordinate the categorical lists with the HIVE.
Field [from above] Values [semicolon separated]


HIVE data processing pipeline

  • This section is to be completed by the HIVE.
  • All pipeline processing steps should be detailed in the table below, including all parameter values used, as exemplified below.
  • A figure should be included, as relevant, to better elucidate the pipeline levels, with each level being fully described in the table.
  • Yes/No — The pipeline will produce the processed files from raw without human intervention. If “No”, then the required steps (human interventions) need to be detailed here.
  • This processing should detail any expected pre-processing of input file(s)?
  • The description should include what the processing step achieves.
  • Any manual interventions should be documented with links to publications, as relevant.
Level Program Version Source URL Input file Command line (including all non-default parameters) Description  
2 space- ranger 6.1.2 Space Ranger raw/* spaceranger mkfastq 10x Genomics program to demultiplex the sequencing data, generating the fastq files.  
3 space- ranger 6.1.2 Space Ranger lab/ demulti- plexed/ .fastq spaceranger count 10x Genomics program to process Visium data, combining the spatial and genomic![[PipelineArrows1-trans1.png 474x101]]


Lab data processing pipeline

  • The same details as provided in the above section (“HIVE data processing pipeline”) should be detailed for each lab that uploads data that is processed independently from the HIVE.
  • Do not include the lab-processing details here; but rather include this lab-specific processing table of information with your data upload.
  • In the Files section, describe any files that include the details of the lab-processing pipeline.