Projects metadata file:

The file is used to define the projects within the analysis directory, and associates metadata information with each project.

It is initially created when the make_fastqs command completes, and acts as a control file for subsequent operations including:

  • setup_analysis_dirs (used to populate the project directories)

  • publish_qc (only publishes data for projects that are listed in is a tab delimited file with one line for each project, with the following fields:


Data item


Name of the project


List of sample names associated with the project


Name(s) of the researcher(s) directly associated with the project


Library or application type (e.g. RNA-seq, ChIP-seq)


Single cell platform used to prepare the samples


Organism(s) that the samples in the project came from (e.g. Human, Mouse, D. Melanogaster)


Name(s) of the principal investigator(s) (PIs) associated with the project


Free text field for any additional comments

The project name and sample names are normally added automatically; the remaining fields must be populated manually by editing the file. The following fields are compulsory for setup_analysis_dirs:

  • User

  • Library

  • Organism

  • PI

The following conventions are used for data item syntax:

  • “Null” values are represented by a full stop (i.e. .)

  • Unknown values should be represented with a question mark (i.e. ?)

  • Where there are multiple users, PIs or organisms, they should be separated by comma characters (e.g. Human,mouse)

  • Projects preceeded by a comment character (i.e. #) are ignored

Currently there are no canonical lists of allowed values for libraries or organism names, however:

  • Organism name(s) are used to look up appropriate reference data files for the QC pipeline in the configuration file, so names that are used should be consistent with the configuration;

  • Some combinations of single cell platform and library type are used to determine the appropriate QC protocol for the data, in which case the single cell platform and library values must be a valid combination; Single cell and spatial data below.


The organism names are converted to lower case and non-alphabetic characters are converted to underscores (_) when looking up names in the configuration: for example, Mouse is converted to mouse, D. Melanogaster is converted to d_melanogaster)

Single cell and spatial data

For single cell and spatial data values in the library type (Library) and single cell platform (SC_platform) fields have special significance.

For more information on the valid options for specific types of data see:


Use . for the single cell platform for projects that don’t have single cell or spatial data.