auto_process_ngs.commands.archive_cmd

auto_process_ngs.commands.archive_cmd.archive(ap, archive_dir=None, platform=None, year=None, perms=None, group=None, include_bcl2fastq=False, read_only_fastqs=True, runner=None, final=False, force=False, dry_run=False)

Copy an analysis directory and contents to an archive area

Copies the contents of the analysis directory to an archive area, which can be on a local or remote system.

The archive directory is constructed in the form

<TOP_DIR>/<YEAR>/<PLATFORM>/<DIR>/…

The YEAR and PLATFORM can be overriden using the appropriate arguments.

By default the data is copied to a ‘staging’ directory called ‘__ANALYSIS_DIR.pending’ in the archive directory. The archiving can be finalised by setting the ‘final’ argumente to ‘True’, which performs a last update of the staging area before moving the data to its final location.

Once the archive has been finalised any further archiving attempts will be refused.

Copying of the data is performed using ‘rsync’; multiple archive operations mirror the contents of the analysis directory (so any data removed from the source will also be removed from the archive).

By default the ‘bcl2fastq’ directory is omitted from the archive, unless the fastq files in any projects are links to the data. Inclusion of this directory can be forced by setting the appropriate argument.

The fastqs will be switched to be read-only in the archive by default.

Parameters:
  • ap (AutoProcessor) – autoprocessor pointing to the analysis directory to be archived

  • archive_dir (str) – top level archive directory, of the form ‘[[user@]host:]dir’ (if not set then use the value from the auto_process.ini file).

  • platform (str) – set the value of the <PLATFORM> level in the archive (if not set then taken from the supplied autoprocessor instance).

  • year (str) – set the value of the <YEAR> level in the archive (if not set then defaults to the current year) (4 digits)

  • perms (str) –

    change the permissions of the destination files and directories according to the supplied argument (e.g. ‘g+w’) (if not set then use the value

    from the auto_process.ini file).

  • group (str) – set the group of the destination files to the supplied argument (if not set then use the value from the auto_process.ini file).

  • include_bcl2fastq (bool) – if True then force inclusion of the ‘bcl2fastq’ subdirectory; otherwise only include it if fastq files in project subdirectories are symlinks.

  • read_only_fastqs (bool) – if True then make the fastqs read-only in the destination directory; otherwise keep the original permissions.

  • runner – (optional) specify a non-default job runner to use for primary data rsync

  • final (bool) – if True then finalize the archive by moving the ‘.pending’ temporary archive to the final location

  • force (bool) – if True then do archiving even if there are errors (e.g. key metadata items not set, permission error when setting group etc); otherwise abort archiving operation.

  • dry_run (bool) – report what would be done but don’t perform any operations.

Returns:

0 = successful termination,

non-zero indicates an error occurred.

Return type:

UNIX-style integer returncode