Skip to content

FAQ

In this section I have some troubleshooting advice with problems I faced when I started with bioinformatics and testing with WtP

Input fasta-file size

  • We encountered some WtP long run issues if the file size exceeds a certain size (see this issue), and WtP seems to be stuck. (maybe very long run times of some tools)
  • Our fasta.gz testfiles are below below 40 MB in file size
  • If your fasta files exceed 80 - 100 MB, please split them into smaller chunks

Problems with storage while running WtP

WtP produces temporary data. Depending on your input file, this temporary data can take up a lot of GB of storage space after several WtP runs By default, all temporary data files are stored in /tmp/nextflow-phage-$USER. i.e. When you restart, the temporary data files will be removed.

For users who run WtP on a cluster and can't restart it, we have the --work-dir flag.
This makes it possible to change the storage location for the temporary workflow files.
With the flag --work-dir work, a folder with the name work will be created in your current working dir.
All of WtP temporary workflow files will be stored in this directory.
With sudo rm -r current_working_dir/work* they can become demanding.

--work-dir /path/to/dir    # defines the path where nextflow writes temporary files, default: '/tmp/nextflow-phage-$USER'

Docker-files needed and their size

# check size of dockerimages that are needed
docker images --format "table {{.Repository}}\t{{.Tag}}\t{{.Size}}"
REPOSITORY TAG SIZE
multifractal/template_pandas v3.8.p 817MB
nanozoo/rmarkdown 2.10--a3f4088 3.72GB
papanikos/marvel 0.2-29b3c73 6.42GB
papanikos/virsorter-2 2.2.1--fa935f8 1.19GB
multifractal/seeker 0.1 1.66GB
multifractal/phigaro 0.5.2 2.6GB
nanozoo/seqkit 0.13.2--cd66104 469MB
multifractal/virnet-hack 0.1 1.62GB
nanozoo/emboss 6.6.0--418c521 1.07GB
multifractal/ppr-meta 0.3.1 5.3GB
multifractal/virfinder 0.2 3.91GB
multifractal/vibrant 0.5 1.42GB
nanozoo/sourmash 3.4.1--16a8db7 788MB
nanozoo/hmmer 3.3--3db9dd1 484MB
nanozoo/checkv 0.6.0--e97f45e 1.72GB
nanozoo/altair 4.1.0--086b80e 1.02GB
nanozoo/samtools 1.9--76b9270 487MB
multifractal/virsorter 0.1.2 2.84GB
nanozoo/r_fungi 0.1--097b1bb 3.11GB
nanozoo/template 3.8--ccd0653 681MB
nanozoo/basics 1.0--962b907 79.1MB
nanozoo/upsetr 1.4.0--0ea25b3 3.21GB
nanozoo/r_ggplot2 0.1--6405f6d 3.26GB
multifractal/metaphinder 0.1 767MB
multifractal/deepvirfinder 0.1 2.37GB
nanozoo/prodigal 2.6.3--2769024 531MB

Installing WTP in a centralized way for multi-users

Run WtP on a cluster environment:

  • Let the Users run WtP from their individual accounts (e.g., via ssh connection)
  • create special locations (shared locations) where to store the singularity images, docker images, the databases and the cache
  • let each run of WtP (from a user) use the shared locations but the usage provided forces the users to specify params --workdir, --databases and --cache......
  • the shared locations should be transparent to the users ..

Quick Solution:

  • "install" WtP via git clone --branch v1.2.0 https://github.com/replikation/What_the_Phage.git
  • then change the nextflow config and let the user use this "git" (the version would then be fixed to the git clone)
  • e.g.:
./phage.nf \
  --fasta /path/to/file.fa \      # provide a fasta-file as input
  --cores 8 \                     # number of cores you want to use
  -profile local,docker           # choose the environment:local and docker

Error ignored:

  • some tools can fail during the identification process e.g. if a tool can't process a fasta-inputfile (sometimes MARVEL with really large multi fasta files) then you will get an error-message like this :
[34/31d18f] NOTE: Process `identify_fasta_MSF:upsetr_plot (1)` terminated with an error exit status (1) -- Error is ignored
  • even if an error occures WtP continues its work, but won't include the results of the failed process

Chromomap issues: terminated with an error exit status (1)

In the Annotation process of WtP, wird eine Genkarte mit den annotierten/gefundenen Genen erstellt Bei der erstellung der Genkarte kann es manchmal zu folgendem Error-code kommen:

[4e/57d82d] NOTE: Process 'phage_annotation_MSF:chromomap (1)' terminated with an error exit status (1) -- Execution is retried (1)

The workflow works as intended. We included a few "failsafe" processes for chromomap, so if one plotting approach fails it retries another approach to render it,
thus you get the fail but the retry will work. In the end you will get the result


Singularity image problems

Sometimes the singularity runs fail because some singularity images are failing in their build process:

Download Databases manually

  • usually there is no problem with database download.
  • in case there is a problem you can download them here