raw download copy
cutadapt version 3.5

Copyright (C) 2010-2021 Marcel Martin <[email protected]>

cutadapt removes adapter sequences from high-throughput sequencing reads.

    cutadapt -a ADAPTER [options] [-o output.fastq] input.fastq

For paired-end reads:
    cutadapt -a ADAPT1 -A ADAPT2 [options] -o out1.fastq -p out2.fastq in1.fastq in2.fastq

Replace "ADAPTER" with the actual sequence of your 3' adapter. IUPAC wildcard
characters are supported. All reads from input.fastq will be written to
output.fastq with the adapter sequence removed. Adapter matching is
error-tolerant. Multiple adapter sequences can be given (use further -a
options), but only the best-matching adapter will be removed.

Input may also be in FASTA format. Compressed input and output is supported and
auto-detected from the file name (.gz, .xz, .bz2). Use the file name '-' for
standard input/output. Without the -o option, output is sent to standard output.


Marcel Martin. Cutadapt removes adapter sequences from high-throughput
sequencing reads. EMBnet.Journal, 17(1):10-12, May 2011.

Run "cutadapt --help" to see all command-line options.
See https://cutadapt.readthedocs.io/ for full documentation.

  -h, --help            Show this help message and exit
  --version             Show version number and exit
  --debug               Print debug log. Use twice to also print DP matrices
  -j CORES, --cores CORES
                        Number of CPU cores to use. Use 0 to auto-detect.
                        Default: 1

Finding adapters:
  Parameters -a, -g, -b specify adapters to be removed from each read (or from
  R1 if data is paired-end. If specified multiple times, only the best
  matching adapter is trimmed (but see the --times option). Use notation
  'file:FILE' to read adapter sequences from a FASTA file.

  -a ADAPTER, --adapter ADAPTER
                        Sequence of an adapter ligated to the 3' end (paired
                        data: of the first read). The adapter and subsequent
                        bases are trimmed. If a '$' character is appended
                        ('anchoring'), the adapter is only found if it is a
                        suffix of the read.
  -g ADAPTER, --front ADAPTER
                        Sequence of an adapter ligated to the 5' end (paired
                        data: of the first read). The adapter and any preceding
                        bases are trimmed. Partial matches at the 5' end are
                        allowed. If a '^' character is prepended ('anchoring'),
                        the adapter is only found if it is a prefix of the read.
  -b ADAPTER, --anywhere ADAPTER
                        Sequence of an adapter that may be ligated to the 5' or
                        3' end (paired data: of the first read). Both types of
                        matches as described under -a and -g are allowed. If the
                        first base of the read is part of the match, the
                        behavior is as with -g, otherwise as with -a. This
                        option is mostly for rescuing failed library
                        preparations - do not use if you know which end your
                        adapter was ligated to!
  -e E, --error-rate E, --errors E
                        Maximum allowed error rate (if 0 <= E < 1), or absolute
                        number of errors for full-length adapter match (if E is
                        an integer >= 1). Error rate = no. of errors divided by
                        length of matching region. Default: 0.1 (10%)
  --no-indels           Allow only mismatches in alignments. Default: allow both
                        mismatches and indels
  -n COUNT, --times COUNT
                        Remove up to COUNT adapters from each read. Default: 1
                        Require MINLENGTH overlap between read and adapter for
                        an adapter to be found. Default: 3
                        Interpret IUPAC wildcards in reads. Default: False
  -N, --no-match-adapter-wildcards
                        Do not interpret IUPAC wildcards in adapters.
  --action {trim,retain,mask,lowercase,none}
                        What to do if a match was found. trim: trim adapter and
                        up- or downstream sequence; retain: trim, but retain
                        adapter; mask: replace with 'N' characters; lowercase:
                        convert to lowercase; none: leave unchanged. Default:
  --rc, --revcomp       Check both the read and its reverse complement for
                        adapter matches. If match is on reverse-complemented
                        version, output that one. Default: check only read

Additional read modifications:
  -u LENGTH, --cut LENGTH
                        Remove bases from each read (first read only if paired).
                        If LENGTH is positive, remove bases from the beginning.
                        If LENGTH is negative, remove bases from the end. Can be
                        used twice if LENGTHs have different signs. This is
                        applied *before* adapter trimming.
  --nextseq-trim 3'CUTOFF
                        NextSeq-specific quality trimming (each read). Trims
                        also dark cycles appearing as high-quality G bases.
  -q [5'CUTOFF,]3'CUTOFF, --quality-cutoff [5'CUTOFF,]3'CUTOFF
                        Trim low-quality bases from 5' and/or 3' ends of each
                        read before adapter removal. Applied to both reads if
                        data is paired. If one value is given, only the 3' end
                        is trimmed. If two comma-separated cutoffs are given,
                        the 5' end is trimmed with the first cutoff, the 3' end
                        with the second.
  --quality-base N      Assume that quality values in FASTQ are encoded as
                        ascii(quality + N). This needs to be set to 64 for some
                        old Illumina FASTQ files. Default: 33
  --length LENGTH, -l LENGTH
                        Shorten reads to LENGTH. Positive values remove bases at
                        the end while negative ones remove bases at the
                        beginning. This and the following modifications are
                        applied after adapter trimming.
  --trim-n              Trim N's on ends of reads.
  --length-tag TAG      Search for TAG followed by a decimal number in the
                        description field of the read. Replace the decimal
                        number with the correct length of the trimmed read. For
                        example, use --length-tag 'length=' to correct fields
                        like 'length=123'.
  --strip-suffix STRIP_SUFFIX
                        Remove this suffix from read names if present. Can be
                        given multiple times.
  -x PREFIX, --prefix PREFIX
                        Add this prefix to read names. Use {name} to insert the
                        name of the matching adapter.
  -y SUFFIX, --suffix SUFFIX
                        Add this suffix to read names; can also include {name}
  --rename TEMPLATE     Rename reads using TEMPLATE containing variables such as
                        {id}, {adapter_name} etc. (see documentation)
  --zero-cap, -z        Change negative quality values to zero.

Filtering of processed reads:
  Filters are applied after above read modifications. Paired-end reads are
  always discarded pairwise (see also --pair-filter).

  -m LEN[:LEN2], --minimum-length LEN[:LEN2]
                        Discard reads shorter than LEN. Default: 0
  -M LEN[:LEN2], --maximum-length LEN[:LEN2]
                        Discard reads longer than LEN. Default: no limit
  --max-n COUNT         Discard reads with more than COUNT 'N' bases. If COUNT
                        is a number between 0 and 1, it is interpreted as a
                        fraction of the read length.
  --max-expected-errors ERRORS, --max-ee ERRORS
                        Discard reads whose expected number of errors (computed
                        from quality values) exceeds ERRORS.
  --discard-trimmed, --discard
                        Discard reads that contain an adapter. Use also -O to
                        avoid discarding too many randomly matching reads.
  --discard-untrimmed, --trimmed-only
                        Discard reads that do not contain an adapter.
  --discard-casava      Discard reads that did not pass CASAVA filtering (header
                        has :Y:).

  --quiet               Print only error messages.
  --report {full,minimal}
                        Which type of report to print: 'full' or 'minimal'.
                        Default: full
  --json FILE           Dump report in JSON format to FILE
  -o FILE, --output FILE
                        Write trimmed reads to FILE. FASTQ or FASTA format is
                        chosen depending on input. Summary report is sent to
                        standard output. Use '{name}' for demultiplexing (see
                        docs). Default: write to standard output
  --fasta               Output FASTA to standard output even on FASTQ input.
  -Z                    Use compression level 1 for gzipped output files
                        (faster, but uses more space)
  --info-file FILE      Write information about each read and its adapter
                        matches into FILE. See the documentation for the file
  -r FILE, --rest-file FILE
                        When the adapter matches in the middle of a read, write
                        the rest (after the adapter) to FILE.
  --wildcard-file FILE  When the adapter has N wildcard bases, write adapter
                        bases matching wildcard positions to FILE. (Inaccurate
                        with indels.)
  --too-short-output FILE
                        Write reads that are too short (according to length
                        specified by -m) to FILE. Default: discard reads
  --too-long-output FILE
                        Write reads that are too long (according to length
                        specified by -M) to FILE. Default: discard reads
  --untrimmed-output FILE
                        Write reads that do not contain any adapter to FILE.
                        Default: output to same file as trimmed reads

Paired-end options:
  The -A/-G/-B/-U/-Q options work like their lowercase counterparts, but are
  applied to R2 (second read in pair)

  -A ADAPTER            3' adapter to be removed from R2
  -G ADAPTER            5' adapter to be removed from R2
  -B ADAPTER            5'/3 adapter to be removed from R2
  -U LENGTH             Remove LENGTH bases from R2
                        Quality-trimming cutoff for R2. Default: same as for R1
  -p FILE, --paired-output FILE
                        Write R2 to FILE.
  --pair-adapters       Treat adapters given with -a/-A etc. as pairs. Either
                        both or none are removed from each read pair.
  --pair-filter {any,both,first}
                        Which of the reads in a paired-end read have to match
                        the filtering criterion in order for the pair to be
                        filtered. Default: any
  --interleaved         Read and/or write interleaved paired-end reads.
  --untrimmed-paired-output FILE
                        Write second read in a pair to this FILE when no adapter
                        was found. Use with --untrimmed-output. Default: output
                        to same file as trimmed reads
  --too-short-paired-output FILE
                        Write second read in a pair to this file if pair is too
  --too-long-paired-output FILE
                        Write second read in a pair to this file if pair is too