1cutadapt version 3.5
3Copyright (C) 2010-2021 Marcel Martin <[email protected]>
5cutadapt removes adapter sequences from high-throughput sequencing reads.
8    cutadapt -a ADAPTER [options] [-o output.fastq] input.fastq
10For paired-end reads:
11    cutadapt -a ADAPT1 -A ADAPT2 [options] -o out1.fastq -p out2.fastq in1.fastq in2.fastq
13Replace "ADAPTER" with the actual sequence of your 3' adapter. IUPAC wildcard
14characters are supported. All reads from input.fastq will be written to
15output.fastq with the adapter sequence removed. Adapter matching is
16error-tolerant. Multiple adapter sequences can be given (use further -a
17options), but only the best-matching adapter will be removed.
19Input may also be in FASTA format. Compressed input and output is supported and
20auto-detected from the file name (.gz, .xz, .bz2). Use the file name '-' for
21standard input/output. Without the -o option, output is sent to standard output.
25Marcel Martin. Cutadapt removes adapter sequences from high-throughput
26sequencing reads. EMBnet.Journal, 17(1):10-12, May 2011.
29Run "cutadapt --help" to see all command-line options.
30See https://cutadapt.readthedocs.io/ for full documentation.
33  -h, --help            Show this help message and exit
34  --version             Show version number and exit
35  --debug               Print debug log. Use twice to also print DP matrices
36  -j CORES, --cores CORES
37                        Number of CPU cores to use. Use 0 to auto-detect.
38                        Default: 1
40Finding adapters:
41  Parameters -a, -g, -b specify adapters to be removed from each read (or from
42  R1 if data is paired-end. If specified multiple times, only the best
43  matching adapter is trimmed (but see the --times option). Use notation
44  'file:FILE' to read adapter sequences from a FASTA file.
46  -a ADAPTER, --adapter ADAPTER
47                        Sequence of an adapter ligated to the 3' end (paired
48                        data: of the first read). The adapter and subsequent
49                        bases are trimmed. If a '$' character is appended
50                        ('anchoring'), the adapter is only found if it is a
51                        suffix of the read.
52  -g ADAPTER, --front ADAPTER
53                        Sequence of an adapter ligated to the 5' end (paired
54                        data: of the first read). The adapter and any preceding
55                        bases are trimmed. Partial matches at the 5' end are
56                        allowed. If a '^' character is prepended ('anchoring'),
57                        the adapter is only found if it is a prefix of the read.
58  -b ADAPTER, --anywhere ADAPTER
59                        Sequence of an adapter that may be ligated to the 5' or
60                        3' end (paired data: of the first read). Both types of
61                        matches as described under -a and -g are allowed. If the
62                        first base of the read is part of the match, the
63                        behavior is as with -g, otherwise as with -a. This
64                        option is mostly for rescuing failed library
65                        preparations - do not use if you know which end your
66                        adapter was ligated to!
67  -e E, --error-rate E, --errors E
68                        Maximum allowed error rate (if 0 <= E < 1), or absolute
69                        number of errors for full-length adapter match (if E is
70                        an integer >= 1). Error rate = no. of errors divided by
71                        length of matching region. Default: 0.1 (10%)
72  --no-indels           Allow only mismatches in alignments. Default: allow both
73                        mismatches and indels
74  -n COUNT, --times COUNT
75                        Remove up to COUNT adapters from each read. Default: 1
77                        Require MINLENGTH overlap between read and adapter for
78                        an adapter to be found. Default: 3
79  --match-read-wildcards
80                        Interpret IUPAC wildcards in reads. Default: False
81  -N, --no-match-adapter-wildcards
82                        Do not interpret IUPAC wildcards in adapters.
83  --action {trim,retain,mask,lowercase,none}
84                        What to do if a match was found. trim: trim adapter and
85                        up- or downstream sequence; retain: trim, but retain
86                        adapter; mask: replace with 'N' characters; lowercase:
87                        convert to lowercase; none: leave unchanged. Default:
88                        trim
89  --rc, --revcomp       Check both the read and its reverse complement for
90                        adapter matches. If match is on reverse-complemented
91                        version, output that one. Default: check only read
93Additional read modifications:
94  -u LENGTH, --cut LENGTH
95                        Remove bases from each read (first read only if paired).
96                        If LENGTH is positive, remove bases from the beginning.
97                        If LENGTH is negative, remove bases from the end. Can be
98                        used twice if LENGTHs have different signs. This is
99                        applied *before* adapter trimming.
100  --nextseq-trim 3'CUTOFF
101                        NextSeq-specific quality trimming (each read). Trims
102                        also dark cycles appearing as high-quality G bases.
103  -q [5'CUTOFF,]3'CUTOFF, --quality-cutoff [5'CUTOFF,]3'CUTOFF
104                        Trim low-quality bases from 5' and/or 3' ends of each
105                        read before adapter removal. Applied to both reads if
106                        data is paired. If one value is given, only the 3' end
107                        is trimmed. If two comma-separated cutoffs are given,
108                        the 5' end is trimmed with the first cutoff, the 3' end
109                        with the second.
110  --quality-base N      Assume that quality values in FASTQ are encoded as
111                        ascii(quality + N). This needs to be set to 64 for some
112                        old Illumina FASTQ files. Default: 33
113  --length LENGTH, -l LENGTH
114                        Shorten reads to LENGTH. Positive values remove bases at
115                        the end while negative ones remove bases at the
116                        beginning. This and the following modifications are
117                        applied after adapter trimming.
118  --trim-n              Trim N's on ends of reads.
119  --length-tag TAG      Search for TAG followed by a decimal number in the
120                        description field of the read. Replace the decimal
121                        number with the correct length of the trimmed read. For
122                        example, use --length-tag 'length=' to correct fields
123                        like 'length=123'.
124  --strip-suffix STRIP_SUFFIX
125                        Remove this suffix from read names if present. Can be
126                        given multiple times.
127  -x PREFIX, --prefix PREFIX
128                        Add this prefix to read names. Use {name} to insert the
129                        name of the matching adapter.
130  -y SUFFIX, --suffix SUFFIX
131                        Add this suffix to read names; can also include {name}
132  --rename TEMPLATE     Rename reads using TEMPLATE containing variables such as
133                        {id}, {adapter_name} etc. (see documentation)
134  --zero-cap, -z        Change negative quality values to zero.
136Filtering of processed reads:
137  Filters are applied after above read modifications. Paired-end reads are
138  always discarded pairwise (see also --pair-filter).
140  -m LEN[:LEN2], --minimum-length LEN[:LEN2]
141                        Discard reads shorter than LEN. Default: 0
142  -M LEN[:LEN2], --maximum-length LEN[:LEN2]
143                        Discard reads longer than LEN. Default: no limit
144  --max-n COUNT         Discard reads with more than COUNT 'N' bases. If COUNT
145                        is a number between 0 and 1, it is interpreted as a
146                        fraction of the read length.
147  --max-expected-errors ERRORS, --max-ee ERRORS
148                        Discard reads whose expected number of errors (computed
149                        from quality values) exceeds ERRORS.
150  --discard-trimmed, --discard
151                        Discard reads that contain an adapter. Use also -O to
152                        avoid discarding too many randomly matching reads.
153  --discard-untrimmed, --trimmed-only
154                        Discard reads that do not contain an adapter.
155  --discard-casava      Discard reads that did not pass CASAVA filtering (header
156                        has :Y:).
159  --quiet               Print only error messages.
160  --report {full,minimal}
161                        Which type of report to print: 'full' or 'minimal'.
162                        Default: full
163  --json FILE           Dump report in JSON format to FILE
164  -o FILE, --output FILE
165                        Write trimmed reads to FILE. FASTQ or FASTA format is
166                        chosen depending on input. Summary report is sent to
167                        standard output. Use '{name}' for demultiplexing (see
168                        docs). Default: write to standard output
169  --fasta               Output FASTA to standard output even on FASTQ input.
170  -Z                    Use compression level 1 for gzipped output files
171                        (faster, but uses more space)
172  --info-file FILE      Write information about each read and its adapter
173                        matches into FILE. See the documentation for the file
174                        format.
175  -r FILE, --rest-file FILE
176                        When the adapter matches in the middle of a read, write
177                        the rest (after the adapter) to FILE.
178  --wildcard-file FILE  When the adapter has N wildcard bases, write adapter
179                        bases matching wildcard positions to FILE. (Inaccurate
180                        with indels.)
181  --too-short-output FILE
182                        Write reads that are too short (according to length
183                        specified by -m) to FILE. Default: discard reads
184  --too-long-output FILE
185                        Write reads that are too long (according to length
186                        specified by -M) to FILE. Default: discard reads
187  --untrimmed-output FILE
188                        Write reads that do not contain any adapter to FILE.
189                        Default: output to same file as trimmed reads
191Paired-end options:
192  The -A/-G/-B/-U/-Q options work like their lowercase counterparts, but are
193  applied to R2 (second read in pair)
195  -A ADAPTER            3' adapter to be removed from R2
196  -G ADAPTER            5' adapter to be removed from R2
197  -B ADAPTER            5'/3 adapter to be removed from R2
198  -U LENGTH             Remove LENGTH bases from R2
199  -Q [5'CUTOFF,]3'CUTOFF
200                        Quality-trimming cutoff for R2. Default: same as for R1
201  -p FILE, --paired-output FILE
202                        Write R2 to FILE.
203  --pair-adapters       Treat adapters given with -a/-A etc. as pairs. Either
204                        both or none are removed from each read pair.
205  --pair-filter {any,both,first}
206                        Which of the reads in a paired-end read have to match
207                        the filtering criterion in order for the pair to be
208                        filtered. Default: any
209  --interleaved         Read and/or write interleaved paired-end reads.
210  --untrimmed-paired-output FILE
211                        Write second read in a pair to this FILE when no adapter
212                        was found. Use with --untrimmed-output. Default: output
213                        to same file as trimmed reads
214  --too-short-paired-output FILE
215                        Write second read in a pair to this file if pair is too
216                        short.
217  --too-long-paired-output FILE
218                        Write second read in a pair to this file if pair is too
219                        long.