Customization and search options
Crux allows the user the flexibility to change many of the search and analysis parameters. Attributes like the output format and which peptides are selected from the protein database are controlled through numerous options. This page starts with some general information about options and then describe the use of some key crux options.
Introduction to options
A crux command is made up of four parts: executable name, command, options, and required arguments. Let's use a crux tide-search command as an example. Here is the general form.
$ crux tide-search [options] <mass spectra> <peptide index>
In this case, the command is tide-search. This is followed by zero or more optional arguments. Finally, the required arguments, listed above inside angle brackets, include the name of the file containing the spectra to be identified and the name of a peptide index produced previously by the tide-index command.
All of the available options are described for each command on the documentation pages. You can also get a list of available options by running a command with no arguments. For example, the command
$ crux tide-search
will produce output that looks like this:
bash-3.2$ ~/proj/crux/trunk/src/c/crux tide-search FATAL: Error in command line. Error # 5 The required argument <tide spectra file> is missing. USAGE: crux tide-search [options] <tide spectra file> <tide database index> REQUIRED ARGUMENTS: <tide spectra file> The name of the file from which to parse the fragmentation spectra, in any of the file formats supported by ProteoWizard. Alternatively, the argument may be a binary spectrum file produced by a previous run of crux tide-search using the store-spectra parameter. <tide database index> A directory containing a database index created by a previous run of crux tide-index. OPTIONAL ARGUMENTS: [--precursor-window <double>] Search peptides within +/- 'precursor-window' of the spectrum mass. Definition of precursor window depends upon precursor-window-type. Default=3.0. [--precursor-window-type <string>] Window type to use for selecting candidate peptides. <string>=mass|mz|ppm. Default=mass. [--spectrum-min-mz <double>] The lowest spectrum m/z to search. Default=0.0. [--spectrum-max-mz <double>] The highest spectrum m/z to search. Default=no maximum. [--min-peaks <int>] The minimum number of peaks a spectrum must have for it to be searched. Default=20. ...
The first three lines are telling you that you forgot the required arguments and are reminding you what they are. The following lines list all the options (only five of which are shown above). Crux options all begin with two dashes followed by the option name. The name is followed by a space and the appropriate argument. This example increases the verbosity to 40:
$ crux tide-search --verbosity 40 sample.ms2 yeast.fasta
Specifying options via a parameter file
The second option listed above, --parameter-file, is available for all Crux commands. The parameter file allows multiple options to be specified in a file. All of the command line options can be put in a parameter file, but the format is slightly different. In the parameter file, the two leading dashes are removed from the option name, and the option name and value must be separated by an equal sign instead of a space:
The above example, in which we changed the verbosity, would look like this in a parameter file:
The parameter file allows only one option per line. Lines beginning with "#" are considered comments and are ignored. A sample parameter file can be found here. Command line and parameter file options may be used separately or together. If an option is specified in both places, then value on the command line will be used.
During execution of any Crux command, a parameter file containing the
name and value of all the options for the current operation
will be automatically be saved in the output directory. Note that not
all parameters in the file may have been used in the operation. The
parameter file will be named
<tag> is the name of the command that was
In addition to
--parameter-file, Crux includes several
other options that are shared across all, or nearly all, Crux commands.
--output-dir <filename>– The name of the directory where output files will be created. Default = crux-output.
--fileroot <string>– The
filerootstring will be added as a prefix to all output file names. Default = none.
--overwrite <T|F>– By default, if Crux detects that the output file it is about to produce already exists, then Crux will exit with an error. This option allows Crux to overwrite existing files.
--verbosity <0-100>– Specify the verbosity of the current command. Each level prints the following messages, including all those at lower verbosity levels: 0-fatal errors, 10-non-fatal errors, 20-warnings, 30-information on the progress of execution, 40-more progress information, 50-debug info, 60-detailed debug info. Default = 30.
In addition, many Crux commands include various options of the form
--<format>-output. These options take Boolean arguments (specified as "T" or "F") and indicate whether output files in the specified format should be produced. For example, in addition to tab-delimited text format,
tide-searchcan produce output in PepXML, MZid, SQT and PINxml formats.
Changing the indexing and searching parameters
Various options to tide-index control how
the proteins in the database are converted to peptides. These options
fall into several categories, allowing specification of peptide
properties such as minimum and maximum length, enzymatic digestion
rules, decoy database generation and specification of various static
and variable modifications. These options are fully
documented here. Similarly,
documentation describes options for selecting which spectra to score,
the rules for selecting candidate peptides for a given spectrum, and
for deciding what kinds of scores to report.