crux diameter [options] <tide spectra file> <tide database>
DIAmeter detects peptides from data-independent acquisition mass spectrometry data without requiring a spectral library. The input includes centroided DIA data and a proteome FASTA database. DIAmeter then searches the DIA data using Tide, allowing multiple peptide-spectrum matches (PSMs) per DIA spectrum. A subset of these PSMs are selected for further analysis, using a greedy bipartite graph matching algorithm. Finally, PSMs are augmented and filtered with auxiliary features describing various types of evidence supporting the detection of the associated peptide. The PSM feature vectors, the output of DIAmeter, should be processed subsequently by Percolator to induce a ranking on peptides. Percolator will assign each peptide a statistical confidence estimate, where highly ranked peptides are detected in the DIA data with stronger confidence. Further details are provided here:
YY Lu, J Bilmes, RA Rodriguez-Mias, J Villen, and WS Noble. "DIAmeter: Matching peptides to data-independent acquisition mass spectrometry data". Bioinformatics. 37(Supplement_1):i434–i442, 2021.
DIAmeter performs several intermediate steps, as follows:
- If a FASTA file was provided, convert it to an index using tide-index. Otherwise, use the given Tide index.
- Convert the given fragmentation spectra to a binary format.
- Search the spectra against the database and extract the auxiliary features.
- Store the results in Percolator input (PIN) format.
- Run the PIN file through Percolator.
tide spectra file– The name of one or more files from which to parse the fragmentation spectra, in any of the file formats supported by ProteoWizard. Alternatively, the argument may be one or more binary spectrum files produced by a previous run of crux tide-search using the store-spectra parameter. Multiple files can be included on the command line (space delimited), prior to the name of the database.
tide database– Either a FASTA file or a directory containing a database index created by a previous run of crux tide-index.
The program writes files to the folder
crux-output by default. The name of the output folder can be set by the user using the
--output-dir option. The following files will be created:
diameter.psm-features.txt– a tab-delimited text file containing the feature of the searched PSMs.
diameter.psm-features.filtered.txt– a tab-delimited text file containing the feature of the PSMs after filtering.
diameter.features.pin– the searched PSM results in Percolator input (PIN) format.
diameter.params.txt– a file containing the name and value of all parameters/options for the current operation. Not all parameters in the file may have been used in the operation. The resulting file can be used with the --parameter-file option for other Crux programs.
diameter.log.txt– a log file containing a copy of all messages that were printed to the screen during execution.
percolator-output– the output of percolator after running the PIN file.
--predrt-files <string>– The name of file from which to parse the predicted retention time of each peptide in the database. The file is tab-delimited where the first column is peptide and the second column is the predicted rt information. The rt prediction doesn't require normalization beforehand. If the peptide in the database is missing in the prediction, its predicted value will be imputed by the median of all predicted values. Default =
--prec-ppm <integer>– Tolerance used for matching precursors to spectra. Peptides must be within +/- ‘precursor-ppm’ parts-per-million (ppm) of the spectrum precursor m/z Default =
--frag-ppm <integer>– Tolerance used for matching fragment ions to spectrum peaks. Fragment ions must be within +/- 'fragment-ppm' of the spectrum peak value. Default =
--diameter-instrument orbitrap|tof5600|tof6600– Specify the instrument platform used to acquire the input spectra. This option selects among different sets of coefficient values for the scores computed by diameter. Default =
--max-precursor-charge <integer>– The maximum charge state of a spectra to consider in search. Default =
--mz-bin-offset <float>– In the discretization of the m/z axes of the observed and theoretical spectra, this parameter specifies the location of the left edge of the first bin, relative to mass = 0 (i.e., mz-bin-offset = 0.xx means the left edge of the first bin will be located at +0.xx Da). Default =
--mz-bin-width <float>– Before calculation of the XCorr score, the m/z axes of the observed and theoretical spectra are discretized. This parameter specifies the size of each bin. The exact formula for computing the discretized m/z value is floor((x/mz-bin-width) + 1.0 - mz-bin-offset), where x is the observed m/z value. For low resolution ion trap ms/ms data 1.0005079 and for high resolution ms/ms 0.02 is recommended. Default =
Input and output
--output-dir <string>– The name of the directory where output files will be created. Default =
--overwrite T|F– Replace existing files if true or fail when trying to overwrite a file if false. Default =
--top-match <integer>– Specify the number of matches to report for each spectrum. Default =
--verbosity <integer>– Specify the verbosity of the current processes. Each level prints the following messages, including all those at lower verbosity levels: 0-fatal errors, 10-non-fatal errors, 20-warnings, 30-information on the progress of execution, 40-more progress information, 50-debug info, 60-detailed debug info. Default =