R/51_annotate_peaks_mz_rt_ms2.R
annotate_peaks_mz_rt_ms2.Rd
This function annotates metabolites by matching MS1, retention time (RT), and MS2 spectra against a provided database. It allows for the use of custom parameters such as mass tolerance, MS2 fragment matching criteria, and retention time tolerance.
annotate_peaks_mz_rt_ms2(
ms1.info = NULL,
ms2.info = NULL,
database = NULL,
based_on = c("ms1", "rt", "ms2"),
polarity = c("positive", "negative"),
ce = "all",
column = c("hilic", "rp"),
adduct.table = NULL,
ms1.match.ppm = 25,
mz.ppm.thr = 400,
rt.match.tol = 30,
ms2.match.ppm = 30,
ms2.match.tol = 0.5,
fraction.weight = 0.3,
dp.forward.weight = 0.6,
dp.reverse.weight = 0.1,
remove_fragment_intensity_cutoff = 0,
ms1.match.weight = 0.25,
rt.match.weight = 0.25,
ms2.match.weight = 0.5,
total.score.tol = 0.5,
candidate.num = 3,
threads = 3
)
A data frame containing MS1 peak information (m/z and RT). If `based_on` includes `"ms1"` or `"rt"`, this argument is required.
A list containing MS2 spectra for each corresponding `ms2_spectrum_id`. If `based_on` includes `"ms2"`, this argument is required.
A `databaseClass` object containing the reference database for metabolite annotation.
Character vector. Specifies which criteria to base the matching on. Can include `"ms1"`, `"rt"`, and `"ms2"`. Default is `c("ms1", "rt", "ms2")`.
Character. The ionization mode, either `"positive"` or `"negative"`. Default is `"positive"`.
Character. Collision energy used in MS2 spectra. Default is `"all"`.
Character. The chromatographic column type, either `"hilic"` or `"rp"` (reversed-phase). Default is `"hilic"`.
A data frame containing the adducts to use in the matching process. If `NULL`, a default table is loaded based on the `polarity` and `column`.
Numeric. The mass tolerance in parts per million (ppm) for MS1 peak matching. Default is 25.
Numeric. m/z threshold for ppm calculation. Default is 400.
Numeric. Retention time matching tolerance in seconds. Default is 30.
Numeric. The mass tolerance in ppm for MS2 peak matching. Default is 30.
Numeric. The retention time tolerance for MS2 fragment matching. Default is 0.5.
Numeric. Weight for the fraction of matched fragments in MS2 spectra. Default is 0.3.
Numeric. Weight for the forward dot product score in MS2 matching. Default is 0.6.
Numeric. Weight for the reverse dot product score in MS2 matching. Default is 0.1.
Numeric. Intensity cutoff for removing low-intensity MS2 fragments. Default is 0.
Numeric. Weight for MS1 matching score in the total score calculation. Default is 0.25.
Numeric. Weight for RT matching score in the total score calculation. Default is 0.25.
Numeric. Weight for MS2 matching score in the total score calculation. Default is 0.5.
Numeric. Threshold for the total score. Only results with a score above this value are retained. Default is 0.5.
Numeric. Maximum number of top candidate annotations to retain per metabolite. Default is 3.
Numeric. Number of threads to use for parallel processing. Default is 3.
A data frame with annotated metabolites, including columns for matched m/z, retention time, MS2 spectra, and the calculated scores for each match.
The function uses a combination of MS1 peak information (m/z and retention time), MS2 spectra, and a reference database to annotate metabolites. The matching process can be customized by adjusting the mass tolerance, retention time tolerance, and MS2 fragment matching parameters.
If `based_on` includes `"ms1"` or `"rt"`, the MS1 information is extracted from the `ms1.info` data frame. If `based_on` includes `"ms2"`, the function uses the provided MS2 spectra in `ms2.info` to perform fragment matching. The function calculates individual scores for m/z, retention time, and MS2 fragment matches, which are then combined into a total score. Annotations with total scores above `total.score.tol` are retained, and only the top `candidate.num` annotations are kept for each metabolite.
if (FALSE) { # \dontrun{
# Example MS1 and MS2 data
ms1_info <- data.frame(
variable_id = c("id1", "id2"),
mz = c(150.08, 180.12),
rt = c(12.5, 14.7)
)
ms2_info <- list(
id1 = matrix(c(75, 1000, 80, 2000),
ncol = 2, byrow = TRUE,
dimnames = list(NULL, c("mz", "intensity"))),
id2 = matrix(c(85, 3000, 90, 1500),
ncol = 2, byrow = TRUE,
dimnames = list(NULL, c("mz", "intensity")))
)
# Example database
database <- load_database("path/to/database")
# Annotate metabolites using MS1, RT, and MS2 data
annotations <- annotate_peaks_mz_rt_ms2(
ms1.info = ms1_info,
ms2.info = ms2_info,
database = database,
based_on = c("ms1", "rt", "ms2"),
polarity = "positive",
column = "rp",
ms1.match.ppm = 20,
rt.match.tol = 20,
candidate.num = 5,
threads = 4
)
print(annotations)
} # }