This function annotates metabolites by matching MS1, retention time (RT), and MS2 spectra against a provided database. It allows for the use of custom parameters such as mass tolerance, MS2 fragment matching criteria, and retention time tolerance.

annotate_peaks_mz_rt_ms2(
  ms1.info = NULL,
  ms2.info = NULL,
  database = NULL,
  based_on = c("ms1", "rt", "ms2"),
  polarity = c("positive", "negative"),
  ce = "all",
  column = c("hilic", "rp"),
  adduct.table = NULL,
  ms1.match.ppm = 25,
  mz.ppm.thr = 400,
  rt.match.tol = 30,
  ms2.match.ppm = 30,
  ms2.match.tol = 0.5,
  fraction.weight = 0.3,
  dp.forward.weight = 0.6,
  dp.reverse.weight = 0.1,
  remove_fragment_intensity_cutoff = 0,
  ms1.match.weight = 0.25,
  rt.match.weight = 0.25,
  ms2.match.weight = 0.5,
  total.score.tol = 0.5,
  candidate.num = 3,
  threads = 3
)

Arguments

ms1.info

A data frame containing MS1 peak information (m/z and RT). If `based_on` includes `"ms1"` or `"rt"`, this argument is required.

ms2.info

A list containing MS2 spectra for each corresponding `ms2_spectrum_id`. If `based_on` includes `"ms2"`, this argument is required.

database

A `databaseClass` object containing the reference database for metabolite annotation.

based_on

Character vector. Specifies which criteria to base the matching on. Can include `"ms1"`, `"rt"`, and `"ms2"`. Default is `c("ms1", "rt", "ms2")`.

polarity

Character. The ionization mode, either `"positive"` or `"negative"`. Default is `"positive"`.

ce

Character. Collision energy used in MS2 spectra. Default is `"all"`.

column

Character. The chromatographic column type, either `"hilic"` or `"rp"` (reversed-phase). Default is `"hilic"`.

adduct.table

A data frame containing the adducts to use in the matching process. If `NULL`, a default table is loaded based on the `polarity` and `column`.

ms1.match.ppm

Numeric. The mass tolerance in parts per million (ppm) for MS1 peak matching. Default is 25.

mz.ppm.thr

Numeric. m/z threshold for ppm calculation. Default is 400.

rt.match.tol

Numeric. Retention time matching tolerance in seconds. Default is 30.

ms2.match.ppm

Numeric. The mass tolerance in ppm for MS2 peak matching. Default is 30.

ms2.match.tol

Numeric. The retention time tolerance for MS2 fragment matching. Default is 0.5.

fraction.weight

Numeric. Weight for the fraction of matched fragments in MS2 spectra. Default is 0.3.

dp.forward.weight

Numeric. Weight for the forward dot product score in MS2 matching. Default is 0.6.

dp.reverse.weight

Numeric. Weight for the reverse dot product score in MS2 matching. Default is 0.1.

remove_fragment_intensity_cutoff

Numeric. Intensity cutoff for removing low-intensity MS2 fragments. Default is 0.

ms1.match.weight

Numeric. Weight for MS1 matching score in the total score calculation. Default is 0.25.

rt.match.weight

Numeric. Weight for RT matching score in the total score calculation. Default is 0.25.

ms2.match.weight

Numeric. Weight for MS2 matching score in the total score calculation. Default is 0.5.

total.score.tol

Numeric. Threshold for the total score. Only results with a score above this value are retained. Default is 0.5.

candidate.num

Numeric. Maximum number of top candidate annotations to retain per metabolite. Default is 3.

threads

Numeric. Number of threads to use for parallel processing. Default is 3.

Value

A data frame with annotated metabolites, including columns for matched m/z, retention time, MS2 spectra, and the calculated scores for each match.

Details

The function uses a combination of MS1 peak information (m/z and retention time), MS2 spectra, and a reference database to annotate metabolites. The matching process can be customized by adjusting the mass tolerance, retention time tolerance, and MS2 fragment matching parameters.

If `based_on` includes `"ms1"` or `"rt"`, the MS1 information is extracted from the `ms1.info` data frame. If `based_on` includes `"ms2"`, the function uses the provided MS2 spectra in `ms2.info` to perform fragment matching. The function calculates individual scores for m/z, retention time, and MS2 fragment matches, which are then combined into a total score. Annotations with total scores above `total.score.tol` are retained, and only the top `candidate.num` annotations are kept for each metabolite.

Examples

if (FALSE) { # \dontrun{
# Example MS1 and MS2 data
ms1_info <- data.frame(
  variable_id = c("id1", "id2"),
  mz = c(150.08, 180.12),
  rt = c(12.5, 14.7)
)
ms2_info <- list(
  id1 = matrix(c(75, 1000, 80, 2000),
  ncol = 2, byrow = TRUE,
  dimnames = list(NULL, c("mz", "intensity"))),
  id2 = matrix(c(85, 3000, 90, 1500),
  ncol = 2, byrow = TRUE,
  dimnames = list(NULL, c("mz", "intensity")))
)

# Example database
database <- load_database("path/to/database")

# Annotate metabolites using MS1, RT, and MS2 data
annotations <- annotate_peaks_mz_rt_ms2(
  ms1.info = ms1_info,
  ms2.info = ms2_info,
  database = database,
  based_on = c("ms1", "rt", "ms2"),
  polarity = "positive",
  column = "rp",
  ms1.match.ppm = 20,
  rt.match.tol = 20,
  candidate.num = 5,
  threads = 4
)

print(annotations)
} # }