This function annotates metabolites in the provided object based on MS1, retention time (RT), and/or MS2 spectra data using a specified database. It allows for customization of matching parameters such as m/z match tolerance, retention time tolerance, and MS2 matching criteria.

annotate_metabolites(
  object,
  database,
  based_on = c("ms1", "rt", "ms2"),
  polarity = c("positive", "negative"),
  column = c("rp", "hilic"),
  adduct.table = NULL,
  ce = "all",
  ms1.match.ppm = 25,
  ms2.match.ppm = 30,
  mz.ppm.thr = 400,
  ms2.match.tol = 0.5,
  fraction.weight = 0.3,
  dp.forward.weight = 0.6,
  dp.reverse.weight = 0.1,
  rt.match.tol = 30,
  ms1.match.weight = 0.25,
  rt.match.weight = 0.25,
  ms2.match.weight = 0.5,
  total.score.tol = 0.5,
  candidate.num = 3,
  remove_fragment_intensity_cutoff = 0,
  return_format = c("mass_dataset", "data.frame"),
  threads = 3
)

Arguments

object

A `mass_dataset` object containing MS1, RT, and/or MS2 data.

database

A `databaseClass` object used for metabolite annotation.

based_on

Character vector. Specifies the matching criteria to be used for annotation. Can include `"ms1"`, `"rt"`, and/or `"ms2"`. Default is `c("ms1", "rt", "ms2")`.

polarity

Character. Ionization mode, either `"positive"` or `"negative"`. Default is `"positive"`.

column

Character. The chromatographic column type, either `"hilic"` or `"rp"` (reversed-phase). Default is `"hilic"`.

adduct.table

A data frame specifying the adduct table for metabolite annotation. If `NULL`, a default adduct table is loaded based on polarity and column type.

ce

Character. Collision energy used in MS2. Default is `"all"`.

ms1.match.ppm

Numeric. Mass tolerance in parts per million (ppm) for MS1 peak matching. Default is 25.

ms2.match.ppm

Numeric. Mass tolerance in ppm for MS2 peak matching. Default is 30.

mz.ppm.thr

Numeric. m/z threshold for ppm calculation. Default is 400.

ms2.match.tol

Numeric. Retention time tolerance for MS2 fragment matching. Default is 0.5.

fraction.weight

Numeric. Weight for the fraction of matched fragments in MS2 spectra. Default is 0.3.

dp.forward.weight

Numeric. Weight for the forward dot product score in MS2 matching. Default is 0.6.

dp.reverse.weight

Numeric. Weight for the reverse dot product score in MS2 matching. Default is 0.1.

rt.match.tol

Numeric. Retention time matching tolerance in seconds. Default is 30.

ms1.match.weight

Numeric. Weight for MS1 matching score in the overall annotation score. Default is 0.25.

rt.match.weight

Numeric. Weight for retention time matching score in the overall annotation score. Default is 0.25.

ms2.match.weight

Numeric. Weight for MS2 matching score in the overall annotation score. Default is 0.5.

total.score.tol

Numeric. Tolerance for the total matching score. Default is 0.5.

candidate.num

Numeric. Maximum number of candidate annotations to retain per metabolite. Default is 3.

remove_fragment_intensity_cutoff

Numeric. Cutoff to remove low-intensity MS2 fragments. Default is 0.

return_format

Character. Specifies the format of the output. Can be `"mass_dataset"` or `"data.frame"`. Default is `"mass_dataset"`.

threads

Numeric. Number of threads to use for parallel processing. Default is 3.

Value

A modified `mass_dataset` object with annotated metabolites added to the `annotation_table` slot.

Details

This function performs metabolite annotation using a combination of MS1, retention time, and MS2 data (if available) from the provided object. The function allows users to customize the matching process, including setting tolerances for MS1 and MS2 matching, adjusting the weights of different scoring components, and selecting a specific chromatographic column and adduct table.

If `ms2` is included in the `based_on` argument, the function extracts both MS1 and MS2 information for annotation. The final annotations are filtered based on the specified score thresholds and only the top `candidate.num` annotations are retained for each metabolite.

Examples

if (FALSE) { # \dontrun{
# Load a sample dataset and database
my_data <- load_mass_dataset("path/to/data")
my_database <- load_database("path/to/database")

# Annotate metabolites using MS1 and MS2 data
annotated_data <- annotate_metabolites(
  object = my_data,
  database = my_database,
  based_on = c("ms1", "ms2"),
  polarity = "positive",
  column = "rp",
  ms1.match.ppm = 20,
  ms2.match.ppm = 25,
  candidate.num = 5,
  threads = 4
)
} # }