Trace common and unique identifications between different software outputs

Identifications of two input data frames are compared and categorized in unique and common entries.

Usage

trace_level(
  input_df1,
  input_df2,
  analysis_name1 = "input_df1",
  analysis_name2 = "input_df2",
  level = c("precursor", "modified_peptides", "proteinGroups"),
  filter_unknown_mods = TRUE
)

Arguments

input_df1: A tibble with flowTraceR's standardized precursor, modified peptide, or proteinGroup level information - required column depends on chosen level.
input_df2: A tibble with flowTraceR's standardized precursor, modified peptide, or proteinGroup level information - required column depends on chosen level.
analysis_name1: output tibble name for input_df1 - default is "input_df1".
analysis_name2: output tibble name for input_df2 - default is "input_df2".
level: "precursor", "modified_peptides", "proteinGroups" - respective level for tracing common vs. unique entries. Default is precursor.
filter_unknown_mods: Logical value, default is TRUE. If TRUE, unknown modifications are filtered out - requires "traceR_precursor_unknownMods" or "traceR_mod.peptides_unknownMods" column; depends on chosen level.

Value

This function returns a list with both original submitted tibbles - input_df1 and input_df2 - including one of the following new columns depending on chosen level :

traceR_traced_precursor - categorization on precursor level in common and unique entries.
traceR_traced_mod.peptides - categorization on modified peptide level in common and unique entries.
traceR_traced_proteinGroups - categorization on proteinGroups level in common and unique entries.

Details

Based on flowTraceR's standardized output format two software outputs can be compared and categorized into common and unique identifications for a chosen level: precursor, modified peptide or proteinGroup level.

Author

Oliver Kardell

Examples

# Load libraries
library(dplyr)
library(stringr)
library(tibble)

# DIA-NN example data
diann <- tibble::tibble(
  "traceR_proteinGroups" = c("P02768", "P02671", "Q92496", "DummyProt"),
  "traceR_mod.peptides" = c("AAC(UniMod:4)LLPK", "RLEVDIDIK",
   "EGIVEYPR", "ALTDM(DummyModification)PQMK"),
  "traceR_mod.peptides_unknownMods" = c(FALSE, FALSE, FALSE, TRUE),
  "traceR_precursor" = c("AAC(UniMod:4)LLPK1", "RLEVDIDIK2",
   "EGIVEYPR2", "ALTDM(DummyModification)PQMK3" ),
  "traceR_precursor_unknownMods" = c(FALSE, FALSE, FALSE, TRUE)
)

# Spectronaut example data
spectronaut <- tibble::tibble(
  "traceR_proteinGroups" = c("P02768", "Q02985", "P02671"),
  "traceR_mod.peptides" = c("AAC(UniMod:4)LLPK", "EGIVEYPR", "M(UniMod:35)KPVPDLVPGNFK"),
  "traceR_mod.peptides_unknownMods" = c(FALSE, FALSE, FALSE),
  "traceR_precursor" = c("AAC(UniMod:4)LLPK1", "EGIVEYPR2", "M(UniMod:35)KPVPDLVPGNFK2"),
  "traceR_precursor_unknownMods" = c(FALSE, FALSE, FALSE)
)

# trace proteinGroup level
traced_proteinGroups <- trace_level(
  input_df1 = diann,
  input_df2 = spectronaut,
  analysis_name1 = "DIA-NN",
  analysis_name2 = "Spectronaut",
  level = "proteinGroups",
  filter_unknown_mods = TRUE
)

# trace precursor level
traced_pecursor <- trace_level(
  input_df1 = diann,
  input_df2 = spectronaut,
  analysis_name1 = "DIA-NN",
  analysis_name2 = "Spectronaut",
  level = "precursor",
  filter_unknown_mods = TRUE
)