Conversion of precursor, modified peptide and proteinGroup entries to standardized format.
Usage
convert_all_levels(
input_df,
input_MQ_pg,
software = c("MaxQuant", "DIA-NN", "Spectronaut", "PD")
)
Arguments
- input_df
A tibble with precursor, modified peptide and proteinGroup level information. For MaxQuant: evidence.txt and proteinGroups.txt, for PD: PSMs.txt with R-friendly headers enabled, for DIA-NN and Spectronaut default output reports.
- input_MQ_pg
For MaxQuant: A tibble with proteinGroup level information - proteinGroups.txt.
- software
The used analysis software - MaxQuant, PD, DIA-NN or Spectronaut. Default is MaxQuant.
Value
This function returns the original submitted tibble
- input_df - including the following new columns:
traceR_precursor - software-independent standardized text for precursor entries.
traceR_precursor_unknownMods - logical value, if TRUE: a modification is detected, which is not converted to a standardized format.
traceR_mod.peptides - software-independent standardized text for modified peptide entries.
traceR_mod.peptides_unknownMods - logical value, if TRUE: a modification is detected, which is not converted to a standardized format.
traceR_proteinGroups - software-independent standardized text for proteinGroups.
Details
The input entries are converted to a software independent format. The generated entries are appended to the submitted dataframe.
Examples
# Load libraries
library(dplyr)
library(stringr)
library(tidyr)
library(comprehenr)
library(tibble)
# MaxQuant example data
evidence <- tibble::tibble(
"Modified sequence" = c("_AACLLPK_",
"_ALTDM(Oxidation (M))PQM(Oxidation (M))R_",
"ALTDM(Dummy_Modification)PQMK"),
Charge = c(2,2,3),
"Protein group IDs" = c("26", "86;17", "86;17")
)
proteingroups <- tibble::tibble(
"Protein IDs" = c("A0A075B6P5;P01615;A0A087WW87;P01614;A0A075B6S6", "P02671", "P02672"),
id = c(26, 86, 17)
)
# Conversion
convert_all_levels(
input_df = evidence,
input_MQ_pg = proteingroups,
software = "MaxQuant"
)
#> # A tibble: 5 x 10
#> `Protein IDs` id `Modified sequenc~ Precursor.Id Charge traceR_mod.pept~
#> <chr> <dbl> <chr> <chr> <dbl> <chr>
#> 1 A0A075B6P5;P01~ 26 _AACLLPK_ AACLLPK2 2 AACLLPK
#> 2 P02671 86 _ALTDM(Oxidation ~ ALTDM(Oxidat~ 2 ALTDM(UniMod:35~
#> 3 P02671 86 ALTDM(Dummy_Modif~ ALTDM(DummyM~ 3 ALTDM(DummyModi~
#> 4 P02672 17 _ALTDM(Oxidation ~ ALTDM(Oxidat~ 2 ALTDM(UniMod:35~
#> 5 P02672 17 ALTDM(Dummy_Modif~ ALTDM(DummyM~ 3 ALTDM(DummyModi~
#> # ... with 4 more variables: traceR_mod.peptides_unknownMods <lgl>,
#> # traceR_precursor <chr>, traceR_precursor_unknownMods <lgl>,
#> # traceR_proteinGroups <chr>