3.1 Real data

Inspecting and summarizing the proteomics data.

library(readr)
library(plotly)
library(tidyverse)

📌 Remember: Load the library before starting the analysis.

Load and prepare the data

How to load a R object

In R programming, objects are the fundamental data structures used to store and manipulate data. Objects in R can hold different types of data, such as numbers, characters, lists, or even more complex structures like data frames and matrices.

An object in R is important an instance of a class and can be assigned to a variable. Unlike many other programming languages, R does not require variables to be explicitly declared with a data type. Instead, variables in R are associated with objects and R automatically determines the object type based on the assigned value.

protein_data_parsed <- readRDS("data-01/protein_data_parsed_mut.rds")

📌 Remember: Remember: You can use the DT package to visualize the data.

DT::datatable(
  data = head(protein_data_parsed, 1000),  # show only the first 1000 rows
  rownames = FALSE,
  extensions = c('Buttons', 'Scroller'),
  options = list(
    dom = 'Bfrtip',
    buttons = c('copy', 'csv'),
    deferRender = TRUE,
    scrollX = TRUE,
    scrollY = 200,
    scroller = TRUE
  ),
  caption = 'proteomics metadata'
)

We can also load the data in a csv format. For now, we’re not going to do that, but this is the command on how to do it:

read_csv = reads a csv file

data <- read_csv("../data-01/PXD040621_peptides.csv", show_col_types = FALSE)
library(plotly)

p <- ggplot(protein_data_parsed, aes(x = Intensity, fill = Label)) +
    geom_histogram(bins = 40, color = "white") +
    theme_minimal() +
    facet_wrap(~Reference, scales = "fixed", nrow = 2)

ggplotly(p)
plot_ly(
  data = protein_data_parsed,
  x = ~Intensity,
  color = ~Reference,        # equivalent to fill in ggplot2
  type = "histogram"
) %>%
  layout(
    barmode = "stack", # "overlay" or "stack"
    title = "Protein Intensity Distribution by Label",
    xaxis = list(title = "Intensity"),
    yaxis = list(title = "Count")
  )