--- title: "Getting Started with SingleCellComplexHeatMap" author: "XueCheng" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Getting Started with SingleCellComplexHeatMap} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 12, fig.height = 8, warning = FALSE, message = FALSE ) ``` # Introduction The `SingleCellComplexHeatMap` package provides a powerful and flexible way to visualize single-cell RNA-seq data using complex heatmaps that simultaneously display both gene expression levels (as color intensity) and expression percentages (as circle sizes). This dual-information approach allows for comprehensive visualization of expression patterns across different cell types and conditions. ## Key Features - **Dual Information Display**: Shows both expression intensity and percentage in a single visualization - **Gene Grouping**: Organize genes by functional categories with color-coded annotations - **Time Course Support**: Handle time series data with proper ordering and visualization - **Cell Type Annotations**: Annotate and group different cell types - **Flexible Display Options**: Show/hide different annotation layers - **Highly Customizable**: Extensive parameters for colors, sizes, fonts, and layouts - **Seurat Integration**: Direct integration with Seurat objects # Installation ```{r eval=FALSE} # Install from GitHub devtools::install_github("FanXuRong/SingleCellComplexHeatMap") ``` # Quick Start ## Loading Libraries and Data ```{r setup} library(SingleCellComplexHeatMap) library(Seurat) library(dplyr) # Load optional color packages for testing if (requireNamespace("ggsci", quietly = TRUE)) { library(ggsci) } if (requireNamespace("viridis", quietly = TRUE)) { library(viridis) } # For this vignette, we'll use the built-in pbmc_small dataset data("pbmc_small", package = "SeuratObject") seurat_obj <- pbmc_small # Add example metadata for demonstration set.seed(123) seurat_obj$timepoint <- sample(c("Mock", "6hpi", "24hpi"), ncol(seurat_obj), replace = TRUE) seurat_obj$celltype <- sample(c("T_cell", "B_cell", "Monocyte"), ncol(seurat_obj), replace = TRUE) seurat_obj$group <- paste(seurat_obj$timepoint, seurat_obj$celltype, sep = "_") # Check the structure head(seurat_obj@meta.data) ``` ## Basic Usage The simplest way to create a complex heatmap is to provide a Seurat object and a list of features: ```{r basic_example, fig.width=12, fig.height=8} # Define genes to visualize features <- c("CD3D", "CD3E", "CD8A", "IL32", "CD79A") # Create basic heatmap heatmap_basic <- create_single_cell_complex_heatmap( seurat_object = seurat_obj, features = features, group_by = "group" ) ``` ## Advanced Usage with Gene Grouping For more complex analyses, you can group genes by functional categories: ```{r advanced_example, fig.width=12, fig.height=8} # Define gene groups gene_groups <- list( "T_cell_markers" = c("CD3D", "CD3E", "CD8A", "IL32"), "B_cell_markers" = c("CD79A", "CD79B", "MS4A1"), "Activation_markers" = c("GZMK", "CCL5") ) # Get all genes from groups all_genes <- c("CD3D", "CD3E", "CD8A", "IL32","CD79A", "CD79B", "MS4A1","GZMK", "CCL5") # Create advanced heatmap with gene grouping heatmap_advanced <- create_single_cell_complex_heatmap( seurat_object = seurat_obj, features = all_genes, gene_classification = gene_groups, group_by = "group", time_points_order = c("Mock", "6hpi", "24hpi"), cell_types_order = c("T_cell", "B_cell", "Monocyte"), color_range = c(-2, 0, 2), color_palette = c("navy", "white", "firebrick"), max_circle_size = 3, split_by = "time" ) ``` # Understanding the Visualization ## Dual Information Display The complex heatmap displays two types of information simultaneously: 1. **Color Intensity**: Represents the scaled expression level of each gene 2. **Circle Size**: Represents the percentage of cells expressing each gene This dual approach allows you to distinguish between genes that are: - Highly expressed in few cells (small intense circles) - Moderately expressed in many cells (large moderate circles) - Highly expressed in many cells (large intense circles) ## Data Preparation Requirements For optimal functionality, your group identifiers should follow the format `"timepoint_celltype"`: ```{r data_prep} # Example of proper data preparation seurat_obj@meta.data <- seurat_obj@meta.data %>% mutate( # Create combined group for time course + cell type analysis time_celltype = paste(timepoint, celltype, sep = "_"), # Or create other combinations as needed cluster_time = paste(RNA_snn_res.0.8, timepoint, sep = "_") ) head(seurat_obj@meta.data[, c("timepoint", "celltype", "time_celltype","cluster_time")]) ``` # Customization Options ## Custom Annotation Titles Here, we test the ability to change the default titles for the annotation tracks. ```{r test_custom_titles, fig.width=12, fig.height=8} heatmap <- create_single_cell_complex_heatmap( seurat_object = seurat_obj, features = features, gene_classification = gene_groups, group_by = "group", time_points_order = c("Mock", "6hpi", "24hpi"), # NEW: Custom annotation titles gene_group_title = "Gene Function", time_point_title = "Time Point", cell_type_title = "Cell Type", split_by = "time" ) ``` ## Color Schemes You can customize colors for different components: ```{r color_customization, fig.width=12, fig.height=8} # Custom color schemes heatmap_colors <- create_single_cell_complex_heatmap( seurat_object = seurat_obj, features = features, gene_classification = gene_groups, group_by = "group", color_range = c(-1.5, 0, 1.5,3), # 4-point gradient color_palette = c("darkblue", "blue", "white", "red"), # 4 colors gene_color_palette = "Spectral", time_color_palette = "Set2", celltype_color_palette = "Pastel1" ) ``` ### Using `ggsci` Palettes ```{r test_ggsci_colors, fig.width=12, fig.height=8} if (requireNamespace("ggsci", quietly = TRUE)) { heatmap_colors <- create_single_cell_complex_heatmap( seurat_object = seurat_obj, features = features, gene_classification = gene_groups, group_by = "group", time_points_order = c("Mock", "6hpi", "24hpi"), # NEW: Using ggsci color vectors gene_color_palette = pal_npg()(3), time_color_palette = pal_lancet()(3), celltype_color_palette = pal_jama()(4), # Custom expression heatmap colors color_range = c(-2, 0, 2), color_palette = c("#2166AC", "#F7F7F7", "#B2182B") ) } ``` ### 2.2 Using `viridis` and Custom Colors ```{r test_viridis_custom_colors, fig.width=12, fig.height=8} if (requireNamespace("ggsci", quietly = TRUE)) { heatmap_colors <- create_single_cell_complex_heatmap( seurat_object = seurat_obj, features = features, gene_classification = gene_groups, group_by = "group", time_points_order = c("Mock", "6hpi", "24hpi"), # NEW: Using ggsci color vectors gene_color_palette = pal_npg()(3), time_color_palette = pal_lancet()(3), celltype_color_palette = pal_jama()(4), # Custom expression heatmap colors color_range = c(-2, 0, 2), color_palette = c("#2166AC", "#F7F7F7", "#B2182B") ) } ``` ## Font and Size Adjustments ```{r font_customization, fig.width=12, fig.height=8} # Publication-ready styling heatmap_publication <- create_single_cell_complex_heatmap( seurat_object = seurat_obj, features = all_genes, gene_classification = gene_groups, group_by = "group", max_circle_size = 2.5, row_fontsize = 12, col_fontsize = 12, row_title_fontsize = 14, col_title_fontsize = 12, percentage_legend_title = "Fraction of cells", percentage_legend_labels = c("0", "20", "40", "60", "80"), legend_side = "right" ) ``` ## Visual Element Control This test demonstrates how to remove cell borders and column annotations for a cleaner look. ```{r test_visual_control, fig.width=12, fig.height=8} heatmap_con <- create_single_cell_complex_heatmap( seurat_object = seurat_obj, features = features, gene_classification = gene_groups, group_by = "group", # NEW: Visual control parameters show_cell_borders = FALSE, show_column_annotation = FALSE, # Other parameters for a clean plot split_by = "none", cluster_cells = TRUE ) ``` ## Gene Name Mapping This test shows how to replace default gene names (e.g., symbols) with custom labels on the y-axis. ```{r test_gene_mapping, fig.width=12, fig.height=8} # Create a mapping for a subset of genes gene_mapping <- c( "CD3D" = "T-cell Receptor CD3d", "CD79A" = "B-cell Antigen Receptor CD79a", "GZMK" = "Granzyme K", "NKG7" = "Natural Killer Cell Granule Protein 7" ) heatmap_map <- create_single_cell_complex_heatmap( seurat_object = seurat_obj, features = features, gene_classification = gene_groups, group_by = "group", # NEW: Gene name mapping gene_name_mapping = gene_mapping, row_fontsize = 9 ) ``` ## Clustering Control You can control clustering behavior for both genes and cells: ```{r clustering_control, fig.width=12, fig.height=8} # Custom clustering heatmap_clustering <- create_single_cell_complex_heatmap( seurat_object = seurat_obj, features = features, group_by = "group", cluster_cells = TRUE, cluster_features = TRUE, clustering_distance_rows = "pearson", clustering_method_rows = "ward.D2", clustering_distance_cols = "euclidean", clustering_method_cols = "complete" ) ``` # Different Use Cases ## Case 1: Time Course Analysis For time course experiments, focus on temporal patterns: ```{r time_course, fig.width=12, fig.height=8} # Time course focused analysis heatmap_time <- create_single_cell_complex_heatmap( seurat_object = seurat_obj, features = features, group_by = "group", time_points_order = c("Mock", "6hpi", "24hpi"), cell_types_order = c("T_cell", "B_cell", "Monocyte"), split_by = "time", show_celltype_annotation = TRUE, show_time_annotation = TRUE ) ``` ## Case 2: Cell Type Comparison For cell type-focused analysis: ```{r cell_type, fig.width=12, fig.height=8} # Cell type focused analysis heatmap_celltype <- create_single_cell_complex_heatmap( seurat_object = seurat_obj, features = features, group_by = "celltype", split_by = "celltype", show_time_annotation = FALSE, show_celltype_annotation = TRUE ) ``` ## Case 3: Simple Expression Analysis For basic expression visualization without grouping: ```{r simple, fig.width=8, fig.height=6} # Simple analysis heatmap_sample <- create_single_cell_complex_heatmap( seurat_object = seurat_obj, features = features, gene_classification = NULL, # No gene grouping group_by = "group", show_time_annotation = FALSE, show_celltype_annotation = FALSE, split_by = "none" ) ``` # Comprehensive Example This final example combines several new and existing features to create a highly customized, publication-ready plot. ```{r test_comprehensive, fig.width=14, fig.height=12} create_single_cell_complex_heatmap( seurat_object = seurat_obj, features = features, gene_classification = gene_groups, group_by = "group", time_points_order = c("Mock", "6hpi", "24hpi"), # --- New Features --- gene_group_title = "Functional Category", time_point_title = "Time Post-Infection", cell_type_title = "Cell Identity", show_cell_borders = TRUE, cell_border_color = "white", gene_name_mapping = c("MS4A1" = "CD20"), # --- Color Customization --- color_range = c(-2, 0, 2), color_palette = c("#0072B2", "white", "#D55E00"), # Colorblind-friendly gene_color_palette = "Dark2", time_color_palette = "Set2", celltype_color_palette = "Paired", # --- Layout and Font --- row_fontsize = 10, col_fontsize = 9, row_title_fontsize = 12, col_title_fontsize = 12, col_name_rotation = 45, legend_side = "right", merge_legends = TRUE, # --- Clustering and Splitting --- cluster_features = FALSE, # Rows are already grouped cluster_cells = FALSE, # Columns are already grouped split_by = "time" ) ``` # Working with Helper Functions The package provides helper functions for step-by-step analysis: ## Step 1: Prepare Expression Matrices ```{r helper_1} # Prepare matrices matrices <- prepare_expression_matrices( seurat_object = seurat_obj, features = features, group_by = "group", idents = NULL # Use all groups ) # Check the structure dim(matrices$exp_mat) dim(matrices$percent_mat) head(matrices$dotplot_data) ``` ## Step 2: Create Gene Annotations ```{r helper_2} # Create gene annotations if (!is.null(gene_groups)) { gene_ann <- create_gene_annotations( exp_mat = matrices$exp_mat, percent_mat = matrices$percent_mat, gene_classification = gene_groups, color_palette = "Set1" ) # Check results dim(gene_ann$exp_mat_ordered) levels(gene_ann$annotation_df$GeneGroup) } ``` ## Step 3: Create Cell Annotations ```{r helper_3} # Create cell annotations cell_ann <- create_cell_annotations( exp_mat = matrices$exp_mat, percent_mat = matrices$percent_mat, time_points_order = c("Mock", "6hpi", "24hpi"), cell_types_order = c("T_cell", "B_cell", "Monocyte"), show_time_annotation = TRUE, show_celltype_annotation = TRUE ) # Check results dim(cell_ann$exp_mat_ordered) head(cell_ann$annotation_df) ``` # Saving and Exporting You can save plots directly from the function: ```{r save_plot, eval=FALSE} # Save plot to file heatmap_saved <- create_single_cell_complex_heatmap( seurat_object = seurat_obj, features = features, gene_classification = gene_groups, group_by = "group", save_plot = "my_heatmap.png", plot_width = 12, plot_height = 10, plot_dpi = 300 ) ``` # Tips and Best Practices ## 1. Data Preparation - Ensure your metadata columns are properly formatted - Use consistent naming conventions for time points and cell types - Consider the number of features - too many can make the plot cluttered ## 2. Color Selection - Use diverging color palettes for expression data - Ensure sufficient contrast for accessibility - Consider colorblind-friendly palettes ## 3. Performance Optimization - For large datasets, consider subsetting cells or using representative clusters - Reduce visual complexity with smaller circle sizes and fonts - Use `merge_legends = TRUE` for cleaner layouts ## 4. Publication Guidelines - Use high DPI (300+) for publication figures - Ensure font sizes are readable in the final format - Include scale bars and legends - Consider splitting complex plots into multiple panels # Troubleshooting ## Common Issues 1. **"Features not found"**: Check that gene names match exactly with your Seurat object 2. **"group_by column not found"**: Verify the metadata column name exists 3. **Empty heatmap**: Check that your `idents` parameter includes valid groups 4. **Clustering errors**: Ensure you have enough features for clustering algorithms ## Getting Help If you encounter issues: 1. Check the function documentation: `?create_single_cell_complex_heatmap` 2. Review the examples in this vignette 3. File an issue on GitHub: https://github.com/FanXuRong/SingleCellComplexHeatMap/issues # Session Information ```{r session_info} sessionInfo() ```