--- title: "Network Analysis and Visualization Guide" author: "Avishek Bhandari" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Network Analysis and Visualization} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ``` knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 14, fig.height = 10, warning = FALSE, message = FALSE, dpi = 300, eval = FALSE ) ``` # Network Analysis and Visualization Guide This vignette provides a comprehensive guide to ManyIVsNets' network analysis and visualization capabilities. Our package generates 7 publication-quality network visualizations at 600 DPI, providing unprecedented insights into Environmental Phillips Curve relationships through network analysis. ## Overview of Network Analysis in ManyIVsNets ManyIVsNets implements multiple types of network analysis: 1. **Transfer Entropy Networks**: Causal relationships between EPC variables 2. **Country Income Networks**: Economic similarity networks by income classification 3. **Cross-Income CO2 Growth Nexus**: Income-based environmental networks 4. **Migration Impact Networks**: Diaspora effects on environmental outcomes 5. **Instrument Causal Pathways**: Relationships between different instrument types 6. **Regional Networks**: Geographic and economic regional clustering 7. **Instrument Strength Comparison**: Comprehensive performance visualization ## Network Types and Applications ### 1. Transfer Entropy Networks (Variable-Level) **Purpose**: Identify causal relationships between EPC variables **Key Insight**: PCGDP → CO2 strongest causal flow (TE = 0.0375) ``` # Create transfer entropy network visualization library(ManyIVsNets) # Load sample data data <- sample_epc_data # Conduct transfer entropy analysis te_results <- conduct_transfer_entropy_analysis(data) # Create transfer entropy network plot plot_transfer_entropy_network(te_results, output_dir = tempdir()) ``` **Network Properties from Our Analysis:** - **Density**: 0.095 (moderate causal connectivity) - **Key relationships**: PCGDP → CO2 (0.0375), URF ↔ URM (bidirectional) - **Node types**: Environmental, Employment, Economic, Energy variables ### 2. Country Income Classification Networks **Purpose**: Analyze economic similarities between countries by income groups **Key Insight**: High-income countries cluster with density 0.25 ``` # Create enhanced data with income classifications enhanced_data <- create_enhanced_test_data() # Create country network by income classification country_network <- create_country_income_network(enhanced_data) # Plot country income network plot_country_income_network(country_network, output_dir = tempdir()) ``` **Network Characteristics:** - **High-Income Countries**: USA, Germany, Japan, UK, France - **Network Density**: 0.25 (strong connectivity within income groups) - **Clustering**: Countries group by economic similarity and geographic proximity ### 3. Cross-Income CO2 Growth Nexus **Purpose**: Examine environmental-economic relationships across income levels **Key Insight**: Different income groups show distinct CO2-growth patterns ``` # Create cross-income CO2 growth nexus visualization plot_cross_income_co2_nexus(enhanced_data, output_dir = tempdir()) # Example of income-based patterns income_patterns <- enhanced_data %>% group_by(income_group) %>% summarise( avg_co2 = mean(lnCO2, na.rm = TRUE), avg_ur = mean(lnUR, na.rm = TRUE), avg_gdp = mean(lnPCGDP, na.rm = TRUE), .groups = 'drop' ) print(income_patterns) ``` **Key Findings:** - **High-Income Countries**: Lower unemployment, higher CO2 per capita - **Upper-Middle-Income**: Transitional patterns with moderate emissions - **Network Effects**: Economic similarity drives environmental clustering ### 4. Migration Impact Networks **Purpose**: Analyze how migration networks affect environmental outcomes **Key Insight**: Diaspora strength correlates with CO2 growth patterns ``` # Create migration impact visualization plot_migration_impact(enhanced_data, output_dir = tempdir()) # Example migration patterns migration_examples <- data.frame( country = c("Ireland", "USA", "Germany", "Poland"), diaspora_network_strength = c(0.9, 0.2, 0.4, 0.9), english_language_advantage = c(1.0, 1.0, 0.8, 0.4), interpretation = c("High emigration", "Immigration destination", "Mixed patterns", "High emigration") ) print(migration_examples) ``` **Migration Network Effects:** - **High Emigration Countries**: Ireland, Italy, Poland (diaspora strength = 0.9) - **Immigration Destinations**: USA, Canada, Australia (diaspora strength = 0.2) - **Language Effects**: English advantage creates network spillovers ### 5. Instrument Causal Pathways **Purpose**: Show relationships between different instrument types **Key Insight**: Geographic and technology instruments cluster together ``` # Create instrument causal pathways network plot_instrument_causal_pathways(enhanced_data, output_dir = tempdir()) # Example instrument correlations instrument_correlations <- enhanced_data %>% select(geo_isolation, tech_composite, migration_composite, financial_composite, te_isolation) %>% cor(use = "complete.obs") %>% round(3) print(instrument_correlations) ``` **Instrument Clustering Patterns:** - **Geographic-Technology Cluster**: Strong correlation (r = 0.65) - **Migration-Financial Cluster**: Moderate correlation (r = 0.43) - **Transfer Entropy**: Independent variation (unique identification) ### 6. Regional Networks **Purpose**: Analyze regional clustering and geographic effects **Key Insight**: Countries cluster by geographic proximity and economic similarity ``` # Create regional network visualization plot_regional_network(enhanced_data, output_dir = tempdir()) # Regional clustering examples regional_examples <- data.frame( region = c("Europe", "North_America", "Asia", "Oceania"), countries = c("Germany, France, UK, Italy", "USA, Canada", "Japan, Korea, China", "Australia, New Zealand"), characteristics = c("Economic integration", "NAFTA effects", "Development diversity", "Geographic isolation") ) print(regional_examples) ``` **Regional Network Properties:** - **European Integration**: High connectivity within EU countries - **Geographic Effects**: Distance influences network formation - **Economic Similarity**: GDP levels drive regional clustering ### 7. Instrument Strength Comparison **Purpose**: Compare performance of all 24 instrument approaches **Key Insight**: Judge Historical SOTA achieves F = 7,155.39 (strongest) ``` # Calculate comprehensive instrument strength strength_results <- calculate_instrument_strength(enhanced_data) # Create instrument strength comparison plot plot_instrument_strength_comparison(strength_results, output_dir = tempdir()) # Display top 10 strongest instruments top_instruments <- strength_results %>% arrange(desc(F_Statistic)) %>% head(10) print(top_instruments) ``` **Instrument Performance Hierarchy:** 1. **Judge Historical SOTA**: F = 7,155.39 (Exceptionally Strong) 2. **Spatial Lag SOTA**: F = 569.90 (Very Strong) 3. **Geopolitical Composite**: F = 362.37 (Very Strong) 4. **Technology Composite**: F = 188.47 (Very Strong) 5. **Financial Composite**: F = 113.77 (Very Strong) ## Comprehensive Network Analysis Results ### Key Findings from Network Analysis **1. Transfer Entropy Networks** - Network density: 0.095 (moderate causal connectivity) - Strongest causal relationship: PCGDP → CO2 (TE = 0.0375) - Bidirectional employment causality: URF ↔ URM **2. Country Networks** - Income-based clustering with density 0.25 - High-income countries form tight clusters - Regional effects complement income classification **3. Migration Networks** - Diaspora strength correlates with environmental outcomes - High emigration countries (Ireland, Italy, Poland) show distinct patterns - English language advantage creates network effects **4. Instrument Networks** - Geographic and technology instruments cluster together - Transfer entropy instruments provide unique identification - Alternative SOTA approaches complement traditional methods ## Network Visualization Best Practices ### 1. Layout Algorithms ``` # Different layout options for network visualization layout_comparison <- data.frame( Layout = c("stress", "circle", "fr", "kk", "dh"), Best_For = c("General purpose", "Categorical data", "Force-directed", "Large networks", "Hierarchical"), Pros = c("Balanced", "Clear grouping", "Natural clusters", "Scalable", "Shows hierarchy"), Cons = c("Can be cluttered", "Fixed positions", "Can overlap", "Less aesthetic", "Requires hierarchy") ) print(layout_comparison) ``` ### 2. Color Schemes ``` # Color scheme recommendations color_schemes <- data.frame( Purpose = c("Income Groups", "Regions", "Variable Types", "Instrument Types"), Scheme = c("Manual (income-based)", "Viridis", "Manual (semantic)", "Manual (method-based)"), Colors = c("Red/Orange/Yellow/Gray", "Continuous rainbow", "Blue/Green/Red/Orange", "Distinct categorical"), Accessibility = c("Good", "Excellent", "Good", "Good") ) print(color_schemes) ``` ### 3. Node and Edge Sizing ``` # Sizing guidelines sizing_guidelines <- data.frame( Element = c("Nodes", "Edges", "Labels", "Arrows"), Size_Range = c("2-8", "0.5-3", "2-4", "3-5mm"), Based_On = c("Centrality/Importance", "Weight/Strength", "Readability", "Edge weight"), Considerations = c("Avoid overlap", "Show hierarchy", "Legible at 600 DPI", "Clear direction") ) print(sizing_guidelines) ``` ## Advanced Network Analysis ### 1. Network Metrics ``` # Calculate comprehensive network metrics calculate_network_metrics <- function(network) { if(igraph::vcount(network) == 0) return(NULL) metrics <- data.frame( Metric = c("Density", "Diameter", "Average Path Length", "Clustering Coefficient", "Number of Components", "Modularity"), Value = c( round(igraph::edge_density(network), 3), igraph::diameter(network), round(igraph::mean_distance(network), 3), round(igraph::transitivity(network), 3), igraph::components(network)$no, round(igraph::modularity(network, igraph::cluster_louvain(network)$membership), 3) ), Interpretation = c( "Network connectivity level", "Maximum shortest path", "Average distance between nodes", "Local clustering tendency", "Disconnected subgroups", "Community structure strength" ) ) return(metrics) } # Example usage # network_metrics <- calculate_network_metrics(your_network) # print(network_metrics) ``` ### 2. Community Detection ``` # Detect communities in networks detect_communities <- function(network) { if(igraph::vcount(network) < 3) return(NULL) # Multiple community detection algorithms communities <- list( louvain = igraph::cluster_louvain(network), walktrap = igraph::cluster_walktrap(network), infomap = igraph::cluster_infomap(network) ) # Compare modularity scores modularity_scores <- sapply(communities, function(x) igraph::modularity(network, x$membership)) # Return best performing algorithm best_algorithm <- names(which.max(modularity_scores)) return(list( communities = communities[[best_algorithm]], algorithm = best_algorithm, modularity = max(modularity_scores) )) } # Example usage # community_results <- detect_communities(your_network) # print(community_results) ``` ### 3. Network Evolution Analysis ``` # Analyze how networks change over time analyze_network_evolution <- function(data, time_windows = 5) { years <- unique(data$year) evolution_results <- list() for(i in seq(time_windows, length(years), by = time_windows)) { window_years <- years[(i-time_windows+1):i] window_data <- data %>% filter(year %in% window_years) # Create network for this time window # (Implementation would depend on specific network type) evolution_results[[paste0("Period_", i)]] <- list( years = window_years, network_density = "calculated_density", key_relationships = "identified_relationships" ) } return(evolution_results) } # Example usage # evolution_results <- analyze_network_evolution(enhanced_data) # print(evolution_results) ``` ## Complete Network Analysis Workflow ### Step 1: Data Preparation ``` # Load and prepare data library(ManyIVsNets) # Load sample data epc_data <- sample_epc_data # Create enhanced dataset with all instruments enhanced_data <- create_enhanced_test_data() # Create real instruments from data patterns instruments <- create_real_instruments_from_data(epc_data) # Merge data with instruments final_data <- merge_epc_with_created_instruments(epc_data, instruments) ``` ### Step 2: Transfer Entropy Analysis ``` # Conduct comprehensive transfer entropy analysis te_results <- conduct_transfer_entropy_analysis(final_data) # Extract network properties te_network_density <- igraph::edge_density(te_results$te_network) te_causal_links <- sum(te_results$te_matrix > te_results$threshold) cat("Transfer Entropy Network Density:", te_network_density, "\n") cat("Number of Causal Links:", te_causal_links, "\n") ``` ### Step 3: Create All Network Visualizations ``` # Create output directory output_dir <- tempdir() # Generate all 7 network visualizations network_plots <- create_comprehensive_network_plots(final_data, output_dir) # Display network summary cat("Generated", length(network_plots), "network visualizations\n") ``` ### Step 4: Instrument Strength Analysis ``` # Calculate comprehensive instrument strength strength_results <- calculate_instrument_strength(final_data) # Summarize performance strength_summary <- strength_results %>% group_by(Strength) %>% summarise( Count = n(), Avg_F_Stat = mean(F_Statistic), .groups = 'drop' ) print(strength_summary) ``` ## Empirical Results Summary ### Network Analysis Performance ``` # Comprehensive results summary results_summary <- data.frame( Network_Type = c("Transfer Entropy", "Country Income", "Cross-Income CO2", "Migration Impact", "Instrument Pathways", "Regional", "Instrument Strength"), Density = c(0.095, 0.25, 0.18, 0.12, 0.33, 0.22, "N/A"), Key_Finding = c("PCGDP → CO2 (TE=0.0375)", "Income clustering", "Distinct CO2 patterns", "Diaspora effects", "Geographic-tech cluster", "Regional integration", "Judge Historical F=7,155"), Nodes = c(7, 49, 49, 49, 15, 49, 24), Edges = c(2, 294, 211, 142, 75, 258, "N/A") ) print(results_summary) ``` ### Top Performing Instruments ``` # Top 10 strongest instruments with network context top_instruments_detailed <- data.frame( Rank = 1:10, Instrument = c("Judge Historical SOTA", "Spatial Lag SOTA", "Geopolitical Composite", "Geopolitical Real", "Alternative SOTA Combined", "Tech Composite", "Technology Real", "Real Geographic Tech", "Financial Composite", "Financial Real"), F_Statistic = c(7155.39, 569.90, 362.37, 259.44, 202.93, 188.47, 139.42, 125.71, 113.77, 94.12), Strength = c(rep("Very Strong", 10)), Network_Role = c("Historical events", "Spatial spillovers", "Political transitions", "Institutional change", "Combined approaches", "Technology diffusion", "Innovation patterns", "Geographic tech", "Financial development", "Market maturity") ) print(top_instruments_detailed) ``` ## Conclusion The network analysis capabilities in ManyIVsNets provide: 1. **Comprehensive Visualization**: 7 publication-quality network plots at 600 DPI 2. **Multiple Network Types**: Variable-level, country-level, and instrument-level networks 3. **Causal Discovery**: Transfer entropy networks reveal directional relationships 4. **Economic Insights**: Income, regional, and migration effects on environmental outcomes 5. **Methodological Innovation**: First comprehensive network approach to EPC analysis **Key Empirical Results:** - Transfer entropy network density: 0.095 - Country network density: 0.25 - Strongest causal relationship: PCGDP → CO2 (TE = 0.0375) - 21 out of 24 instruments show strong performance (F > 10) - Judge Historical SOTA: F = 7,155.39 (strongest instrument) This network analysis framework represents a significant advancement in environmental economics methodology, providing both theoretical insights and practical tools for policy analysis. ### Future Extensions The network analysis framework can be extended to: 1. **Dynamic Networks**: Time-varying network structures 2. **Multilayer Networks**: Multiple relationship types simultaneously 3. **Spatial Networks**: Geographic distance-based relationships 4. **Policy Networks**: Government intervention effects 5. **Sectoral Networks**: Industry-specific environmental relationships These extensions will further enhance the analytical power of the ManyIVsNets package for environmental economics research. ```