--- title: "Introduction to kardl" author: "Huseyin Karamelikli and Huseyin Utku Demir" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Introduction to kardl} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE) ``` ## Introduction The `kardl` package is an R tool for estimating symmetric and asymmetric Autoregressive Distributed Lag (ARDL) and Nonlinear ARDL (NARDL) models, designed for econometricians and researchers analyzing cointegration and dynamic relationships in time series data. It offers flexible model specifications, allowing users to include deterministic variables, asymmetric effects for short- and long-run dynamics, and trend components. The package supports customizable lag structures, model selection criteria (AIC, BIC, AICc, HQ), and parallel processing for computational efficiency. Key features include: - **Flexible Formula Specification**: Use `asym()`, `asymL()`, and `asymS()` to model asymmetric effects in short- and long-run dynamics, and `deterministic()` for dummy variables. - **Lag Optimization**: Choose between automatic lag selection (`"quick"`, `"grid"`, `"grid_custom"`) or user-defined lags. - **Dynamic Analysis**: Compute long-run coefficients, perform cointegration tests (PSS F, PSS t, Narayan, Banerjee, ECM), and test for heteroskedasticity (ARCH). This vignette demonstrates how to use the `kardl()` function to estimate an asymmetric ARDL model, perform diagnostic tests, and visualize results, using economic data from Turkey. ## Installation `kardl` in R can easily be installed from its CRAN repository: ```{r install-cran, eval=FALSE} install.packages("kardl") library(kardl) ``` Alternatively, you can use the `devtools` package to load directly from GitHub: ```{r install, eval=FALSE} # Install required packages install.packages(c("stats", "msm", "lmtest", "nlWaldTest", "car", "strucchange", "utils")) # Install kardl from GitHub install.packages("devtools") devtools::install_github("karamelikli/kardl") ``` Load the package: ```{r load} library(kardl) ``` ## Estimating an Asymmetric ARDL Model This example estimates an asymmetric ARDL model to analyze the dynamics of exchange rate pass-through to domestic prices in Turkey, using a sample dataset (`imf_example_data`) with variables for Consumer Price Index (CPI), Exchange Rate (ER), Producer Price Index (PPI), and a COVID-19 dummy variable. ### Step 1: Data Preparation Assume `imf_example_data` contains monthly data for CPI, ER, PPI, and a COVID dummy variable. We prepare the data by ensuring proper formatting and adding the dummy variable. We retrieve data from the IMF’s International Financial Statistics (IFS) dataset and prepare it for analysis. Note: The `imf_example_data` is a placeholder for demonstration purposes. You should replace it with your actual dataset. The data can be loaded by `readxl` or other data import functions. ### Step 2: Define the Model Formula We define the model formula using R's formula syntax, incorporating asymmetric effects and deterministic variables. We use `asym()` for variables with both short- and long-run asymmetry, `deterministic()` for fixed effects, and `trend` for a linear time trend. ```{r data-prepare} # Define the model formula MyFormula <- CPI ~ ER + PPI + asym(ER + PPI) + deterministic(covid) + trend ``` ### Step 3: Model Estimation We estimate the ARDL model using different `mode` settings to demonstrate flexibility in lag selection. The `kardl()` function supports various modes: `"grid"`, `"grid_custom"`, `"quick"`, or a user-defined lag vector. #### Using `mode = "grid"` The `"grid"` mode evaluates all lag combinations up to `maxlag` and provides console feedback. ```{r model-grid} # Set model options kardl_set(criterion = "BIC", differentAsymLag = TRUE, data=imf_example_data) # Estimate model with grid mode kardl_model <- kardl(data=imf_example_data,model= MyFormula, maxlag = 4, mode = "grid") # View results kardl_model # Display model summary summary(kardl_model) ``` #### Using User-Defined Lags Specify custom lags to bypass automatic lag selection: ```{r model-user-defined} kardl_model2 <- kardl(data=imf_example_data, MyFormula, mode = c(2, 1, 1, 3, 0)) # View results kardl_model2$properLag ``` #### Using All Variables Use the `.` operator to include all variables except the dependent variable: ```{r model-all-vars} kardl_set(data=imf_example_data) kardl(model = CPI ~ . + deterministic(covid), mode = "grid") ``` #### Visualizing Lag Criteria The `LagCriteria` component contains lag combinations and their criterion values. We visualize these to compare model selection criteria (AIC, BIC, HQ). ```{r lag-criteria} library(dplyr) library(tidyr) library(ggplot2) # Convert LagCriteria to a data frame LagCriteria <- as.data.frame(kardl_model[["LagCriteria"]]) colnames(LagCriteria) <- c("lag", "AIC", "BIC", "AICc", "HQ") LagCriteria <- LagCriteria %>% mutate(across(c(AIC, BIC, HQ), as.numeric)) # Pivot to long format LagCriteria_long <- LagCriteria %>% select(-AICc) %>% pivot_longer(cols = c(AIC, BIC, HQ), names_to = "Criteria", values_to = "Value") # Find minimum values min_values <- LagCriteria_long %>% group_by(Criteria) %>% slice_min(order_by = Value) %>% ungroup() # Plot ggplot(LagCriteria_long, aes(x = lag, y = Value, color = Criteria, group = Criteria)) + geom_line() + geom_point(data = min_values, aes(x = lag, y = Value), color = "red", size = 3, shape = 8) + geom_text(data = min_values, aes(x = lag, y = Value, label = lag), vjust = 1.5, color = "black", size = 3.5) + labs(title = "Lag Criteria Comparison", x = "Lag Configuration", y = "Criteria Value") + theme_minimal() + theme(axis.text.x = element_text(angle = 45, hjust = 1)) ``` ### Step 4: Long-Run Coefficients We calculate long-run coefficients using `kardl_longrun()`, which standardizes coefficients by dividing them by the negative of the dependent variable’s long-run parameter. ```{r long-run} # Long-run coefficients mylong <- kardl_longrun(kardl_model) mylong ``` ### Step 5: Asymmetry Test The `asymmetrytest()` function performs Wald tests to assess short- and long-run asymmetry in the model. ```{r asymmetry-test} ast <- imf_example_data %>% kardl(CPI ~ ER + PPI + asym(ER + PPI) + deterministic(covid) + trend, mode = c(1, 2, 3, 0, 1)) %>% asymmetrytest() ast # View long-run hypotheses ast$Lhypotheses # Detailed results summary(ast) ``` ### Step 6: Cointegration Tests We perform cointegration tests to assess long-term relationships using `pssf()`, `psst()`, `narayan()`, `banerjee()`, and `recmt()`. #### PSS F Bound Test The `pssf()` function tests for cointegration using the Pesaran, Shin, and Smith F Bound test. ```{r pssf} A <- kardl_model %>% pssf(case = 3, signif_level = "0.05") cat(paste0("The F statistic = ", A$statistic, " where k = ", A$k, ".")) cat(paste0("\nWe found '", A$Cont, "' at ", A$siglvl, ".")) A$criticalValues summary(A) ``` #### PSS t Bound Test The `psst()` function tests the significance of the lagged dependent variable’s coefficient. ```{r psst} A <- kardl_model %>% psst(case = 3, signif_level = "0.05") cat(paste0("The t statistic = ", A$statistic, " where k = ", A$k, ".")) cat(paste0("\nWe found '", A$Cont, "' at ", A$siglvl, ".")) A$criticalValues summary(A) ``` #### Narayan Test The `narayan()` function is tailored for small sample sizes. ```{r narayan} A <- kardl_model %>% narayan(case = 3, signif_level = "0.05") cat(paste0("The F statistic = ", A$statistic, " where k = ", A$k, ".")) cat(paste0("\nWe found '", A$Cont, "' at ", A$siglvl, ".")) A$criticalValues summary(A) ``` #### Banerjee Test The `banerjee()` function is designed for small datasets (≤100 observations). ```{r banerjee} A <- kardl_model %>% banerjee(signif_level = "0.05") cat(paste0("The ECM parameter = ", A$coef, ", k = ", A$k, ", t statistic = ", A$statistic, ".")) cat(paste0("\nWe found '", A$Cont, "' at ", A$siglvl, ".")) A$criticalValues summary(A) ``` #### Restricted ECM Test The `recmt()` function tests for cointegration using an Error Correction Model. ```{r recmt} recmt_model <- imf_example_data %>% recmt(MyFormula, mode = "grid_custom", case = 3) recmt_model # View results summary(recmt_model) ``` ### Step 7: ARCH Test The `archtest()` function checks for autoregressive conditional heteroskedasticity in the model residuals. ```{r arch-test} arch_result <- archtest(kardl_model$finalModel$model$residuals, q = 2) summary(arch_result) ``` ### Step 8: Customizing Asymmetric Variables We demonstrate how to customize prefixes and suffixes for asymmetric variables using `kardl_set()`. ```{r asym-custom} # Set custom prefixes and suffixes kardl_reset() kardl_set(AsymPrefix = c("asyP_", "asyN_"), AsymSuffix = c("_PP", "_NN")) kardl_custom <- kardl(data=imf_example_data, MyFormula, mode = "grid_custom") kardl_custom$properLag ``` ## Key Functions and Parameters - **`kardl(data, model, maxlag, mode, ...)`**: - `data`: A time series dataset (e.g., a data frame with CPI, ER, PPI). - `model`: A formula specifying the long-run equation, e.g., `y ~ x + z + asym(z) + asymL(x2 + x3) + asymS(x3 + x4) + deterministic(dummy1 + dummy2) + trend`. Supports: - `asym()`: Asymmetric effects for both short- and long-run dynamics. - `asymL()`: Long-run asymmetric variables. - `asymS()`: Short-run asymmetric variables. - `deterministic()`: Fixed dummy variables. - `trend`: Linear time trend. - `maxlag`: Maximum number of lags (default: 4). Use smaller values (e.g., 2) for small datasets, larger values (e.g., 8) for long-term dependencies. - `mode`: Estimation mode: - `"quick"`: Verbose output for interactive use. - `"grid"`: Verbose output with lag optimization. - `"grid_custom"`: Silent, efficient execution. - User-defined vector (e.g., `c(1, 2, 4, 5)` or `c(CPI = 2, ER_POS = 3, ER_NEG = 1, PPI = 3)`). - Returns a list with components: `inputs`, `finalModel`, `start_time`, `end_time`, `properLag`, `TimeSpan`, `OptLag`, `LagCriteria`, `type` ("kardlmodel"). - **`kardl_set(...)`**: Configures options like `criterion` (AIC, BIC, AICc, HQ), `differentAsymLag`, `AsymPrefix`, `AsymSuffix`, `ShortCoef`, and `LongCoef`. Use `kardl_get()` to retrieve settings and `kardl_reset()` to restore defaults. - **`kardl_longrun(model)`**: Calculates standardized long-run coefficients, returning `type` ("kardl_longrun"), `coef`, `delta_se`, `results`, and `starsDesc`. - **`asymmetrytest(model)`**: Performs Wald tests for short- and long-run asymmetry, returning `Lhypotheses`, `Lwald`, `Shypotheses`, `Swald`, and `type` ("asymmetrytest"). - **`pssf(model, case, signif_level)`**: Performs the Pesaran, Shin, and Smith F Bound test for cointegration, supporting cases 1–5 and significance levels ("auto", 0.01, 0.025, 0.05, 0.1, 0.10). - **`psst(model, case, signif_level)`**: Performs the PSS t Bound test, focusing on the lagged dependent variable’s coefficient. - **`narayan(model, case, signif_level)`**: Conducts the Narayan test for cointegration, optimized for small samples (cases 2–5). - **`banerjee(model, signif_level)`**: Performs the Banerjee cointegration test for small datasets (≤100 observations). - **`recmt(data, model, maxlag, mode, case, signif_level, ...)`**: Conducts the Restricted ECM test for cointegration, with similar parameters to `kardl()` and case/significance level options. - **`archtest(resid, q)`**: Tests for ARCH effects in model residuals, returning `type`, `statistic`, `parameter`, `p.value`, and `Fval`. For detailed documentation, use `?kardl`, `?kardl_set`, `?kardl_longrun`, `?asymmetrytest`, `?pssf`, `?psst`, `?narayan`, `?banerjee`, `?recmt`, or `?archtest`. ## Conclusion The `kardl` package is a versatile tool for econometric analysis, offering robust support for symmetric and asymmetric ARDL/NARDL modeling, cointegration tests, stability diagnostics, and heteroskedasticity checks. Its flexible formula specification, lag optimization, and support for parallel processing make it ideal for studying complex economic relationships. For more information, visit [https://github.com/karamelikli/kardl](https://github.com/karamelikli/kardl) or contact the authors at [hakperest@gmail.com](mailto:hakperest@gmail.com). ---