---
title: "Introduction to fixes"
author: "Yosuke Abe"
date: "`r Sys.setlocale('LC_TIME', 'C'); format(Sys.Date(), '%B %d, %Y')`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Introduction to fixes}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  warning = FALSE,
  message = FALSE,
  fig.width = 7,
  fig.height = 5,
  dpi = 150
)

# Load the development version of the package
library(fixes)
library(dplyr)
library(ggplot2)
```

## Introduction

The **fixes** package provides an easy-to-use toolkit for creating, estimating, and visualizing event study models using fixed effects regression. With **fixes**, you can automatically generate lead and lag dummy variables, flexibly estimate fixed effects event study regressions, and visualize the results with `ggplot2` using a single pipeline.

This vignette introduces the core functions of the package through simple examples, including recent updates such as multiple confidence interval support and improved plotting options.

## Installation

Install the released version from CRAN:

```r
install.packages("fixes")
```

Or with **pak** (recommended for fast install):

```r
pak::pak("fixes")
```

To install the latest development version from GitHub:

```r
pak::pak("yo5uke/fixes")
```

or

```r
devtools::install_github("yo5uke/fixes")
```

## Minimal Example

Below is a basic example using the built-in `fixest::base_did` dataset, running an event study, and visualizing the results.

```{r minimal-example}
# Load example data
df <- fixest::base_did

# Run the event study (supports multiple confidence levels)
event_study <- run_es(
  data       = df,
  outcome    = y,
  treatment  = treat,
  time       = period,
  timing     = 5,  # Treatment occurs at period 5
  fe         = ~ id + period,
  cluster    = ~ id,
  baseline   = -1,
  interval   = 1,
  lead_range = 3,
  lag_range  = 3,
  conf.level = c(0.90, 0.95, 0.99)  # Multiple CIs supported!
)

# View results
head(event_study)
```

## Visualizing Event Study Results

The **fixes** package provides `plot_es()` for flexible visualization. You can easily switch between ribbon-style or error bar CIs, select the displayed CI level, and customize appearance.

```{r plot-basic}
# Basic plot (default: ribbon, 95% CI)
plot_es(event_study)
```

```{r plot-errorbar}
# Plot with error bars and 99% CI
plot_es(event_study, type = "errorbar", ci_level = 0.99)
```

```{r plot-custom}
# Customize further with ggplot2
plot_es(event_study, type = "errorbar", ci_level = 0.9, theme_style = "classic") +
  scale_x_continuous(breaks = seq(-3, 3, by = 1)) +
  ggtitle("Event Study with 90% CI and Classic Theme")
```

## Staggered Treatment with `sunab`

For staggered adoption designs where treatment timing varies across units, you can use `method = "sunab"` to implement the Sun & Abraham (2021) estimator, which is robust to heterogeneous treatment effects.

```{r sunab-example}
# Example with fixest::base_stagg data
df_stagg <- fixest::base_stagg

event_study_sunab <- run_es(
  data       = df_stagg,
  outcome    = y,
  treatment  = treated,
  time       = year,
  timing     = year_treated,
  fe         = ~ id + year,
  staggered  = TRUE,
  method     = "sunab",  # Use Sun & Abraham decomposition
  lead_range = 3,
  lag_range  = 3,
  cluster    = ~ id
)

head(event_study_sunab)
```

```{r plot-sunab}
# Visualize sunab results
plot_es(event_study_sunab) +
  ggtitle("Staggered Adoption Event Study (Sun & Abraham 2021)")
```

**New in v0.7.1:** The `baseline` parameter now applies to both `classic` and `sunab` methods, and the baseline period is included in results with zero estimates for consistent visualization.

## Package Highlights

- **`run_es()`**:
    - Fast, one-step event study for panel data.
    - Automatic creation of lead/lag dummies relative to treatment.
    - Supports both classic and staggered timing, covariates, clustering, weights, and flexible baseline normalization.
    - Choice of estimation methods: `classic` (factor expansion) or `sunab` (Sun & Abraham 2021 for staggered designs).
    - Multiple confidence interval levels supported (e.g., 90%, 95%, 99%).
    - Handles irregular time panels via `time_transform`.
    - Results are filtered to specified `lead_range` and `lag_range` (v0.7.1+).

- **`plot_es()`**:
    - Intuitive event study plot with ribbon or errorbar CI display.
    - CI level and visual style are fully customizable.
    - ggplot2-based for further modification.

- **`plot_es_interactive()`** (v0.7.0+):
    - Interactive plotly-based visualizations with hover tooltips.
    - Displays point estimates, confidence intervals, standard errors, and p-values on hover.

## Staggered adoption estimators (v0.8.0)

Standard TWFE event-study regressions can produce biased and sign-reversed
estimates under heterogeneous treatment effects when units adopt treatment at
different times (Callaway & Sant'Anna 2021; Sun & Abraham 2021; Borusyak,
Jaravel & Spiess 2024). **fixes** v0.8.0 adds three robust alternatives, all
accessible through the same `run_es()` interface via the `estimator` argument.

We demonstrate all three on `fixest::base_stagg`, a simulated staggered-adoption
panel with true ATT = 1 for all cohorts and horizons.

```{r stagg-data}
df_stagg <- fixest::base_stagg
# Mark never-treated units with NA (convention for all three estimators)
df_stagg$timing <- df_stagg$year_treated
df_stagg$timing[df_stagg$year_treated == 10000] <- NA
```

### Callaway & Sant'Anna (2021) — `estimator = "cs"`

Estimates a separate ATT(g,t) for every cohort-by-period cell, then aggregates
to an event-study curve using cohort-size weights.

```{r cs-est}
res_cs <- run_es(
  data          = df_stagg,
  outcome       = y,
  time          = year,
  timing        = timing,
  unit          = id,
  staggered     = TRUE,
  estimator     = "cs",
  control_group = "nevertreated"
)
plot_es(res_cs) + ggplot2::ggtitle("Callaway & Sant'Anna (2021)")
```

### Sun & Abraham (2021) — `estimator = "sa"`

Builds cohort x relative-time interactions and aggregates with cohort-share
weights. Numerically identical to `fixest::sunab()`.

```{r sa-est}
res_sa <- run_es(
  data      = df_stagg,
  outcome   = y,
  treatment = treated,
  time      = year,
  timing    = timing,
  unit      = id,
  fe        = ~ id + year,
  staggered = TRUE,
  estimator = "sa",
  cluster   = ~ id
)
plot_es(res_sa) + ggplot2::ggtitle("Sun & Abraham (2021)")
```

### Borusyak, Jaravel & Spiess (2024) — `estimator = "bjs"`

Fits TWFE on untreated observations, imputes counterfactuals for treated units,
then averages by horizon.

```{r bjs-est}
res_bjs <- run_es(
  data      = df_stagg,
  outcome   = y,
  time      = year,
  timing    = timing,
  unit      = id,
  staggered = TRUE,
  estimator = "bjs"
)
plot_es(res_bjs) + ggplot2::ggtitle("Borusyak, Jaravel & Spiess (2024)")
```

Under homogeneous treatment effects (as in this DGP), all three estimators
give similar results, each recovering a post-treatment ATT close to the true
value of 1.

---

## Bootstrap simultaneous confidence bands (v0.8.0)

### Pointwise vs. simultaneous CIs

Standard confidence intervals are **pointwise**: the 95% CI at each horizon
covers the true value with 95% probability, but the probability that *all*
intervals simultaneously cover their true values can be much lower when many
periods are plotted.

**Simultaneous** confidence bands (Callaway & Sant'Anna 2021, Corollary 1)
control the joint coverage probability. With probability at least 1 − α, the
entire event-study curve is contained within the simultaneous band. This is
especially important for parallel-trends pre-testing: a pre-trend test based
on simultaneous bands controls the family-wise error rate.

### Usage

Pass `bootstrap = TRUE` to `run_es()` together with the CS estimator. The
multiplier bootstrap (Algorithm 1 of Callaway & Sant'Anna 2021) is used.

```{r boot-demo, eval=FALSE}
# NOTE: B = 199 shown here for brevity; use B = 999 in practice
res_cs_boot <- run_es(
  data          = df_stagg,
  outcome       = y,
  time          = year,
  timing        = timing,
  unit          = id,
  staggered     = TRUE,
  estimator     = "cs",
  control_group = "nevertreated",
  bootstrap     = TRUE,
  B             = 199,
  boot_seed     = 42
)
# The lighter outer band is the simultaneous CI; the darker inner band is
# the standard pointwise CI.
plot_es(res_cs_boot, show_simultaneous = TRUE)
```

The simultaneous critical value ĉ and per-period simultaneous CI bounds are
stored in `attr(res_cs_boot, "bootstrap")`. The simultaneous band is always
at least as wide as the pointwise band, since the critical value
ĉ_{1-α} >= z_{1-α/2}.

---

## ATT(g,t) visualization (v0.8.0)

The CS estimator produces a separate ATT estimate for every (cohort g,
calendar period t) pair. `plot_att_gt()` visualises this full matrix.

### Heatmap

Tiles are filled by the ATT(g,t) estimate. Cells whose pointwise CI excludes
zero are marked with a filled dot (●); when bootstrap data are available,
simultaneously significant cells also receive an open diamond (◇).

```{r heatmap-demo}
plot_att_gt(res_cs, type = "heatmap")
```

The vertical dashed lines mark each cohort's treatment onset (t = g).

### Facet plot

One panel per cohort showing ATT over calendar time, with a pointwise CI
ribbon. Useful for inspecting heterogeneous dynamics across cohorts.

```{r facet-demo}
plot_att_gt(res_cs, type = "facet")
```

---

## Conclusion

The **fixes** package streamlines event study estimation and visualization for panel data researchers. With a minimal API, multiple CI support, and robust visualization, it accelerates the workflow for dynamic treatment effect analysis.

For further details and full argument documentation, see:

```r
?run_es
?plot_es
?plot_att_gt
```

Happy analyzing!