Projecting actor locations and modeling dyadic interactions with PALS

Overview

Many actors in the social world — armed groups, firms, diplomats, migrating populations — have no fixed location. They move through space over time, and where two such actors interact is itself an outcome worth modeling. The Projected Actor Location (PALS) method (Kim et al. 2023) addresses this by projecting where a mobile actor “is” at any moment from the spatiotemporal history of its past interactions, using exponential-smoothing weights that favour recent and nearby events.

The palsr package implements the full workflow:

build a validated table of dyadic events (pal_events());
estimate the smoothing parameters by minimizing great-circle prediction error (estimate_pals());
project actor locations at arbitrary times (project_pals());
predict dyadic interaction locations and build distance covariates (predict_event_locations(), pal_distance());
quantify uncertainty with a nonparametric bootstrap and pool with Rubin’s Rules (bootstrap_pals(), pool_rubin()).

library(palsr)

The method in brief

For a focal actor \(i\) at prediction time \(t\), PALS forms a recency-weighted mean of the locations of \(i\)’s own past events (the focal component) and a recency-weighted mean of the locations of events involving \(i\)’s past interaction partners, or alters (the alter component). The projected location is a convex combination of the two, \[ g_i(t) = (1-\pi)\sum_e W_i(e)\, g(e) \;+\; \pi \sum_e W_k(e)\, g(e), \qquad \pi = \mathrm{logistic}(\gamma + \eta\, v), \] where the weights decay with event age — \(W_i(e) \propto (\text{age}^{\alpha})^{-1}\) for the focal actor and analogously with \(\beta\) for the alters — and the mixing weight \(\pi\) depends on how active the focal actor is relative to its alters through the event-count ratio \(v\). The four parameters are therefore:

Parameter	Role
\(\alpha\)	decay of the focal actor’s own history
\(\beta\)	decay of the alters’ histories
\(\gamma\)	intercept of the focal-vs-alter mixing weight
\(\eta\)	dependence of the mixing weight on relative activity

A reduced one-parameter model fixes \(\pi = 0\) (focal history only) and estimates \(\alpha\) alone; it is fast and surprisingly competitive.

A worked example

The package ships a deterministic simulated dataset, nigeria_sim, of 1,500 dyadic conflict events among 25 mobile actors between 2000 and 2016, so that examples run identically everywhere. The bundled nigeria_acled dataset provides the real events from the replication archive of Kim, Liu and Desmarais (2023).

data(nigeria_sim)
nigeria_sim
#> <pal_events>
#>   events: 1500
#>   actors: 25
#>   time:   2000-01-05 to 2016-12-24
#>   bbox:   lon [2.70, 14.70], lat [5.11, 13.90]
summary(nigeria_sim)
#> <pal_events> summary
#>   1500 events, 25 actors, 2000-01-05 to 2016-12-24
#>   events per actor: min 15, median 103, max 279

You can build your own pal_events object from any data frame by naming the actor, time, longitude and latitude columns:

raw <- data.frame(
  from = c("A", "A", "B"),
  to   = c("B", "C", "C"),
  when = as.Date(c("2001-01-01", "2001-06-01", "2002-01-01")),
  x    = c(7.1, 8.0, 7.5),
  y    = c(9.0, 9.4, 10.1)
)
pal_events(raw, actor1 = "from", actor2 = "to",
           time = "when", lon = "x", lat = "y")
#> <pal_events>
#>   events: 3
#>   actors: 3
#>   time:   2001-01-01 to 2002-01-01
#>   bbox:   lon [7.10, 8.00], lat [9.00, 10.10]

Estimating the parameters

Estimation marches forward through time: every event is predicted using only events strictly earlier than it, and the parameters minimize the mean great-circle (Haversine) distance between predicted and observed locations.

fit1 <- estimate_pals(nigeria_sim, model = "one")
fit1
#> <pals_fit> one-parameter PALS model
#> <pals_params> (one-parameter model)
#>   alpha = 0.874  (pi fixed at 0)
#>   objective (mean Haversine km): 103.3665 over 1494 events
coef(fit1)
#>     alpha 
#> 0.8739538

The full four-parameter model adds the alter component. We cap the optimizer iterations here purely to keep the vignette quick:

fit4 <- estimate_pals(nigeria_sim, model = "four",
                      control = list(maxit = 60))
coef(fit4)
#>        alpha         beta        gamma          eta 
#>   0.87427976   0.01427642 -11.45988661  -9.41597484

Projecting actor locations

With a fitted model (or a hand-specified pals_params()), project where each actor is at a given time:

pal_2015 <- project_pals(nigeria_sim,
                         predict_time = as.Date("2015-01-01"),
                         params = fit1)
head(pal_2015)
#>   actor       time       lon       lat n_focal n_alter has_history
#> 1   G01 2015-01-01 10.606386  6.657116      14     954        TRUE
#> 2   G02 2015-01-01  4.823395  7.553735     130    1308        TRUE
#> 3   G03 2015-01-01  7.351427 11.771869     129    1324        TRUE
#> 4   G04 2015-01-01 13.541634  8.963808      41    1242        TRUE
#> 5   G05 2015-01-01  6.570594  7.804320      55    1306        TRUE
#> 6   G06 2015-01-01  4.857437  7.135634     185    1301        TRUE

library(ggplot2)
ggplot(pal_2015, aes(lon, lat)) +
  geom_point(colour = "#2b6cb0", size = 2) +
  geom_text(aes(label = actor), vjust = -0.8, size = 3) +
  labs(title = "Projected actor locations, 2015-01-01",
       x = "Longitude", y = "Latitude") +
  theme_minimal()

Projected actor locations on 2015-01-01

Because the projection is recomputed as time advances, each actor traces a trajectory through space. Projecting a few actors at yearly intervals and drawing their paths over the cloud of observed events shows how PALS captures mobile actors drifting through the theatre:

actors <- c("G03", "G08", "G14", "G21")
dates  <- as.Date(sprintf("%d-01-01", seq(2005, 2016)))
traj   <- project_pals(nigeria_sim, actors = actors,
                       predict_time = dates, params = fit1)
traj   <- traj[!is.na(traj$lon), ]
ends   <- do.call(rbind, lapply(split(traj, traj$actor),
                                function(d) d[which.max(d$time), ]))

ggplot() +
  geom_point(data = nigeria_sim, aes(lon, lat),
             colour = "grey80", size = 0.5, alpha = 0.5) +
  geom_path(data = traj, aes(lon, lat, colour = actor), linewidth = 0.8,
            arrow = grid::arrow(length = grid::unit(0.18, "cm"), type = "closed")) +
  geom_point(data = traj, aes(lon, lat, colour = actor), size = 1.6) +
  geom_text(data = ends, aes(lon, lat, colour = actor, label = actor),
            nudge_y = 0.35, size = 3, show.legend = FALSE) +
  scale_colour_brewer(palette = "Dark2", name = "Actor") +
  labs(title = "Projected actor trajectories, 2005-2016",
       x = "Longitude", y = "Latitude") +
  coord_quickmap() +
  theme_minimal()

Projected trajectories of four actors over 2005-2016

Predicting interaction locations and dyadic distances

The predicted location of an interaction between two actors is the mean of their two projected locations. Supplying observed coordinates scores the prediction in kilometres:

targets <- nigeria_sim[nigeria_sim$time > as.Date("2014-01-01"), ]
scored  <- predict_event_locations(nigeria_sim, targets, fit1)
summary(scored$error_km)
#>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#>   4.458  71.025  99.239 102.931 134.497 307.286

The dyadic distance between two actors’ projected locations is the key covariate for modeling who interacts with whom:

dyads <- data.frame(actor1 = "G01", actor2 = "G02",
                    time = as.Date("2014-06-01"))
pal_distance(nigeria_sim, dyads, fit1, transform = "log")
#>   actor1 actor2       time pal_distance pal_log_distance
#> 1    G01    G02 2014-06-01     320.2117         5.769014

Uncertainty: bootstrap and Rubin’s Rules

bootstrap_pals() resamples events with replacement and re-estimates the model on each replicate, yielding bootstrap standard errors and percentile intervals. (We use a small number of replicates here for speed; the paper uses ten.)

bt <- bootstrap_pals(nigeria_sim, R = 10, model = "one", seed = 1)
summary(bt)
#> <pals_boot> summary (10 replicates, 95% percentile intervals)
#>    term estimate boot_mean boot_se  lower  upper
#> 1 alpha    0.874    0.8581 0.04665 0.7756 0.9083

When a downstream estimand (say, a regression coefficient using PAL distances) is computed on each replicate, treat the replicates as multiple imputations and combine them with Rubin’s Rules, which propagate both within- and between-replicate uncertainty:

q <- c(1.10, 0.95, 1.20, 1.05, 0.98)   # per-replicate estimates
u <- c(0.04, 0.05, 0.045, 0.038, 0.052) # per-replicate variances
pool_rubin(q, u, df = TRUE, dfcom = 100)
#>    qbar  ubar       b        t        se       fmi       df    p.value
#> 1 1.056 0.045 0.00993 0.056916 0.2385707 0.2093612 41.91733 6.7103e-05