Econometrics
See recent articles
Showing new listings for Friday, 30 January 2026
- [1] arXiv:2601.21272 [pdf, html, other]
-
Title: Finite-Sample Properties of Model Specification Tests for Multivariate Dynamic Regression ModelsComments: 50 pages; 11 tablesSubjects: Econometrics (econ.EM); Pricing of Securities (q-fin.PR); Statistical Finance (q-fin.ST)
This paper proposes a new multivariate model specification test that generalizes Durbin regression to a seemingly unrelated regression framework and reframes the Durbin approach as a GLS-class estimator. The proposed estimator explicitly models cross-equation dependence and the joint second-order dynamics of regressors and disturbances. It remains consistent under a comparatively weak dependence condition in which conventional OLS- and GLS-based estimators can be inconsistent, and it is asymptotically efficient under stronger conditions. Monte Carlo experiments indicate that the associated Wald test achieves improved size control and competitive power in finite samples, especially when combined with a bootstrap-based bias correction. An empirical application further illustrates that the proposed procedure delivers stable inference and is practically useful for multi-equation specification testing.
- [2] arXiv:2601.21749 [pdf, html, other]
-
Title: Fast and user-friendly econometrics estimations: The R package fixestComments: 56 pages, 12 tables, 5 figuresSubjects: Econometrics (econ.EM)
fixest is an R package for fast and flexible econometric estimation, providing a comprehensive toolkit for applied researchers. The package particularly excels at fixed-effects estimation, supported by a novel fixed-point acceleration algorithm implemented in C++. This algorithm achieves rapid convergence across a broad class of data contexts and further enables estimation of complex models, including those with varying slopes, in a highly efficient manner. Beyond computational speed, fixest provides a unified syntax for a wide variety of models: ordinary least squares, instrumental variables, generalized linear models, maximum likelihood, and difference-in-differences estimators. An expressive formula interface enables multiple estimations, stepwise regressions, and variable interpolation in a single call, while users can make on-the-fly inference adjustments using a variety of built-in robust standard errors. Finally, fixest provides methods for publication-ready regression tables and coefficient plots. Benchmarks against leading alternatives in R, Python, and Julia demonstrate best-in-class performance, and the paper includes many worked examples illustrating the core functionality.
New submissions (showing 2 of 2 entries)
- [3] arXiv:2601.20875 (cross-list from stat.AP) [pdf, html, other]
-
Title: Distributed Causality in the SDG Network: Evidence from Panel VAR and Conditional Independence AnalysisMd Muhtasim Munif Fahim, Md Jahid Hasan Imran, Luknath Debnath, Tonmoy Shill, Md. Naim Molla, Ehsanul Bashar Pranto, Md Shafin Sanyan Saad, Md Rezaul KarimComments: Comprehensive Manuscript with Code & DataSubjects: Applications (stat.AP); Machine Learning (cs.LG); Econometrics (econ.EM); Methodology (stat.ME); Machine Learning (stat.ML)
The achievement of the 2030 Sustainable Development Goals (SDGs) is dependent upon strategic resource distribution. We propose a causal discovery framework using Panel Vector Autoregression, along with both country-specific fixed effects and PCMCI+ conditional independence testing on 168 countries (2000-2025) to develop the first complete causal architecture of SDG dependencies. Utilizing 8 strategically chosen SDGs, we identify a distributed causal network (i.e., no single 'hub' SDG), with 10 statistically significant Granger-causal relationships identified as 11 unique direct effects. Education to Inequality is identified as the most statistically significant direct relationship (r = -0.599; p < 0.05), while effect magnitude significantly varies depending on income levels (e.g., high-income: r = -0.65; lower-middle-income: r = -0.06; non-significant). We also reject the idea that there exists a single 'keystone' SDG. Additionally, we offer a proposed tiered priority framework for the SDGs namely, identifying upstream drivers (Education, Growth), enabling goals (Institutions, Energy), and downstream outcomes (Poverty, Health). Therefore, we conclude that effective SDG acceleration can be accomplished through coordinated multi-dimensional intervention(s), and that single-goal sequential strategies are insufficient.
- [4] arXiv:2601.21036 (cross-list from stat.ME) [pdf, other]
-
Title: Experimental Design for MatchingSubjects: Methodology (stat.ME); Econometrics (econ.EM); Systems and Control (eess.SY)
Matching mechanisms play a central role in operations management across diverse fields including education, healthcare, and online platforms. However, experimentally comparing a new matching algorithm against a status quo presents some fundamental challenges due to matching interference, where assigning a unit in one matching may preclude its assignment in the other. In this work, we take a design-based perspective to study the design of randomized experiments to compare two predetermined matching plans on a finite population, without imposing outcome or behavioral models. We introduce the notation of a disagreement set, which captures the difference between the two matching plans, and show that it admits a unique decomposition into disjoint alternating paths and cycles with useful structural properties. Based on these properties, we propose the Alternating Path Randomized Design, which sequentially randomizes along these paths and cycles to effectively manage interference. Within a minimax framework, we optimize the conditional randomization probability and show that, for long paths, the optimal choice converges to $\sqrt{2}-1$, minimizing worst-case variance. We establish the unbiasedness of the Horvitz-Thompson estimator and derive a finite-population Central Limit Theorem that accommodates complex and unstable path and cycle structures as the population grows. Furthermore, we extend the design to many-to-one matchings, where capacity constraints fundamentally alter the structure of the disagreement set. Using graph-theoretic tools, including finding augmenting paths and Euler-tour decomposition on an auxiliary unbalanced directed graph, we construct feasible alternating path and cycle decompositions that allow the design and inference results to carry over.
- [5] arXiv:2601.21470 (cross-list from cs.LG) [pdf, html, other]
-
Title: PPI-SVRG: Unifying Prediction-Powered Inference and Variance Reduction for Semi-Supervised OptimizationComments: 27 pages, 4 figuresSubjects: Machine Learning (cs.LG); Econometrics (econ.EM); Optimization and Control (math.OC); Machine Learning (stat.ML)
We study semi-supervised stochastic optimization when labeled data is scarce but predictions from pre-trained models are available. PPI and SVRG both reduce variance through control variates -- PPI uses predictions, SVRG uses reference gradients. We show they are mathematically equivalent and develop PPI-SVRG, which combines both. Our convergence bound decomposes into the standard SVRG rate plus an error floor from prediction uncertainty. The rate depends only on loss geometry; predictions affect only the neighborhood size. When predictions are perfect, we recover SVRG exactly. When predictions degrade, convergence remains stable but reaches a larger neighborhood. Experiments confirm the theory: PPI-SVRG reduces MSE by 43--52\% under label scarcity on mean estimation benchmarks and improves test accuracy by 2.7--2.9 percentage points on MNIST with only 10\% labeled data.
- [6] arXiv:2601.21534 (cross-list from econ.GN) [pdf, html, other]
-
Title: Electoral Polls and Economic Uncertainty: an Analysis of the Last Two U.S. Presidential ElectionsComments: 25 pages, 2 tables, 5 figuresSubjects: General Economics (econ.GN); Econometrics (econ.EM)
This paper examines the dynamic relationship between electoral polls and indicators of economic and financial uncertainty during the last two U.S. presidential elections (2020 and 2024). Using daily polling data on Donald Trump and measures such as the Aruoba-Diebold-Scotti Business Conditions Index, the 5-year Breakeven Inflation Rate, the Trade Policy Uncertainty index, and the VIX, we estimate conditional correlation models to capture time-varying interactions. The analysis reveals that in 2020, correlations between polls and uncertainty measures were highly dynamic and event-driven, reflecting the influence of exogenous shocks (COVID-19, oil price collapse) and political milestones (primaries, debates). In contrast, during the 2024 campaign, correlations remained close to zero, stable, and largely unresponsive to shocks, suggesting that entrenched polarization and non-economic events (e.g., assassination attempt, candidate changes) muted the economic channel. The study highlights how the interplay between voter sentiment, financial markets, and uncertainty varies across electoral contexts, offering a methodological contribution through the application of Dynamic Conditional Correlation models to political data and policy-relevant insights on the conditions under which economic fundamentals influence electoral dynamics.
Cross submissions (showing 4 of 4 entries)
- [7] arXiv:2310.07151 (replaced) [pdf, html, other]
-
Title: Identification and Estimation of a Semiparametric Logit Model using Network DataSubjects: Econometrics (econ.EM)
This paper studies identification and estimation in semiparametric binary choice models when social networks are endogenous. In many applications, unobserved individual traits shape both the outcome of interest and the formation of social ties, so standard logit specifications, including those augmented with common network controls, can be biased. I show how network data can be used to address this endogeneity without imposing parametric structure on the link formation process.
The key insight is that agents who are observationally equivalent in their network formation behavior share the same latent social influence, even if the underlying individual traits remain unobserved. Exploiting this equivalence, I establish point identification of the slope parameters in a binary response model by comparing matched pairs of agents with identical network types. I propose feasible estimators based on nonparametric matching using codegree information derived from the adjacency matrix and establish their consistency and asymptotic normality. Monte Carlo simulations demonstrate that the proposed estimator performs well in finite samples across a range of network designs. An empirical application to microfinance adoption in rural Indian villages illustrates how the method can be implemented in a canonical network dataset and shows that accounting for endogenous network formation affects estimated covariate effects, both with and without village fixed effects. - [8] arXiv:2402.19425 (replaced) [pdf, html, other]
-
Title: Testing Information Ordering for Strategic AgentsSubjects: Econometrics (econ.EM)
Specifying the information structure in strategic environments is difficult for empirical researchers. We develop a test of information ordering that examines whether the true information structure is at least as informative as a proposed baseline. Using Bayes Correlated Equilibrium (BCE), we translate the ordering of information structures into testable moment inequalities and establish uniform asymptotic validity for our testing procedure. In an application to U.S. airline markets, we test whether hub airlines have informational advantages beyond cost and demand benefits. We reject the privileged information hypothesis, with rejections concentrated in large, competitive markets.
- [9] arXiv:2404.11198 (replaced) [pdf, html, other]
-
Title: Forecasting with panel data: Estimation uncertainty versus parameter heterogeneitySubjects: Econometrics (econ.EM)
We provide a comprehensive examination of the predictive performance of panel forecasting methods based on individual, pooling, fixed effects, and empirical Bayes estimation, and propose optimal weights for forecast combination schemes. We consider linear panel data models, allowing for weakly exogenous regressors and correlated heterogeneity. We quantify the gains from exploiting panel data and demonstrate how forecasting performance depends on the degree of parameter heterogeneity, whether such heterogeneity is correlated with the regressors, the goodness of fit of the model, and the dimensions of the data. Monte Carlo simulations and empirical applications to house prices and CPI inflation show that empirical Bayes and forecast combination methods perform best overall and rarely produce the least accurate forecasts for individual series.
- [10] arXiv:2510.19672 (replaced) [pdf, html, other]
-
Title: Policy Learning with AbstentionComments: Accepted at AISTATS 2025Subjects: Machine Learning (cs.LG); Econometrics (econ.EM); Machine Learning (stat.ML)
Policy learning algorithms are widely used in areas such as personalized medicine and advertising to develop individualized treatment regimes. However, most methods force a decision even when predictions are uncertain, which is risky in high-stakes settings. We study policy learning with abstention, where a policy may defer to a safe default or an expert. When a policy abstains, it receives a small additive reward on top of the value of a random guess. We propose a two-stage learner that first identifies a set of near-optimal policies and then constructs an abstention rule from their disagreements. We establish fast O(1/n)-type regret guarantees when propensities are known, and extend these guarantees to the unknown-propensity case via a doubly robust (DR) objective. We further show that abstention is a versatile tool with direct applications to other core problems in policy learning: it yields improved guarantees under margin conditions without the common realizability assumption, connects to distributionally robust policy learning by hedging against small data shifts, and supports safe policy improvement by ensuring improvement over a baseline policy with high probability.
- [11] arXiv:2512.21080 (replaced) [pdf, html, other]
-
Title: LLM Personas as a Substitute for Field Experiments in Method BenchmarkingSubjects: Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Econometrics (econ.EM)
Field experiments (A/B tests) are often the most credible benchmark for methods (algorithms) in societal systems, but their cost and latency bottleneck rapid methodological progress. LLM-based persona simulation offers a cheap synthetic alternative, yet it is unclear whether replacing humans with personas preserves the benchmark interface that adaptive methods optimize against. We prove an if-and-only-if characterization: when (i) methods observe only the aggregate outcome (aggregate-only observation) and (ii) evaluation depends only on the submitted artifact and not on the method's identity or provenance (method-blind evaluation), swapping humans for personas is just panel change from the method's point of view, indistinguishable from changing the evaluation population (e.g., New York to Jakarta). Furthermore, we move from validity to usefulness: we define an information-theoretic discriminability of the induced aggregate channel and show that making persona benchmarking as decision-relevant as a field experiment is fundamentally a sample-size question, yielding explicit bounds on the number of independent persona evaluations required to reliably distinguish meaningfully different methods at a chosen resolution.