Junting Duan

This image has an empty alt attribute; its file name is IMG_1457-1024x1024.jpg

Ph.D. student
Management Science and Engineering
Stanford University

duanjt@stanford.edu

I am a fifth-year Ph.D. student in the Department of Management Science and Engineering at Stanford University, advised by Professor Markus Pelger. I will be on the academic job market in Fall 2025. My research interests lie broadly in data-driven decision-making, machine learning, and statistical inference, with a particular focus on applications in causal inference and finance.

I received my B.S. from the School of Mathematical Sciences at Peking University.

Journal Publications & Working Papers

Imputation-Powered Inference for Missing Covariates
Junting Duan*, Markus Pelger*
Working Paper
Abstract: A prevalent problem in empirical research is missing covariate data as an input for model estimation. This paper develops a novel framework for optimally using partially observed covariates data for estimation and inference in a downstream model. Commonly used approaches, that either discard all partially observed samples or naively treat imputed values as real observations, are generally inefficient or biased. Our novel approach combines a bias correction with an optimal weighting scheme for imputed values that balances the efficiency trade-off between imputation error and effective sample size. Our method ensures valid inference while improving statistical efficiency by leveraging all available data. Our framework accommodates general missing data patterns for large panels and can be combined with a broad class of imputation methods. We establish the asymptotic normality of the proposed estimator under general assumptions. In simulations, we demonstrate the superior performance of our method over the naive approaches, as it achieves both lower bias and variance, while being robust to the imputation quality.

Learning the Values of Illiquid Bonds with Statistical Guarantees
Junting Duan*, Yang Fan*, Kay Giesecke*
Working Paper
Abstract: Mortgage, corporate, municipal and other bonds are often illiquid. We develop and test a scalable, assumption-light semiparametric method for learning the values of non-traded bonds from contemporaneous and past trade data. The method is based on a conditional latent factor model for the quantiles of bond values, with factor exposures depending on observable bond attributes. Our training procedure leverages deep neural networks for factor exposures and employs an alternating optimization procedure. Simultaneous training for several quantiles delivers a robust point estimate and a prediction interval for the value of a non-traded bond. The interval quantifies prediction uncertainty and has attribute-dependent length; conformal prediction techniques yield a finite-sample coverage guaranty. In an empirical application, we estimate pay-ups for Agency MBS pools using data from over 4 million TRACE trade records during the period 2011–22 along with data on pool attributes from eMBS/ICE. Our method delivers accurate out-of-sample price estimates and valid prediction intervals.

Automatic Doubly Robust Forest
Zhaomeng Chen*, Junting Duan*, Victor Chernozhukov, Vasilis Syrgkanis
Working Paper
Abstract: This paper proposes the automatic Doubly Robust Random Forest (DRRF) algorithm for estimating the conditional expectation of a moment functional in the presence of high-dimensional nuisance functions. DRRF extends the automatic debiasing framework based on the Riesz representer to the conditional setting and enables nonparametric, forest-based estimation (Athey et al., 2019; Oprescu et al. 2019). In contrast to existing methods, DRRF does not require prior knowledge of the form of the debiasing term or impose restrictive parametric or semi-parametric assumptions on the target quantity. Additionally, it is computationally efficient in making predictions at multiple query points. We establish consistency and asymptotic normality results for the DRRF estimator under general assumptions, allowing for the construction of valid confidence intervals. Through extensive simulations in heterogeneous treatment effect (HTE) estimation, we demonstrate the superior performance of DRRF over benchmark approaches in terms of estimation accuracy, robustness, and computational efficiency.

Factor Analysis for Causal Inference on Large Non-Stationary Panels with Endogenous Treatment
Junting Duan*, Markus Pelger*, Ruoxuan Xiong*
Management Science, Revise & Resubmit
Abstract: This paper studies the imputation and inference for large-dimensional non-stationary panel data with missing observations. We propose a novel method, Within-Transform-PCA (wi-PCA), to estimate an approximate latent factor structure and non-stationary two-way fixed effects under general missing patterns. The missing patterns can depend on the latent factor model and two-way fixed effects. Our method combines a novel within-transformation for the estimation of two-way fixed effects with a PCA on within-transformed data. We provide entry-wise inferential theory for the values imputed with wi-PCA. The key application of wi-PCA is the estimation of counterfactuals on causal panels, where we allow for two-way endogenous treatment effects, time trends and general latent confounders. In an empirical study of the liberalization of marijuana, we show that wi-PCA yields more accurate estimates of treatment effects and more credible economic conclusions compared to its two special cases of conventional difference-in-differences and PCA.

Change-Point Testing for Risk Measures in Time Series
Lin Fan, Junting Duan, Peter Glynn, Markus Pelger
Journal of Financial Econometrics, Revise & Resubmit
Abstract: We propose novel methods for change-point testing for nonparametric estimators of expected shortfall and related risk measures in weakly dependent time series. We can detect general multiple structural changes in the tails of marginal distributions of time series under general assumptions. Self-normalization allows us to avoid the issues of standard error estimation. The theoretical foundations for our methods are functional central limit theorems, which we develop under weak assumptions. An empirical study of S&P 500 and US Treasury bond returns illustrates the practical use of our methods in detecting and quantifying instability in the tails of financial time series.

Target PCA: Transfer Learning Large Dimensional Panel Data
Junting Duan*, Markus Pelger*, Ruoxuan Xiong*
Journal of Econometrics, 2023
Abstract: This paper develops a novel method to estimate a latent factor model for a large target panel with missing observations by optimally using the information from auxiliary panel data sets. We refer to our estimator as target-PCA. Transfer learning from auxiliary panel data allows us to deal with a large fraction of missing observations and weak signals in the target panel. We show that our estimator is more efficient and can consistently estimate weak factors, which are not identifiable with conventional methods. We provide the asymptotic inferential theory for target-PCA under very general assumptions on the approximate factor model and missing patterns. In an empirical study of imputing data in a mixed-frequency macroeconomic panel, we demonstrate that target-PCA significantly outperforms all benchmark methods.

Education

2026 (Expected): Ph.D., Department of Management Science and Engineering, Stanford University
2020: B.S., School of Mathematical Sciences, Peking University

Teaching

MS&E 245A Investment Science, Teaching Assistant, Fall 2021-2025
MS&E 221 Stochastic Modeling, Teaching Assistant, Spring 2023-2025
MS&E 211 Introduction to Optimization, Winter 2021-2022