Journal of the American Statistical Association
S Yao, B Rava, X Tong, G James
December 9, 2023
Label noise in data has long been an important problem in supervised learning applications as it affects the effectiveness of many widely used classification methods. Recently, important real-world applications, such as medical diagnosis and cybersecurity, have generated renewed interest in the NeymanâPearson (NP) classification paradigm, which constrains the more severe type of error (e.g., the Type I error) under a preferred level while minimizing the other (e.g., the Type II error).
Journal of the American Statistical Association
L Fu, B Gang, GM James, W Sun
December 4, 2022
Standardization has been a widely adopted practice in multiple testing, for it takes into account the variability in sampling and makes the test statistics comparable across different study units. However, despite conventional wisdom to the contrary, we show that there can be a significant loss in information from basing hypothesis tests on standardized statistics rather than the full data.
Journal of the American Statistical Association
GM James, P Radchenko, B Rava
August 17, 2022
We consider the common setting where one observes probability estimates for a large number of events, such as default risks for numerous bonds. Unfortunately, even with unbiased estimates, selecting events corresponding to the most extreme probabilities can result in systematically underestimating the true level of uncertainty. We develop an empirical Bayes approach âexcess certainty adjusted probabilitiesâ (ECAP), using a variant of Tweedieâs formula, which updates probability estimates to correct for selection bias.
Journal of Marketing
D Chandrasekaran, GJ Tellis, GM James
May 17, 2022
When faced with new technologies, the incumbentsâ dilemma is whether to embrace the new technology, stick with their old technology, or invest in both. The entrantsâ dilemma is whether to target a niche and avoid incumbent reaction or target the mass market and incur the incumbentâs wrath. The solution is knowing to what extent the new technology cannibalizes the old one or whether both technologies may exist in tandem. The authors develop a generalized model of the diffusion of successive technologies, which allows for the rate of disengagement from the old technology to differ from the rate of adoption of the new. The model helps managers estimate evolving proportions of segments that play different roles in the competition between technologies and predict technological leapfrogging, cannibalization, and coexistence.
Biometrika
Qiao, X., Qian, C., James, G. and Guo, S.
October 9, 2020
We consider estimating a functional graphical model from multivariate functional observations. In functional data analysis, the classical assumption is that each function has been measured over a densely sampled grid. However, in practice the functions have often been observed, with measurement error, at a relatively small number of points. In this paper, we propose a class of doubly functional graphical models to capture the evolving conditional dependence relationship among a large number of sparsely or densely sampled functions.
Journal of the American Statistical Association
GM James, C Paulson, P Rusmevichientong
June 19, 2020
Firms are increasingly transitioning advertising budgets to Internet display campaigns, but this transition poses new challenges. These campaigns use numerous potential metrics for success (e.g., reach or click rate), and because each website represents a separate advertising opportunity, this is also an inherently high-dimensional problem. Further, advertisers often have constraints they wish to place on their campaign, such as targeting specific sub-populations or websites. These challenges require a method flexible enough to accommodate thousands of websites, as well as numerous metrics and campaign constraints. Motivated by this application, we consider the general constrained high-dimensional problem, where the parameters satisfy linear constraints. We develop the Penalized and Constrained optimization method (PaC) to compute the solution path for high-dimensional, linearly constrained criteria.
Journal of Marketing Research
C Paulson, L Luo, GM James
August 1, 2018
In today's digital market, the number of websites available for advertising has ballooned into the millions. Consequently, firms often turn to ad agencies and demand-side platforms (DSPs) to decide how to allocate their Internet display advertising budgets. Nevertheless, most extant DSP algorithms are rule-based and strictly proprietary. This article is among the first efforts in marketing to develop a nonproprietary algorithm for optimal budget allocation of Internet display ads within the context of programmatic advertising. Unlike many DSP algorithms that treat each ad impression independently, this method explicitly accounts for viewership correlations across websites.
The Annals of Statistics
Y Fan, GM James, P Radchenko
October 1, 2015
We suggest a new method, called Functional Additive Regression, or FAR, for efficiently performing high-dimensional functional regression. We demonstrate that FAR can be implemented with a wide range of penalty functions using a highly efficient coordinate descent algorithm. Theoretical results are developed which provide motivation for the FAR optimization criterion. Finally, we show through simulations and two real data sets that FAR can significantly outperform competing methods.