## RESEARCH

### Job Market Paper

Type Fixed Effects and Rational Addiction: A GMM Framework for Latent Type Heterogeneity (Download)

Abstract. This paper reexamines Rational Addiction (RA) by introducing the type fixed effects (TFE) panel model. The TFE model incorporates heterogeneous coefficients and time-varying patterns of heterogeneity, which reflect differences in preferences and the addiction process. The model assumes the existence of a latent, time-invariant continuous variable referred to as a ``type'', which drives the heterogeneity in the parameters. Smoothness of the parameters as functions of the type is key to identification, allowing individuals of similar types to have similar parameter values. Correlation between the parameters, covariates, and instruments stem from type heterogeneity. I propose the type fixed effects generalized method of moments (TFE-GMM) estimator and establish consistency. I provide fast computation procedures based on the stochastic gradient descent algorithm. Simulations demonstrate good performance of this estimator. Using yearly household cigarette purchase data to estimate the model shows that most households follow cyclical consumption patterns and insensitivity to prices changes, giving support to educational interventions to curb smoking.

### Publications

A solution for the greedy approximation of a step function with a waveform dictionary. Communications in Nonlinear Science and Numerical Simulation, 2022, 106890. (arXiv, Journal). With Pierluigi Vellucci.

Abstract. In this paper we consider a step function characterized by an arbitrary sequence of real-valued scalars and approximate it with a matching pursuit (MP) algorithm. We utilize a waveform dictionary with rectangular window functions as part of this algorithm. We show that the waveform dictionary is not necessary when all of the scalars are either non positive or non negative and the parameters of a wavelet dictionary on an integer lattice yields a closed-form solution for the initial optimization problem as part of the MP. Additionally, for any real-valued scalar sequence, we provide a solution with a related wavelet dictionary at each iteration of the algorithm. This allows for practical calculation of the approximating function, which we use to provide examples on simulated and real univariate time series data that display discontinuities in its underlying structure where the step function can be thought of as a sample from a signal of interest.

### Working Papers

Lorenz map, inequality ordering and curves based on multidimensional rearrangements (arXiv), Revise & Resubmit (RESTAT). With Yanqin Fan, Marc Henry, and Brendan Pass

Abstract. We propose a multivariate extension of the Lorenz curve based on multivariate rearrangements of optimal transport theory. We define a vector Lorenz map as the integral of the vector quantile map associated to a multivariate resource allocation. Each component of the Lorenz map is the cumulative share of each resource, as in the traditional univariate case. The pointwise ordering of such Lorenz maps defines a new multivariate majorization order. We define a multi-attribute Gini index and complete ordering based on the Lorenz map. We formulate income egalitarianism and show that the class of egalitarian allocations is maximal with respect to our inequality ordering over a large class of allocations. We propose the level sets of an Inverse Lorenz Function as a practical tool to visualize and compare inequality in two dimensions, and apply it to income-wealth inequality in the United States between 1989 and 2019.

Unobserved Grouped Heteroskedasticity and Fixed Effects (arXiv)

Abstract. This paper extends the linear grouped fixed effects (GFE) panel model to allow for heteroskedasticity from a discrete latent group variable. Key features of GFE are preserved, such as individuals belonging to one of a finite number of groups and group membership is unrestricted and estimated. Ignoring group heteroskedasticity may lead to poor classification, which is detrimental to finite sample bias and standard errors of estimators. I introduce the “weighted grouped fixed effects” (WGFE) estimator that minimizes a weighted average of group sum of squared residuals. I establish √NT-consistency and normality under a concept of group separation based on second moments. A test of group homoskedasticity is discussed. A fast computation procedure is provided. Simulations show that WGFE outperforms alternatives that exclude second moment information. I demonstrate this approach by considering the link between income and democracy and the effect of unionization on earnings.

### In Progress

Switching, Quitting, and Relapse: A Demand Model for Nicotine Products and Consumer Type-Heterogeneity

Health concerns regarding tobacco products have prompted firms to introduce supposedly healthier alternatives, such as e-cigarettes and nicotine pouches, which eliminate most harmful ingredients but retain nicotine. Nicotine addiction, a major driver of consumption, varies widely among individuals. Consumers often recognize the harm caused by these products and may attempt to quit or explore perceived safer alternatives in response to relapse. Addressing this public health issue requires effective strategies to reduce consumption, but the complexities of nicotine addiction and its dynamics make analysis challenging. To account for this, I introduce a demand model of continuous type heterogeneity where consumers of different types may have different preferences to substituting nicotine products, propensity to quit, and potency of addiction influencing choices. This discrete choice framework has individuals of similar types possess comparable parameter values as in type fixed effects models. Understanding the choice between switching, quitting, and relapse while controlling for type heterogeneity may reveal insights on the role of various government interventions to reduce the use of nicotine products.

Variable Productivity Heterogeneity from Farmer Skill Clusters: A case against misallocation in Uganda

Large income disparities between countries may be attributed to productivity misallocation. Gollin and Udry (2021) analyze data from farmers in Uganda and Tanzania, discovering significant productivity differences among farmers. They break down total factor productivity into heterogeneity, measurement error, and misallocation components, with heterogeneity and measurement error accounting for a substantial portion of the dispersion. While they employ various unit fixed effects, I propose considering time-varying unobserved group fixed effects of Bonhomme and Manresa (2015) to capture "skill groups" of farmers that evolve over time. Groups may be generated along the lines of how skills are disseminated or improved, capturing dynamic information on the productivity of groups of farmers over time potentially in a "learning-by-doing" manner. I am interested in how the dispersion changes once accounting for dynamic group productivity changes. To capture the variability in productivity these groups might exhibit, I assume unobserved group heteroskedasticity as in my paper Rivero (2023). To account for measurement error, I propose a novel GMM estimator with a weight matrix that also depends on the unobserved groupings. I expect the results of Gollin and Udry (2021) to be strengthened, but also offer insights on how the heterogeneity is evolving over time.

Latent Groups with Many Heterogeneous Moments

In the literature on panel data models with discrete unobservables, the case of group heterogeneous first and second moments is known (Bonhomme and Manresa (2015), Rivero (2023), respectively). This paper extends to heterogeneous moments beyond the second by introducing a novel estimator based on the Kantorovich-Wasserstein distance between probability distributions.

Generalized Basis Expansions for the Method of Sieves

The concept of frames in applied mathematics provides a flexible way to represent functions. Frames are collections of functions that can be used to represent any function in some "nice" function space as a series expansion, without the need for linear independence. The lack of linear independence implies redundancy in the set, resulting in non unique representations. While this may seem less desirable, it can be advantageous over a basis when a large number of basis coefficients must be calculated. Basis are often used in sieve approximations since it is known they will recover the true regression function, but the choice of complexity can be difficult. In principle, a sieve approximation based on frames could offer more flexibility since the choice of complexity may not be so costly. For example, a wavelet frame may contain a redundant number of translations and dilations of the mother wavelet so, for any fixed complexity, it may be possible to optimally select the pairs of dilations and translations to improve performance of the coefficient estimators.