# Ordinal scale

Many applied studies collect one or more ordered categorical predictors, which do not fit neatly within classic regression frameworks. In this paper, we discuss the benefit of taking a smoothing spline approach to the modeling of ordinal predictors. The purpose of this paper is to provide theoretical insight into the ordinal smoothing spline, as well as examples revealing the potential of the ordinal smoothing spline for various types of applied research.

Specifically, we i derive the analytical form of the ordinal smoothing spline reproducing kernel, ii propose an ordinal smoothing spline isotonic regression estimator, iii prove an asymptotic equivalence between the ordinal and linear smoothing spline reproducing kernel functions, iv develop large sample approximations for the ordinal smoothing spline, and v demonstrate the use of ordinal smoothing splines for isotonic regression and semiparametric regression with multiple predictors. Our results reveal that the ordinal smoothing spline offers a flexible approach for incorporating ordered predictors in regression models, and has the benefit of being invariant to any monotonic transformation of the predictor scores.

The General Linear Model GLM see [ 1 ] is one of the most widely applied statistical methods, with applications common in psychology [ 2 ], education [ 3 ], medicine [ 4 ], business [ 5 ], and several other disciplines. The GLM's popularity in applied research is likely due to a combination of the model's interpretability and flexibility, as well as easy availability through R [ 6 ] and commercial statistical softwares e.

The GLM and its generalized extension GzLM; see [ 7 ] are well-equipped for modeling relationships between variables of mixed types, i. However, many studies collect one or more ordered categorical variables, which do not fit neatly within the GLM framework. For example, in finance it is typical to rate the risk of investments on an ordinal scale very low risk, low risk, …, high risk, very high riskand a typical goal is to model expected returns given an investment's risk.

In medical studies, severity of symptoms very low, low, …, high, very high and adherence to treatment never, rarely, …, almost always, always are often measured on ordinal

scales, and a typical goal is to study patient outcomes in response to different treatments after controlling for symptom severity and treatment adherence.

Psychological attributes such as personality and intelligence are typically measured on an ordinal scale, e.

The examples mentioned in the previous paragraph represent just a glimpse of the many ways in which ordinal variables are relevant to our day-to-day financial, physical, and mental health. When it comes to modeling ordinal outcome response variables, there are a multitude of potential methods discussed in the literature see [ 8 - 12 ]. In nearly all cases, ordinal predictors are treated as either nominal unordered or continuous variables in regression models, which can lead to convoluted and possibly misleading results.

We refer to this method as naive for two reasons: Although the dummy coding approach will suffice for certain applications, this method is far from ideal. Furthermore, if the number of levels K is large, the dummy coding approach could be infeasible. Another possibility for including an ordinal predictor X in a regression model is to simply treat X as a continuous variable. In some cases, researchers make an effort to code the levels of an ordinal predictor X such that the relationship between X and Y is approximately linear.

However, this approach is problematic for several reasons i the slope coefficient in such a model has no meaning, ii different researchers could concoct different coding schemes for the same data, which would hinder research comparability and reproducibility, iii the ordinal nature of the predictor X is ignored, which is undesirable.

Penalized regression provides a promising framework for including ordinal predictors in regression models [ 13 ], given that an appropriate penalty can simultaneously induce order information on the solution and stabilize the estimation.

Gertheiss and Tutz [ 13 ] discuss how a binary design matrix in combination with a squared difference penalty can be used to fit regression models with ordinal predictors. This approach is implemented in the R package ordPens [ 14 ], which fits models containing additive effects of ordinal and metric continuous predictors.

As a result, this approach and the ordPens R package offer no method for examining interaction effects between multiple ordinal predictors or interaction effects between ordinal and metric predictors. In this paper, we discuss the benefits of taking a smoothing spline approach [ 1516 ] to the modeling of ordinal predictors.

This approach has been briefly mentioned as a possibility [ 15p. Expanding the work of Gertheiss and Tutz [ 13 ] and Gu [ 15 ], we Section 3. Our results reveal that the reproducing kernel function only depends on rank information, so the ordinal smoothing spline estimator is invariant to any monotonic transformation of the predictor scores.

We also Section 3.

Furthermore, we Section 3. Finally, we demonstrate the potential of the ordinal smoothing spline for applied research via a simulation study and two real data examples. Our simulation study Section 4 reveals that the ordinal smoothing spline can outperform the linear smoothing spline and classic isotonic regression algorithms when analyzing monotonic functions with various degrees of smoothness.

Our real data results Section 5 demonstrate that the ordinal smoothing spline—in combination with the powerful smoothing spline ANOVA framework [ 15 ]—provides an appealing approach for including ordinal predictors in regression models.

The symmetric and non-negative definite function. Assume a nonparametric regression model see [ 151620 - 23 ]. Note that, by definition, the quadratic penalty functional J is the inner-product of the contrast space H 1i. The essential components for the formation of a smoothing spline include: Table 1 provides the information needed to form three common smoothing splines nominal, polynomial, and thin-plateas well as the ordinal smoothing spline.

See Gu [ 15 ] and Helwig and Ma [ 26 ] for more information about nominal and polynomial smoothing splines, and see Gu [ 15 ], Helwig and Ma [ 26 ], Duchon [ 27 ], Meinguet [ 28 ], and Wood [ 29 ] for more information about thin-plate splines.

More information about the ordinal smoothing spline will be provided in Section 3. If the predictors are all continuous with similar scale, a thin-plate spline can be used. Specifically, when solving the penalized least squares functional, we can define.

The fitted values can be written as. The smoothing spline solution in Equation can be interpreted as a Bayesian estimate of a Gaussian process [ 36 - 40 ]. The corresponding Bayesian covariance matrix estimator is. See the Supplementary Material for a proof. Note that Theorem 3. When applying the Bayesian interpretation in Section 2.

Using the reproducing kernel definition in the first line of Theorem 3. Specifically, we could reparameterize the ordinal smoothing spline problem as. See the Supplementary Material for the proof. Bottom Frobenius norm between the standardized ordinal and linear smoothing spline reproducing kernel matrices Q as a function of K.

If the number of elements of the ordered set K is quite large, then using all K levels as knots would be computationally costly. When monotonicity constraints are needed, the ordinal smoothing spline would be computationally costly for large K.

This is because the factorization of the reproducing kernel into the outer product of monotonic functions, i. For a scalable approximation to the ordinal smoothing spline isotonic regression estimator,

the penalty functional itself can be approximated such as. The modified RKHS is. Using the modified reproducing kernel, the reparameterized ordinal smoothing spline problem is. To investigate the performance of the ordinal smoothing spline, we designed a simulation study that manipulated two conditions: We compared four different methods as a part of the simulation: Methods a — c were fit using the bigsplines package [ 41 ] in R.

For the smoothing spline methods, we used the same sequence of 20 points as knots to ensure that differences in the results are not confounded by differences in the knot locations. The data generating functions from the simulation study. The RMSE for each method is displayed in Table 2 and Figure 4which clearly reveal the benefit of the ordinal smoothing spline. Figure 4 also reveals that the monotonicity constrained estimator using the knot-approximated reproducing kernel function described in Section 3.

Furthermore, the simulation results in Figure 4 demonstrate that the ordinal smoothing spline systematically outperforms the default isotonic regression routine.

Thus, the ordinal smoothing spline offers an effective alternative to classic nonparametric and isotonic regression methods, and—unlike the linear smoothing spline—the ordinal smoothing spline has the benefit of being invariant to any monotonic transformation of x.

Median root mean squared error RMSE across simulation replications. To demonstrate the power of the monotonic ordinal smoothing spline, we use open source data to examine the relationship between income and educational attainment. The predicted mean incomes for each educational attainment level and sex are plotted in Figure 5which has some striking trends.

This disturbing trend continues to magnify at each education level, such that women receive a smaller expected return on their education. Monotonic ordinal smoothing spline solution showing the relationship between education and income for males blue circle and females red triangle.

As a second example, we use math performance data from Portuguese secondary students. In this example, we use the math performance data student-mat. We focus on predicting the students' scores on the first exam during the period G1. Unlike Cortez et al. By discovering factors that relate to poor math performance on the first exam, it may be possible to create student-specific interventions e.

In Table 3we describe the 15 predictor variables that we include in our model. To model the math exam scores, we fit a regression model of the form. We used a cubic smoothing spline marginal reproducing kernel for the integer valued variables age, failures, absences because these variables are measured on a ratio scale. We used the ordinal smoothing spline reproducing kernel see Theorem 3.

The full smoothing spline solution was fit, i. The smoothing parameters were chosen by minimizing the GCV criterion [ 35 ]. This approach has been shown to produce results that are essentially identical to the fully optimal solution see [ 1526 ]. Examining the top row of Figure 6,

it is evident that only two of the six parametric effects has a significant effect: The signs and magnitudes of these significant coefficients indicate that i males tend to get higher math exam scores than females, and ii students who receive extra educational support from their families tend to get lower math exam scores.

This second point may seem counter-intuitive, because one may think that extra educational support should lead to higher grades.

However, it is likely that the students who receive extra support are receiving this extra support for a reason. Examination of the remaining subplots in Figure 6 reveals that the number of prior course failures has the largest negative effect on the expected math scores, which is not surprising.

Given the other effects in the model, the number of absences has no effect on the expected math scores, which is surprising. Having a mother who completed higher education increases a student's expected math exam score, whereas there were no significant differences between the other four lesser levels of the mother's education.

Studying for 5 or more hours per week increases a student's expected scores, whereas studying less than 2 h per week decreases a student's expected scores. Given the other effects, travel time to school and a student's health did not have significant effects on the math exam scores. Our simulation and real data examples reveal the flexibility and practical potential of the ordinal smoothing spline.

The simulation study investigated the potential of the ordinal smoothing spline for isotonic regression, as well Ordinal scale definition statistics of sexual immorality

the relationship between the ordinal and linear smoothing spline reproducing kernel functions.

The simulation results demonstrated that i the ordinal smoothing spline can outperform the linear smoothing spline at small samples, ii the ordinal smoothing spline performs similar to the linear smoothing spline for large K, and iii monotonic ordinal smoothing splines can outperform standard isotonic regression approaches. Thus, the simulation study illustrates the results in Theorems 3. The first example income by education and sex offers a practical example of the potential of the ordinal smoothing spline for discovering monotonic trends in data.

