BAYESIAN ESTIMATION OF RANDOM PARAMETER MODELS OF RESPONSES WITH NORMAL AND SKEW-t DISTRIBUTIONS EVIDENCE FROM MONTE CARLO SIMULATION

Random parameter models have been found to outperform fixed parameter models to estimate dose-response relationships with independent errors. A major restriction, however, is that the responses are assumed to be normally and symmetrically distributed. The purpose of this paper is to analyze Bayesian inference of random parameter response models in the case of independent responses with normal and skewed, heavy-tailed distributions by way of Monte Carlo simulation. Three types of Bayesian estimators are considered: one applying a normal, symmetrical prior distribution, a second applying a Skew-normal prior and, a third applying a Skew-t-distribution. We use the relative bias (RelBias) and Root Mean Squared Error (RMSE) as valuation criteria. We consider the commonly applied linear Quadratic and the nonlinear Spillman-Mitscherlich dose-response models. One simulation examines the performance of the estimators in the case of independent, normally and symmetrically distributed responses; the other in the case of independent responses following a heavy-tailed, Skew-t-distribution. The main finding is that the estimator based on the Skew-t prior outperforms the alternative estimators applying the normal and Skew-normal prior for skewed, heavy-tailed data. For normal data, the Skew-t prior performs approximately equally well as the Skewnormal and the normal prior. Furthermore, it is more efficient than its alternatives. Overall, the Skew-t prior seems to be preferable to the normal and Skew-normal for dose-response modeling. 2000 Mathematics Subject Classification: 62F15, 62H10, 62P10, 62P12. Received: 19-05-2017, revised: 04-09-2017, accepted: 04-09-2017.


Introduction
The linear Quadratic and the nonlinear Spillman-Mitscherlich model are commonly applied to analyze dose-response relationship with independent errors in a large variety of fields including environmental sciences, biology, public health, and agricultural sciences (de Souza et al. [7]; Pinheiro et al. [23]; WHO [37]).The model parameters are usually estimated by means of least squares under the assumptions of fixed parameters and errors that are independently and normally distributed with constant variances (Lopez-Bellido et al. [17]; Sain and Jauregui [28]).
A limitation of the standard fixed parameter models is that they preclude the variability of the parameters that may exist among subjects.A model that does not have this limitation is the random parameter response model (Makowski and Wallach [19]; Makowski and Lavielle [20]; Plan et al. [24]; Tumusiime et al. [34]; Wallach [35]).This model type assumes that the response functions are common to all subjects, but that the parameters vary between subjects.For this purpose a random component is associated with the coefficients that represents inter-individual variability.The random parameter models have been found to outperform the fixed parameter models (Boyer et al. [5]; Makowski et al. [18]; Makowski and Wallach [19]; Tumusiime et al. [34]).
The model parameters and the random errors are usually based on the assumption of independently, symmetrically, normally distributed response (Boyer et al. [5]; Makowski and Wallach [19]; Makowski and Lavielle [20]; Plan et al. [24]; Tumusiime et al. [34]).However, the assumption of normality may be too restrictive in many applications (Arellano-Valley et al. [1]- [2]; Jara et al. [11]; Ouedraogo and Brorsen [21]).Lachos et al. [14] proposed skewed linear mixed dose-response models when there is evidence of departure from symmetry or normality.The present paper deals with responses that follow the asymmetric heavy-tailed Skew-t distribution.The paper also considers responses that follow a normal distribution.
Random parameter dose-response models can be estimated by maximum likelihood (ML).However, for models that are nonlinear in the parameters, ML may lead to non-unique solutions (Brorsen [6]; Tembo et al. [33]).In addition, convergence may be problematic even with careful scaling and good starting values.Alternatively, Bayesian methods for which convergence of nonlinear estimation is not an issue, can be used (Brorsen [6]; Ouedraogo and Brorsen [21]).An additional advantage of the use of Bayesian methods is that the results are valid in small samples, which are quite common in dose-response modeling.
The objective of this paper is to investigate the performance of the Bayesian estimator with Skew-t prior of the random parameters linear Quadratic and the nonlinear Spillman-Mitscherlich model when the response follows (i) an asymmetric heavy-tailed Skew-t distribution, (ii) normal distribution.In addition to the Skew-t prior, we will also consider the commonly used normal, symmetrical prior, and the Skew normal prior.
The remainder of the paper is organized as follows.In Section 2, we briefly introduce the Skew-Normal (SN) and Skew-t (S-t) distribution and specify the linear Quadratic and nonlinear Spillman-Mitscherlich model which in practice are most frequently used to model dose-response relationships.Section 3 presents the Bayesian inference approach as well as the model comparison criteria.Section 4 outlines the simulation framework and Section 5 the simulation results.Conclusions follow in Section 6.
2. The Skew SN And Skew S-t Distribution and the Linear Quadratic and Nonlinear Spillman-Mitscherlich Model 2.1.The Independent Skew-Normal (SN) and Skew-t (S-t) Distribution.Lachos et al. [14] defined the family of Skew normal (SN) distributions as follows.
where µ is a location vector, U is a positive random variable with cumulative distribution function (cdf) H(u|v) and probability density function (pdf) h(u|v), and independent of the random vector Z, v is a scalar or vector of parameters indexing the distribution of U , which is a positive value and Z is a multivariate Skew-normal random vector with location 0, scale matrix Σ and Skewness parameter vector λ, i.e.Z ∼ SN p (0, Σ, λ).When U = u, Y follows a multivariate Skew-normal distribution with location vector µ, scale matrix u −1 Σ and Skewness parameter vector where φ p (.; µ, Σ) denotes the pdf of the p-variate normal distribution N p (µ, Σ), with mean vector µ and covariance matrix Σ, and and Φ(.) represents the cdf of the standard univariate normal distribution.We will use the notation Y ∼ SN p (µ, Σ, λ, H).
When λ = 0, the class of SN distributions reduces to the class of normal independent (N) distributions (Lachos et al. [15]; Lange and Sinsheimer [16]; Rosa et al. [27]), i.e., the class of thick-tailed distributions represented by the pdf We will use the notation Y ∼ N p (µ, Σ, H) for this case.
In the mixture model ( 1), when U = 1, Y is a multivariate Skew-normal distribution (SN) with location vector µ and covariance matrix Σ, and and Skewness parameter vector λ, i.e., Y ∼ SN p (µ, Σ, λ).The pdf of Y is where A convenient stochastic representation of Y for simulation purposes, particularly data generation, follows from Bandyopadhyay et al. [3]- [4]: where where t p (.; µ, Σ, v) and T (.; v) denote the pdf of the p-variate Student-t distribution and the cdf of the standard univariate t-distribution, respectively, A = As v ↑ ∞, we get the Skew-normal distribution.When λ = 0, the Skew-t distribution reduces to the Student-t distribution.
We first consider the general mixed model in which the random errors and the random parameters are independent and jointly normally distributed.The model reads: where T with η(.) the nonlinear or linear function of random parameters ϕ i , and covariate vector X i , A i and B i are known design matrices of dimensions n i × p and n i × q, respectively, β is the p × 1 vector of fixed parameter components (means), b i = (b 1i , ..., b qi ) T is the vector of random parameter components, and i = ( i1 , ...., in ) T is the vector of random errors, I ni denotes the identity matrix.The matrices D = D(α) with unknown parameter α is the q × q unstructured dispersion matrix of b i , σ 2 e the unknown variance of the error term and λ is the skewness parameter vector corresponding to the random components b i .
We assume that E(b i ) = E( i ) = 0, and the b i and i are uncorrelated, i.e.Cov(b i , i ) = 0.The model takes the within-subjects errors i to be symmetrically distributed and the random parameter b i to be asymmetrically distributed with mean zero (Bandyopadhyay et al. [4]; Lachos et al. [14]- [15]).When η(.) is a nonlinear parametric function, we have the SN-NonLinear Mixed Model (SN-NLMM); if η(.) is a linear parametric function, we have the SN-Linear Mixed Model (SN-LMM).
The general framework (3) gives the linear Quadratic and the nonlinear Spillman-Mitscherlich mixed model as follows: 1.The linear Quadratic mixed model: where for i = 1, 2, .., n, Y i is the response, X i the dose, γ 1 is the intercept ; γ 2 the fixed linear response coefficient; γ 3 the fixed quadratic response coefficient; b 1i , b 2i , and b 3i are the random response coefficients; and i is the random error term (Park et al. [22]; Tumusiime et.al. [34]).In this case, 2. The nonlinear Spillman-Mitscherlich mixed model: where the variables are as in (4), β 1 is the fixed maximum or potential response obtainable by the stimulus; β 2 is the fixed response increase induced by the stimulus; β 3 the ratio of successive increment in ouput β 1 to total output Y ; b 1i , b 2i , and b 3i are the random components; and i is the random error term (Tumusiime et.al. [34]) 3. Bayesian Inference, Gibbs Sampler, and Simulation Evaluation Criteria 3.1.Prior distributions and joint posterior density.As explained in the Introduction, we apply Bayesian inference to overcome the limitations of maximum likelihood.In spite of its advantages, Bayesian analysis also has some limitations.A mayor limitation which has hampered widespread implementation of the Bayesian approach, is that obtaining the posterior distribution often requires the integration of high-dimensional functions which can be analytically difficult.For simple models, the likelihood functions are standard, and, if one uses conjugate priors, deriving the posterior density analytically poses no major problems (This is the main reasons why conjugate priors are widely employed in Bayesian analysis).But Bayesian estimation quickly becomes challenging when working with more complicated models (possibly high-dimensional), or when one uses non-conjugate priors.Then analytical solution is not easy or may even be impossible.As a way out, Bayesian estimation using Markov Chain Monte Carlo (MCMC) simulation can be applied.Given a complex multivariate distribution, it is simpler to sample from a conditional distribution than to marginalize by integrating over a joint distribution.The MCMC approach proceeds on the basis of sampling from the complex distribution of interest.The algorithm departs from the previous sample value to generate the next sample value, thus generating a Markov chain.Specifically, let θ be the parameter of interest and let y 1 , ...., y n be the numerical values of a sample from the distribution f (y 1 , ...., y n |θ).Suppose we sample (with replacement) some S independent, random θ-values from the posterior distribution f (θ|y 1 , ...., y n ) : θ (1) , ..., θ (S) ∼ i.i.d f (θ|y 1 , ...., y n ).Then the empirical distribution of θ (1) , ..., θ (S) approximates f (θ|y 1 , ...., y n ) with the approximation improving with increasing S. The empirical distribution of θ (1) , ..., θ (S) is known as a Monte Carlo approximation to f (θ|y 1 , ...., y n ).Let g(θ) be (just about) any function of θ.The law of large numbers says that if θ (1) , ..., θ (S) are i.i.d samples from f (θ|y 1 , ...., y n ) then: 1 as S → ∞.For further details we refer to Gelman et al. [8].
For many multi parameter models the joint posterior distribution is nonstandard (i.e.not a density like the normal or gamma distribution) and thus difficult to directly sample from.The composition method is impossible or hard to apply in that case because it is hard to get the marginal distributions of the random variables of interest.That is, it may be difficult to apply the decomposition method to generate independent observations from such a density p(θ|y), because the joint posterior distribution cannot be defined as the product of marginal and conditional distributions.An alternative solution in that case consists of generating a sample of correlated values which approximately come from the joint posterior distribution.Even if the sample observations are dependent, Monte Carlo integration can be applied, if the observations can be generated so that their joint density is roughly the same as the joint density of a random sample.The standard MCMC algorithms are: • Metropolis • Metropolis-Hasting • Gibbs sampler The Metropolis sampler obtains the state of the chain at t + 1 by sampling a candidate point θ new from a proposal distribution q(.|θ (t) ) which is only dependent on the previous state θ (t) .The Metropolis algorithm can draw samples from any probability distribution f (θ|y) (target distribution), provided we can compute the value of a function q(θ|y) (proposal) that is proportional to the density of f .The lax requirement that q(θ|y) should be merely proportional to the target density, rather than exactly equal, makes the Metropolis algorithm particularly useful, because calculating the necessary normalization factor is often extremely difficult in practice.The algorithm works by generating a sequence of sample values in such a way that as more and more sample values are produced, the distribution of values more closely approximates the target distribution, f (θ|y).These sample values are produced iteratively with the distribution of the next sample being dependent only on the current sample value (thus making the sequence of samples a Markov chain).Specifically, at each iteration, the algorithm picks a candidate for the next sample value based on the current sample value.Then, with some probability, the candidate is either accepted (in which case the candidate value is used in the next iteration) or rejected (in which case the candidate value is discarded, and the current value is reused in the next iteration).The probability of acceptance is determined by comparing the current and candidate sample values of the function q(θ|y) with corresponding values of the target distribution f (θ|y).
The Metropolis sampler is based on a symmetric random-walk proposal distribution.A more general sampler is the Metropolis-Hastings algorithm which uses an asymmetric distribution: The Gibbs sampler is a special (simple) case of the Metropolis sampler in which the proposal distributions exactly match the posterior conditional distributions and proposals are accepted 100 % of the time.It decomposes the joint posterior distribution into full conditional distributions for each parameter in the model and then samples from them (A full conditional distribution is the conditional distribution of a parameter given all of the other parameters in the model).
The Gibbs sampler is efficient when the parameters are not highly dependent on each other and the full conditional distributions are easy to sample from.It is a popular sampling algorithm because it does not require a proposal distribution as the Metropolis method does, because the full conditional distribution is a standard distribution (e.g.normal or gamma).However, while deriving the full conditional distributions can be relatively easy, it is not always possible to find an efficient way to sample from these full conditional distributions.The Gibbs sampler proceeds as follows: 1. Set t = 0, and choose an arbitrary initial value of θ 0 = {θ Generate each vector component of θ as follows: • draw θ the number of desired samples, return to step 2. Otherwise, stop.
Software such as JAGS (Just Another Gibbs Sampler) applies Gibbs sampling to implement Bayesian inference based on Markov Chain Monte Carlo simulation.In the Appendix A we present an example of an R program of the Gibbs sampler for a Bivariate distribution adapted from Rizzo [26].
The challenge of MCMC simulation is the construction of a Markov chain whose values converge to the target distribution.The general approach is the Metropolis-Hastings sampling procedure.This algorithm simulates samples from a probability distribution by making use of the full joint density function and independent proposal distributions for each of the variables of interest.Below, we apply the Gibbs sampler which is a special case of the Metropolis-Hastings sampling procedure.Gibbs sampling decomposes the joint posterior distribution into full conditional distributions for each parameter in the model and then samples from them.The proposal distributions in the Gibbs sampler exactly match the posterior conditional distributions.The sampler is usually efficient when the parameters are not highly dependent on each other and the full conditional distributions is easy to decompose.

Simulation Setup
In the simulations, we consider the two most common dose-response models, i.e., the linear Quadratic model and the nonlinear Spillman-Mitscherlich model (Models (4) and (5), respectively).
The following A i and B i matrices were applied The scale matrix of the random components is To get insight into the performance of the estimators under increasing variance, we analyzed small, medium and large scale D matrices (scenario 1-3) as follows: Simulation σ b1 σ b2 σ b3 σ e Scenario 1 0.1 0.01 0.005 0.5 Scenario 2 1 0.1 0.05 1 Scenario 3 1.5 0.2 0.10 0.75 To analyze the Skew-t distributions, we generated β k + b ki and γ k + b ki , k = 1, 2, 3 according to the multivariate (right) Skew-t distribution St 3 ∼ (0, σ b k , 3, 4) and the i according to the t-distribution i ∼ t 1 (0, σ 2 , 4).For the multivariate normal distributions, we generated β k +b ki and γ k +b ki according to the multivariate normal distribution N 3 ∼ (0, σ b k ) and the i according to the normal distribution For each of the 100 simulated data sets, the linear Quadratic and the Spillman-Mitscherlich random parameter models were estimated under the assumption that (1) the density of random components was the Skew-t and the density of the errors the t distribution, (2) the random components and the errors were normally distributed (N).
The following independent priors were considered to analyze the Gibbs sampler : β k ∼ N (0, 10 3 ), γ k ∼ N (0, 10 3 ), σ e 2 ∼ IG(0.01, 0.01), Γ ∼ IW 3 (H) with H = diag(0.01)for the normal, Skew-normal, and Skew-t priors, and ∆ ∼ N (0, 0.001) for the Skew-normal and Skew-t priors and v ∼ Exp(0.1;(2, ∞)) for the Skew-t prior.For these prior densities, we generated two parallel independent runs of the Gibbs sampler chain of size 25 000 for each parameter.We disregarded the first 5 000 iterations to eliminate the effect of the initial value.To avoid potential autocorrelation, we used a thinning of 10.We assessed chain convergence using trace plots, autocorrelation plots and the Brooks-Gelman-Rubin scale reduction factor R (Gelman et al. [8]).We fitted the models using the R2jags package available in R (Su and Yajima [31]).We computed the relative bias (RelBias) and the Root Mean Square Error (RM SE) for each parameter estimate over 100 samples for each simulation.These statistics are defined as where θj is the estimate of θ for the j t h sample and N=100.Note also that are for the Spillman-Mitscherlich model with T=10 and Small-D the RelBias of β 3 in the case of the normal prior is smaller than in the case of the Skew-normal and Skew-t priors, while for large-D the RelBias of β 1 and β 2 for the three priors are similar.Moreover, for the Spillman-Mitscherlich model for large-D and an increasing number of observations up to 30 the RelBias and RM SE of β 1 and β 2 of the three priors worsen, but improve for increasing T.This sample size bias inconsistency was also observed by Hagenbuch [9].Apparently, one should increase the sample size substantially to reduce the bias.

Main Results
From the above it follows that except for some minor exceptions, for skewed heavy tailed data the Bayesian estimator with Skew-t prior is more accurate than in the case of the normal and the Skew normal prior.(Note that because of different scales, inferences regarding the variance component are not feasible (Lachos et al. [13]- [14]).3 and 4 show the overall fit statistics for the Spillman-Mitscherlich and linear Quadratic model.For both models the DIC, EAIC, and EBIC all tend to favor the Skew-t model for all sample sizes (T ) and for the three variance scenarios.The percentage (%) of samples that the criteria choose the Skew-t model as the best model increases with increasing number of observations.For T=75 the percentage is 100%.Note also that for Spillman-Mitscherlich data, T=10 and Small-D, the DIC selects the S-t model as the best model while the EAIC and EBIC select the normal model.For T=30 and large-D, all the measures favor the normal model.The different results are probably a consequence of the fact that the three measures penalize model complexity differently.According to Spiegelhalter et al. [30]), the AIC is based on the number of parameters, the BIC on the log sample size and the DIC on the effective number of parameters.i.e. on p D = ED(θ) − DE(θ), Note: Within brackets the percentage that the criterion selected the given prior where ED(θ) is the posterior mean of the deviance and ED(θ) is the deviance of posterior mean of the model parameters.Some studies showed that compared to DIC, the AIC and BIC favor simpler models (i.e. with less parameters) (Ward [36]; Spiegelhalter et al. [30]).
The above results show that the average RelBias (in absolute value) and the average RM SE (for all T, D, and the three parameters) of the Spillman Mitscherlich model are larger than of the linear Quadratic model.Moreover, for both models of SN are slightly smaller than of S-t.Note that for small-D and T=75 the average RelBias over all the three parameters of the normal prior is smaller than that of the Skew normal and Skew-t prior.For the average RM SEs of the three priors the opposite holds, however.Table 6 and Figures 2 show that in the case of normal data for the linear Quadratic model the average RelBias (in absolute value) (over all T, D, and the three parameters) of the normal prior (N) is larger than that of the Skew normal prior (SN), but slightly smaller than that of the Skew-t prior.However, the RM SE of the Skew-t prior is smaller than that of the normal and Skew normal prior.Moreover, the RM SE of the normal prior is smaller than that of the Skew normal prior.
Tables 7 and 8 show the overall fit statistics for the Spillman-Mitscherlich and linear Quadratic model for normal data.For T=10 and all D the DIC, EAIC, and EBIC favor the normal prior for both models (with some minor exceptions).For T=30 and 75 all measures favor the Skew-t prior, except for the Spillman-Mitscherlich model, T=30 and small-D where the normal prior is selected by all criteria.

Concluding Remarks
This paper analyzed by way of Monte Carlo simulation Bayesian inference of random parameter dose -response models with (i) normal and (ii) skewed, heavytailed (Skew-t) distributions of the random response parameter component and independently normally distributed errors.The data generating models were the Skew-t distribution and the normal distribution.The commonly applied linear Quadratic and the nonlinear Spillman-Mitscherlich dose -response model were estimated by means of Bayesian methods.Three priors were considered: a normal, symmetric prior, a Skew normal prior and, finally, a Skew-t prior.The first set of simulations examined the performance of the three priors in the case of the Skew-t data; the second in the case of normal data.
The simulation results showed that the Skew-t prior is more accurate and efficient than the normal and Skew-normal in the case of skewed heavy-tailed data.For random components that follow a normal distribution, the Skewed-t prior obtain comparable result as the Skew normal prior and more accurate and efficient than the normal in the case of the nonlinear Spillman-Mitscherlich model.For the linear Quadratic model the Skew normal prior is more accurate than the normal and slightly more accurate than the Skew-t prior.Furthermore, the Skew-t prior is more efficient than the normal and Skew normal prior.Overall, the Skew-t prior seems to be preferable to the normal and Skew-normal alternatives for dose response modeling, especially because skewed response data is more common than normal response data and the linear Quadratic model is preferable to the nonlinear Spillman-Mitscherlich model in many cases.

3. 2 .
Model comparison criteria.For model comparison, we use the deviance information criterion (DIC), the expected Akaike information criterion (EAIC) and the expected Bayesian information criterion (EBIC).These are based on the posterior mean of the deviance which can be approximated as D = Q q=1 D(θ q )/Q, where D(θ) = −2 n i=1 logf (y i |θ) and Q is the number of iterations.The EAIC, EBIC and DIC can be estimated using MCMC output as followsEAIC = D + 2p, EBIC = D + plog(N ), DIC = D + p vwhere D is the posterior mean of the deviance, p the number of parameters in the model, N the total number of observations, and p v the effective number of parameters defined as Variance (D)/2 (Plummer[25]; Spiegelhalter et al.[29]-[30]).

5. 1 .
(Right) Skewed-t response data.Tables1 and 2and Figures 1 show that for the nonlinear Spillman-Mitscherlich model and right-skewed, heavy-tailed response data, the average RelBias (in absolute value) and the average RM SE for all T, D, and the three parameters of the normal prior (N) are larger than for the Skew normal prior (SN).However, for the linear Quadratic model the opposite holds.Moreover, the average RelBias (in absolute value) and the average RM SE of the Skew-t priors (S-t) have the smallest values for all sample sizes (T ) and for the three variance for both models.

Figure 1 .
Figure 1.Average RelBias and RM SE of the Normal (N), Skew Normal (SN), and Skew-t (S-t) prior for the Spillman-Mitscherlich and linear Quadratic model for right-skewed data

Figure 2 .
Figure 2. Average RelBias and RM SE of the Normal (N), Skew Normal (SN), and Skew-t (S-t) prior for the Spillman-Mitscherlich and linear Quadratic model for normal data

Table 1 .
RelBias and RM SE of the Normal (N), Skew-Normal (SN), and Skew-t (S-t) prior for the Spillman-Mitscherlich model for right-

Table 2 .
RelBias and RM SE of the Normal (N), Skew-Normal (SN), and Skew-t (S-t) prior for the linear Quadratic model for right-skewed data

Table 4 .
[10] EAIC and EBIC for the Normal (N), Skew Normal (SN), and Skew-t (S-t) priors for the linear Quadratic model for right-skewed data EAIC, and EBIC all tend to favor the Skew-t model for all sample sizes (T) and for the three variance scenarios, although there are some minor exceptions for the Spillman-Mitscherlich model.These results indicate a major drawback of the nonlinear mixed model.According to Harring and Liu[10]estimation -including Bayesian estimation -of model parameters of nonlinear mixed model are not straightforward compared to its counterpart, the linear mixed model.The nonlinearity requires multidimensional integration to derive the needed marginal

Table 5 .
RelBias and RM SE of the Normal (N), Skew-Normal (SN), and Skew-t (S-t) prior for the Spillman-Mitscherlich model Normal data.From Table5and Figures 2 it follows that in the case of the Spillman-Mitscherlich model and normally distributed data, the normal prior average RelBias (in absolute value) and the average RM SE of the over all T, D, and the three parameters are larger than of the Skew normal prior and Skew-t prior.Furthermore, the average RelBias (in absolute value) and average RM SE

Table 6 .
RelBias and RM SE for the Normal (N), Skew-Normal (SN), and Skew-t (S-t) prior for the linear Quadratic model for normal data

Table 7 .
DIC, EAIC and EBIC for the Normal (N), Skew-Normal (SN), and Skew-t (S-t) prior for the Spillman-Mitscherlich model for normal data

Table 8 .
DIC, EAIC and EBIC for the Normal (N), Skew-Normal (SN), and Skew-t (S-t) prior for the linear Quadratic model for normal data