## IPW Estimator

The inverse probability weighted estimator of the value of a fixed regime, $$\widehat{\mathcal{V}}_{IPW}(d)$$, is defined as

$\widehat{\mathcal{V}}_{IPW}(d) = n^{-1} \sum_{i=1}^{n} \frac{\mathcal{C}_{d,i}Y_{i}}{\left\{\prod_{k=2}^{K}\pi_{d,k}(\overline{X}_{ki});\widehat{\gamma}_{k})\right\}\pi_{d,1}(X_{1i};\widehat{\gamma}_{1})},$ where

$\mathcal{C}_{d,i} = \text{I}\{\overline{A}_{i} = \overline{d}(\overline{X}_{i})\}$ is the indicator of whether or not all treatments actually received coincide with the options dictated by $$d$$.

For binary treatments $$\mathcal{A}_{k} = \{0,1\}$$, $$k=1, \dots, K$$,

$\pi_{d,k}(h_{k}; \gamma_{k}) = \pi_{k}(h_{k}; \gamma_{k}) ~ \text{I}\{d_{k}(h_{k}) = 1\} + \{1-\pi_{k}(h_{k}; \gamma_{k})\} ~ \text{I}\{d_{k}(h_{k}) = 0\}.$

Recall that $$\pi_{k}(h_{k}; \gamma_{k})$$ is a model for

$\pi_{k}(h_{k}) = P(A_{k} = 1 | H_{k} = h_{k}), \quad k = 1, \ldots, K.$

To streamline the presentation of methods, we consider only one propensity score model for each decision point. Note that we are not demonstrating a definitive analysis that one might do in practice, in which the analyst would use all sorts of variable selection techniques, etc., to arrive at a posited model. Our objective is to illustrate how to apply the methods; we have chosen to use the data generating models, which are a class of models that are likely to be familiar to many readers and will not obfuscate the objective.

Click on each tab below to see each model and basic model diagnostic steps.

The posited model for the first decision point is the model used to generate the data

$\pi_{1}(h_{1};\gamma_{1}) = \frac{\exp(\gamma_{10} + \gamma_{11}~\text{CD4_0})}{1+\exp(\gamma_{10} + \gamma_{11}~\text{CD4_0})}.$

### Modeling Object

The parameters of this model will be estimated using maximum likelihood via R’s stats::glm() function. Predictions will be made using stats::predict.glm(), which by default returns predictions on the scale of the linear predictors. For methods implemented in DynTxRegime, prediction for the propensity score must always be on the scale of the response variable, i.e., the probabilities. The modeling object for this model is specified as

p1 <- modelObj::buildModelObj(model = ~ CD4_0,
solver.method = 'glm',
solver.args = list(family='binomial'),
predict.method = 'predict.glm',
predict.args = list(type='response'))

### Model Diagnostics

Though the estimators are implemented in DynTxRegime such that the regression step is performed internally and model diagnostics can be performed post-analysis, it is convenient to do model diagnostics separately here. We will make use of modelObj::fit() to complete the regression step.

For $$\pi_{1}(h_{1};\gamma)$$ the regression can be completed as follows:

PS1 <- modelObj::fit(object = p1, data = dataMDP, response = dataMDP$A1) PS1 <- modelObj::fitObject(object = PS1) PS1  Call: glm(formula = YinternalY ~ CD4_0, family = "binomial", data = data) Coefficients: (Intercept) CD4_0 2.356590 -0.006604 Degrees of Freedom: 999 Total (i.e. Null); 998 Residual Null Deviance: 1315 Residual Deviance: 1228 AIC: 1232 where for convenience we have made use of modelObj’s fitObject() function to strip away the modeling object framework making PS1 an object of class “glm.” Though we know this model to be correctly specified, let’s examine the regression results. summary(object = PS1)  Call: glm(formula = YinternalY ~ CD4_0, family = "binomial", data = data) Deviance Residuals: Min 1Q Median 3Q Max -1.9010 -0.9414 -0.7107 1.2178 2.1054 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 2.3565904 0.3353980 7.026 2.12e-12 *** CD4_0 -0.0066038 0.0007588 -8.703 < 2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 1314.7 on 999 degrees of freedom Residual deviance: 1227.6 on 998 degrees of freedom AIC: 1231.6 Number of Fisher Scoring iterations: 4 In comparing the null deviance (1314.7) and the residual deviance (1227.6), we see that including the independent variable, $$\text{CD4_0}$$, reduces the deviance relative to the simpler constant propensity score model. This result is consistent with the fact that it is the model used to generate our data. The posited model for the second decision point is the model used to generate the data $\pi_{2}(h_{2};\gamma_{2}) = \frac{\exp(\gamma_{20} + \gamma_{21}~\text{CD4_6})}{1+\exp(\gamma_{20} + \gamma_{21}~\text{CD4_6})}.$ ### Modeling Object The parameters of this model will be estimated using maximum likelihood via R’s stats::glm() function. Predictions will be made using stats::predict.glm(), which by default returns predictions on the scale of the linear predictors. For methods implemented in DynTxRegime, prediction for the propensity score must always be on the scale of the response variable, i.e., the probabilities. The modeling object for this model is specified as p2 <- modelObj::buildModelObj(model = ~ CD4_6, solver.method = 'glm', solver.args = list(family='binomial'), predict.method = 'predict.glm', predict.args = list(type='response')) ### Model Diagnostics Though the estimators are implemented in DynTxRegime such that the regression step is performed internally and model diagnostics can be performed post-analysis, it is convenient to do model diagnostics separately here. We will make use of modelObj::fit() to complete the regression step. For $$\pi_{2}(h_{2};\gamma_{2})$$ the regression can be completed as follows: PS2 <- modelObj::fit(object = p2, data = dataMDP, response = dataMDP$A2)
PS2 <- modelObj::fitObject(object = PS2)
PS2

Call:  glm(formula = YinternalY ~ CD4_6, family = "binomial", data = data)

Coefficients:
(Intercept)        CD4_6
0.818685    -0.004294

Degrees of Freedom: 999 Total (i.e. Null);  998 Residual
Null Deviance:      951.8
Residual Deviance: 911.4    AIC: 915.4

where for convenience we have made use of modelObj’s fitObject() function to strip away the modeling object framework making PS2 an object of class “glm.”

Though we know this model to be correctly specified, let’s examine the regression results.

summary(object = PS2)

Call:
glm(formula = YinternalY ~ CD4_6, family = "binomial", data = data)

Deviance Residuals:
Min       1Q   Median       3Q      Max
-1.2416  -0.6751  -0.5610  -0.4157   2.2830

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept)  0.8186852  0.3733067   2.193   0.0283 *
CD4_6       -0.0042941  0.0006989  -6.144 8.05e-10 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 951.82  on 999  degrees of freedom
Residual deviance: 911.44  on 998  degrees of freedom
AIC: 915.44

Number of Fisher Scoring iterations: 4

In comparing the null deviance (951.8) and the residual deviance (911.4), we see that including the independent variable, $$\text{CD4_6}$$ reduces the deviance relative to the simpler constant propensity score model. This result is consistent with the fact that it is the model used to generate our data.

The posited model for the final decision point is the model used to generate the data

$\pi_{3}(h_{3};\gamma_{3}) = \frac{\exp(\gamma_{30} + \gamma_{31}~\text{CD4_12})}{1+\exp(\gamma_{30} + \gamma_{31}~\text{CD4_12})}.$

### Modeling Object

The parameters of this model will be estimated using maximum likelihood via R’s stats::glm() function. Predictions will be made using stats::predict.glm(), which by default returns predictions on the scale of the linear predictors. For methods implemented in DynTxRegime, prediction for the propensity score must always be on the scale of the response variable, i.e., the probabilities. The modeling object for this model is specified as

p3 <- modelObj::buildModelObj(model = ~ CD4_12,
solver.method = 'glm',
solver.args = list(family='binomial'),
predict.method = 'predict.glm',
predict.args = list(type='response'))

### Model Diagnostics

Though the estimators are implemented in DynTxRegime such that the regression step is performed internally and model diagnostics can be performed post-analysis, it is convenient to do model diagnostics separately here. We will make use of modelObj::fit() to complete the regression step.

For $$\pi_{3}(h_{3};\gamma_{3})$$ the regression can be completed as follows:

PS3 <- modelObj::fit(object = p3, data = dataMDP, response = dataMDP$A3) PS3 <- modelObj::fitObject(object = PS3) PS3  Call: glm(formula = YinternalY ~ CD4_12, family = "binomial", data = data) Coefficients: (Intercept) CD4_12 1.411752 -0.004945 Degrees of Freedom: 999 Total (i.e. Null); 998 Residual Null Deviance: 1252 Residual Deviance: 1203 AIC: 1207 where for convenience we have made use of modelObj’s fitObject() function to strip away the modeling object framework making PS3 an object of class “glm.” Though we know this model to be correctly specified, let’s examine the regression results. summary(object = PS3)  Call: glm(formula = YinternalY ~ CD4_12, family = "binomial", data = data) Deviance Residuals: Min 1Q Median 3Q Max -1.4176 -0.9026 -0.7373 1.2970 2.0555 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 1.411752 0.324038 4.357 1.32e-05 *** CD4_12 -0.004945 0.000734 -6.736 1.62e-11 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 1252.2 on 999 degrees of freedom Residual deviance: 1203.0 on 998 degrees of freedom AIC: 1207 Number of Fisher Scoring iterations: 4 In comparing the null deviance (1252.2) and the residual deviance (1203), we see that including the independent variable, $$\text{CD4_12}$$, reduces the deviance relative to the simpler constant propensity score model. This result is consistent with the fact that it is the model used to generate our data. For the inverse probability weighted estimator of the value of fixed regime $$d$$ for which $$\pi_{k}(h_{k})$$ are modeled using a logistic regression model and the parameters of which are estimated using maximum likelihood, the $$(1,1)$$ component of the sandwich estimator for the variance is defined as $\widehat{\Sigma}_{11} = A_{n} + (C_{n}^{\intercal}~D_{n}^{-1})~B_{n}~(C_{n}^{\intercal}~D_{n}^{-1})^{\intercal} ,$ where $$A_{n} = n^{-1} \sum_{i=1}^{n} A_{ni} A_{ni}$$ is the value estimator component $A_{ni} = \frac{C_{d,i}~Y_i}{\left\{\prod_{k=2}^{K}\pi_{d,k}(\overline{X}_{ki});\widehat{\gamma}_{k})\right\}\pi_{d,1}(X_{1i};\widehat{\gamma}_{1})} - \widehat{\mathcal{V}}_{IPW}(d).$ And, the components arising from the maximum likehood estimation of each of the $$k = 1, \dots, K$$ logistic regression models are $B_{n} = \left( \begin{array}{cc} B_{n1} & \cdots & 0\\ 0 & \ddots & 0\\ 0 & \cdots & B_{nK} \end{array} \right),$ where $$B_{nk} = n^{-1} \sum_{i=1}^{n} B_{nki} B_{nki}^{\intercal}$$, $$k=1,\dots,K$$ and $B_{nki} = \frac{ A_{ki} - \pi_{k}(\overline{X}_{ki};\widehat{\gamma}_{k})}{\pi_{k}(\overline{X}_{ki}; \widehat{\gamma}_{k})\{1-\pi_{k}(\overline{X}_{ki}; \widehat{\gamma}_{k})\}} \frac{\partial \pi_{k}(\overline{X}_{ki}; \widehat{\gamma}_{k})}{\partial \gamma_{k}}.$ Terms $$C_{n}$$ and $$D_{n}$$ are the derivatives with respect to the parameters of the propensity score models: $C_{n} = n^{-1}\sum_{i=1}^{n}\frac{\partial A_{ni}}{\partial \gamma} \quad \text{and} \quad D_{n} = n^{-1}\sum_{i=1}^{n} \frac{\partial B_{ni}}{\partial \gamma^{\intercal}}.$ $$\gamma = (\gamma_{1}^{\intercal}, \dots, \gamma_{K}^{\intercal})^{\intercal}$$. As was done in Chapters 2 and 3, we break the implementation of the value estimator and the sandwich estimator of the variance into multiple functions, each function calculating a specific component. In addition, we reuse function swv_ML() defined previously in Chapter 2. • value_IPW_se_md(): returns the value estimated using the inverse probability weighted estimator and its standard error estimated using the sandwich variance estimator. • value_IPW_swv_md() : returns the inverse probability weighted estimator component of the M-estimating equations and its derivatives, i.e., $$A_n$$ and $$C^{\intercal}_n$$ of the sandwich variance expression. • value_IPW_md(): returns the inverse probability weighted estimator of the value of a fixed regime. • swv_ML(): returns the component of the M-estimating question due to a maximum likelihood estimation of the parameters of a logistic regression model and the inverse of the derivatives of that component, i.e., $$B_n$$ and $$D^{-1}_n$$ of the sandwich variance expression. This implementation is not the most efficient implementation of the variance estimator. For example, the regression step is completed twice (once in value_IPW_swv_md() and once in swv_ML()). However, our objective is not to provide the most efficient implementation, but one that is general and reusable. #------------------------------------------------------------------------------# # IPW estimator for value of a fixed tx regime with binary tx at each of the # K>1 decision points # # ASSUMPTIONS # each of the tx is denoted by Ak k=(1,...,K) and is binary coded as {0,1} # # INPUTS # moPS : list of modeling objects for propensity score regressions # must be logistic models; kth element corresponds to kth decision # point # data : data.frame containing all covariates and txs # txs must be named 'Ak' (k=1,...,K) and coded 0/1 # y : outcome of interest # regime : matrix of 0/1 containing recommended txs # kth column corresponds to kth decision point # # RETURNS # a list containing # fitPS : list of value objects returned by propensity score regression # valueHat : estimated value #------------------------------------------------------------------------------# value_IPW_md <- function(moPS, data, y, regime) { # number of decision points K <- length(x = moPS) # number of patients in data set n <- nrow(x = data) # tx names in data set nms <- paste0('A',1L:K) #### Propensity Score Regression fitPS <- list() pid <- rep(x = 1.0, times = n) for (k in 1L:K) { # fit propensity score fitPS[[ k ]] <- modelObj::fit(object = moPS[[ k ]], data = data, response = data[,nms[k]]) # product of estimated propensity scores pk <- modelObj::predict(object = fitPS[[ k ]], newdata = data) if (is.matrix(x = pk)) pk <- pk[,ncol(x = pk), drop = TRUE] pid <- pid * {pk*{regime[,k] == 1L} + {1.0 - pk}*{regime[,k] == 0L}} } #### Value Estimate # indicator of tx received was that dictated by the regime Cd <- rowSums(x = regime == data[,nms]) == K value <- mean(x = Cd * y / pid) return( list( "fitPS" = fitPS, "valueHat" = value) ) } #------------------------------------------------------------------------------# # standard error for the IPW estimator of the value of a fixed regime # with K > 1 decision points estiamted using the sandwich estimator of variance # # ASSUMPTIONS # all K propensity score models are logistic models # txs are denoted by Ak (k=1,...,K) and are binary coded as {0,1} # # INPUTS # moPS : list of modeling objects for propensity score regressions # must be logistic models; kth element corresponds to kth decision # point # data : data.frame containing all covariates and txs # txs must be named 'Ak' (k=1,...,K) and coded 0/1 # y : outcome of interest # regime : matrix of 0/1 containing recommended txs # kth column corresponds to kth decision point # # RETURNS # a list containing # fitPS : list of value objects returned by propensity score regression # valueHat : estimated value # sigmaHat : estimated standard error #------------------------------------------------------------------------------# value_IPW_se_md <- function(moPS, data, y, regime) { # number of decision points K <- length(x = moPS) # number of patients in data set n <- nrow(x = data) # tx names in data set nms <- paste0('A',1L:K) #### ML components invdM <- matrix(data = 0.0, nrow = 0L, ncol = 0L) MM <- matrix(data = 0.0, nrow = 0L, ncol = 0L) for (k in 1L:K) { ML <- swv_ML(mo = moPS[[ k ]], data = data, y = data[,nms[k]]) invdM <- rbind(cbind(invdM, matrix(data = 0.0, nrow = nrow(x = invdM), ncol = ncol(x = ML$invdM))),
cbind(matrix(data = 0.0,
nrow = nrow(x = ML$invdM), ncol = ncol(x = invdM)), ML$invdM))
MM <- rbind(cbind(MM,
matrix(data = 0.0,
nrow = nrow(x = MM),
ncol = ncol(x = ML$MM))), cbind(matrix(data = 0.0, nrow = nrow(x = ML$MM),
ncol = ncol(x = MM)),
ML$MM)) } #### IPW value components IPW <- value_IPW_swv_md(moPS = moPS, data = data, y = y, regime = regime) #### variance estimator temp <- IPW$dMdG %*% invdM
sig11 <- drop(x = IPW$MM + temp %*% MM %*% t(x = temp)) sig11 <- sqrt(x = sig11 / n) return( c(IPW$value, "sigmaHat" = sig11) )
}

#------------------------------------------------------------------------------#
# component of sandwich estimator for IPW estimator of value of a fixed regime
#   with K > 1
#
# ASSUMPTIONS
#   all K propensity score models are logistic models
#   txs are denoted by Ak (k=1,...,K) and are binary coded as {0,1}
#
# INPUTS
#  moPS   : list of modeling objects for propensity score regressions
#           must be logistic models; kth element corresponds to kth decision
#           point
#  data   : data.frame containing all covariates and txs
#           txs must be named 'Ak' (k=1,...,K) and coded 0/1
#  y      : outcome of interest
#  regime : matrix of 0/1 containing recommended txs
#           kth column corresponds to kth decision point
#
# RETURNS
#  a list containing
#  value : list returned by value_IPW()
#     MM : M M^T matrix
#   dMdG : matrix of derivatives of M wrt gamma
#------------------------------------------------------------------------------#
value_IPW_swv_md <- function(moPS, data, y, regime) {

# estimated value
value <- value_IPW_md(moPS = moPS, data = data, y = y, regime = regime)

# number of decision points
K <- length(x = moPS)

# number of patients in data set
n <- nrow(x = data)

# tx names in data set
nms <- paste0('A',1L:K)

piMM <- list()
pid <- rep(x = 1.0, times = n)

for (k in 1L:K) {

# pi(x_k; gamma_k)
pk <- modelObj::predict(object = value$fitPS[[ k ]], newdata = data) if (is.matrix(x = pk)) pk <- pk[,ncol(x = pk), drop = TRUE] # propensity for receiving recommended tx pid <- pid * {pk*{regime[,k] == 1L} + {1.0 - pk}*{regime[,k] == 0L}} # derivative piMM[[ k ]] <- stats::model.matrix(object = modelObj::model(object = moPS[[ k ]]), data = data)*{-regime[,k] + pk} } # indicator of tx received was that dictated by the regime Cd <- as.integer(x = {rowSums(x = regime == data[,nms]) == K}) # value component of M MT mmValue <- mean(x = {Cd * y / pid - value$valueHat}^2)

#### derivative w.r.t. gamma
dMdG <- NULL

for (k in 1L:K) {
dMdG <- c(dMdG, colMeans(x = Cd * y / pid * piMM[[ k ]]))
}

return( list("value" = value,
"MM" = mmValue,
"dMdG" = dMdG) )

}

#------------------------------------------------------------------------------#
# components of the sandwich estimator for a maximum likelihood estimation of
#   a logistic regression model
#
# ASSUMPTIONS
#   mo is a logistic model with parameters to be estimated using ML
#
# INPUTS
#    mo : a modeling object specifying a regression step
#         *** must be a logistic model ***
#  data : a data.frame containing covariates and tx
#     y : a vector containing the binary outcome of interest
#
#
# RETURNS
# a list containing
#     MM : M^T M component from ML
#  invdM : inverse of the matrix of derivatives of M wrt gamma
#------------------------------------------------------------------------------#
swv_ML <- function(mo, data, y) {

# regression step
fit <- modelObj::fit(object = mo, data = data, response = y)

# yHat
p1 <- modelObj::predict(object = fit, newdata = data)
if (is.matrix(x = p1)) p1 <- p1[,ncol(x = p1), drop = TRUE]

# model.matrix for model
piMM <- stats::model.matrix(object = modelObj::model(object = mo),
data = data)

n <- nrow(x = piMM)
p <- ncol(x = piMM)

# ML M-estimator component
mMLi <- {y - p1} * piMM

# ML component of M MT
mmML <- sapply(X = 1L:n,
FUN = function(i){ mMLi[i,] %o% mMLi[i,] },
simplify = "array")

# derivative of ML component
dFun <- function(i){
- piMM[i,] %o% piMM[i,] * p1[i] * {1.0 - p1[i]}
}
dmML <- sapply(X = 1L:n, FUN = dFun, simplify = "array")

if( p == 1L ) {
mmML <- mean(x = mmML)
dmML <- mean(x = dmML)
} else {
mmML <- rowMeans(x = mmML, dim = 2L)
dmML <- rowMeans(x = dmML, dim = 2L)
}

# invert derivative ML component
invD <- solve(a = dmML)

return( list("MM" = mmML, "invdM" = invD) )

}

With this implementation, it is straightforward to estimate the value and the standard error when logistic regression models are posited for each of the $$K$$ decision points. The propensity score models considered are described and reviewed under tab $$\pi_{k}(h_{k};\gamma_{k})$$.

As was done for the single decision point estimator, we compare the estimated value under a variety of fixed regimes. Each tab below illustrates how to define the fixed regime under consideration and complete an analysis using the posited propensity score models.

First, we consider static regime $$d = \{d_{1}(x_{1}) \equiv 1, d_{2}(\overline{x}_{2}) \equiv 1, d_{3}(\overline{x}_{3}) \equiv 1\}$$. That is, all individuals are recommended to receive treatment $$A_{k} = 1$$, $$k = 1, 2, 3$$.

The estimated value of this fixed regime and its standard error are obtained as follows:

p1 <- modelObj::buildModelObj(model = ~ CD4_0,
solver.method = 'glm',
solver.args = list(family='binomial'),
predict.method = 'predict.glm',
predict.args = list(type='response'))
p2 <- modelObj::buildModelObj(model = ~ CD4_6,
solver.method = 'glm',
solver.args = list(family='binomial'),
predict.method = 'predict.glm',
predict.args = list(type='response'))
p3 <- modelObj::buildModelObj(model = ~ CD4_12,
solver.method = 'glm',
solver.args = list(family='binomial'),
predict.method = 'predict.glm',
predict.args = list(type='response'))
regime <- matrix(data = 1L, nrow = nrow(x = dataMDP), ncol = 3L)
value_IPW_se_md(moPS = list(p1, p2, p3),
data = dataMDP,
y = dataMDP$Y, regime = regime) $fitPS
$fitPS[[1]] Call: glm(formula = YinternalY ~ CD4_0, family = "binomial", data = data) Coefficients: (Intercept) CD4_0 2.356590 -0.006604 Degrees of Freedom: 999 Total (i.e. Null); 998 Residual Null Deviance: 1315 Residual Deviance: 1228 AIC: 1232$fitPS[[2]]

Call:  glm(formula = YinternalY ~ CD4_6, family = "binomial", data = data)

Coefficients:
(Intercept)        CD4_6
0.818685    -0.004294

Degrees of Freedom: 999 Total (i.e. Null);  998 Residual
Null Deviance:      951.8
Residual Deviance: 911.4    AIC: 915.4

$fitPS[[3]] Call: glm(formula = YinternalY ~ CD4_12, family = "binomial", data = data) Coefficients: (Intercept) CD4_12 1.411752 -0.004945 Degrees of Freedom: 999 Total (i.e. Null); 998 Residual Null Deviance: 1252 Residual Deviance: 1203 AIC: 1207$valueHat
[1] 160.1139

$sigmaHat [1] 48.43737 We see that the inverse probability weighted estimator of the value under $$d = \{d_{1}(x_{1}) \equiv 1, d_{2}(\overline{x}_{2}\} \equiv 1, d_{3}(\overline{x}_{3}) \equiv 1)$$ is 160.11 (48.44) cells/mm$$^3$$ for this data set. Note that only 28 individuals in our data set follow this regime and thus contribute to the estimated value, which represents a very small percentage of the data. We have carried out a simulation study to evaluate the performance of the presented method. We generated 2 Monte Carlo data sets, each with $$n=1000$$. The Monte Carlo average estimated value and standard deviation of the estimated value are 13.99 (221.38). In addition, the Monte Carlo average of the standard errors is 145.48. Next, we consider static regime $$d = \{d_{1}(x_{1}) \equiv 0, d_{2}(\overline{x}_{2}) \equiv 0, d_{3}(\overline{x}_{3}) \equiv 0\}$$. That is, all individuals are recommended to receive treatment $$A_{k} = 0$$, $$k = 1, 2, 3$$. The estimated value of this fixed regime and its standard error are obtained as follows: p1 <- modelObj::buildModelObj(model = ~ CD4_0, solver.method = 'glm', solver.args = list(family='binomial'), predict.method = 'predict.glm', predict.args = list(type='response')) p2 <- modelObj::buildModelObj(model = ~ CD4_6, solver.method = 'glm', solver.args = list(family='binomial'), predict.method = 'predict.glm', predict.args = list(type='response')) p3 <- modelObj::buildModelObj(model = ~ CD4_12, solver.method = 'glm', solver.args = list(family='binomial'), predict.method = 'predict.glm', predict.args = list(type='response')) regime <- matrix(data = 0L, nrow = nrow(x = dataMDP), ncol = 3L) value_IPW_se_md(moPS = list(p1, p2, p3), data = dataMDP, y = dataMDP$Y,
regime = regime)
$fitPS$fitPS[[1]]

Call:  glm(formula = YinternalY ~ CD4_0, family = "binomial", data = data)

Coefficients:
(Intercept)        CD4_0
2.356590    -0.006604

Degrees of Freedom: 999 Total (i.e. Null);  998 Residual
Null Deviance:      1315
Residual Deviance: 1228     AIC: 1232

$fitPS[[2]] Call: glm(formula = YinternalY ~ CD4_6, family = "binomial", data = data) Coefficients: (Intercept) CD4_6 0.818685 -0.004294 Degrees of Freedom: 999 Total (i.e. Null); 998 Residual Null Deviance: 951.8 Residual Deviance: 911.4 AIC: 915.4$fitPS[[3]]

Call:  glm(formula = YinternalY ~ CD4_12, family = "binomial", data = data)

Coefficients:
(Intercept)       CD4_12
1.411752    -0.004945

Degrees of Freedom: 999 Total (i.e. Null);  998 Residual
Null Deviance:      1252
Residual Deviance: 1203     AIC: 1207

$valueHat [1] 1095.729$sigmaHat
[1] 60.60388

We see that the inverse probability weighted estimator of the value under $$d = \{d_{1}(x_{1}) \equiv 0, d_{2}(\overline{x}_{2}) \equiv 0, d_{3}(\overline{x}_{3}) \equiv 0\}$$ is 1095.73 (60.60) cells/mm$$^3$$. In the current data set, 376 individuals follow this regime; a small percentage of the total data, but significantly more than regime $$d = \{d_{1}(x_{1}) \equiv 1, d_{2}(\overline{x}_{2}) \equiv 1, d_{3}(\overline{x}_{3}) \equiv 1\}$$.

We have carried out a simulation study to evaluate the performance of the presented method. We generated 2 Monte Carlo data sets, each with $$n=1000$$. The Monte Carlo average estimated value and standard deviation of the estimated value are 1106.47 (28.75). In addition, the Monte Carlo average of the standard errors is 60.65.

The final fixed regime we consider is a covariate dependent regime $$d = \{d_{1}(x_{1}) = \text{I}(\text{CD4_0} < 250 ~\text{cells}/\text{mm}^{3}),$$$$d_{2}(\overline{x}_{2}) = \text{I}(\text{CD4_6} < 360~\text{cells}/\text{mm}^{3}),$$$$d_{3}(\overline{x}_{3}) = \text{I}(\text{CD4_12} < 300~\text{cells}/\text{mm}^{3})\}$$.

The estimated value of this fixed regime and its standard error are obtained as follows:

p1 <- modelObj::buildModelObj(model = ~ CD4_0,
solver.method = 'glm',
solver.args = list(family='binomial'),
predict.method = 'predict.glm',
predict.args = list(type='response'))
p2 <- modelObj::buildModelObj(model = ~ CD4_6,
solver.method = 'glm',
solver.args = list(family='binomial'),
predict.method = 'predict.glm',
predict.args = list(type='response'))
p3 <- modelObj::buildModelObj(model = ~ CD4_12,
solver.method = 'glm',
solver.args = list(family='binomial'),
predict.method = 'predict.glm',
predict.args = list(type='response'))
regime <- cbind(as.integer(x = {dataMDP$CD4_0 < 250.0}), as.integer(x = {dataMDP$CD4_6 < 360.0}),
as.integer(x = {dataMDP$CD4_12 < 300.0})) value_IPW_se_md(moPS = list(p1, p2, p3), data = dataMDP, y = dataMDP$Y,
regime = regime)
$fitPS$fitPS[[1]]

Call:  glm(formula = YinternalY ~ CD4_0, family = "binomial", data = data)

Coefficients:
(Intercept)        CD4_0
2.356590    -0.006604

Degrees of Freedom: 999 Total (i.e. Null);  998 Residual
Null Deviance:      1315
Residual Deviance: 1228     AIC: 1232

$fitPS[[2]] Call: glm(formula = YinternalY ~ CD4_6, family = "binomial", data = data) Coefficients: (Intercept) CD4_6 0.818685 -0.004294 Degrees of Freedom: 999 Total (i.e. Null); 998 Residual Null Deviance: 951.8 Residual Deviance: 911.4 AIC: 915.4$fitPS[[3]]

Call:  glm(formula = YinternalY ~ CD4_12, family = "binomial", data = data)

Coefficients:
(Intercept)       CD4_12
1.411752    -0.004945

Degrees of Freedom: 999 Total (i.e. Null);  998 Residual
Null Deviance:      1252
Residual Deviance: 1203     AIC: 1207

$valueHat [1] 1118.004$sigmaHat
[1] 61.41719

We see that the IPW estimator of the value under $$d = \{d_{1}(x_{1}) = \text{I}(\text{CD4_0} < 250 ~\text{cells}/\text{mm}^{3}),$$$$d_{2}(\overline{x}_{2}) = \text{I}(\text{CD4_6} < 360~\text{cells}/\text{mm}^{3}),$$$$d_{3}(\overline{x}_{3}) = \text{I}(\text{CD4_12} < 300~\text{cells}/\text{mm}^{3})\}$$ is 1118.00 (61.42) cells/mm$$^3$$. As we saw in the second example, 379 individuals follow this regime.

We have carried out a simulation study to evaluate the performance of the presented method. We generated 2 Monte Carlo data sets, each with $$n=1000$$. The Monte Carlo average estimated value and standard deviation of the estimated value are 1060.69 (41.98). In addition, the Monte Carlo average of the standard errors is 62.22.