Estimating Individualized Treatment Rules Using Outcome Weighted Learning | by Nadav Har-Tuv | Mar, 2024


Simulating data can test the owl method. We create the reward function so that we know what the optimal treatment is for every patient. We can then train the OWL classifier on the data and check how well it fits the optimal classifier.

For example:

I created 50 features that are all sampled from a U([-1,1]) distribution. I gave the patients one of three treatments {1,2,3} at random, uniformly.

The response function is sampled from a N(μ, 1) distribution, where μ = (X₁ + X₂)*I(T=1) + (X₁ — X₂)*I(T=2) + (X₂-X₁)*I(T=3)

# This code block creates the data for the simulation
import numpy as np

n_train = 500 # I purposely chose a small training set to simulate a medical trial
n_col = 50 # This is the number of features
n_test = 1000
X_train = np.random.uniform(low = -1, high = 1, size = (n_train, n_col))
T = np.random.randint(3, size = n_train) # Treatments given at random uniformly
R_mean = (X_train[:,0]+X_train[:,1])*(T==0) + (X_train[:,0]-X_train[:,1])*(T==1) + (X_train[:,1]-X_train[:,0])*(T==2)
R = np.random.normal(loc = R_mean, scale = .1) # The stanadard deviation can be tweaked
X_test = np.random.uniform(low = -1 , high = 1, size = (n_test, n_col))

# The optimal classifier can be deduced from the design of R
optimal_classifier = (1-(X_test[:,0] >0)*(X_test[:,1]>0))*((X_test[:,0] > X_test[:,1]) + 2*(X_test[:,1] > X_test[:,0]))

It is not hard to see that the optimal treatment regime is to give treatment 1 if both X₁ and X₂ are positive. If they are both negative, give treatment 2 if X₂<X₁ and give treatment 3 if X₁<X₂. If X₁ is positive and X₂ is negative, give treatment 2. If X₂ is positive and X₁ is negative, give treatment 3.

Or we can show this with an image. These are the different ranges of the optimal treatment, shown for ranges of X₁, X₂:

Optimal treatment ranges for combinations of X₁, X₂

I sampled 500 data points with 50 features and the reward function that I described above. I fit an OWL classifier with a Gaussian (‘rbf’) kernel and got the following classifications, which I visualized for values of X₁, X₂:

Classification of treatment groups visualized for values of X₁, X₂
# Code for the plot 
import seaborn as sns

kernel = 'rbf'
gamma = 1/X_train.shape[1]
# gamma is a hyperparameter that has to be found by cross validation but this is a good place to start
D = owl_classifier(X_train, T, R, kernel, gamma)
prediction = D.predict(X_test)
sns.scatterplot(x = X_test[:,0], y = X_test[:,1], c = prediction )

In case you missed what happened here: The data was composed of 2 features that affected the response and 48 features of noise. The model managed to learn the effect of the two important features without us modeling this relationship in any way!

This is just one simple example, I made the reward function depend on X₁ and X₂ so that it’s easy to understand and visualize but you can feel free to use other examples and try out different classifiers.

We will be happy to hear your thoughts

Leave a reply

0
Your Cart is empty!

It looks like you haven't added any items to your cart yet.

Browse Products
Powered by Caddy