You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
TL;DR: When using CoxPHFitter.fit(), it doesn't matter whether a value for robust is specified. If there's a cluster_col specified, then presumably the Huber sandwich estimator will always be used.
I was using cluster_col in the CoxPHFitter and saw in the docstring that the sandwich estimator automatically gets used. I was aiming to match the standard errors in a test case with a CoxTimeVarying model by setting robust to the same value in the CoxPHFitter and CoxTimeVarying. (This explains my test data below.) However, I saw from issue #544 that the CoxTimeVarying has not been implemented leaving me only the option to set robust=False in the CoxPHFitter model. For the test case, I can just leave cluster_col unspecified. I think an error or error message should be returned in the case of cluster_col being set and robust=False. It looks like this conditional needs to be edited.
Here's a reproducible example with my comments:
importnumpy.testingasnptimportpandasaspdfromlifelinesimportCoxPHFitter, CoxTimeVaryingFitterfromlifelines.datasetsimportload_stanford_heart_transplantsfromlifelines.utilsimportto_long_formatstanford=load_stanford_heart_transplants()
# Keep only the last record for each subject, drop all covariate columns except age to simplify datastanford_last= (
stanford.groupby("id")
.tail(1)
.drop(["year", "surgery", "transplant"], axis="columns")
)
# Format the data for CPH modelstanford_last_cph_wid=stanford_last.rename(
columns={"start": "W", "stop": "T", "event": "E"}
)
stanford_last_cph_wid.head()
Create a CoxPHFitter model and fit it with the cluster_col specified.
TL;DR: When using
CoxPHFitter.fit()
, it doesn't matter whether a value forrobust
is specified. If there's acluster_col
specified, then presumably theHuber sandwich estimator
will always be used.I was using
cluster_col
in theCoxPHFitter
and saw in the docstring that the sandwich estimator automatically gets used. I was aiming to match the standard errors in a test case with aCoxTimeVarying
model by settingrobust
to the same value in theCoxPHFitter
andCoxTimeVarying
. (This explains my test data below.) However, I saw from issue #544 that theCoxTimeVarying
has not been implemented leaving me only the option to setrobust=False
in theCoxPHFitter
model. For the test case, I can just leavecluster_col
unspecified. I think an error or error message should be returned in the case ofcluster_col
being set androbust=False
. It looks like this conditional needs to be edited.Here's a reproducible example with my comments:
Create a
CoxPHFitter
model and fit it with thecluster_col
specified.However, if both a
cluster_col
androbust
was specified, the SE value is always the same (0.14374
) regardless of the value forrobust
.The standard error is different (
0.13862
) whencluster_col
is not specified, therefore lettingrobust
be set to its default value ofFalse
.lifelines version: 0.27.8
The text was updated successfully, but these errors were encountered: