dowhy.causal_estimators package
Submodules
dowhy.causal_estimators.causalml module
- class dowhy.causal_estimators.causalml.Causalml(*args, **kwargs)[source]
Bases:
CausalEstimator
Initializes an estimator with data and names of relevant variables.
This method is called from the constructors of its child classes.
- Parameters
data – data frame containing the data
identified_estimand – probability expression representing the target identified estimand to estimate.
treatment – name of the treatment variable
outcome – name of the outcome variable
control_value – Value of the treatment in the control group, for effect estimation. If treatment is multi-variate, this can be a list.
treatment_value – Value of the treatment in the treated group, for effect estimation. If treatment is multi-variate, this can be a list.
test_significance – Binary flag or a string indicating whether to test significance and by which method. All estimators support test_significance=”bootstrap” that estimates a p-value for the obtained estimate using the bootstrap method. Individual estimators can override this to support custom testing methods. The bootstrap method supports an optional parameter, num_null_simulations that can be specified through the params dictionary. If False, no testing is done. If True, significance of the estimate is tested using the custom method if available, otherwise by bootstrap.
evaluate_effect_strength – (Experimental) whether to evaluate the strength of effect
confidence_intervals – Binary flag or a string indicating whether the confidence intervals should be computed and which method should be used. All methods support estimation of confidence intervals using the bootstrap method by using the parameter confidence_intervals=”bootstrap”. The bootstrap method takes in two arguments (num_simulations and sample_size_fraction) that can be optionally specified in the params dictionary. Estimators may also override this to implement their own confidence interval method. If this parameter is False, no confidence intervals are computed. If True, confidence intervals are computed by the estimator’s specific method if available, otherwise through bootstrap.
target_units – The units for which the treatment effect should be estimated. This can be a string for common specifications of target units (namely, “ate”, “att” and “atc”). It can also be a lambda function that can be used as an index for the data (pandas DataFrame). Alternatively, it can be a new DataFrame that contains values of the effect_modifiers and effect will be estimated only for this new data.
effect_modifiers – Variables on which to compute separate effects, or return a heterogeneous effect function. Not all methods support this currently.
params – (optional) Additional method parameters num_null_simulations: The number of simulations for testing the statistical significance of the estimator num_simulations: The number of simulations for finding the confidence interval (and/or standard error) for a estimate sample_size_fraction: The size of the sample for the bootstrap estimator confidence_level: The confidence level of the confidence interval estimate num_quantiles_to_discretize_cont_cols: The number of quantiles into which a numeric effect modifier is split, to enable estimation of conditional treatment effect over it.
- Returns
an instance of the estimator class.
dowhy.causal_estimators.distance_matching_estimator module
- class dowhy.causal_estimators.distance_matching_estimator.DistanceMatchingEstimator(*args, **kwargs)[source]
Bases:
CausalEstimator
Simple matching estimator for binary treatments based on a distance metric.
Initializes an estimator with data and names of relevant variables.
This method is called from the constructors of its child classes.
- Parameters
data – data frame containing the data
identified_estimand – probability expression representing the target identified estimand to estimate.
treatment – name of the treatment variable
outcome – name of the outcome variable
control_value – Value of the treatment in the control group, for effect estimation. If treatment is multi-variate, this can be a list.
treatment_value – Value of the treatment in the treated group, for effect estimation. If treatment is multi-variate, this can be a list.
test_significance – Binary flag or a string indicating whether to test significance and by which method. All estimators support test_significance=”bootstrap” that estimates a p-value for the obtained estimate using the bootstrap method. Individual estimators can override this to support custom testing methods. The bootstrap method supports an optional parameter, num_null_simulations that can be specified through the params dictionary. If False, no testing is done. If True, significance of the estimate is tested using the custom method if available, otherwise by bootstrap.
evaluate_effect_strength – (Experimental) whether to evaluate the strength of effect
confidence_intervals – Binary flag or a string indicating whether the confidence intervals should be computed and which method should be used. All methods support estimation of confidence intervals using the bootstrap method by using the parameter confidence_intervals=”bootstrap”. The bootstrap method takes in two arguments (num_simulations and sample_size_fraction) that can be optionally specified in the params dictionary. Estimators may also override this to implement their own confidence interval method. If this parameter is False, no confidence intervals are computed. If True, confidence intervals are computed by the estimator’s specific method if available, otherwise through bootstrap.
target_units – The units for which the treatment effect should be estimated. This can be a string for common specifications of target units (namely, “ate”, “att” and “atc”). It can also be a lambda function that can be used as an index for the data (pandas DataFrame). Alternatively, it can be a new DataFrame that contains values of the effect_modifiers and effect will be estimated only for this new data.
effect_modifiers – Variables on which to compute separate effects, or return a heterogeneous effect function. Not all methods support this currently.
params – (optional) Additional method parameters num_null_simulations: The number of simulations for testing the statistical significance of the estimator num_simulations: The number of simulations for finding the confidence interval (and/or standard error) for a estimate sample_size_fraction: The size of the sample for the bootstrap estimator confidence_level: The confidence level of the confidence interval estimate num_quantiles_to_discretize_cont_cols: The number of quantiles into which a numeric effect modifier is split, to enable estimation of conditional treatment effect over it.
- Returns
an instance of the estimator class.
- Valid_Dist_Metric_Params = ['p', 'V', 'VI', 'w']
dowhy.causal_estimators.econml module
- class dowhy.causal_estimators.econml.Econml(*args, **kwargs)[source]
Bases:
CausalEstimator
Initializes an estimator with data and names of relevant variables.
This method is called from the constructors of its child classes.
- Parameters
data – data frame containing the data
identified_estimand – probability expression representing the target identified estimand to estimate.
treatment – name of the treatment variable
outcome – name of the outcome variable
control_value – Value of the treatment in the control group, for effect estimation. If treatment is multi-variate, this can be a list.
treatment_value – Value of the treatment in the treated group, for effect estimation. If treatment is multi-variate, this can be a list.
test_significance – Binary flag or a string indicating whether to test significance and by which method. All estimators support test_significance=”bootstrap” that estimates a p-value for the obtained estimate using the bootstrap method. Individual estimators can override this to support custom testing methods. The bootstrap method supports an optional parameter, num_null_simulations that can be specified through the params dictionary. If False, no testing is done. If True, significance of the estimate is tested using the custom method if available, otherwise by bootstrap.
evaluate_effect_strength – (Experimental) whether to evaluate the strength of effect
confidence_intervals – Binary flag or a string indicating whether the confidence intervals should be computed and which method should be used. All methods support estimation of confidence intervals using the bootstrap method by using the parameter confidence_intervals=”bootstrap”. The bootstrap method takes in two arguments (num_simulations and sample_size_fraction) that can be optionally specified in the params dictionary. Estimators may also override this to implement their own confidence interval method. If this parameter is False, no confidence intervals are computed. If True, confidence intervals are computed by the estimator’s specific method if available, otherwise through bootstrap.
target_units – The units for which the treatment effect should be estimated. This can be a string for common specifications of target units (namely, “ate”, “att” and “atc”). It can also be a lambda function that can be used as an index for the data (pandas DataFrame). Alternatively, it can be a new DataFrame that contains values of the effect_modifiers and effect will be estimated only for this new data.
effect_modifiers – Variables on which to compute separate effects, or return a heterogeneous effect function. Not all methods support this currently.
params – (optional) Additional method parameters num_null_simulations: The number of simulations for testing the statistical significance of the estimator num_simulations: The number of simulations for finding the confidence interval (and/or standard error) for a estimate sample_size_fraction: The size of the sample for the bootstrap estimator confidence_level: The confidence level of the confidence interval estimate num_quantiles_to_discretize_cont_cols: The number of quantiles into which a numeric effect modifier is split, to enable estimation of conditional treatment effect over it.
- Returns
an instance of the estimator class.
dowhy.causal_estimators.generalized_linear_model_estimator module
- class dowhy.causal_estimators.generalized_linear_model_estimator.GeneralizedLinearModelEstimator(*args, **kwargs)[source]
Bases:
RegressionEstimator
Compute effect of treatment using a generalized linear model such as logistic regression.
Implementation uses statsmodels.api.GLM. Needs an additional parameter, “glm_family” to be specified in method_params. The value of this parameter can be any valid statsmodels.api families object. For example, to use logistic regression, specify “glm_family” as statsmodels.api.families.Binomial().
Initializes an estimator with data and names of relevant variables.
This method is called from the constructors of its child classes.
- Parameters
data – data frame containing the data
identified_estimand – probability expression representing the target identified estimand to estimate.
treatment – name of the treatment variable
outcome – name of the outcome variable
control_value – Value of the treatment in the control group, for effect estimation. If treatment is multi-variate, this can be a list.
treatment_value – Value of the treatment in the treated group, for effect estimation. If treatment is multi-variate, this can be a list.
test_significance – Binary flag or a string indicating whether to test significance and by which method. All estimators support test_significance=”bootstrap” that estimates a p-value for the obtained estimate using the bootstrap method. Individual estimators can override this to support custom testing methods. The bootstrap method supports an optional parameter, num_null_simulations that can be specified through the params dictionary. If False, no testing is done. If True, significance of the estimate is tested using the custom method if available, otherwise by bootstrap.
evaluate_effect_strength – (Experimental) whether to evaluate the strength of effect
confidence_intervals – Binary flag or a string indicating whether the confidence intervals should be computed and which method should be used. All methods support estimation of confidence intervals using the bootstrap method by using the parameter confidence_intervals=”bootstrap”. The bootstrap method takes in two arguments (num_simulations and sample_size_fraction) that can be optionally specified in the params dictionary. Estimators may also override this to implement their own confidence interval method. If this parameter is False, no confidence intervals are computed. If True, confidence intervals are computed by the estimator’s specific method if available, otherwise through bootstrap.
target_units – The units for which the treatment effect should be estimated. This can be a string for common specifications of target units (namely, “ate”, “att” and “atc”). It can also be a lambda function that can be used as an index for the data (pandas DataFrame). Alternatively, it can be a new DataFrame that contains values of the effect_modifiers and effect will be estimated only for this new data.
effect_modifiers – Variables on which to compute separate effects, or return a heterogeneous effect function. Not all methods support this currently.
params – (optional) Additional method parameters num_null_simulations: The number of simulations for testing the statistical significance of the estimator num_simulations: The number of simulations for finding the confidence interval (and/or standard error) for a estimate sample_size_fraction: The size of the sample for the bootstrap estimator confidence_level: The confidence level of the confidence interval estimate num_quantiles_to_discretize_cont_cols: The number of quantiles into which a numeric effect modifier is split, to enable estimation of conditional treatment effect over it.
- Returns
an instance of the estimator class.
dowhy.causal_estimators.instrumental_variable_estimator module
- class dowhy.causal_estimators.instrumental_variable_estimator.InstrumentalVariableEstimator(*args, **kwargs)[source]
Bases:
CausalEstimator
Compute effect of treatment using the instrumental variables method.
This is also a superclass that can be inherited by other specific methods.
Supports additional parameters that can be specified in the estimate_effect() method.
‘iv_instrument_name’: Name of the specific instrumental variable to be used. Needs to be one of the IVs identified in the identification step. Default is to use all the IV variables from the identification step.
Initializes an estimator with data and names of relevant variables.
This method is called from the constructors of its child classes.
- Parameters
data – data frame containing the data
identified_estimand – probability expression representing the target identified estimand to estimate.
treatment – name of the treatment variable
outcome – name of the outcome variable
control_value – Value of the treatment in the control group, for effect estimation. If treatment is multi-variate, this can be a list.
treatment_value – Value of the treatment in the treated group, for effect estimation. If treatment is multi-variate, this can be a list.
test_significance – Binary flag or a string indicating whether to test significance and by which method. All estimators support test_significance=”bootstrap” that estimates a p-value for the obtained estimate using the bootstrap method. Individual estimators can override this to support custom testing methods. The bootstrap method supports an optional parameter, num_null_simulations that can be specified through the params dictionary. If False, no testing is done. If True, significance of the estimate is tested using the custom method if available, otherwise by bootstrap.
evaluate_effect_strength – (Experimental) whether to evaluate the strength of effect
confidence_intervals – Binary flag or a string indicating whether the confidence intervals should be computed and which method should be used. All methods support estimation of confidence intervals using the bootstrap method by using the parameter confidence_intervals=”bootstrap”. The bootstrap method takes in two arguments (num_simulations and sample_size_fraction) that can be optionally specified in the params dictionary. Estimators may also override this to implement their own confidence interval method. If this parameter is False, no confidence intervals are computed. If True, confidence intervals are computed by the estimator’s specific method if available, otherwise through bootstrap.
target_units – The units for which the treatment effect should be estimated. This can be a string for common specifications of target units (namely, “ate”, “att” and “atc”). It can also be a lambda function that can be used as an index for the data (pandas DataFrame). Alternatively, it can be a new DataFrame that contains values of the effect_modifiers and effect will be estimated only for this new data.
effect_modifiers – Variables on which to compute separate effects, or return a heterogeneous effect function. Not all methods support this currently.
params – (optional) Additional method parameters num_null_simulations: The number of simulations for testing the statistical significance of the estimator num_simulations: The number of simulations for finding the confidence interval (and/or standard error) for a estimate sample_size_fraction: The size of the sample for the bootstrap estimator confidence_level: The confidence level of the confidence interval estimate num_quantiles_to_discretize_cont_cols: The number of quantiles into which a numeric effect modifier is split, to enable estimation of conditional treatment effect over it.
- Returns
an instance of the estimator class.
dowhy.causal_estimators.linear_regression_estimator module
- class dowhy.causal_estimators.linear_regression_estimator.LinearRegressionEstimator(*args, **kwargs)[source]
Bases:
RegressionEstimator
Compute effect of treatment using linear regression.
Fits a regression model for estimating the outcome using treatment(s) and confounders. For a univariate treatment, the treatment effect is equivalent to the coefficient of the treatment variable.
Simple method to show the implementation of a causal inference method that can handle multiple treatments and heterogeneity in treatment. Requires a strong assumption that all relationships from (T, W) to Y are linear.
Initializes an estimator with data and names of relevant variables.
This method is called from the constructors of its child classes.
- Parameters
data – data frame containing the data
identified_estimand – probability expression representing the target identified estimand to estimate.
treatment – name of the treatment variable
outcome – name of the outcome variable
control_value – Value of the treatment in the control group, for effect estimation. If treatment is multi-variate, this can be a list.
treatment_value – Value of the treatment in the treated group, for effect estimation. If treatment is multi-variate, this can be a list.
test_significance – Binary flag or a string indicating whether to test significance and by which method. All estimators support test_significance=”bootstrap” that estimates a p-value for the obtained estimate using the bootstrap method. Individual estimators can override this to support custom testing methods. The bootstrap method supports an optional parameter, num_null_simulations that can be specified through the params dictionary. If False, no testing is done. If True, significance of the estimate is tested using the custom method if available, otherwise by bootstrap.
evaluate_effect_strength – (Experimental) whether to evaluate the strength of effect
confidence_intervals – Binary flag or a string indicating whether the confidence intervals should be computed and which method should be used. All methods support estimation of confidence intervals using the bootstrap method by using the parameter confidence_intervals=”bootstrap”. The bootstrap method takes in two arguments (num_simulations and sample_size_fraction) that can be optionally specified in the params dictionary. Estimators may also override this to implement their own confidence interval method. If this parameter is False, no confidence intervals are computed. If True, confidence intervals are computed by the estimator’s specific method if available, otherwise through bootstrap.
target_units – The units for which the treatment effect should be estimated. This can be a string for common specifications of target units (namely, “ate”, “att” and “atc”). It can also be a lambda function that can be used as an index for the data (pandas DataFrame). Alternatively, it can be a new DataFrame that contains values of the effect_modifiers and effect will be estimated only for this new data.
effect_modifiers – Variables on which to compute separate effects, or return a heterogeneous effect function. Not all methods support this currently.
params – (optional) Additional method parameters num_null_simulations: The number of simulations for testing the statistical significance of the estimator num_simulations: The number of simulations for finding the confidence interval (and/or standard error) for a estimate sample_size_fraction: The size of the sample for the bootstrap estimator confidence_level: The confidence level of the confidence interval estimate num_quantiles_to_discretize_cont_cols: The number of quantiles into which a numeric effect modifier is split, to enable estimation of conditional treatment effect over it.
- Returns
an instance of the estimator class.
dowhy.causal_estimators.propensity_score_estimator module
- class dowhy.causal_estimators.propensity_score_estimator.PropensityScoreEstimator(*args, propensity_score_model=None, recalculate_propensity_score=True, propensity_score_column='propensity_score', **kwargs)[source]
Bases:
CausalEstimator
Base class for estimators that estimate effects based on propensity of treatment assignment. Supports additional parameters that can be specified in the estimate_effect() method. - ‘propensity_score_model’: The model used to compute propensity score. Could be any classification model that supports fit() and predict_proba() methods. If None, use LogisticRegression model as the default. Default=None - ‘recalculate_propensity_score’: If true, force the estimator to calculate the propensity score. To use pre-computed propensity score, set this value to false. Default=True - ‘propensity_score_column’: column name that stores the propensity score. Default=’propensity_score’
Initializes an estimator with data and names of relevant variables.
This method is called from the constructors of its child classes.
- Parameters
data – data frame containing the data
identified_estimand – probability expression representing the target identified estimand to estimate.
treatment – name of the treatment variable
outcome – name of the outcome variable
control_value – Value of the treatment in the control group, for effect estimation. If treatment is multi-variate, this can be a list.
treatment_value – Value of the treatment in the treated group, for effect estimation. If treatment is multi-variate, this can be a list.
test_significance – Binary flag or a string indicating whether to test significance and by which method. All estimators support test_significance=”bootstrap” that estimates a p-value for the obtained estimate using the bootstrap method. Individual estimators can override this to support custom testing methods. The bootstrap method supports an optional parameter, num_null_simulations that can be specified through the params dictionary. If False, no testing is done. If True, significance of the estimate is tested using the custom method if available, otherwise by bootstrap.
evaluate_effect_strength – (Experimental) whether to evaluate the strength of effect
confidence_intervals – Binary flag or a string indicating whether the confidence intervals should be computed and which method should be used. All methods support estimation of confidence intervals using the bootstrap method by using the parameter confidence_intervals=”bootstrap”. The bootstrap method takes in two arguments (num_simulations and sample_size_fraction) that can be optionally specified in the params dictionary. Estimators may also override this to implement their own confidence interval method. If this parameter is False, no confidence intervals are computed. If True, confidence intervals are computed by the estimator’s specific method if available, otherwise through bootstrap.
target_units – The units for which the treatment effect should be estimated. This can be a string for common specifications of target units (namely, “ate”, “att” and “atc”). It can also be a lambda function that can be used as an index for the data (pandas DataFrame). Alternatively, it can be a new DataFrame that contains values of the effect_modifiers and effect will be estimated only for this new data.
effect_modifiers – Variables on which to compute separate effects, or return a heterogeneous effect function. Not all methods support this currently.
params – (optional) Additional method parameters num_null_simulations: The number of simulations for testing the statistical significance of the estimator num_simulations: The number of simulations for finding the confidence interval (and/or standard error) for a estimate sample_size_fraction: The size of the sample for the bootstrap estimator confidence_level: The confidence level of the confidence interval estimate num_quantiles_to_discretize_cont_cols: The number of quantiles into which a numeric effect modifier is split, to enable estimation of conditional treatment effect over it.
- Returns
an instance of the estimator class.
dowhy.causal_estimators.propensity_score_matching_estimator module
- class dowhy.causal_estimators.propensity_score_matching_estimator.PropensityScoreMatchingEstimator(*args, propensity_score_model=None, recalculate_propensity_score=True, propensity_score_column='propensity_score', **kwargs)[source]
Bases:
PropensityScoreEstimator
Estimate effect of treatment by finding matching treated and control units based on propensity score.
Straightforward application of the back-door criterion.
Supports additional parameters that can be specified in the estimate_effect() method.
‘propensity_score_model’: The model used to compute propensity score. Could be any classification model that supports fit() and predict_proba() methods. If None, use LogisticRegression model as the default. Default=None
‘recalculate_propensity_score’: If true, force the estimator to calculate the propensity score. To use pre-computed propensity score, set this value to false. Default=True
‘propensity_score_column’: column name that stores the propensity score. Default=’propensity_score’
Initializes an estimator with data and names of relevant variables.
This method is called from the constructors of its child classes.
- Parameters
data – data frame containing the data
identified_estimand – probability expression representing the target identified estimand to estimate.
treatment – name of the treatment variable
outcome – name of the outcome variable
control_value – Value of the treatment in the control group, for effect estimation. If treatment is multi-variate, this can be a list.
treatment_value – Value of the treatment in the treated group, for effect estimation. If treatment is multi-variate, this can be a list.
test_significance – Binary flag or a string indicating whether to test significance and by which method. All estimators support test_significance=”bootstrap” that estimates a p-value for the obtained estimate using the bootstrap method. Individual estimators can override this to support custom testing methods. The bootstrap method supports an optional parameter, num_null_simulations that can be specified through the params dictionary. If False, no testing is done. If True, significance of the estimate is tested using the custom method if available, otherwise by bootstrap.
evaluate_effect_strength – (Experimental) whether to evaluate the strength of effect
confidence_intervals – Binary flag or a string indicating whether the confidence intervals should be computed and which method should be used. All methods support estimation of confidence intervals using the bootstrap method by using the parameter confidence_intervals=”bootstrap”. The bootstrap method takes in two arguments (num_simulations and sample_size_fraction) that can be optionally specified in the params dictionary. Estimators may also override this to implement their own confidence interval method. If this parameter is False, no confidence intervals are computed. If True, confidence intervals are computed by the estimator’s specific method if available, otherwise through bootstrap.
target_units – The units for which the treatment effect should be estimated. This can be a string for common specifications of target units (namely, “ate”, “att” and “atc”). It can also be a lambda function that can be used as an index for the data (pandas DataFrame). Alternatively, it can be a new DataFrame that contains values of the effect_modifiers and effect will be estimated only for this new data.
effect_modifiers – Variables on which to compute separate effects, or return a heterogeneous effect function. Not all methods support this currently.
params – (optional) Additional method parameters num_null_simulations: The number of simulations for testing the statistical significance of the estimator num_simulations: The number of simulations for finding the confidence interval (and/or standard error) for a estimate sample_size_fraction: The size of the sample for the bootstrap estimator confidence_level: The confidence level of the confidence interval estimate num_quantiles_to_discretize_cont_cols: The number of quantiles into which a numeric effect modifier is split, to enable estimation of conditional treatment effect over it.
- Returns
an instance of the estimator class.
dowhy.causal_estimators.propensity_score_stratification_estimator module
- class dowhy.causal_estimators.propensity_score_stratification_estimator.PropensityScoreStratificationEstimator(*args, num_strata='auto', clipping_threshold=10, propensity_score_model=None, recalculate_propensity_score=True, propensity_score_column='propensity_score', **kwargs)[source]
Bases:
PropensityScoreEstimator
Estimate effect of treatment by stratifying the data into bins with identical common causes.
Straightforward application of the back-door criterion.
Supports additional parameters that can be specified in the estimate_effect() method.
‘num_strata’: Number of bins by which data will be stratified. Default=50
‘clipping_threshold’: Mininum number of treated or control units per strata. Default=10
‘propensity_score_model’: The model used to compute propensity score. Could be any classification model that supports fit() and predict_proba() methods. If None, use LogisticRegression model as the default. Default=None
‘recalculate_propensity_score’: If true, force the estimator to calculate the propensity score. To use pre-computed propensity score, set this value to false. Default=True
‘propensity_score_column’: column name that stores the propensity score. Default=’propensity_score’
Initializes an estimator with data and names of relevant variables.
This method is called from the constructors of its child classes.
- Parameters
data – data frame containing the data
identified_estimand – probability expression representing the target identified estimand to estimate.
treatment – name of the treatment variable
outcome – name of the outcome variable
control_value – Value of the treatment in the control group, for effect estimation. If treatment is multi-variate, this can be a list.
treatment_value – Value of the treatment in the treated group, for effect estimation. If treatment is multi-variate, this can be a list.
test_significance – Binary flag or a string indicating whether to test significance and by which method. All estimators support test_significance=”bootstrap” that estimates a p-value for the obtained estimate using the bootstrap method. Individual estimators can override this to support custom testing methods. The bootstrap method supports an optional parameter, num_null_simulations that can be specified through the params dictionary. If False, no testing is done. If True, significance of the estimate is tested using the custom method if available, otherwise by bootstrap.
evaluate_effect_strength – (Experimental) whether to evaluate the strength of effect
confidence_intervals – Binary flag or a string indicating whether the confidence intervals should be computed and which method should be used. All methods support estimation of confidence intervals using the bootstrap method by using the parameter confidence_intervals=”bootstrap”. The bootstrap method takes in two arguments (num_simulations and sample_size_fraction) that can be optionally specified in the params dictionary. Estimators may also override this to implement their own confidence interval method. If this parameter is False, no confidence intervals are computed. If True, confidence intervals are computed by the estimator’s specific method if available, otherwise through bootstrap.
target_units – The units for which the treatment effect should be estimated. This can be a string for common specifications of target units (namely, “ate”, “att” and “atc”). It can also be a lambda function that can be used as an index for the data (pandas DataFrame). Alternatively, it can be a new DataFrame that contains values of the effect_modifiers and effect will be estimated only for this new data.
effect_modifiers – Variables on which to compute separate effects, or return a heterogeneous effect function. Not all methods support this currently.
params – (optional) Additional method parameters num_null_simulations: The number of simulations for testing the statistical significance of the estimator num_simulations: The number of simulations for finding the confidence interval (and/or standard error) for a estimate sample_size_fraction: The size of the sample for the bootstrap estimator confidence_level: The confidence level of the confidence interval estimate num_quantiles_to_discretize_cont_cols: The number of quantiles into which a numeric effect modifier is split, to enable estimation of conditional treatment effect over it.
- Returns
an instance of the estimator class.
dowhy.causal_estimators.propensity_score_weighting_estimator module
- class dowhy.causal_estimators.propensity_score_weighting_estimator.PropensityScoreWeightingEstimator(*args, min_ps_score=0.05, max_ps_score=0.95, weighting_scheme='ips_weight', propensity_score_model=None, recalculate_propensity_score=True, propensity_score_column='propensity_score', **kwargs)[source]
Bases:
PropensityScoreEstimator
Estimate effect of treatment by weighing the data by inverse probability of occurrence.
Straightforward application of the back-door criterion.
Supports additional parameters that can be specified in the estimate_effect() method.
‘min_ps_score’: Lower bound used to clip the propensity score. Default=0.05
‘max_ps_score’: Upper bound used to clip the propensity score. Default=0.95
‘weighting_scheme’: This is the name of weighting method to use. Can be inverse propensity score (“ips_weight”, default), stabilized IPS score (“ips_stabilized_weight”), or normalized IPS score (“ips_normalized_weight”)
‘propensity_score_model’: The model used to compute propensity score. Could be any classification model that supports fit() and predict_proba() methods. If None, use LogisticRegression model as the default. Default=None
‘recalculate_propensity_score’: If true, force the estimator to calculate the propensity score. To use pre-computed propensity score, set this value to false. Default=True
‘propensity_score_column’: column name that stores the propensity score. Default=’propensity_score’
Initializes an estimator with data and names of relevant variables.
This method is called from the constructors of its child classes.
- Parameters
data – data frame containing the data
identified_estimand – probability expression representing the target identified estimand to estimate.
treatment – name of the treatment variable
outcome – name of the outcome variable
control_value – Value of the treatment in the control group, for effect estimation. If treatment is multi-variate, this can be a list.
treatment_value – Value of the treatment in the treated group, for effect estimation. If treatment is multi-variate, this can be a list.
test_significance – Binary flag or a string indicating whether to test significance and by which method. All estimators support test_significance=”bootstrap” that estimates a p-value for the obtained estimate using the bootstrap method. Individual estimators can override this to support custom testing methods. The bootstrap method supports an optional parameter, num_null_simulations that can be specified through the params dictionary. If False, no testing is done. If True, significance of the estimate is tested using the custom method if available, otherwise by bootstrap.
evaluate_effect_strength – (Experimental) whether to evaluate the strength of effect
confidence_intervals – Binary flag or a string indicating whether the confidence intervals should be computed and which method should be used. All methods support estimation of confidence intervals using the bootstrap method by using the parameter confidence_intervals=”bootstrap”. The bootstrap method takes in two arguments (num_simulations and sample_size_fraction) that can be optionally specified in the params dictionary. Estimators may also override this to implement their own confidence interval method. If this parameter is False, no confidence intervals are computed. If True, confidence intervals are computed by the estimator’s specific method if available, otherwise through bootstrap.
target_units – The units for which the treatment effect should be estimated. This can be a string for common specifications of target units (namely, “ate”, “att” and “atc”). It can also be a lambda function that can be used as an index for the data (pandas DataFrame). Alternatively, it can be a new DataFrame that contains values of the effect_modifiers and effect will be estimated only for this new data.
effect_modifiers – Variables on which to compute separate effects, or return a heterogeneous effect function. Not all methods support this currently.
params – (optional) Additional method parameters num_null_simulations: The number of simulations for testing the statistical significance of the estimator num_simulations: The number of simulations for finding the confidence interval (and/or standard error) for a estimate sample_size_fraction: The size of the sample for the bootstrap estimator confidence_level: The confidence level of the confidence interval estimate num_quantiles_to_discretize_cont_cols: The number of quantiles into which a numeric effect modifier is split, to enable estimation of conditional treatment effect over it.
- Returns
an instance of the estimator class.
dowhy.causal_estimators.regression_discontinuity_estimator module
- class dowhy.causal_estimators.regression_discontinuity_estimator.RegressionDiscontinuityEstimator(*args, **kwargs)[source]
Bases:
CausalEstimator
Compute effect of treatment using the regression discontinuity method.
Estimates effect by transforming the problem to an instrumental variables problem.
Supports additional parameters that can be specified in the estimate_effect() method.
‘rd_variable_name’: name of the variable on which the discontinuity occurs. This is the instrument.
‘rd_threshold_value’: Threshold at which the discontinuity occurs.
‘rd_bandwidth’: Distance from the threshold within which confounders can be considered the same between treatment and control. Considered band is (threshold +- bandwidth)
Initializes an estimator with data and names of relevant variables.
This method is called from the constructors of its child classes.
- Parameters
data – data frame containing the data
identified_estimand – probability expression representing the target identified estimand to estimate.
treatment – name of the treatment variable
outcome – name of the outcome variable
control_value – Value of the treatment in the control group, for effect estimation. If treatment is multi-variate, this can be a list.
treatment_value – Value of the treatment in the treated group, for effect estimation. If treatment is multi-variate, this can be a list.
test_significance – Binary flag or a string indicating whether to test significance and by which method. All estimators support test_significance=”bootstrap” that estimates a p-value for the obtained estimate using the bootstrap method. Individual estimators can override this to support custom testing methods. The bootstrap method supports an optional parameter, num_null_simulations that can be specified through the params dictionary. If False, no testing is done. If True, significance of the estimate is tested using the custom method if available, otherwise by bootstrap.
evaluate_effect_strength – (Experimental) whether to evaluate the strength of effect
confidence_intervals – Binary flag or a string indicating whether the confidence intervals should be computed and which method should be used. All methods support estimation of confidence intervals using the bootstrap method by using the parameter confidence_intervals=”bootstrap”. The bootstrap method takes in two arguments (num_simulations and sample_size_fraction) that can be optionally specified in the params dictionary. Estimators may also override this to implement their own confidence interval method. If this parameter is False, no confidence intervals are computed. If True, confidence intervals are computed by the estimator’s specific method if available, otherwise through bootstrap.
target_units – The units for which the treatment effect should be estimated. This can be a string for common specifications of target units (namely, “ate”, “att” and “atc”). It can also be a lambda function that can be used as an index for the data (pandas DataFrame). Alternatively, it can be a new DataFrame that contains values of the effect_modifiers and effect will be estimated only for this new data.
effect_modifiers – Variables on which to compute separate effects, or return a heterogeneous effect function. Not all methods support this currently.
params – (optional) Additional method parameters num_null_simulations: The number of simulations for testing the statistical significance of the estimator num_simulations: The number of simulations for finding the confidence interval (and/or standard error) for a estimate sample_size_fraction: The size of the sample for the bootstrap estimator confidence_level: The confidence level of the confidence interval estimate num_quantiles_to_discretize_cont_cols: The number of quantiles into which a numeric effect modifier is split, to enable estimation of conditional treatment effect over it.
- Returns
an instance of the estimator class.
dowhy.causal_estimators.regression_estimator module
- class dowhy.causal_estimators.regression_estimator.RegressionEstimator(*args, **kwargs)[source]
Bases:
CausalEstimator
Compute effect of treatment using some regression function.
Fits a regression model for estimating the outcome using treatment(s) and confounders.
Initializes an estimator with data and names of relevant variables.
This method is called from the constructors of its child classes.
- Parameters
data – data frame containing the data
identified_estimand – probability expression representing the target identified estimand to estimate.
treatment – name of the treatment variable
outcome – name of the outcome variable
control_value – Value of the treatment in the control group, for effect estimation. If treatment is multi-variate, this can be a list.
treatment_value – Value of the treatment in the treated group, for effect estimation. If treatment is multi-variate, this can be a list.
test_significance – Binary flag or a string indicating whether to test significance and by which method. All estimators support test_significance=”bootstrap” that estimates a p-value for the obtained estimate using the bootstrap method. Individual estimators can override this to support custom testing methods. The bootstrap method supports an optional parameter, num_null_simulations that can be specified through the params dictionary. If False, no testing is done. If True, significance of the estimate is tested using the custom method if available, otherwise by bootstrap.
evaluate_effect_strength – (Experimental) whether to evaluate the strength of effect
confidence_intervals – Binary flag or a string indicating whether the confidence intervals should be computed and which method should be used. All methods support estimation of confidence intervals using the bootstrap method by using the parameter confidence_intervals=”bootstrap”. The bootstrap method takes in two arguments (num_simulations and sample_size_fraction) that can be optionally specified in the params dictionary. Estimators may also override this to implement their own confidence interval method. If this parameter is False, no confidence intervals are computed. If True, confidence intervals are computed by the estimator’s specific method if available, otherwise through bootstrap.
target_units – The units for which the treatment effect should be estimated. This can be a string for common specifications of target units (namely, “ate”, “att” and “atc”). It can also be a lambda function that can be used as an index for the data (pandas DataFrame). Alternatively, it can be a new DataFrame that contains values of the effect_modifiers and effect will be estimated only for this new data.
effect_modifiers – Variables on which to compute separate effects, or return a heterogeneous effect function. Not all methods support this currently.
params – (optional) Additional method parameters num_null_simulations: The number of simulations for testing the statistical significance of the estimator num_simulations: The number of simulations for finding the confidence interval (and/or standard error) for a estimate sample_size_fraction: The size of the sample for the bootstrap estimator confidence_level: The confidence level of the confidence interval estimate num_quantiles_to_discretize_cont_cols: The number of quantiles into which a numeric effect modifier is split, to enable estimation of conditional treatment effect over it.
- Returns
an instance of the estimator class.
dowhy.causal_estimators.two_stage_regression_estimator module
- class dowhy.causal_estimators.two_stage_regression_estimator.TwoStageRegressionEstimator(*args, **kwargs)[source]
Bases:
CausalEstimator
Compute treatment effect whenever the effect is fully mediated by another variable (front-door) or when there is an instrument available.
Currently only supports a linear model for the effects.
Initializes an estimator with data and names of relevant variables.
This method is called from the constructors of its child classes.
- Parameters
data – data frame containing the data
identified_estimand – probability expression representing the target identified estimand to estimate.
treatment – name of the treatment variable
outcome – name of the outcome variable
control_value – Value of the treatment in the control group, for effect estimation. If treatment is multi-variate, this can be a list.
treatment_value – Value of the treatment in the treated group, for effect estimation. If treatment is multi-variate, this can be a list.
test_significance – Binary flag or a string indicating whether to test significance and by which method. All estimators support test_significance=”bootstrap” that estimates a p-value for the obtained estimate using the bootstrap method. Individual estimators can override this to support custom testing methods. The bootstrap method supports an optional parameter, num_null_simulations that can be specified through the params dictionary. If False, no testing is done. If True, significance of the estimate is tested using the custom method if available, otherwise by bootstrap.
evaluate_effect_strength – (Experimental) whether to evaluate the strength of effect
confidence_intervals – Binary flag or a string indicating whether the confidence intervals should be computed and which method should be used. All methods support estimation of confidence intervals using the bootstrap method by using the parameter confidence_intervals=”bootstrap”. The bootstrap method takes in two arguments (num_simulations and sample_size_fraction) that can be optionally specified in the params dictionary. Estimators may also override this to implement their own confidence interval method. If this parameter is False, no confidence intervals are computed. If True, confidence intervals are computed by the estimator’s specific method if available, otherwise through bootstrap.
target_units – The units for which the treatment effect should be estimated. This can be a string for common specifications of target units (namely, “ate”, “att” and “atc”). It can also be a lambda function that can be used as an index for the data (pandas DataFrame). Alternatively, it can be a new DataFrame that contains values of the effect_modifiers and effect will be estimated only for this new data.
effect_modifiers – Variables on which to compute separate effects, or return a heterogeneous effect function. Not all methods support this currently.
params – (optional) Additional method parameters num_null_simulations: The number of simulations for testing the statistical significance of the estimator num_simulations: The number of simulations for finding the confidence interval (and/or standard error) for a estimate sample_size_fraction: The size of the sample for the bootstrap estimator confidence_level: The confidence level of the confidence interval estimate num_quantiles_to_discretize_cont_cols: The number of quantiles into which a numeric effect modifier is split, to enable estimation of conditional treatment effect over it.
- Returns
an instance of the estimator class.
- DEFAULT_FIRST_STAGE_MODEL
alias of
LinearRegressionEstimator
- DEFAULT_SECOND_STAGE_MODEL
alias of
LinearRegressionEstimator