dowhy.causal_estimators package

Submodules

dowhy.causal_estimators.causalml module

class dowhy.causal_estimators.causalml.Causalml(*args, **kwargs)[source]

Bases: CausalEstimator

Initializes an estimator with data and names of relevant variables.

This method is called from the constructors of its child classes.

Parameters
  • data – data frame containing the data

  • identified_estimand – probability expression representing the target identified estimand to estimate.

  • treatment – name of the treatment variable

  • outcome – name of the outcome variable

  • control_value – Value of the treatment in the control group, for effect estimation. If treatment is multi-variate, this can be a list.

  • treatment_value – Value of the treatment in the treated group, for effect estimation. If treatment is multi-variate, this can be a list.

  • test_significance – Binary flag or a string indicating whether to test significance and by which method. All estimators support test_significance=”bootstrap” that estimates a p-value for the obtained estimate using the bootstrap method. Individual estimators can override this to support custom testing methods. The bootstrap method supports an optional parameter, num_null_simulations that can be specified through the params dictionary. If False, no testing is done. If True, significance of the estimate is tested using the custom method if available, otherwise by bootstrap.

  • evaluate_effect_strength – (Experimental) whether to evaluate the strength of effect

  • confidence_intervals – Binary flag or a string indicating whether the confidence intervals should be computed and which method should be used. All methods support estimation of confidence intervals using the bootstrap method by using the parameter confidence_intervals=”bootstrap”. The bootstrap method takes in two arguments (num_simulations and sample_size_fraction) that can be optionally specified in the params dictionary. Estimators may also override this to implement their own confidence interval method. If this parameter is False, no confidence intervals are computed. If True, confidence intervals are computed by the estimator’s specific method if available, otherwise through bootstrap.

  • target_units – The units for which the treatment effect should be estimated. This can be a string for common specifications of target units (namely, “ate”, “att” and “atc”). It can also be a lambda function that can be used as an index for the data (pandas DataFrame). Alternatively, it can be a new DataFrame that contains values of the effect_modifiers and effect will be estimated only for this new data.

  • effect_modifiers – Variables on which to compute separate effects, or return a heterogeneous effect function. Not all methods support this currently.

  • params – (optional) Additional method parameters num_null_simulations: The number of simulations for testing the statistical significance of the estimator num_simulations: The number of simulations for finding the confidence interval (and/or standard error) for a estimate sample_size_fraction: The size of the sample for the bootstrap estimator confidence_level: The confidence level of the confidence interval estimate num_quantiles_to_discretize_cont_cols: The number of quantiles into which a numeric effect modifier is split, to enable estimation of conditional treatment effect over it.

Returns

an instance of the estimator class.

construct_symbolic_estimator(estimand)[source]

dowhy.causal_estimators.distance_matching_estimator module

class dowhy.causal_estimators.distance_matching_estimator.DistanceMatchingEstimator(*args, **kwargs)[source]

Bases: CausalEstimator

Simple matching estimator for binary treatments based on a distance metric.

Initializes an estimator with data and names of relevant variables.

This method is called from the constructors of its child classes.

Parameters
  • data – data frame containing the data

  • identified_estimand – probability expression representing the target identified estimand to estimate.

  • treatment – name of the treatment variable

  • outcome – name of the outcome variable

  • control_value – Value of the treatment in the control group, for effect estimation. If treatment is multi-variate, this can be a list.

  • treatment_value – Value of the treatment in the treated group, for effect estimation. If treatment is multi-variate, this can be a list.

  • test_significance – Binary flag or a string indicating whether to test significance and by which method. All estimators support test_significance=”bootstrap” that estimates a p-value for the obtained estimate using the bootstrap method. Individual estimators can override this to support custom testing methods. The bootstrap method supports an optional parameter, num_null_simulations that can be specified through the params dictionary. If False, no testing is done. If True, significance of the estimate is tested using the custom method if available, otherwise by bootstrap.

  • evaluate_effect_strength – (Experimental) whether to evaluate the strength of effect

  • confidence_intervals – Binary flag or a string indicating whether the confidence intervals should be computed and which method should be used. All methods support estimation of confidence intervals using the bootstrap method by using the parameter confidence_intervals=”bootstrap”. The bootstrap method takes in two arguments (num_simulations and sample_size_fraction) that can be optionally specified in the params dictionary. Estimators may also override this to implement their own confidence interval method. If this parameter is False, no confidence intervals are computed. If True, confidence intervals are computed by the estimator’s specific method if available, otherwise through bootstrap.

  • target_units – The units for which the treatment effect should be estimated. This can be a string for common specifications of target units (namely, “ate”, “att” and “atc”). It can also be a lambda function that can be used as an index for the data (pandas DataFrame). Alternatively, it can be a new DataFrame that contains values of the effect_modifiers and effect will be estimated only for this new data.

  • effect_modifiers – Variables on which to compute separate effects, or return a heterogeneous effect function. Not all methods support this currently.

  • params – (optional) Additional method parameters num_null_simulations: The number of simulations for testing the statistical significance of the estimator num_simulations: The number of simulations for finding the confidence interval (and/or standard error) for a estimate sample_size_fraction: The size of the sample for the bootstrap estimator confidence_level: The confidence level of the confidence interval estimate num_quantiles_to_discretize_cont_cols: The number of quantiles into which a numeric effect modifier is split, to enable estimation of conditional treatment effect over it.

Returns

an instance of the estimator class.

Valid_Dist_Metric_Params = ['p', 'V', 'VI', 'w']
construct_symbolic_estimator(estimand)[source]

dowhy.causal_estimators.econml module

class dowhy.causal_estimators.econml.Econml(*args, **kwargs)[source]

Bases: CausalEstimator

Initializes an estimator with data and names of relevant variables.

This method is called from the constructors of its child classes.

Parameters
  • data – data frame containing the data

  • identified_estimand – probability expression representing the target identified estimand to estimate.

  • treatment – name of the treatment variable

  • outcome – name of the outcome variable

  • control_value – Value of the treatment in the control group, for effect estimation. If treatment is multi-variate, this can be a list.

  • treatment_value – Value of the treatment in the treated group, for effect estimation. If treatment is multi-variate, this can be a list.

  • test_significance – Binary flag or a string indicating whether to test significance and by which method. All estimators support test_significance=”bootstrap” that estimates a p-value for the obtained estimate using the bootstrap method. Individual estimators can override this to support custom testing methods. The bootstrap method supports an optional parameter, num_null_simulations that can be specified through the params dictionary. If False, no testing is done. If True, significance of the estimate is tested using the custom method if available, otherwise by bootstrap.

  • evaluate_effect_strength – (Experimental) whether to evaluate the strength of effect

  • confidence_intervals – Binary flag or a string indicating whether the confidence intervals should be computed and which method should be used. All methods support estimation of confidence intervals using the bootstrap method by using the parameter confidence_intervals=”bootstrap”. The bootstrap method takes in two arguments (num_simulations and sample_size_fraction) that can be optionally specified in the params dictionary. Estimators may also override this to implement their own confidence interval method. If this parameter is False, no confidence intervals are computed. If True, confidence intervals are computed by the estimator’s specific method if available, otherwise through bootstrap.

  • target_units – The units for which the treatment effect should be estimated. This can be a string for common specifications of target units (namely, “ate”, “att” and “atc”). It can also be a lambda function that can be used as an index for the data (pandas DataFrame). Alternatively, it can be a new DataFrame that contains values of the effect_modifiers and effect will be estimated only for this new data.

  • effect_modifiers – Variables on which to compute separate effects, or return a heterogeneous effect function. Not all methods support this currently.

  • params – (optional) Additional method parameters num_null_simulations: The number of simulations for testing the statistical significance of the estimator num_simulations: The number of simulations for finding the confidence interval (and/or standard error) for a estimate sample_size_fraction: The size of the sample for the bootstrap estimator confidence_level: The confidence level of the confidence interval estimate num_quantiles_to_discretize_cont_cols: The number of quantiles into which a numeric effect modifier is split, to enable estimation of conditional treatment effect over it.

Returns

an instance of the estimator class.

construct_symbolic_estimator(estimand)[source]

dowhy.causal_estimators.generalized_linear_model_estimator module

class dowhy.causal_estimators.generalized_linear_model_estimator.GeneralizedLinearModelEstimator(*args, **kwargs)[source]

Bases: RegressionEstimator

Compute effect of treatment using a generalized linear model such as logistic regression.

Implementation uses statsmodels.api.GLM. Needs an additional parameter, “glm_family” to be specified in method_params. The value of this parameter can be any valid statsmodels.api families object. For example, to use logistic regression, specify “glm_family” as statsmodels.api.families.Binomial().

Initializes an estimator with data and names of relevant variables.

This method is called from the constructors of its child classes.

Parameters
  • data – data frame containing the data

  • identified_estimand – probability expression representing the target identified estimand to estimate.

  • treatment – name of the treatment variable

  • outcome – name of the outcome variable

  • control_value – Value of the treatment in the control group, for effect estimation. If treatment is multi-variate, this can be a list.

  • treatment_value – Value of the treatment in the treated group, for effect estimation. If treatment is multi-variate, this can be a list.

  • test_significance – Binary flag or a string indicating whether to test significance and by which method. All estimators support test_significance=”bootstrap” that estimates a p-value for the obtained estimate using the bootstrap method. Individual estimators can override this to support custom testing methods. The bootstrap method supports an optional parameter, num_null_simulations that can be specified through the params dictionary. If False, no testing is done. If True, significance of the estimate is tested using the custom method if available, otherwise by bootstrap.

  • evaluate_effect_strength – (Experimental) whether to evaluate the strength of effect

  • confidence_intervals – Binary flag or a string indicating whether the confidence intervals should be computed and which method should be used. All methods support estimation of confidence intervals using the bootstrap method by using the parameter confidence_intervals=”bootstrap”. The bootstrap method takes in two arguments (num_simulations and sample_size_fraction) that can be optionally specified in the params dictionary. Estimators may also override this to implement their own confidence interval method. If this parameter is False, no confidence intervals are computed. If True, confidence intervals are computed by the estimator’s specific method if available, otherwise through bootstrap.

  • target_units – The units for which the treatment effect should be estimated. This can be a string for common specifications of target units (namely, “ate”, “att” and “atc”). It can also be a lambda function that can be used as an index for the data (pandas DataFrame). Alternatively, it can be a new DataFrame that contains values of the effect_modifiers and effect will be estimated only for this new data.

  • effect_modifiers – Variables on which to compute separate effects, or return a heterogeneous effect function. Not all methods support this currently.

  • params – (optional) Additional method parameters num_null_simulations: The number of simulations for testing the statistical significance of the estimator num_simulations: The number of simulations for finding the confidence interval (and/or standard error) for a estimate sample_size_fraction: The size of the sample for the bootstrap estimator confidence_level: The confidence level of the confidence interval estimate num_quantiles_to_discretize_cont_cols: The number of quantiles into which a numeric effect modifier is split, to enable estimation of conditional treatment effect over it.

Returns

an instance of the estimator class.

construct_symbolic_estimator(estimand)[source]

dowhy.causal_estimators.instrumental_variable_estimator module

class dowhy.causal_estimators.instrumental_variable_estimator.InstrumentalVariableEstimator(*args, **kwargs)[source]

Bases: CausalEstimator

Compute effect of treatment using the instrumental variables method.

This is also a superclass that can be inherited by other specific methods.

Supports additional parameters that can be specified in the estimate_effect() method.

  • ‘iv_instrument_name’: Name of the specific instrumental variable to be used. Needs to be one of the IVs identified in the identification step. Default is to use all the IV variables from the identification step.

Initializes an estimator with data and names of relevant variables.

This method is called from the constructors of its child classes.

Parameters
  • data – data frame containing the data

  • identified_estimand – probability expression representing the target identified estimand to estimate.

  • treatment – name of the treatment variable

  • outcome – name of the outcome variable

  • control_value – Value of the treatment in the control group, for effect estimation. If treatment is multi-variate, this can be a list.

  • treatment_value – Value of the treatment in the treated group, for effect estimation. If treatment is multi-variate, this can be a list.

  • test_significance – Binary flag or a string indicating whether to test significance and by which method. All estimators support test_significance=”bootstrap” that estimates a p-value for the obtained estimate using the bootstrap method. Individual estimators can override this to support custom testing methods. The bootstrap method supports an optional parameter, num_null_simulations that can be specified through the params dictionary. If False, no testing is done. If True, significance of the estimate is tested using the custom method if available, otherwise by bootstrap.

  • evaluate_effect_strength – (Experimental) whether to evaluate the strength of effect

  • confidence_intervals – Binary flag or a string indicating whether the confidence intervals should be computed and which method should be used. All methods support estimation of confidence intervals using the bootstrap method by using the parameter confidence_intervals=”bootstrap”. The bootstrap method takes in two arguments (num_simulations and sample_size_fraction) that can be optionally specified in the params dictionary. Estimators may also override this to implement their own confidence interval method. If this parameter is False, no confidence intervals are computed. If True, confidence intervals are computed by the estimator’s specific method if available, otherwise through bootstrap.

  • target_units – The units for which the treatment effect should be estimated. This can be a string for common specifications of target units (namely, “ate”, “att” and “atc”). It can also be a lambda function that can be used as an index for the data (pandas DataFrame). Alternatively, it can be a new DataFrame that contains values of the effect_modifiers and effect will be estimated only for this new data.

  • effect_modifiers – Variables on which to compute separate effects, or return a heterogeneous effect function. Not all methods support this currently.

  • params – (optional) Additional method parameters num_null_simulations: The number of simulations for testing the statistical significance of the estimator num_simulations: The number of simulations for finding the confidence interval (and/or standard error) for a estimate sample_size_fraction: The size of the sample for the bootstrap estimator confidence_level: The confidence level of the confidence interval estimate num_quantiles_to_discretize_cont_cols: The number of quantiles into which a numeric effect modifier is split, to enable estimation of conditional treatment effect over it.

Returns

an instance of the estimator class.

construct_symbolic_estimator(estimand)[source]

dowhy.causal_estimators.linear_regression_estimator module

class dowhy.causal_estimators.linear_regression_estimator.LinearRegressionEstimator(*args, **kwargs)[source]

Bases: RegressionEstimator

Compute effect of treatment using linear regression.

Fits a regression model for estimating the outcome using treatment(s) and confounders. For a univariate treatment, the treatment effect is equivalent to the coefficient of the treatment variable.

Simple method to show the implementation of a causal inference method that can handle multiple treatments and heterogeneity in treatment. Requires a strong assumption that all relationships from (T, W) to Y are linear.

Initializes an estimator with data and names of relevant variables.

This method is called from the constructors of its child classes.

Parameters
  • data – data frame containing the data

  • identified_estimand – probability expression representing the target identified estimand to estimate.

  • treatment – name of the treatment variable

  • outcome – name of the outcome variable

  • control_value – Value of the treatment in the control group, for effect estimation. If treatment is multi-variate, this can be a list.

  • treatment_value – Value of the treatment in the treated group, for effect estimation. If treatment is multi-variate, this can be a list.

  • test_significance – Binary flag or a string indicating whether to test significance and by which method. All estimators support test_significance=”bootstrap” that estimates a p-value for the obtained estimate using the bootstrap method. Individual estimators can override this to support custom testing methods. The bootstrap method supports an optional parameter, num_null_simulations that can be specified through the params dictionary. If False, no testing is done. If True, significance of the estimate is tested using the custom method if available, otherwise by bootstrap.

  • evaluate_effect_strength – (Experimental) whether to evaluate the strength of effect

  • confidence_intervals – Binary flag or a string indicating whether the confidence intervals should be computed and which method should be used. All methods support estimation of confidence intervals using the bootstrap method by using the parameter confidence_intervals=”bootstrap”. The bootstrap method takes in two arguments (num_simulations and sample_size_fraction) that can be optionally specified in the params dictionary. Estimators may also override this to implement their own confidence interval method. If this parameter is False, no confidence intervals are computed. If True, confidence intervals are computed by the estimator’s specific method if available, otherwise through bootstrap.

  • target_units – The units for which the treatment effect should be estimated. This can be a string for common specifications of target units (namely, “ate”, “att” and “atc”). It can also be a lambda function that can be used as an index for the data (pandas DataFrame). Alternatively, it can be a new DataFrame that contains values of the effect_modifiers and effect will be estimated only for this new data.

  • effect_modifiers – Variables on which to compute separate effects, or return a heterogeneous effect function. Not all methods support this currently.

  • params – (optional) Additional method parameters num_null_simulations: The number of simulations for testing the statistical significance of the estimator num_simulations: The number of simulations for finding the confidence interval (and/or standard error) for a estimate sample_size_fraction: The size of the sample for the bootstrap estimator confidence_level: The confidence level of the confidence interval estimate num_quantiles_to_discretize_cont_cols: The number of quantiles into which a numeric effect modifier is split, to enable estimation of conditional treatment effect over it.

Returns

an instance of the estimator class.

construct_symbolic_estimator(estimand)[source]

dowhy.causal_estimators.propensity_score_estimator module

class dowhy.causal_estimators.propensity_score_estimator.PropensityScoreEstimator(*args, propensity_score_model=None, recalculate_propensity_score=True, propensity_score_column='propensity_score', **kwargs)[source]

Bases: CausalEstimator

Base class for estimators that estimate effects based on propensity of treatment assignment. Supports additional parameters that can be specified in the estimate_effect() method. - ‘propensity_score_model’: The model used to compute propensity score. Could be any classification model that supports fit() and predict_proba() methods. If None, use LogisticRegression model as the default. Default=None - ‘recalculate_propensity_score’: If true, force the estimator to calculate the propensity score. To use pre-computed propensity score, set this value to false. Default=True - ‘propensity_score_column’: column name that stores the propensity score. Default=’propensity_score’

Initializes an estimator with data and names of relevant variables.

This method is called from the constructors of its child classes.

Parameters
  • data – data frame containing the data

  • identified_estimand – probability expression representing the target identified estimand to estimate.

  • treatment – name of the treatment variable

  • outcome – name of the outcome variable

  • control_value – Value of the treatment in the control group, for effect estimation. If treatment is multi-variate, this can be a list.

  • treatment_value – Value of the treatment in the treated group, for effect estimation. If treatment is multi-variate, this can be a list.

  • test_significance – Binary flag or a string indicating whether to test significance and by which method. All estimators support test_significance=”bootstrap” that estimates a p-value for the obtained estimate using the bootstrap method. Individual estimators can override this to support custom testing methods. The bootstrap method supports an optional parameter, num_null_simulations that can be specified through the params dictionary. If False, no testing is done. If True, significance of the estimate is tested using the custom method if available, otherwise by bootstrap.

  • evaluate_effect_strength – (Experimental) whether to evaluate the strength of effect

  • confidence_intervals – Binary flag or a string indicating whether the confidence intervals should be computed and which method should be used. All methods support estimation of confidence intervals using the bootstrap method by using the parameter confidence_intervals=”bootstrap”. The bootstrap method takes in two arguments (num_simulations and sample_size_fraction) that can be optionally specified in the params dictionary. Estimators may also override this to implement their own confidence interval method. If this parameter is False, no confidence intervals are computed. If True, confidence intervals are computed by the estimator’s specific method if available, otherwise through bootstrap.

  • target_units – The units for which the treatment effect should be estimated. This can be a string for common specifications of target units (namely, “ate”, “att” and “atc”). It can also be a lambda function that can be used as an index for the data (pandas DataFrame). Alternatively, it can be a new DataFrame that contains values of the effect_modifiers and effect will be estimated only for this new data.

  • effect_modifiers – Variables on which to compute separate effects, or return a heterogeneous effect function. Not all methods support this currently.

  • params – (optional) Additional method parameters num_null_simulations: The number of simulations for testing the statistical significance of the estimator num_simulations: The number of simulations for finding the confidence interval (and/or standard error) for a estimate sample_size_fraction: The size of the sample for the bootstrap estimator confidence_level: The confidence level of the confidence interval estimate num_quantiles_to_discretize_cont_cols: The number of quantiles into which a numeric effect modifier is split, to enable estimation of conditional treatment effect over it.

Returns

an instance of the estimator class.

construct_symbolic_estimator(estimand)[source]

A symbolic string that conveys what each estimator does. For instance, linear regression is expressed as y ~ bx + e

dowhy.causal_estimators.propensity_score_matching_estimator module

class dowhy.causal_estimators.propensity_score_matching_estimator.PropensityScoreMatchingEstimator(*args, propensity_score_model=None, recalculate_propensity_score=True, propensity_score_column='propensity_score', **kwargs)[source]

Bases: PropensityScoreEstimator

Estimate effect of treatment by finding matching treated and control units based on propensity score.

Straightforward application of the back-door criterion.

Supports additional parameters that can be specified in the estimate_effect() method.

  • ‘propensity_score_model’: The model used to compute propensity score. Could be any classification model that supports fit() and predict_proba() methods. If None, use LogisticRegression model as the default. Default=None

  • ‘recalculate_propensity_score’: If true, force the estimator to calculate the propensity score. To use pre-computed propensity score, set this value to false. Default=True

  • ‘propensity_score_column’: column name that stores the propensity score. Default=’propensity_score’

Initializes an estimator with data and names of relevant variables.

This method is called from the constructors of its child classes.

Parameters
  • data – data frame containing the data

  • identified_estimand – probability expression representing the target identified estimand to estimate.

  • treatment – name of the treatment variable

  • outcome – name of the outcome variable

  • control_value – Value of the treatment in the control group, for effect estimation. If treatment is multi-variate, this can be a list.

  • treatment_value – Value of the treatment in the treated group, for effect estimation. If treatment is multi-variate, this can be a list.

  • test_significance – Binary flag or a string indicating whether to test significance and by which method. All estimators support test_significance=”bootstrap” that estimates a p-value for the obtained estimate using the bootstrap method. Individual estimators can override this to support custom testing methods. The bootstrap method supports an optional parameter, num_null_simulations that can be specified through the params dictionary. If False, no testing is done. If True, significance of the estimate is tested using the custom method if available, otherwise by bootstrap.

  • evaluate_effect_strength – (Experimental) whether to evaluate the strength of effect

  • confidence_intervals – Binary flag or a string indicating whether the confidence intervals should be computed and which method should be used. All methods support estimation of confidence intervals using the bootstrap method by using the parameter confidence_intervals=”bootstrap”. The bootstrap method takes in two arguments (num_simulations and sample_size_fraction) that can be optionally specified in the params dictionary. Estimators may also override this to implement their own confidence interval method. If this parameter is False, no confidence intervals are computed. If True, confidence intervals are computed by the estimator’s specific method if available, otherwise through bootstrap.

  • target_units – The units for which the treatment effect should be estimated. This can be a string for common specifications of target units (namely, “ate”, “att” and “atc”). It can also be a lambda function that can be used as an index for the data (pandas DataFrame). Alternatively, it can be a new DataFrame that contains values of the effect_modifiers and effect will be estimated only for this new data.

  • effect_modifiers – Variables on which to compute separate effects, or return a heterogeneous effect function. Not all methods support this currently.

  • params – (optional) Additional method parameters num_null_simulations: The number of simulations for testing the statistical significance of the estimator num_simulations: The number of simulations for finding the confidence interval (and/or standard error) for a estimate sample_size_fraction: The size of the sample for the bootstrap estimator confidence_level: The confidence level of the confidence interval estimate num_quantiles_to_discretize_cont_cols: The number of quantiles into which a numeric effect modifier is split, to enable estimation of conditional treatment effect over it.

Returns

an instance of the estimator class.

construct_symbolic_estimator(estimand)[source]

A symbolic string that conveys what each estimator does. For instance, linear regression is expressed as y ~ bx + e

dowhy.causal_estimators.propensity_score_stratification_estimator module

class dowhy.causal_estimators.propensity_score_stratification_estimator.PropensityScoreStratificationEstimator(*args, num_strata='auto', clipping_threshold=10, propensity_score_model=None, recalculate_propensity_score=True, propensity_score_column='propensity_score', **kwargs)[source]

Bases: PropensityScoreEstimator

Estimate effect of treatment by stratifying the data into bins with identical common causes.

Straightforward application of the back-door criterion.

Supports additional parameters that can be specified in the estimate_effect() method.

  • ‘num_strata’: Number of bins by which data will be stratified. Default=50

  • ‘clipping_threshold’: Mininum number of treated or control units per strata. Default=10

  • ‘propensity_score_model’: The model used to compute propensity score. Could be any classification model that supports fit() and predict_proba() methods. If None, use LogisticRegression model as the default. Default=None

  • ‘recalculate_propensity_score’: If true, force the estimator to calculate the propensity score. To use pre-computed propensity score, set this value to false. Default=True

  • ‘propensity_score_column’: column name that stores the propensity score. Default=’propensity_score’

Initializes an estimator with data and names of relevant variables.

This method is called from the constructors of its child classes.

Parameters
  • data – data frame containing the data

  • identified_estimand – probability expression representing the target identified estimand to estimate.

  • treatment – name of the treatment variable

  • outcome – name of the outcome variable

  • control_value – Value of the treatment in the control group, for effect estimation. If treatment is multi-variate, this can be a list.

  • treatment_value – Value of the treatment in the treated group, for effect estimation. If treatment is multi-variate, this can be a list.

  • test_significance – Binary flag or a string indicating whether to test significance and by which method. All estimators support test_significance=”bootstrap” that estimates a p-value for the obtained estimate using the bootstrap method. Individual estimators can override this to support custom testing methods. The bootstrap method supports an optional parameter, num_null_simulations that can be specified through the params dictionary. If False, no testing is done. If True, significance of the estimate is tested using the custom method if available, otherwise by bootstrap.

  • evaluate_effect_strength – (Experimental) whether to evaluate the strength of effect

  • confidence_intervals – Binary flag or a string indicating whether the confidence intervals should be computed and which method should be used. All methods support estimation of confidence intervals using the bootstrap method by using the parameter confidence_intervals=”bootstrap”. The bootstrap method takes in two arguments (num_simulations and sample_size_fraction) that can be optionally specified in the params dictionary. Estimators may also override this to implement their own confidence interval method. If this parameter is False, no confidence intervals are computed. If True, confidence intervals are computed by the estimator’s specific method if available, otherwise through bootstrap.

  • target_units – The units for which the treatment effect should be estimated. This can be a string for common specifications of target units (namely, “ate”, “att” and “atc”). It can also be a lambda function that can be used as an index for the data (pandas DataFrame). Alternatively, it can be a new DataFrame that contains values of the effect_modifiers and effect will be estimated only for this new data.

  • effect_modifiers – Variables on which to compute separate effects, or return a heterogeneous effect function. Not all methods support this currently.

  • params – (optional) Additional method parameters num_null_simulations: The number of simulations for testing the statistical significance of the estimator num_simulations: The number of simulations for finding the confidence interval (and/or standard error) for a estimate sample_size_fraction: The size of the sample for the bootstrap estimator confidence_level: The confidence level of the confidence interval estimate num_quantiles_to_discretize_cont_cols: The number of quantiles into which a numeric effect modifier is split, to enable estimation of conditional treatment effect over it.

Returns

an instance of the estimator class.

construct_symbolic_estimator(estimand)[source]

A symbolic string that conveys what each estimator does. For instance, linear regression is expressed as y ~ bx + e

dowhy.causal_estimators.propensity_score_weighting_estimator module

class dowhy.causal_estimators.propensity_score_weighting_estimator.PropensityScoreWeightingEstimator(*args, min_ps_score=0.05, max_ps_score=0.95, weighting_scheme='ips_weight', propensity_score_model=None, recalculate_propensity_score=True, propensity_score_column='propensity_score', **kwargs)[source]

Bases: PropensityScoreEstimator

Estimate effect of treatment by weighing the data by inverse probability of occurrence.

Straightforward application of the back-door criterion.

Supports additional parameters that can be specified in the estimate_effect() method.

  • ‘min_ps_score’: Lower bound used to clip the propensity score. Default=0.05

  • ‘max_ps_score’: Upper bound used to clip the propensity score. Default=0.95

  • ‘weighting_scheme’: This is the name of weighting method to use. Can be inverse propensity score (“ips_weight”, default), stabilized IPS score (“ips_stabilized_weight”), or normalized IPS score (“ips_normalized_weight”)

  • ‘propensity_score_model’: The model used to compute propensity score. Could be any classification model that supports fit() and predict_proba() methods. If None, use LogisticRegression model as the default. Default=None

  • ‘recalculate_propensity_score’: If true, force the estimator to calculate the propensity score. To use pre-computed propensity score, set this value to false. Default=True

  • ‘propensity_score_column’: column name that stores the propensity score. Default=’propensity_score’

Initializes an estimator with data and names of relevant variables.

This method is called from the constructors of its child classes.

Parameters
  • data – data frame containing the data

  • identified_estimand – probability expression representing the target identified estimand to estimate.

  • treatment – name of the treatment variable

  • outcome – name of the outcome variable

  • control_value – Value of the treatment in the control group, for effect estimation. If treatment is multi-variate, this can be a list.

  • treatment_value – Value of the treatment in the treated group, for effect estimation. If treatment is multi-variate, this can be a list.

  • test_significance – Binary flag or a string indicating whether to test significance and by which method. All estimators support test_significance=”bootstrap” that estimates a p-value for the obtained estimate using the bootstrap method. Individual estimators can override this to support custom testing methods. The bootstrap method supports an optional parameter, num_null_simulations that can be specified through the params dictionary. If False, no testing is done. If True, significance of the estimate is tested using the custom method if available, otherwise by bootstrap.

  • evaluate_effect_strength – (Experimental) whether to evaluate the strength of effect

  • confidence_intervals – Binary flag or a string indicating whether the confidence intervals should be computed and which method should be used. All methods support estimation of confidence intervals using the bootstrap method by using the parameter confidence_intervals=”bootstrap”. The bootstrap method takes in two arguments (num_simulations and sample_size_fraction) that can be optionally specified in the params dictionary. Estimators may also override this to implement their own confidence interval method. If this parameter is False, no confidence intervals are computed. If True, confidence intervals are computed by the estimator’s specific method if available, otherwise through bootstrap.

  • target_units – The units for which the treatment effect should be estimated. This can be a string for common specifications of target units (namely, “ate”, “att” and “atc”). It can also be a lambda function that can be used as an index for the data (pandas DataFrame). Alternatively, it can be a new DataFrame that contains values of the effect_modifiers and effect will be estimated only for this new data.

  • effect_modifiers – Variables on which to compute separate effects, or return a heterogeneous effect function. Not all methods support this currently.

  • params – (optional) Additional method parameters num_null_simulations: The number of simulations for testing the statistical significance of the estimator num_simulations: The number of simulations for finding the confidence interval (and/or standard error) for a estimate sample_size_fraction: The size of the sample for the bootstrap estimator confidence_level: The confidence level of the confidence interval estimate num_quantiles_to_discretize_cont_cols: The number of quantiles into which a numeric effect modifier is split, to enable estimation of conditional treatment effect over it.

Returns

an instance of the estimator class.

construct_symbolic_estimator(estimand)[source]

A symbolic string that conveys what each estimator does. For instance, linear regression is expressed as y ~ bx + e

dowhy.causal_estimators.regression_discontinuity_estimator module

class dowhy.causal_estimators.regression_discontinuity_estimator.RegressionDiscontinuityEstimator(*args, **kwargs)[source]

Bases: CausalEstimator

Compute effect of treatment using the regression discontinuity method.

Estimates effect by transforming the problem to an instrumental variables problem.

Supports additional parameters that can be specified in the estimate_effect() method.

  • ‘rd_variable_name’: name of the variable on which the discontinuity occurs. This is the instrument.

  • ‘rd_threshold_value’: Threshold at which the discontinuity occurs.

  • ‘rd_bandwidth’: Distance from the threshold within which confounders can be considered the same between treatment and control. Considered band is (threshold +- bandwidth)

Initializes an estimator with data and names of relevant variables.

This method is called from the constructors of its child classes.

Parameters
  • data – data frame containing the data

  • identified_estimand – probability expression representing the target identified estimand to estimate.

  • treatment – name of the treatment variable

  • outcome – name of the outcome variable

  • control_value – Value of the treatment in the control group, for effect estimation. If treatment is multi-variate, this can be a list.

  • treatment_value – Value of the treatment in the treated group, for effect estimation. If treatment is multi-variate, this can be a list.

  • test_significance – Binary flag or a string indicating whether to test significance and by which method. All estimators support test_significance=”bootstrap” that estimates a p-value for the obtained estimate using the bootstrap method. Individual estimators can override this to support custom testing methods. The bootstrap method supports an optional parameter, num_null_simulations that can be specified through the params dictionary. If False, no testing is done. If True, significance of the estimate is tested using the custom method if available, otherwise by bootstrap.

  • evaluate_effect_strength – (Experimental) whether to evaluate the strength of effect

  • confidence_intervals – Binary flag or a string indicating whether the confidence intervals should be computed and which method should be used. All methods support estimation of confidence intervals using the bootstrap method by using the parameter confidence_intervals=”bootstrap”. The bootstrap method takes in two arguments (num_simulations and sample_size_fraction) that can be optionally specified in the params dictionary. Estimators may also override this to implement their own confidence interval method. If this parameter is False, no confidence intervals are computed. If True, confidence intervals are computed by the estimator’s specific method if available, otherwise through bootstrap.

  • target_units – The units for which the treatment effect should be estimated. This can be a string for common specifications of target units (namely, “ate”, “att” and “atc”). It can also be a lambda function that can be used as an index for the data (pandas DataFrame). Alternatively, it can be a new DataFrame that contains values of the effect_modifiers and effect will be estimated only for this new data.

  • effect_modifiers – Variables on which to compute separate effects, or return a heterogeneous effect function. Not all methods support this currently.

  • params – (optional) Additional method parameters num_null_simulations: The number of simulations for testing the statistical significance of the estimator num_simulations: The number of simulations for finding the confidence interval (and/or standard error) for a estimate sample_size_fraction: The size of the sample for the bootstrap estimator confidence_level: The confidence level of the confidence interval estimate num_quantiles_to_discretize_cont_cols: The number of quantiles into which a numeric effect modifier is split, to enable estimation of conditional treatment effect over it.

Returns

an instance of the estimator class.

construct_symbolic_estimator(estimand)[source]

dowhy.causal_estimators.regression_estimator module

class dowhy.causal_estimators.regression_estimator.RegressionEstimator(*args, **kwargs)[source]

Bases: CausalEstimator

Compute effect of treatment using some regression function.

Fits a regression model for estimating the outcome using treatment(s) and confounders.

Initializes an estimator with data and names of relevant variables.

This method is called from the constructors of its child classes.

Parameters
  • data – data frame containing the data

  • identified_estimand – probability expression representing the target identified estimand to estimate.

  • treatment – name of the treatment variable

  • outcome – name of the outcome variable

  • control_value – Value of the treatment in the control group, for effect estimation. If treatment is multi-variate, this can be a list.

  • treatment_value – Value of the treatment in the treated group, for effect estimation. If treatment is multi-variate, this can be a list.

  • test_significance – Binary flag or a string indicating whether to test significance and by which method. All estimators support test_significance=”bootstrap” that estimates a p-value for the obtained estimate using the bootstrap method. Individual estimators can override this to support custom testing methods. The bootstrap method supports an optional parameter, num_null_simulations that can be specified through the params dictionary. If False, no testing is done. If True, significance of the estimate is tested using the custom method if available, otherwise by bootstrap.

  • evaluate_effect_strength – (Experimental) whether to evaluate the strength of effect

  • confidence_intervals – Binary flag or a string indicating whether the confidence intervals should be computed and which method should be used. All methods support estimation of confidence intervals using the bootstrap method by using the parameter confidence_intervals=”bootstrap”. The bootstrap method takes in two arguments (num_simulations and sample_size_fraction) that can be optionally specified in the params dictionary. Estimators may also override this to implement their own confidence interval method. If this parameter is False, no confidence intervals are computed. If True, confidence intervals are computed by the estimator’s specific method if available, otherwise through bootstrap.

  • target_units – The units for which the treatment effect should be estimated. This can be a string for common specifications of target units (namely, “ate”, “att” and “atc”). It can also be a lambda function that can be used as an index for the data (pandas DataFrame). Alternatively, it can be a new DataFrame that contains values of the effect_modifiers and effect will be estimated only for this new data.

  • effect_modifiers – Variables on which to compute separate effects, or return a heterogeneous effect function. Not all methods support this currently.

  • params – (optional) Additional method parameters num_null_simulations: The number of simulations for testing the statistical significance of the estimator num_simulations: The number of simulations for finding the confidence interval (and/or standard error) for a estimate sample_size_fraction: The size of the sample for the bootstrap estimator confidence_level: The confidence level of the confidence interval estimate num_quantiles_to_discretize_cont_cols: The number of quantiles into which a numeric effect modifier is split, to enable estimation of conditional treatment effect over it.

Returns

an instance of the estimator class.

dowhy.causal_estimators.two_stage_regression_estimator module

class dowhy.causal_estimators.two_stage_regression_estimator.TwoStageRegressionEstimator(*args, **kwargs)[source]

Bases: CausalEstimator

Compute treatment effect whenever the effect is fully mediated by another variable (front-door) or when there is an instrument available.

Currently only supports a linear model for the effects.

Initializes an estimator with data and names of relevant variables.

This method is called from the constructors of its child classes.

Parameters
  • data – data frame containing the data

  • identified_estimand – probability expression representing the target identified estimand to estimate.

  • treatment – name of the treatment variable

  • outcome – name of the outcome variable

  • control_value – Value of the treatment in the control group, for effect estimation. If treatment is multi-variate, this can be a list.

  • treatment_value – Value of the treatment in the treated group, for effect estimation. If treatment is multi-variate, this can be a list.

  • test_significance – Binary flag or a string indicating whether to test significance and by which method. All estimators support test_significance=”bootstrap” that estimates a p-value for the obtained estimate using the bootstrap method. Individual estimators can override this to support custom testing methods. The bootstrap method supports an optional parameter, num_null_simulations that can be specified through the params dictionary. If False, no testing is done. If True, significance of the estimate is tested using the custom method if available, otherwise by bootstrap.

  • evaluate_effect_strength – (Experimental) whether to evaluate the strength of effect

  • confidence_intervals – Binary flag or a string indicating whether the confidence intervals should be computed and which method should be used. All methods support estimation of confidence intervals using the bootstrap method by using the parameter confidence_intervals=”bootstrap”. The bootstrap method takes in two arguments (num_simulations and sample_size_fraction) that can be optionally specified in the params dictionary. Estimators may also override this to implement their own confidence interval method. If this parameter is False, no confidence intervals are computed. If True, confidence intervals are computed by the estimator’s specific method if available, otherwise through bootstrap.

  • target_units – The units for which the treatment effect should be estimated. This can be a string for common specifications of target units (namely, “ate”, “att” and “atc”). It can also be a lambda function that can be used as an index for the data (pandas DataFrame). Alternatively, it can be a new DataFrame that contains values of the effect_modifiers and effect will be estimated only for this new data.

  • effect_modifiers – Variables on which to compute separate effects, or return a heterogeneous effect function. Not all methods support this currently.

  • params – (optional) Additional method parameters num_null_simulations: The number of simulations for testing the statistical significance of the estimator num_simulations: The number of simulations for finding the confidence interval (and/or standard error) for a estimate sample_size_fraction: The size of the sample for the bootstrap estimator confidence_level: The confidence level of the confidence interval estimate num_quantiles_to_discretize_cont_cols: The number of quantiles into which a numeric effect modifier is split, to enable estimation of conditional treatment effect over it.

Returns

an instance of the estimator class.

DEFAULT_FIRST_STAGE_MODEL

alias of LinearRegressionEstimator

DEFAULT_SECOND_STAGE_MODEL

alias of LinearRegressionEstimator

build_first_stage_features()[source]
construct_symbolic_estimator(first_stage_symbolic, second_stage_symbolic, total_effect_symbolic=None, estimand_type=None)[source]

Module contents

dowhy.causal_estimators.get_class_object(method_name, *args, **kwargs)[source]