LinearRegressionTrainingSummary#

class pyspark.ml.regression.LinearRegressionTrainingSummary(java_obj=None)[source]#

Linear regression training results. Currently, the training summary ignores the training weights except for the objective trace.

New in version 2.0.0.

Attributes

coefficientStandardErrors

Standard error of estimated coefficients and intercept.

degreesOfFreedom

Degrees of freedom.

devianceResiduals

The weighted residuals, the usual residuals rescaled by the square root of the instance weights.

explainedVariance

Returns the explained variance regression score.

featuresCol

Field in "predictions" which gives the features of each instance as a vector.

labelCol

Field in "predictions" which gives the true label of each instance.

meanAbsoluteError

Returns the mean absolute error, which is a risk function corresponding to the expected value of the absolute error loss or l1-norm loss.

meanSquaredError

Returns the mean squared error, which is a risk function corresponding to the expected value of the squared error loss or quadratic loss.

numInstances

Number of instances in DataFrame predictions

objectiveHistory

Objective function (scaled loss + regularization) at each iteration.

pValues

Two-sided p-value of estimated coefficients and intercept.

predictionCol

Field in "predictions" which gives the predicted value of the label at each instance.

predictions

Dataframe outputted by the model's transform method.

r2

Returns R^2, the coefficient of determination.

r2adj

Returns Adjusted R^2, the adjusted coefficient of determination.

residuals

Residuals (label - predicted value)

rootMeanSquaredError

Returns the root mean squared error, which is defined as the square root of the mean squared error.

tValues

T-statistic of estimated coefficients and intercept.

totalIterations

Number of training iterations until termination.

Attributes Documentation

coefficientStandardErrors#

Standard error of estimated coefficients and intercept. This value is only available when using the “normal” solver.

If LinearRegression.fitIntercept is set to True, then the last element returned corresponds to the intercept.

New in version 2.0.0.

degreesOfFreedom#

Degrees of freedom.

New in version 2.2.0.

devianceResiduals#

The weighted residuals, the usual residuals rescaled by the square root of the instance weights.

New in version 2.0.0.

explainedVariance#

Returns the explained variance regression score. explainedVariance = \(1 - \frac{variance(y - \hat{y})}{variance(y)}\)

Notes

This ignores instance weights (setting all to 1.0) from LinearRegression.weightCol. This will change in later Spark versions.

For additional information see Explained variation on Wikipedia

New in version 2.0.0.

featuresCol#

Field in “predictions” which gives the features of each instance as a vector.

New in version 2.0.0.

labelCol#

Field in “predictions” which gives the true label of each instance.

New in version 2.0.0.

meanAbsoluteError#

Returns the mean absolute error, which is a risk function corresponding to the expected value of the absolute error loss or l1-norm loss.

Notes

This ignores instance weights (setting all to 1.0) from LinearRegression.weightCol. This will change in later Spark versions.

New in version 2.0.0.

meanSquaredError#

Returns the mean squared error, which is a risk function corresponding to the expected value of the squared error loss or quadratic loss.

Notes

This ignores instance weights (setting all to 1.0) from LinearRegression.weightCol. This will change in later Spark versions.

New in version 2.0.0.

numInstances#

Number of instances in DataFrame predictions

New in version 2.0.0.

objectiveHistory#

Objective function (scaled loss + regularization) at each iteration. This value is only available when using the “l-bfgs” solver.

New in version 2.0.0.

pValues#

Two-sided p-value of estimated coefficients and intercept. This value is only available when using the “normal” solver.

If LinearRegression.fitIntercept is set to True, then the last element returned corresponds to the intercept.

New in version 2.0.0.

predictionCol#

Field in “predictions” which gives the predicted value of the label at each instance.

New in version 2.0.0.

predictions#

Dataframe outputted by the model’s transform method.

New in version 2.0.0.

r2#

Returns R^2, the coefficient of determination.

Notes

This ignores instance weights (setting all to 1.0) from LinearRegression.weightCol. This will change in later Spark versions.

See also Wikipedia coefficient of determination

New in version 2.0.0.

r2adj#

Returns Adjusted R^2, the adjusted coefficient of determination.

Notes

This ignores instance weights (setting all to 1.0) from LinearRegression.weightCol. This will change in later Spark versions.

Wikipedia coefficient of determination, Adjusted R^2

New in version 2.4.0.

residuals#

Residuals (label - predicted value)

New in version 2.0.0.

rootMeanSquaredError#

Returns the root mean squared error, which is defined as the square root of the mean squared error.

Notes

This ignores instance weights (setting all to 1.0) from LinearRegression.weightCol. This will change in later Spark versions.

New in version 2.0.0.

tValues#

T-statistic of estimated coefficients and intercept. This value is only available when using the “normal” solver.

If LinearRegression.fitIntercept is set to True, then the last element returned corresponds to the intercept.

New in version 2.0.0.

totalIterations#

Number of training iterations until termination. This value is only available when using the “l-bfgs” solver.

New in version 2.0.0.