LassoWithSGD#
- class pyspark.mllib.regression.LassoWithSGD[source]#
Train a regression model with L1-regularization using Stochastic Gradient Descent.
New in version 0.9.0.
Deprecated since version 2.0.0: Use
pyspark.ml.regression.LinearRegression
with elasticNetParam = 1.0. Note the default regParam is 0.01 for LassoWithSGD, but is 0.0 for LinearRegression.Methods
train
(data[, iterations, step, regParam, ...])Train a regression model with L1-regularization using Stochastic Gradient Descent.
Methods Documentation
- classmethod train(data, iterations=100, step=1.0, regParam=0.01, miniBatchFraction=1.0, initialWeights=None, intercept=False, validateData=True, convergenceTol=0.001)[source]#
Train a regression model with L1-regularization using Stochastic Gradient Descent. This solves the l1-regularized least squares regression formulation
f(weights) = 1/(2n) ||A weights - y||^2 + regParam ||weights||_1
Here the data matrix has n rows, and the input RDD holds the set of rows of A, each with its corresponding right hand side label y. See also the documentation for the precise formulation.
New in version 0.9.0.
- Parameters
- data
pyspark.RDD
The training data, an RDD of LabeledPoint.
- iterationsint, optional
The number of iterations. (default: 100)
- stepfloat, optional
The step parameter used in SGD. (default: 1.0)
- regParamfloat, optional
The regularizer parameter. (default: 0.01)
- miniBatchFractionfloat, optional
Fraction of data to be used for each SGD iteration. (default: 1.0)
- initialWeights
pyspark.mllib.linalg.Vector
or convertible, optional The initial weights. (default: None)
- interceptbool, optional
Boolean parameter which indicates the use or not of the augmented representation for training data (i.e. whether bias features are activated or not). (default: False)
- validateDatabool, optional
Boolean parameter which indicates if the algorithm should validate data before training. (default: True)
- convergenceTolfloat, optional
A condition which decides iteration termination. (default: 0.001)
- data