Skip to content

Basis Functions

Basis functions control how smooth terms behave in your GAM models -- different basis types are suitable for different kinds of data and modeling requirements:

ThinPlateSpline

ThinPlateSpline(*, shrinkage: bool | None = False, m: int | None = None)

Thin plate regression spline basis.

Parameters:

  • shrinkage (bool | None, default: False ) –

    If True, the penalty is modified so that the term is shrunk to zero for a high enough smoothing parameter.

  • m (int | None, default: None ) –

    The order of the derivative in the thin plate spline penalty. If \(d\) is the number of covariates for the smooth term, this must satisfy \(m>(d+1)/2\). If left to None, the smallest value satisfying \(m>(d+1)/2\) will be used, which creates "visually smooth" functions.

CubicSpline

CubicSpline(*, shrinkage: bool = False, cyclic: bool = False)

Cubic regression spline basis.

Cubic splines use piecewise cubic polynomials with knots placed throughout the data range. They tend to be computationally efficient, but often performs slightly worse than thin plate splines and are limited to univariate smooths. Note the limitation of being restricted to one-dimensional smooths does not imply they cannot be used for multivariate T smooths, which are constructed from marginal bases.

Parameters:

  • cyclic (bool, default: False ) –

    If True, creates a cyclic spline where the function values and derivatives match at the boundaries. Use for periodic data like time of day, angles, or seasonal patterns. Default is False.

  • shrinkage (bool, default: False ) –

    If True, adds penalty to the null space (linear component). Helps with model selection and identifiability. Default is False. Cannot be used with cyclic=True.

Raises:

  • ValueError

    If both cyclic and shrinkage are True (incompatible options)

BSpline

BSpline(*, degree: int = 3, penalty_orders: Iterable[int] | None = None)

B-spline basis with derivative-based penalties.

These are univariate (but note univariate smooths can be used for multivariate smooths constructed with T). BSpline(degree=3, penalty_orders=[2]) constructs a conventional cubic spline.

Parameters:

  • degree (int, default: 3 ) –

    The degree of the B-spline basis (e.g. 3 for a cubic spline).

  • penalty_orders (Iterable[int] | None, default: None ) –

    The derivative orders to penalize. Default to [degree - 1].

PSpline

PSpline(*, degree: int = 3, penalty_order: int | None = None)

P-spline (penalized spline) basis as proposed by Eilers and Marx (1996).

Uses B-spline bases penalized by discrete penalties applied directly to the basis coefficients. Note for most use cases splines with derivative-based penalties (e.g. ThinPlateSpline or CubicSpline) tend to yield better MSE performance. BSpline(degree=3, penalty_order=2) is cubic-spline-like.

Parameters:

  • degree (int, default: 3 ) –

    Degree of the B-spline basis (e.g. 3 for cubic).

  • penalty_order (int | None, default: None ) –

    The difference order to penalize. 0-th order is ridge penalty. Default to degree-1.

DuchonSpline

DuchonSpline(*, m: int = 2, s: float | int = 0)

Duchon spline basis - a generalization of thin plate splines.

These smoothers allow the use of lower orders of derivative in the penalty than conventional thin plate splines, while still yielding continuous functions.

The description, adapted from mgcv is as follows: Duchon’s (1977) construction generalizes the usual thin plate spline penalty as follows. The usual thin plate spline penalty is given by the integral of the squared Euclidian norm of a vector of mixed partial \(m\)-th order derivatives of the function w.r.t. its arguments. Duchon re-expresses this penalty in the Fourier domain, and then weights the squared norm in the integral by the Euclidean norm of the fourier frequencies, raised to the power \(2s\), where \(s\) is a user selected constant.

If \(d\) is the number of arguments of the smooth:

  • It is required that \(-d/2 < s < d/2\).
  • If \(s=0\) then the usual thin plate spline is recovered.
  • To obtain continuous functions we further require that \(m + s > d/2\).

For example, DuchonSpline(m=1, s=d/2) can be used in order to use first derivative penalization for any \(d\), and still yield continuous functions.

Parameters:

  • m

    Order of derivative to penalize.

  • s

    \(s\) as described above, should be an integer divided by 2.

SplineOnSphere

SplineOnSphere(*, m: int = 0)

Isotropic smooth for data on a sphere (latitude/longitude coordinates).

This should be used with exactly two variables, where the first represents latitude on the interval [-90, 90] and the second represents longitude on the interval [-180, 180].

Parameters:

  • m

    An integer in [-1, 4]. Setting m=-1 uses DuchonSpline(m=2,s=1/2). Setting m=0 signals to use the 2nd order spline on the sphere, computed by Wendelberger’s (1981) method. For m>0, (m+2)/2 is the penalty order, with m=2 equivalent to the usual second derivative penalty.

RandomEffect

RandomEffect(*args, **kwargs)

Random effect basis for correlated grouped data.

This can be used with any mixture of numeric or categorical variables. Acts similarly to an Interaction but penalizes the corresponding coefficients with a multiple of the identity matrix (i.e. a ridge penalty), corresponding to an assumption of i.i.d. normality of the parameters.

MarkovRandomField

MarkovRandomField(*, polys: list[ndarray])

Markov Random Field basis for discrete spatial data with neighborhood structure.

The smoothing penalty encourages similar value in neighboring locations. When using this basis, the variable passed to S should be a categorical variable representing the area labels.

Parameters:

  • polys (list[ndarray]) –

    List of numpy arrays defining the spatial polygons or neighborhood structure. Each array represents the boundary or connectivity information for a spatial unit.

RandomWigglyCurve

RandomWigglyCurve(bs: BasisLike = <factory>)

S for each level of a categorical variable.

When using this basis, the first variable of the smooth should be a numeric variable, and the second should be a categorical variable.

Unlike using a categorical by variable e.g. S(x, by="group"):

  • The terms share a smoothing parameter.
  • The terms are fully penalized, with seperate penalties on each null space component (e.g. intercepts). The terms are non-centered, and can be used with an intercept without introducing indeterminacy, due to the penalization.

Parameters:

  • bs (BasisLike, default: <factory> ) –

    Any singly penalized basis function. Defaults to ThinPlateSpline. Only the type of the basis is passed to mgcv (i.e. what is returned by str(bs)). This is a limitation of mgcv (e.g. you cannot do ) mgcv provides no way to pass more details for setting up the basis function.

BasisLike

BasisLike(*args, **kwargs)

Protocol defining the interface for GAM basis functions.

All basis function classes must implement this protocol to be usable with smooth terms. The protocol ensures basis functions can be converted to appropriate mgcv R syntax and provide any additional parameters needed.