Distributions

Univariate Finite Distribution

hyperparameters

Distributions.samplerMethod
sampler(r::NominalRange, probs::AbstractVector{<:Real})
sampler(r::NominalRange)
sampler(r::NumericRange{T}, d)

Construct an object s which can be used to generate random samples from a ParamRange object r (a one-dimensional range) using one of the following calls:

rand(s)             # for one sample
rand(s, n)          # for n samples
rand(rng, s [, n])  # to specify an RNG

The argument probs can be any probability vector with the same length as r.values. The second sampler method above calls the first with a uniform probs vector.

The argument d can be either an arbitrary instance of UnivariateDistribution from the Distributions.jl package, or one of a Distributions.jl types for which fit(d, ::NumericRange) is defined. These include: Arcsine, Uniform, Biweight, Cosine, Epanechnikov, SymTriangularDist, Triweight, Normal, Gamma, InverseGaussian, Logistic, LogNormal, Cauchy, Gumbel, Laplace, and Poisson; but see the doc-string for Distributions.fit for an up-to-date list.

If d is an instance, then sampling is from a truncated form of the supplied distribution d, the truncation bounds being r.lower and r.upper (the attributes r.origin and r.unit attributes are ignored). For discrete numeric ranges (T <: Integer) the samples are rounded.

If d is a type then a suitably truncated distribution is automatically generated using Distributions.fit(d, r).

Important. Values are generated with no regard to r.scale, except in the special case r.scale is a callable object f. In that case, f is applied to all values generated by rand as described above (prior to rounding, in the case of discrete numeric ranges).

Examples

julia> r = range(Char, :letter, values=collect("abc"))
julia> s = sampler(r, [0.1, 0.2, 0.7])
julia> samples =  rand(s, 1000);
julia> StatsBase.countmap(samples)
Dict{Char,Int64} with 3 entries:
  'a' => 107
  'b' => 205
  'c' => 688

julia> r = range(Int, :k, lower=2, upper=6) # numeric but discrete
julia> s = sampler(r, Normal)
julia> samples = rand(s, 1000);
julia> UnicodePlots.histogram(samples)
           ┌                                        ┐
[2.0, 2.5) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 119
[2.5, 3.0) ┤ 0
[3.0, 3.5) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 296
[3.5, 4.0) ┤ 0
[4.0, 4.5) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 275
[4.5, 5.0) ┤ 0
[5.0, 5.5) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 221
[5.5, 6.0) ┤ 0
[6.0, 6.5) ┤▇▇▇▇▇▇▇▇▇▇▇ 89
           └                                        ┘
source
MLJBase.iteratorMethod
iterator([rng, ], r::NominalRange, [,n])
iterator([rng, ], r::NumericRange, n)

Return an iterator (currently a vector) for a ParamRange object r. In the first case iteration is over all values stored in the range (or just the first n, if n is specified). In the second case, the iteration is over approximately n ordered values, generated as follows:

  1. First, exactly n values are generated between U and L, with a spacing determined by r.scale (uniform if scale=:linear) where U and L are given by the following table:

    r.lowerr.upperLU
    finitefiniter.lowerr.upper
    -Inffiniter.upper - 2r.unitr.upper
    finiteInfr.lowerr.lower + 2r.unit
    -InfInfr.origin - r.unitr.origin + r.unit
  2. If a callable f is provided as scale, then a uniform spacing is always applied in (1) but f is broadcast over the results. (Unlike ordinary scales, this alters the effective range of values generated, instead of just altering the spacing.)

  3. If r is a discrete numeric range (r isa NumericRange{<:Integer}) then the values are additionally rounded, with any duplicate values removed. Otherwise all the values are used (and there are exacltly n of them).

  4. Finally, if a random number generator rng is specified, then the values are returned in random order (sampling without replacement), and otherwise they are returned in numeric order, or in the order provided to the range constructor, in the case of a NominalRange.

source
MLJBase.scaleMethod
scale(r::ParamRange)

Return the scale associated with a ParamRange object r. The possible return values are: :none (for a NominalRange), :linear, :log, :log10, :log2, or :custom (if r.scale is a callable object).

source
StatsAPI.fitMethod
Distributions.fit(D, r::MLJBase.NumericRange)

Fit and return a distribution d of type D to the one-dimensional range r.

Only types D in the table below are supported.

The distribution d is constructed in two stages. First, a distributon d0, characterized by the conditions in the second column of the table, is fit to r. Then d0 is truncated between r.lower and r.upper to obtain d.

Distribution type DCharacterization of d0
Arcsine, Uniform, Biweight, Cosine, Epanechnikov, SymTriangularDist, Triweightminimum(d) = r.lower, maximum(d) = r.upper
Normal, Gamma, InverseGaussian, Logistic, LogNormalmean(d) = r.origin, std(d) = r.unit
Cauchy, Gumbel, Laplace, (Normal)Dist.location(d) = r.origin, Dist.scale(d) = r.unit
PoissonDist.mean(d) = r.unit

Here Dist = Distributions.

source
Base.rangeMethod
r = range(model, :hyper; values=nothing)

Define a one-dimensional NominalRange object for a field hyper of model. Note that r is not directly iterable but iterator(r) is.

A nested hyperparameter is specified using dot notation. For example, :(atom.max_depth) specifies the max_depth hyperparameter of the submodel model.atom.

r = range(model, :hyper; upper=nothing, lower=nothing,
          scale=nothing, values=nothing)

Assuming values is not specified, define a one-dimensional NumericRange object for a Real field hyper of model. Note that r is not directly iteratable but iterator(r, n)is an iterator of length n. To generate random elements from r, instead apply rand methods to sampler(r). The supported scales are :linear,:log, :logminus, :log10, :log10minus, :log2, or a callable object.

Note that r is not directly iterable, but iterator(r, n) is, for given resolution (length) n.

By default, the behaviour of the constructed object depends on the type of the value of the hyperparameter :hyper at model at the time of construction. To override this behaviour (for instance if model is not available) specify a type in place of model so the behaviour is determined by the value of the specified type.

A nested hyperparameter is specified using dot notation (see above).

If scale is unspecified, it is set to :linear, :log, :log10minus, or :linear, according to whether the interval (lower, upper) is bounded, right-unbounded, left-unbounded, or doubly unbounded, respectively. Note upper=Inf and lower=-Inf are allowed.

If values is specified, the other keyword arguments are ignored and a NominalRange object is returned (see above).

See also: iterator, sampler

source

Utility functions