Distributions
Univariate Finite Distribution
hyperparameters
Distributions.sampler
— Methodsampler(r::NominalRange, probs::AbstractVector{<:Real})
sampler(r::NominalRange)
sampler(r::NumericRange{T}, d)
Construct an object s
which can be used to generate random samples from a ParamRange
object r
(a one-dimensional range) using one of the following calls:
rand(s) # for one sample
rand(s, n) # for n samples
rand(rng, s [, n]) # to specify an RNG
The argument probs
can be any probability vector with the same length as r.values
. The second sampler
method above calls the first with a uniform probs
vector.
The argument d
can be either an arbitrary instance of UnivariateDistribution
from the Distributions.jl package, or one of a Distributions.jl types for which fit(d, ::NumericRange)
is defined. These include: Arcsine
, Uniform
, Biweight
, Cosine
, Epanechnikov
, SymTriangularDist
, Triweight
, Normal
, Gamma
, InverseGaussian
, Logistic
, LogNormal
, Cauchy
, Gumbel
, Laplace
, and Poisson
; but see the doc-string for Distributions.fit
for an up-to-date list.
If d
is an instance, then sampling is from a truncated form of the supplied distribution d
, the truncation bounds being r.lower
and r.upper
(the attributes r.origin
and r.unit
attributes are ignored). For discrete numeric ranges (T <: Integer
) the samples are rounded.
If d
is a type then a suitably truncated distribution is automatically generated using Distributions.fit(d, r)
.
Important. Values are generated with no regard to r.scale
, except in the special case r.scale
is a callable object f
. In that case, f
is applied to all values generated by rand
as described above (prior to rounding, in the case of discrete numeric ranges).
Examples
julia> r = range(Char, :letter, values=collect("abc"))
julia> s = sampler(r, [0.1, 0.2, 0.7])
julia> samples = rand(s, 1000);
julia> StatsBase.countmap(samples)
Dict{Char,Int64} with 3 entries:
'a' => 107
'b' => 205
'c' => 688
julia> r = range(Int, :k, lower=2, upper=6) # numeric but discrete
julia> s = sampler(r, Normal)
julia> samples = rand(s, 1000);
julia> UnicodePlots.histogram(samples)
┌ ┐
[2.0, 2.5) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 119
[2.5, 3.0) ┤ 0
[3.0, 3.5) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 296
[3.5, 4.0) ┤ 0
[4.0, 4.5) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 275
[4.5, 5.0) ┤ 0
[5.0, 5.5) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 221
[5.5, 6.0) ┤ 0
[6.0, 6.5) ┤▇▇▇▇▇▇▇▇▇▇▇ 89
└ ┘
MLJBase.iterator
— Methoditerator([rng, ], r::NominalRange, [,n])
iterator([rng, ], r::NumericRange, n)
Return an iterator (currently a vector) for a ParamRange
object r
. In the first case iteration is over all values
stored in the range (or just the first n
, if n
is specified). In the second case, the iteration is over approximately n
ordered values, generated as follows:
First, exactly
n
values are generated betweenU
andL
, with a spacing determined byr.scale
(uniform ifscale=:linear
) whereU
andL
are given by the following table:r.lower
r.upper
L
U
finite finite r.lower
r.upper
-Inf
finite r.upper - 2r.unit
r.upper
finite Inf
r.lower
r.lower + 2r.unit
-Inf
Inf
r.origin - r.unit
r.origin + r.unit
If a callable
f
is provided asscale
, then a uniform spacing is always applied in (1) butf
is broadcast over the results. (Unlike ordinary scales, this alters the effective range of values generated, instead of just altering the spacing.)If
r
is a discrete numeric range (r isa NumericRange{<:Integer}
) then the values are additionally rounded, with any duplicate values removed. Otherwise all the values are used (and there are exacltlyn
of them).Finally, if a random number generator
rng
is specified, then the values are returned in random order (sampling without replacement), and otherwise they are returned in numeric order, or in the order provided to the range constructor, in the case of aNominalRange
.
MLJBase.scale
— Methodscale(r::ParamRange)
Return the scale associated with a ParamRange
object r
. The possible return values are: :none
(for a NominalRange
), :linear
, :log
, :log10
, :log2
, or :custom
(if r.scale
is a callable object).
StatsAPI.fit
— MethodDistributions.fit(D, r::MLJBase.NumericRange)
Fit and return a distribution d
of type D
to the one-dimensional range r
.
Only types D
in the table below are supported.
The distribution d
is constructed in two stages. First, a distributon d0
, characterized by the conditions in the second column of the table, is fit to r
. Then d0
is truncated between r.lower
and r.upper
to obtain d
.
Distribution type D | Characterization of d0 |
---|---|
Arcsine , Uniform , Biweight , Cosine , Epanechnikov , SymTriangularDist , Triweight | minimum(d) = r.lower , maximum(d) = r.upper |
Normal , Gamma , InverseGaussian , Logistic , LogNormal | mean(d) = r.origin , std(d) = r.unit |
Cauchy , Gumbel , Laplace , (Normal ) | Dist.location(d) = r.origin , Dist.scale(d) = r.unit |
Poisson | Dist.mean(d) = r.unit |
Here Dist = Distributions
.
Base.range
— Methodr = range(model, :hyper; values=...)
Define a one-dimensional NominalRange
object for a field hyper
of model
. Note that r
is not directly iterable but iterator(r)
is.
A nested hyperparameter is specified using dot notation. For example, :(atom.max_depth)
specifies the max_depth
hyperparameter of the submodel model.atom
.
r = range(model, :hyper; upper=..., lower..., unit=..., origin=...,
scale=nothing)
Define a one-dimensional NumericRange
object for a Real
property hyper
of model
. Note that r
is not directly iteratable but iterator(r, n)
is an iterator of length n
. To generate random elements from r
, instead apply rand
methods to sampler(r)
. The supported scales are :linear
,:log
, :logminus
, :log10
, :log10minus
, :log2
, or a callable object.
By default, the behaviour of the constructed object depends on the type of the value of the hyperparameter :hyper
at model
at the time of construction. To override this behaviour (for instance if model
is not available) specify a type in place of model
so the behaviour is determined by the value of the specified type.
A nested hyperparameter is specified using dot notation (see above).
keyword options
If scale
is unspecified, it is set to :linear
, :log
, :log10minus
, or :linear
, according to whether the interval (lower, upper)
is bounded, right-unbounded, left-unbounded, or doubly unbounded, respectively. Note upper=Inf
and lower=-Inf
are allowed.
You must specify at least two of the keyword arguments upper
, lower
, unit
and origin
. The last two parameters are used when fitting some distributions to unbounded NumericRange
objects. See Distributions.fit
for details.
A range is unbounded if lower=-Inf
or upper=Inf
, or one of these is left unspecified. In this case, both unit
and origin
must be specified. In the bounded case, these have the radius and midpoint of [lower, upper]
as fallbacks.
If values
is specified, the other keyword arguments are ignored and a NominalRange
object is returned (see above).