Documentation

Batteries.Data.RunningStats

Running Statistics #

This module implements Welford's one-pass algorithm for calculating the mean and standard deviation of a sample or a population. The advantage of this algorithm is that it is not necessary to store the data.

The algorithm uses the recurrence formulas for the mean μ, variance σ² and the sample variance :

  μₖ = μₖ₋₁ + (xₖ − μₖ₋₁)/k
  σ²ₖ = σ²ₖ₋₁*(k-1)/k + (xₖ − μₖ₋₁)*(xₖ − μₖ)/k
  s²ₖ = s²ₖ₋₁*(k-2)/(k-1) + (xₖ - μₖ₋₁)²/k

To improve performance, Welford's algorithm keeps track of the two running quantities:

  Mₖ = Mₖ₋₁ + (xₖ - Mₖ₋₁)/k
  Sₖ = Sₖ₋₁ + (xₖ - Mₖ₋₁)*(xₖ - Mₖ)

Then: μₖ = Mₖ, σ²ₖ = Sₖ/k, s²ₖ = Sₖ/(k-1).

Compute running statistics of a data stream using Welford's algorithm.

  • init :: (
    • count : Nat

      Number of data points,

    • mean : Float

      Mean of data points.

    • var : Float

      Variance of data points times the number of data points.

  • )
Instances For
    @[inline]

    Add a new data point to running statistics.

    Equations
    • One or more equations did not get rendered due to their size.
    Instances For
      @[inline]

      Variance of running data stream.

      Equations
      Instances For
        @[inline]

        Unbiased variance of running data stream.

        Equations
        Instances For
          @[inline]

          Standard deviation of running data stream.

          Equations
          Instances For