diff --git a/REQUIRE b/REQUIRE index 4090c41..ea7917e 100644 --- a/REQUIRE +++ b/REQUIRE @@ -7,3 +7,4 @@ Documenter DocumenterTools Plots StatsBase +Date diff --git a/docs/assets/documenter.css b/docs/assets/documenter.css index 7cd2662..d1f8049 100644 --- a/docs/assets/documenter.css +++ b/docs/assets/documenter.css @@ -83,6 +83,10 @@ img { max-width: 100%; } +video { + max-width: 100%; +} + table { border-collapse: collapse; margin: 1em 0; diff --git a/docs/man/AnomalyDetection/index.html b/docs/man/AnomalyDetection/index.html index 4110914..5a4071e 100644 --- a/docs/man/AnomalyDetection/index.html +++ b/docs/man/AnomalyDetection/index.html @@ -1,2 +1,2 @@ -Anomaly Detection · ChemometricsTools

Anomaly Detection

Anomaly Detection API Reference

ChemometricsTools has a few anomaly detection methods. Feel free to read the API below. If that's too abstract, check out the shoot-out example : AnomalyDetection

Functions

Hotelling(X, pca::PCA; Quantile = 0.05, Variance = 1.0)

Computes the hotelling Tsq and upper control limit cut off of a pca object using a specified Quantile and cumulative variance explained Variance for new or old data X.

A review of PCA-based statistical process monitoring methodsfor time-dependent, high-dimensional data. Bart De Ketelaere https://wis.kuleuven.be/stat/robust/papers/2013/deketelaere-review.pdf

source
Leverage(pca::PCA)

Calculates the leverage of samples in a pca object.

source
Q(X, pca::PCA; Quantile = 0.95, Variance = 1.0)

Computes the Q-statistic and upper control limit cut off of a pca object using a specified Quantile and cumulative variance explained Variance for new or old data X.

A review of PCA-based statistical process monitoring methodsfor time-dependent, high-dimensional data. Bart De Ketelaere https://wis.kuleuven.be/stat/robust/papers/2013/deketelaere-review.pdf

source
+Anomaly Detection · ChemometricsTools

Anomaly Detection

Anomaly Detection API Reference

ChemometricsTools has a few anomaly detection methods. Feel free to read the API below. If that's too abstract, check out the shoot-out example : AnomalyDetection

Functions

Hotelling(X, pca::PCA; Quantile = 0.05, Variance = 1.0)

Computes the hotelling Tsq and upper control limit cut off of a pca object using a specified Quantile and cumulative variance explained Variance for new or old data X.

A review of PCA-based statistical process monitoring methodsfor time-dependent, high-dimensional data. Bart De Ketelaere https://wis.kuleuven.be/stat/robust/papers/2013/deketelaere-review.pdf

Leverage(pca::PCA)

Calculates the leverage of samples in a pca object.

Q(X, pca::PCA; Quantile = 0.95, Variance = 1.0)

Computes the Q-statistic and upper control limit cut off of a pca object using a specified Quantile and cumulative variance explained Variance for new or old data X.

A review of PCA-based statistical process monitoring methodsfor time-dependent, high-dimensional data. Bart De Ketelaere https://wis.kuleuven.be/stat/robust/papers/2013/deketelaere-review.pdf

diff --git a/docs/man/ClassificationModels/index.html b/docs/man/ClassificationModels/index.html index a0a9ca0..e5a25c9 100644 --- a/docs/man/ClassificationModels/index.html +++ b/docs/man/ClassificationModels/index.html @@ -1,2 +1,2 @@ -Classification Models · ChemometricsTools

Classification Models

Classification Models API Reference

Functions

GaussianDiscriminant(M, X, Y; Factors = nothing)

Returns a GaussianDiscriminant classification model on basis object M (PCA, LDA) and one hot encoded Y.

source
( model::GaussianDiscriminant )( Z; Factors = size(model.ProjectedClassMeans)[2] )

Returns a 1 hot encoded inference from Z using a GaussianDiscriminant object. This function enforces positive definiteness in the class covariance matrices.

source
GaussianNaiveBayes(X,Y)

Returns a GaussianNaiveBayes classification model object from X and one hot encoded Y.

source
(gnb::GaussianNaiveBayes)(X)

Returns a 1 hot encoded inference from X using a GaussianNaiveBayes object.

source
KNN( X, Y; DistanceType::String )

DistanceType can be "euclidean", "manhattan". Y Must be one hot encoded.

Returns a KNN classification model.

source
( model::KNN )( Z; K = 1 )

Returns a 1 hot encoded inference from X with K Nearest Neighbors, using a KNN object.

source
( model::LogisticRegression )( X )

Returns a 1 hot encoded inference from X using a LogisticRegression object.

source
ProbabilisticNeuralNetwork( X, Y )

Stores data for a PNN. Y Must be one hot encoded.

Returns a PNN classification model.

source
(PNN::ProbabilisticNeuralNetwork)(X; sigma = 0.1)

Returns a 1 hot encoded inference from X with a probabilistic neural network.

source
ConfidenceEllipse(cov, mean, confidence, axis = [1,2]; pointestimate = 180 )

Returns a 2-D array whose columns are X & Y coordinates of a confidence ellipse. The ellipse is generated by the covariance matrix, mean vector, and the number of points to include in the plot.

source
LinearPerceptron(X, Y; LearningRate = 1e-3, MaxIters = 5000)

Returns a batch trained LinearPerceptron classification model object from X and one hot encoded Y.

source
LinearPerceptronsgd(X, Y; LearningRate = 1e-3, MaxIters = 5000)

Returns a SGD trained LinearPerceptron classification model object from X and one hot encoded Y.

source
MultinomialSoftmaxRegression(X, Y; LearnRate = 1e-3, maxiters = 1000, L2 = 0.0)

Returns a LogisticRegression classification model made by Stochastic Gradient Descent.

source
(L::linearperceptron)(X)

Returns a 1 hot encoded inference from X using a LinearPerceptron object.

source
+Classification Models · ChemometricsTools

Classification Models

Classification Models API Reference

Functions

GaussianDiscriminant(M, X, Y; Factors = nothing)

Returns a GaussianDiscriminant classification model on basis object M (PCA, LDA) and one hot encoded Y.

( model::GaussianDiscriminant )( Z; Factors = size(model.ProjectedClassMeans)[2] )

Returns a 1 hot encoded inference from Z using a GaussianDiscriminant object. This function enforces positive definiteness in the class covariance matrices.

GaussianNaiveBayes(X,Y)

Returns a GaussianNaiveBayes classification model object from X and one hot encoded Y.

(gnb::GaussianNaiveBayes)(X)

Returns a 1 hot encoded inference from X using a GaussianNaiveBayes object.

KNN( X, Y; DistanceType::String )

DistanceType can be "euclidean", "manhattan". Y Must be one hot encoded.

Returns a KNN classification model.

( model::KNN )( Z; K = 1 )

Returns a 1 hot encoded inference from X with K Nearest Neighbors, using a KNN object.

( model::LogisticRegression )( X )

Returns a 1 hot encoded inference from X using a LogisticRegression object.

ProbabilisticNeuralNetwork( X, Y )

Stores data for a PNN. Y Must be one hot encoded.

Returns a PNN classification model.

(PNN::ProbabilisticNeuralNetwork)(X; sigma = 0.1)

Returns a 1 hot encoded inference from X with a probabilistic neural network.

ConfidenceEllipse(cov, mean, confidence, axis = [1,2]; pointestimate = 180 )

Returns a 2-D array whose columns are X & Y coordinates of a confidence ellipse. The ellipse is generated by the covariance matrix, mean vector, and the number of points to include in the plot.

LinearPerceptron(X, Y; LearningRate = 1e-3, MaxIters = 5000)

Returns a batch trained LinearPerceptron classification model object from X and one hot encoded Y.

LinearPerceptronsgd(X, Y; LearningRate = 1e-3, MaxIters = 5000)

Returns a SGD trained LinearPerceptron classification model object from X and one hot encoded Y.

MultinomialSoftmaxRegression(X, Y; LearnRate = 1e-3, maxiters = 1000, L2 = 0.0)

Returns a LogisticRegression classification model made by Stochastic Gradient Descent.

(L::linearperceptron)(X)

Returns a 1 hot encoded inference from X using a LinearPerceptron object.

diff --git a/docs/man/Clustering/index.html b/docs/man/Clustering/index.html index fd251e0..ffb5f1e 100644 --- a/docs/man/Clustering/index.html +++ b/docs/man/Clustering/index.html @@ -8,4 +8,4 @@ #BCSS = BetweenClusterSS( km ) push!(ExplainedVar, WCSS / TCSS) end - scatter(ExplainedVar, title = "Elbow Plot", ylabel = "WCSS/TCSS", xlabel = "Clusters (#)", label = "K-means" )

Functions

ChemometricsTools.BetweenClusterSSMethod.
BetweenClusterSS( Clustered::ClusterModel )

Returns a scalar of the between cluster sum of squares for a ClusterModel object.

source
ChemometricsTools.KMeansMethod.
KMeans( X, Clusters; tolerance = 1e-8, maxiters = 200 )

Returns a ClusterModel object after finding clusterings for data in X via MacQueens K-Means algorithm. Clusters is the K parameter, or the # of clusters.

MacQueen, J. B. (1967). Some Methods for classification and Analysis of Multivariate Observations. Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability. 1. University of California Press. pp. 281–297.

source
ChemometricsTools.TotalClusterSSMethod.
TotalClusterSS( Clustered::ClusterModel )

Returns a scalar of the total sum of squares for a ClusterModel object.

source
ChemometricsTools.WithinClusterSSMethod.
WithinClusterSS( Clustered::ClusterModel )

Returns a scalar of the within cluter sum of squares for a ClusterModel object.

source
+ scatter(ExplainedVar, title = "Elbow Plot", ylabel = "WCSS/TCSS", xlabel = "Clusters (#)", label = "K-means" )

Functions

ChemometricsTools.BetweenClusterSSMethod.
BetweenClusterSS( Clustered::ClusterModel )

Returns a scalar of the between cluster sum of squares for a ClusterModel object.

ChemometricsTools.KMeansMethod.
KMeans( X, Clusters; tolerance = 1e-8, maxiters = 200 )

Returns a ClusterModel object after finding clusterings for data in X via MacQueens K-Means algorithm. Clusters is the K parameter, or the # of clusters.

MacQueen, J. B. (1967). Some Methods for classification and Analysis of Multivariate Observations. Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability. 1. University of California Press. pp. 281–297.

ChemometricsTools.TotalClusterSSMethod.
TotalClusterSS( Clustered::ClusterModel )

Returns a scalar of the total sum of squares for a ClusterModel object.

ChemometricsTools.WithinClusterSSMethod.
WithinClusterSS( Clustered::ClusterModel )

Returns a scalar of the within cluter sum of squares for a ClusterModel object.

diff --git a/docs/man/CurveResolution/index.html b/docs/man/CurveResolution/index.html index 08c15d5..6f3c83c 100644 --- a/docs/man/CurveResolution/index.html +++ b/docs/man/CurveResolution/index.html @@ -1,2 +1,2 @@ -Curve Resolution · ChemometricsTools

Curve Resolution

Curve Resolution Models API Reference

Functions

BTEM(X, bands = nothing; Factors = 3, particles = 50, maxiters = 1000)

Returns a single recovered spectra from a 2-Array X, the selected bands, number of Factors, using a Particle Swarm Optimizer. Note: This is not the function used in the original paper. This will be updated... it was written from memory. Also the original method uses Simulated Annealing not PSO. Band-Target Entropy Minimization (BTEM):  An Advanced Method for Recovering Unknown Pure Component Spectra. Application to the FTIR Spectra of Unstable Organometallic Mixtures. Wee Chew,Effendi Widjaja, and, and Marc Garland. Organometallics 2002 21 (9), 1982-1990. DOI: 10.1021/om0108752

source
BTEMobjective( a, X )

Returns the scalar BTEM objective function obtained from the linear combination vector a and loadings X. Note: This is not the function used in the original paper. This will be updated... it was written from memory.

source
FNNLS( A, b; LHS = false, maxiters = 500 )

Uses an implementation of Bro et. al's Fast Non-Negative Least Squares on the matrix A and vector b. Returns regression coefficients in the form of a vector. Bro, R., de Jong, S. (1997) A fast non-negativity-constrained least squares algorithm. Journal of Chemometrics, 11, 393-401.

source
MCRALS(X, C, S = nothing; norm = (false, false), Factors = 1, maxiters = 20, nonnegative = (false, false) )

Performs Multivariate Curve Resolution using Alternating Least Squares on X taking initial estimates for S or C. S or C can be constrained by their norm, or by nonnegativity using nonnegative arguments. The number of resolved Factors can also be set. Tauler, R. Izquierdo-Ridorsa, A. Casassas, E. Simultaneous analysis of several spectroscopic titrations with self-modelling curve resolution.Chemometrics and Intelligent Laboratory Systems. 18, 3, (1993), 293-300.

source
NMF(X; Factors = 1, tolerance = 1e-7, maxiters = 200)

Performs a variation of non-negative matrix factorization on Array X and returns the a 2-Tuple of (Concentration Profile, Spectra) Note: This is not a coordinate descent based NMF. This is a simple fast version which works well enough for chemical signals Algorithms for non-negative matrix factorization. Daniel D. Lee. H. Sebastian Seung. NIPS'00 Proceedings of the 13th International Conference on Neural Information Processing Systems. 535-54

source
SIMPLISMA(X; Factors = 1, alpha = 0.05, includedvars = 1:size(X)[2], SecondDeriv = true)

Performs SIMPLISMA on Array X using either the raw spectra or the Second Derivative spectra. alpha can be set to reduce contributions of baseline, and a list of included variables in the determination of pure variables may also be provided. Returns a tuple of the following form: (Concentraion Profile, Pure Spectral Estimates, Pure Variables) W. Windig, Spectral Data Files for Self-Modeling Curve Resolution with Examples Using the SIMPLISMA Approach, Chemometrics and Intelligent Laboratory Systems, 36, 1997, 3-16.

source
+Curve Resolution · ChemometricsTools

Curve Resolution

Curve Resolution Models API Reference

Functions

BTEM(X, bands = nothing; Factors = 3, particles = 50, maxiters = 1000)

Returns a single recovered spectra from a 2-Array X, the selected bands, number of Factors, using a Particle Swarm Optimizer. Note: This is not the function used in the original paper. This will be updated... it was written from memory. Also the original method uses Simulated Annealing not PSO. Band-Target Entropy Minimization (BTEM):  An Advanced Method for Recovering Unknown Pure Component Spectra. Application to the FTIR Spectra of Unstable Organometallic Mixtures. Wee Chew,Effendi Widjaja, and, and Marc Garland. Organometallics 2002 21 (9), 1982-1990. DOI: 10.1021/om0108752

BTEMobjective( a, X )

Returns the scalar BTEM objective function obtained from the linear combination vector a and loadings X. Note: This is not the function used in the original paper. This will be updated... it was written from memory.

FNNLS( A, b; LHS = false, maxiters = 500 )

Uses an implementation of Bro et. al's Fast Non-Negative Least Squares on the matrix A and vector b. Returns regression coefficients in the form of a vector. Bro, R., de Jong, S. (1997) A fast non-negativity-constrained least squares algorithm. Journal of Chemometrics, 11, 393-401.

MCRALS(X, C, S = nothing; norm = (false, false), Factors = 1, maxiters = 20, nonnegative = (false, false) )

Performs Multivariate Curve Resolution using Alternating Least Squares on X taking initial estimates for S or C. S or C can be constrained by their norm, or by nonnegativity using nonnegative arguments. The number of resolved Factors can also be set. Tauler, R. Izquierdo-Ridorsa, A. Casassas, E. Simultaneous analysis of several spectroscopic titrations with self-modelling curve resolution.Chemometrics and Intelligent Laboratory Systems. 18, 3, (1993), 293-300.

NMF(X; Factors = 1, tolerance = 1e-7, maxiters = 200)

Performs a variation of non-negative matrix factorization on Array X and returns the a 2-Tuple of (Concentration Profile, Spectra) Note: This is not a coordinate descent based NMF. This is a simple fast version which works well enough for chemical signals Algorithms for non-negative matrix factorization. Daniel D. Lee. H. Sebastian Seung. NIPS'00 Proceedings of the 13th International Conference on Neural Information Processing Systems. 535-54

SIMPLISMA(X; Factors = 1, alpha = 0.05, includedvars = 1:size(X)[2], SecondDeriv = true)

Performs SIMPLISMA on Array X using either the raw spectra or the Second Derivative spectra. alpha can be set to reduce contributions of baseline, and a list of included variables in the determination of pure variables may also be provided. Returns a tuple of the following form: (Concentraion Profile, Pure Spectral Estimates, Pure Variables) W. Windig, Spectral Data Files for Self-Modeling Curve Resolution with Examples Using the SIMPLISMA Approach, Chemometrics and Intelligent Laboratory Systems, 36, 1997, 3-16.

diff --git a/docs/man/Dists/index.html b/docs/man/Dists/index.html index e889118..1ecd20a 100644 --- a/docs/man/Dists/index.html +++ b/docs/man/Dists/index.html @@ -1,2 +1,2 @@ -Distance Measures · ChemometricsTools

Distance Measures

Distances API Reference

Functions

(K::Kernel)(X)

This is a convenience function to allow for one-line construction of kernels from a Kernel object K and new data X.

source
NearestNeighbors(DistanceMatrix)

Returns the nearest neighbor adjacency matrix from a given DistanceMatrix.

source
CauchyKernel(X, Y, sigma)

Creates a Cauchy kernel from Arrays X and Y using hyperparameters sigma.

source
CauchyKernel(X, sigma)

Creates a Cauchy kernel from Array X using hyperparameters sigma.

source
EuclideanDistance(X, Y)

Returns the euclidean distance matrix of X and Y such that the columns are the samples in Y.

source
EuclideanDistance(X)

Returns the Grahm aka the euclidean distance matrix of X.

source
GaussianKernel(X, Y, sigma)

Creates a Gaussian/RBF kernel from Arrays X and Y with hyperparameter sigma.

source
GaussianKernel(X, sigma)

Creates a Gaussian/RBF kernel from Array X using hyperparameter sigma.

source
InClassAdjacencyMatrix(DistanceMatrix, YHOT, K = 1)

Computes the in class Adjacency matrix with K nearest neighbors.

source
LinearKernel(X, Y, c)

Creates a Linear kernel from Arrays X and Y with hyperparameter C.

source
LinearKernel(X, c)

Creates a Linear kernel from Array X and hyperparameter C.

source
ManhattanDistance(X, Y)

Returns the Manhattan distance matrix of X and Y such that the columns are the samples in Y.

source
ManhattanDistance(X)

Returns the Manhattan distance matrix of X.

source
NearestNeighbors(DistanceMatrix, N)

Returns a matrix of dimensions DistanceMatrix rows, by N columns. Basically this goes through each row and finds the ones corresponding column which has the smallest distance.

source
OutOfClassAdjacencyMatrix(DistanceMatrix, YHOT, K = 1)

Computes the out of class Adjacency matrix with K nearest neighbors.

source
SquareEuclideanDistance(X, Y)

Returns the squared euclidean distance matrix of X and Y such that the columns are the samples in Y.

source
SquareEuclideanDistance(X)

Returns the squared Grahm aka the euclidean distance matrix of X.

source
+Distance Measures · ChemometricsTools

Distance Measures

Distances API Reference

Functions

(K::Kernel)(X)

This is a convenience function to allow for one-line construction of kernels from a Kernel object K and new data X.

NearestNeighbors(DistanceMatrix)

Returns the nearest neighbor adjacency matrix from a given DistanceMatrix.

CauchyKernel(X, Y, sigma)

Creates a Cauchy kernel from Arrays X and Y using hyperparameters sigma.

CauchyKernel(X, sigma)

Creates a Cauchy kernel from Array X using hyperparameters sigma.

EuclideanDistance(X, Y)

Returns the euclidean distance matrix of X and Y such that the columns are the samples in Y.

EuclideanDistance(X)

Returns the Grahm aka the euclidean distance matrix of X.

GaussianKernel(X, Y, sigma)

Creates a Gaussian/RBF kernel from Arrays X and Y with hyperparameter sigma.

GaussianKernel(X, sigma)

Creates a Gaussian/RBF kernel from Array X using hyperparameter sigma.

InClassAdjacencyMatrix(DistanceMatrix, YHOT, K = 1)

Computes the in class Adjacency matrix with K nearest neighbors.

LinearKernel(X, Y, c)

Creates a Linear kernel from Arrays X and Y with hyperparameter C.

LinearKernel(X, c)

Creates a Linear kernel from Array X and hyperparameter C.

ManhattanDistance(X, Y)

Returns the Manhattan distance matrix of X and Y such that the columns are the samples in Y.

ManhattanDistance(X)

Returns the Manhattan distance matrix of X.

NearestNeighbors(DistanceMatrix, N)

Returns a matrix of dimensions DistanceMatrix rows, by N columns. Basically this goes through each row and finds the ones corresponding column which has the smallest distance.

OutOfClassAdjacencyMatrix(DistanceMatrix, YHOT, K = 1)

Computes the out of class Adjacency matrix with K nearest neighbors.

SquareEuclideanDistance(X, Y)

Returns the squared euclidean distance matrix of X and Y such that the columns are the samples in Y.

SquareEuclideanDistance(X)

Returns the squared Grahm aka the euclidean distance matrix of X.

diff --git a/docs/man/Ensemble/index.html b/docs/man/Ensemble/index.html index fda97bc..bbe1531 100644 --- a/docs/man/Ensemble/index.html +++ b/docs/man/Ensemble/index.html @@ -1,2 +1,2 @@ -Ensemble Models · ChemometricsTools

Ensemble Models

Ensemble Models API Reference

Functions

RandomForest(x, y, mode = :classification; gainfn = entropy, trees = 50, maxdepth = 10,  minbranchsize = 5, samples = 0.7, maxvars = nothing)

Returns a classification (mode = :classification) or a regression (mode = :regression) random forest model. The gainfn can be entropy or gini for classification or ssd for regression. If the number of maximumvars is not provided it will default to sqrt(variables) for classification or variables/3 for regression.

The returned object can be used for inference by calling new data on the object as a function.

Breiman, L. Machine Learning (2001) 45: 5. https://doi.org/10.1023/A:1010933404324

source
(RF::RandomForest)(X)

Returns bagged prediction vector of random forest model.

source
MakeIntervals( columns::Int, intervalsize::Union{Array, Tuple} = [20, 50, 100] )

Creates an Dictionary whose key is the interval size and values are an array of intervals from the range: 1 - columns of size intervalsize.

source
MakeIntervals( columns::Int, intervalsize::Int = 20 )

Returns an 1-Array of intervals from the range: 1 - columns of size intervalsize.

source
stackedweights(ErrVec; power = 2)

Weights stacked interval errors by the reciprocal power specified. Used for SIPLS, SISPLS, etc.

Ni, W. , Brown, S. D. and Man, R. (2009), Stacked partial least squares regression analysis for spectral calibration and prediction. J. Chemometrics, 23: 505-517. doi:10.1002/cem.1246

source
+Ensemble Models · ChemometricsTools

Ensemble Models

Ensemble Models API Reference

Functions

RandomForest(x, y, mode = :classification; gainfn = entropy, trees = 50, maxdepth = 10,  minbranchsize = 5, samples = 0.7, maxvars = nothing)

Returns a classification (mode = :classification) or a regression (mode = :regression) random forest model. The gainfn can be entropy or gini for classification or ssd for regression. If the number of maximumvars is not provided it will default to sqrt(variables) for classification or variables/3 for regression.

The returned object can be used for inference by calling new data on the object as a function.

Breiman, L. Machine Learning (2001) 45: 5. https://doi.org/10.1023/A:1010933404324

(RF::RandomForest)(X)

Returns bagged prediction vector of random forest model.

MakeIntervals( columns::Int, intervalsize::Union{Array, Tuple} = [20, 50, 100] )

Creates an Dictionary whose key is the interval size and values are an array of intervals from the range: 1 - columns of size intervalsize.

MakeIntervals( columns::Int, intervalsize::Int = 20 )

Returns an 1-Array of intervals from the range: 1 - columns of size intervalsize.

stackedweights(ErrVec; power = 2)

Weights stacked interval errors by the reciprocal power specified. Used for SIPLS, SISPLS, etc.

Ni, W. , Brown, S. D. and Man, R. (2009), Stacked partial least squares regression analysis for spectral calibration and prediction. J. Chemometrics, 23: 505-517. doi:10.1002/cem.1246

diff --git a/docs/man/FullAPI/index.html b/docs/man/FullAPI/index.html index 59058f3..835e760 100644 --- a/docs/man/FullAPI/index.html +++ b/docs/man/FullAPI/index.html @@ -1,2 +1,2 @@ -Full API · ChemometricsTools

Full API

API

CanonicalCorrelationAnalysis(A, B)

Returns a CanonicalCorrelationAnalysis object which contains (U, V, r) from Arrays A and B. Currently Untested for correctness but should compute....

source
GaussianBand(sigma,amplitude,center)

Constructs a Gaussian kernel generator.

source
(B::GaussianBand)(X::Float64)

Returns the scalar probability associated with a GaussianBand object (kernel) at a location in space(X).

source
LDA(X, Y; Factors = 1)

Compute's a LinearDiscriminantAnalysis transform from x with a user specified number of latent variables(Factors). Returns an LDA object.

source
( model::LDA )( Z; Factors = length(model.Values) )

Calling a LDA object on new data brings the new data Z into the LDA basis.

source
LorentzianBand(gamma,amplitude,center)

Constructs a Lorentzian kernel generator.

source
(B::LorentzianBand)(X::Float64)

Returns the probability associated with a LorentzianBand object (kernel) at a location in space(X).

source
PCA(X; Factors = minimum(size(X)) - 1)

Compute's a PCA from x using LinearAlgebra's SVD algorithm with a user specified number of latent variables(Factors). Returns a PCA object.

source
(T::PCA)(Z::Array; Factors = length(T.Values), inverse = false)

Calling a PCA object on new data brings the new data Z into or out of (inverse = true) the PCA basis.

source
(U::Universe)(Band...)

A Universe objects internal "spectra" can be updated to include the additive contribution of many Band-like objects.

source
Universe(mini, maxi; width = nothing, bins = nothing)

Creates a 1-D discretized segment that starts at mini and ends at maxi. The width of the bins for the discretization can either be provided or inferred from the number of bins. Returns a Universe object.

source
(U::Universe)(Band::Union{ GaussianBand, LorentzianBand})

A Universe objects internal "spectra" can be updated to include the additive contribution of any Band-like object.

source
AssessHealth( X )

Returns a somewhat detailed Dict containing information about the 'health' of a dataset. What is included is the following: - PercentMissing: percent of missing entries (includes nothing, inf / nan) in the dataset - EmptyColumns: the columns which have only 1 value - RankEstimate: An estimate of the rank of X - (optional)Duplicates: returns the rows of duplicate observations

source
ExplainedVariance(lda::LDA)

Calculates the explained variance of each singular value in an LDA object.

source
ExplainedVariance(PCA::PCA)

Calculates the explained variance of each singular value in a pca object.

source
HLDA(X, YHOT; K = 1, Factors = 1)

Compute's a Hierarchical LinearDiscriminantAnalysis transform from x with a user specified number of latent variables(Factors). The adjacency matrices are created from K nearest neighbors.

Returns an LDA object. Note: this can be used with any other LDA functions such as Gaussian discriminants or explained variance.

Lu D, Ding C, Xu J, Wang S. Hierarchical Discriminant Analysis. Sensors (Basel). 2018 Jan 18;18(1). pii: E279. doi: 10.3390/s18010279.

source
PCA_NIPALS(X; Factors = minimum(size(X)) - 1, tolerance = 1e-7, maxiters = 200)

Compute's a PCA from x using the NIPALS algorithm with a user specified number of latent variables(Factors). The tolerance is the minimum change in the F norm before ceasing execution. Returns a PCA object.

source
RAFFT(raw, reference; maxlags::Int = 500, lookahead::Int = 1, minlength::Int = 20, mincorr::Float64 = 0.05)

RAFFT corrects shifts in the raw spectral bands to be similar to those in a given reference spectra through the use of "recursive alignment by FFT". It returns an array of corrected spectra/chromatograms. The number of maximum lags can be specified, the lookahead parameter ensures that additional recursive executions are performed so the first solution found is not preemptively accepted, the minimum segment length(minlength) can also be specified if FWHM are estimable, and the minimum cross correlation(mincorr) for a match can dictate whether peaks were found to align or not.

Note This method works best with flat baselines because it repeats last known values when padding aligned spectra. It is highly efficient, and in my tests does a good job, but other methods definitely exist. Let me know if other peak Alignment methods are important for your work-flow, I'll see if I can implement them.

Application of Fast Fourier Transform Cross-Correlation for the Alignment of Large Chromatographic and Spectral Datasets Jason W. H. Wong, Caterina Durante, and, Hugh M. Cartwright. Analytical Chemistry 2005 77 (17), 5655-5661

source
findpeaks( vY; m = 3)

Finds the indices of peaks in a vector vY with a window span of 2m. Original R function by Stas_G:(https://stats.stackexchange.com/questions/22974/how-to-find-local-peaks-valleys-in-a-series-of-data) This version is based on a C++ variant by me.

source
+Full API · ChemometricsTools

Full API

API

BlandAltman(Y1, Y2; Confidence = 1.96)

Returns a Plot object of a Bland-Altman plot between vectors Y1 and Y2 with a confidence limit of Confidence.

Bounds(dims)

Constructor for a Bounds object. Returns a bounds object with a lower bound of [lower...] and upper bound[upper...] with length of dims.

Bounds(dims)

Default constructor for a Bounds object. Returns a bounds object with a lower bound of [0...] and upper bound[1...] with length of dims.

CORAL(X1, X2; lambda = 1.0)

Performs CORAL to facilitate covariance based transfer from X1 to X2 with regularization parameter lambda. Returns a CORAL object.

Correlation Alignment for Unsupervised Domain Adaptation. Baochen Sun, Jiashi Feng, Kate Saenko. https://arxiv.org/abs/1612.01939

(C::CORAL)(Z)

Applies a the transform from a learned CORAL object to new data Z.

CanonicalCorrelationAnalysis(A, B)

Returns a CanonicalCorrelationAnalysis object which contains (U, V, r) from Arrays A and B. Currently Untested for correctness but should compute....

(T::Center)(Z; inverse = false)

Centers data in array Z column-wise according to learned mean centers in Center object T.

(T::CenterScale)(Z; inverse = false)

Centers and Scales data in array Z column-wise according to learned measures of central tendancy in Scale object T.

ClassicLeastSquares( X, Y; Bias = false )

Makes a ClassicLeastSquares regression model of the form Y = AX with or without a Bias term. Returns a CLS object.

(M::ClassicLeastSquares)(X)

Makes an inference from X using a ClassicLeastSquares object.

GaussianBand(sigma,amplitude,center)

Constructs a Gaussian kernel generator.

(B::GaussianBand)(X::Float64)

Returns the scalar probability associated with a GaussianBand object (kernel) at a location in space(X).

KFoldsValidation(K::Int, x, y)

Returns a KFoldsValidation iterator with K folds. Because it's an iterator it can be used in for loops, see the tutorials for pragmatic examples. The iterator returns a 2-Tuple of 2-Tuples which have the following form: ((TrainX,TrainY),(ValidateX,ValidateY).

LDA(X, Y; Factors = 1)

Compute's a LinearDiscriminantAnalysis transform from x with a user specified number of latent variables(Factors). Returns an LDA object.

( model::LDA )( Z; Factors = length(model.Values) )

Calling a LDA object on new data brings the new data Z into the LDA basis.

LSSVM( X, Y, Penalty; KernelParameter = 0.0, KernelType = "linear" )

Makes a LSSVM model of the form Y = AK with a bias term using a user specified Kernel("linear", or "gaussian") and has an L2 Penalty. Returns a LSSVM Wrapper for a CLS object.

(M::LSSVM)(X)

Makes an inference from X using a LSSVM object.

LorentzianBand(gamma,amplitude,center)

Constructs a Lorentzian kernel generator.

(B::LorentzianBand)(X::Float64)

Returns the probability associated with a LorentzianBand object (kernel) at a location in space(X).

MultiCenter(Z, mode = 1)

Acquires the mean of the specified mode in Z and returns a transform that will remove those means from any future data.

(T::MultiCenter)(Z; inverse = false)

Centers data in Tensor Z mode-wise according to learned centers in MultiCenter object T.

MultiScale(Z, mode = 1)

Acquires the standard deviations of the specified mode in Z and returns a transform that will scale by those standard deviations from any future data.

(T::MultiScale)(Z; inverse = false)

Scales data in Tensor Z mode-wise according to learned standard deviations in MultiScale object T.

(T::MultiplicativeScatterCorrection)(Z)

Applies MultiplicativeScatterCorrection from a stored object T to Array Z.

OrthogonalSignalCorrection(X, Y; Factors = 1)

Performs Thomas Fearn's Orthogonal Signal Correction to an endogenous X and exogenous Y. The number of Factors are the number of orthogonal components to be removed from X. This function returns an OSC object.

Tom Fearn. On orthogonal signal correction. Chemometrics and Intelligent Laboratory Systems. Volume 50, Issue 1, 2000, Pages 47-52.

(OSC::OrthogonalSignalCorrection)(Z; Factors = 2)

Applies a the transform from a learned orthogonal signal correction object OSC to new data Z.

PCA(X; Factors = minimum(size(X)) - 1)

Compute's a PCA from x using LinearAlgebra's SVD algorithm with a user specified number of latent variables(Factors). Returns a PCA object.

(T::PCA)(Z::Array; Factors = length(T.Values), inverse = false)

Calling a PCA object on new data brings the new data Z into or out of (inverse = true) the PCA basis.

PartialLeastSquares( X, Y; Factors = minimum(size(X)) - 2, tolerance = 1e-8, maxiters = 200 )

Returns a PartialLeastSquares regression model object from arrays X and Y.

  1. PARTIAL LEAST-SQUARES REGRESSION: A TUTORIAL PAUL GELADI and BRUCE R.KOWALSKI. Analytica Chimica Acta, 186, (1986) PARTIAL LEAST-SQUARES REGRESSION:
  2. Martens H., NÊs T. Multivariate Calibration. Wiley: New York, 1989.
  3. Re-interpretation of NIPALS results solves PLSR inconsistency problem. Rolf Ergon. Published in Journal of Chemometrics 2009; Vol. 23/1: 72-75
(M::PartialLeastSquares)

Makes an inference from X using a PartialLeastSquares object.

Particle(ProblemBounds, VelocityBounds)

Default constructor for a Particle object. It creates a random unformly distributed particle within the specified ProblemBounds, and limits it's velocity to the specified VelocityBounds.

(M::PrincipalComponentRegression)( X )

Makes an inference from X using a PrincipalComponentRegression object.

PrincipalComponentRegression(PCAObject, Y )

Makes a PrincipalComponentRegression model object from a PCA Object and property value Y.

QQ( Y1, Y2; Quantiles = collect( 1 : 99 ) ./ 100 )

Returns a Plot object of a Quantile-Quantile plot between vectors Y1 and Y2 at the desired Quantiles.

(T::QuantileTrim)(X, inverse = false)

Trims data in array X columns wise according to learned quantiles in QuantileTrim object T This function does NOT have an inverse.

QuantileTrim(Z; quantiles::Tuple{Float64,Float64} = (0.05, 0.95) )

Trims values above or below the specified columnwise quantiles to the quantile values themselves.

(T::RangeNorm)(Z; inverse = false)

Scales and shifts data in array Z column-wise according to learned min-maxes in RangeNorm object T.

RidgeRegression( X, Y, Penalty; Bias = false )

Makes a RidgeRegression model of the form Y = AX with or without a Bias term and has an L2 Penalty. Returns a CLS object.

(M::RidgeRegression)(X)

Makes an inference from X using a RidgeRegression object which wraps a ClassicLeastSquares object.

RollingWindow(samples::Int,windowsize::Int,skip::Int)

Creates a RollingWindow iterator from a number of samples and a static windowsize where every iteration skip steps are skipped. The iterator can be used in for loops to iteratively return indices of a dynamic rolling window.

RollingWindow(samples::Int,windowsize::Int)

Creates a RollingWindow iterator from a number of samples and a static windowsize. The iterator can be used in for loops to iteratively return indices of a dynamic rolling window.

RunningMean(x)

Constructs a running mean object with an initial scalar value of x.

RunningVar(x)

Constructs a RunningVar object with an initial scalar value of x. Note: RunningVar objects implicitly calculate the running mean.

(T::Scale)(Z; inverse = false)

Scales data in array Z column-wise according to learned standard deviations in Scale object T.

TransferByOrthogonalProjection(X1, X2; Factors = 1)

Performs Thomas Fearns Transfer By Orthogonal Projection to facilitate transfer from X1 to X2. Returns a TransferByOrthogonalProjection object.

Anne Andrew, Tom Fearn. Transfer by orthogonal projection: making near-infrared calibrations robust to between-instrument variation. Chemometrics and Intelligent Laboratory Systems. Volume 72, Issue 1, 2004, Pages 51-56,

(TbOP::TransferByOrthogonalProjection)(X1; Factors = TbOP.Factors)

Applies a the transform from a learned transfer by orthogonal projection object TbOP to new data X1.

(U::Universe)(Band...)

A Universe objects internal "spectra" can be updated to include the additive contribution of many Band-like objects.

Universe(mini, maxi; width = nothing, bins = nothing)

Creates a 1-D discretized segment that starts at mini and ends at maxi. The width of the bins for the discretization can either be provided or inferred from the number of bins. Returns a Universe object.

(U::Universe)(Band::Union{ GaussianBand, LorentzianBand})

A Universe objects internal "spectra" can be updated to include the additive contribution of any Band-like object.

ALSSmoother(X; lambda = 100, p = 0.001, maxiters = 10)

Applies an assymetric least squares smoothing function to a 2-Array X. The lambda, p, and maxiters parameters control the smoothness. See the reference below for more information.

Paul H. C. Eilers, Hans F.M. Boelens. Baseline Correction with Asymmetric Least Squares Smoothing. 2005

AssessHealth( X )

Returns a somewhat detailed Dict containing information about the 'health' of a dataset. What is included is the following: - PercentMissing: percent of missing entries (includes nothing, inf / nan) in the dataset - EmptyColumns: the columns which have only 1 value - RankEstimate: An estimate of the rank of X - (optional)Duplicates: returns the rows of duplicate observations

ClassificationTree(x, y; gainfn = entropy, maxdepth = 4, minbranchsize = 3)

Builds a CART object using either gini or entropy as a partioning method. Y must be a one hot encoded 2-Array. Predictions can be formed by calling the following function from the CART object: (M::CART)(x).

*Note: this is a purely nonrecursive decision tree. The julia compiler doesn't like storing structs of nested things. I wrote it the recursive way in the past and it was quite slow, I think this is true also of interpretted languages like R/Python...So here it is, nonrecursive tree's!

ColdToHot(Y, Schema::ClassificationLabel)

Turns a cold encoded Y vector into a one hot encoded array.

DirectStandardization(InstrumentX1, InstrumentX2; Factors = minimum(collect(size(InstrumentX1))) - 1)

Makes a DirectStandardization object to facilitate the transfer from Instrument #2 to Instrument #1 . The returned object can be used to transfer unseen data to the approximated space of instrument 1. The number of Factors used are those from the internal orthogonal basis.

Yongdong Wang and Bruce R. Kowalski, "Calibration Transfer and Measurement Stability of Near-Infrared Spectrometers," Appl. Spectrosc. 46, 764-771 (1992)

EWMA(Initial::Float64, Lambda::Float64) = ewma(Lambda, Initial, Initial, RunningVar(Initial))

Constructs an exponentially weighted moving average object from an vector of scalar property values Initial and the decay parameter Lambda. This computes the running statistcs neccesary for creating the EWMA model using the interval provided and updates the center value to the mean of the provided values.

EWMA(Initial::Float64, Lambda::Float64) = ewma(Lambda, Initial, Initial, RunningVar(Initial))

Constructs an exponentially weighted moving average object from an initial scalar property value Initial and the decay parameter Lambda. This defaults the center value to be the initial value.

EmpiricalQuantiles(X, quantiles)

Finds the column-wise quantiles of 2-Array X and returns them in a 2-Array of size quantiles by variables. *Note: This copies the array... Use a subset if memory is the concern. *

ExplainedVariance(lda::LDA)

Calculates the explained variance of each singular value in an LDA object.

ExplainedVariance(PCA::PCA)

Calculates the explained variance of each singular value in a pca object.

ExtremeLearningMachine(X, Y, ReservoirSize = 10; ActivationFn = sigmoid)

Returns a ELM regression model object from arrays X and Y, with a user specified ReservoirSize and ActivationFn.

Extreme learning machine: a new learning scheme of feedforward neural networks. Guang-Bin Huang ; Qin-Yu Zhu ; Chee-Kheong Siew. 2004 IEEE International Joint...

FirstDerivative(X)

Uses the finite difference method to compute the first derivative for every row in X. Note: This operation results in the loss of a column dimension.

FractionalDerivative(Y, X = 1 : length(Y); Order = 0.5)

Calculates the Grunwald-Leitnikov fractional order derivative on every row of Array Y. Array X is a vector that has the spacing between column-wise entries in Y. X can be a scalar if that is constant (common in spectroscopy). Order is the fractional order of the derivative. Note: This operation results in the loss of a column dimension.

The Fractional Calculus, by Oldham, K.; and Spanier, J. Hardcover: 234 pages. Publisher: Academic Press, 1974. ISBN 0-12-525550-0

HLDA(X, YHOT; K = 1, Factors = 1)

Compute's a Hierarchical LinearDiscriminantAnalysis transform from x with a user specified number of latent variables(Factors). The adjacency matrices are created from K nearest neighbors.

Returns an LDA object. Note: this can be used with any other LDA functions such as Gaussian discriminants or explained variance.

Lu D, Ding C, Xu J, Wang S. Hierarchical Discriminant Analysis. Sensors (Basel). 2018 Jan 18;18(1). pii: E279. doi: 10.3390/s18010279.

HighestVote(yhat)

Returns the column index for each row that has the highest value in one hot encoded yhat. Returns a one cold encoded vector.

HighestVoteOneHot(yhat)

Turns the highest column-wise value to a 1 and the others to zeros per row in a one hot encoded yhat. Returns a one cold encoded vector.

HotToCold(Y, Schema::ClassificationLabel)

Turns a one hot encoded Y array into a cold encoded vector.

IntervalOverlay(Spectra, Intervals, Err)

Displays the relative error(Err) of each interval ontop of a Spectra.

IsColdEncoded(Y)

Returns a boolean true if the array Y is cold encoded, and false if not.

KennardStone(X, TrainSamples; distance = "euclidean")

Returns the indices of the Kennard-Stone sampled exemplars (E), and those not sampled (O) as a 2-Tuple (E, O).

R. W. Kennard & L. A. Stone (1969) Computer Aided Design of Experiments, Technometrics, 111, 137-148, DOI: 10.1080/00401706.1969.10490666

KernelRidgeRegression( X, Y, Penalty; KernelParameter = 0.0, KernelType = "linear" )

Makes a KernelRidgeRegression model of the form Y = AK using a user specified Kernel("Linear", or "Guassian") and has an L2 Penalty. Returns a KRR Wrapper for a CLS object.

" LabelEncoding(HotOrCold)

Determines if an Array, Y, is one hot encoded, or cold encoded by it's dimensions. Returns a ClassificationLabel object/schema to convert between the formats.

LeaveOneOut(x, y)

Returns a KFoldsValidation iterator with leave one out folds. Because it's an iterator it can be used in for loops, see the tutorials for pragmatic examples. The iterator returns a 2-Tuple of 2-Tuples which have the following form: ((TrainX,TrainY),(ValidateX,ValidateY).

Lifeform(size, onlikelihood, initialscore)

Constructor for a BinaryLifeForm struct. Binary life forms are basically wrappers for a binary vector, which has a likelihood for being 1(onlikelihood). Each life form also has a score based on it's "fitness". So the GA's in this package can be used to minimize or maximize this is an open parameter, but Inf/-Inf is a good initialscore.

Limits(P::ewma; k = 3.0)

This function returns the upper and lower control limits with a k span of variance for an EWMA object P.

Logit(Z; inverse = false)

Logit transforms (ln( X / (1 - X) ))) every element in Z. The inverse may also be applied. Warning: This can return Infs and NaNs if elements of Z are not suited to the transform

MAE( y, yhat )

Calculates Mean Average Error from vectors Y and YHat

MAPE( y, yhat )

Calculates Mean Average Percent Error from vectors Y and YHat

ME( y, yhat )

Calculates Mean Error from vectors Y and YHat.

MSE( y, yhat )

Calculates Mean Squared Error from vectors Y and YHat

Mean(rv::RunningMean)

Returns the current mean inside of a RunningMean object.

Mean(rv::RunningVar)

Returns the current mean inside of a RunningVar object.

MultiNorm(T)

Computes the equivalent of the Froebinius norm on a tensor T. Returns a scalar.

MultiPCA(X; Factors = 2)

Performs multiway PCA aka Higher Order SVD aka Tucker, etc. The number of factors decomposed can be a scalar(repeated across all modes) or a vector/tuple for each mode.

Returns a tuple of (Core Tensor, Basis Tensors)

MulticlassStats(Y, GT, schema; Microaverage = true)

Calculates many essential classification statistics based on predicted values Y, and ground truth values GT, using the encoding schema. Returns a tuple whose first entry is a dictionary of averaged statistics, and whose second entry is a dictionary of the form "Class" => Statistics Dictionary ...

MulticlassThreshold(yhat; level = 0.5)

Effectively does the same thing as Threshold() but per-row across columns.

Warning this function can allow for no class assignments. HighestVote is preferred

Mutate( L::BinaryLifeform, amount = 0.05 )

Assesses each element in the gene vector (inside of L). If a randomly drawn value has a binomial probability of amount the element is mutated.

OneHotOdds(Y)

Calculates the odds of a one-hot formatted probability matrix. Returns a tuple.

PCA_NIPALS(X; Factors = minimum(size(X)) - 1, tolerance = 1e-7, maxiters = 200)

Compute's a PCA from x using the NIPALS algorithm with a user specified number of latent variables(Factors). The tolerance is the minimum change in the F norm before ceasing execution. Returns a PCA object.

PSO(fn, Bounds, VelRange, Particles; tolerance = 1e-6, maxiters = 1000, InertialDecay = 0.5, PersonalWeight = 0.5, GlobalWeight = 0.5, InternalParams = nothing)

Minimizes function fn with-in the user specified Bounds via a Particle Swarm Optimizer. The particle velocities are limitted to the VelRange. The number of particles are defined by the Particles parameter.

Returns a Tuple of the following form: ( GlobalBestPos, GlobalBestScore, P ) Where P is an array of the particles used in the optimization.

*Note: if the optimization function requires an additional constant parameter, please pass that parameter to InternalParams. This will only work if the optimized parameter(o) and constant parameter(c) for the function of interest has the following format: F(o,c) *

Kennedy, J.; Eberhart, R. (1995). Particle Swarm Optimization. Proceedings of IEEE International Conference on Neural Networks. IV. pp. 1942–1948. doi:10.1109/ICNN.1995.488968

PearsonCorrelationCoefficient( y, yhat )

Calculates The Pearson Correlation Coefficient from vectors Y and YHat

PercentRMSE( y, yhat )

Calculates Percent Root Mean Squared Error from vectors Y and YHat

PerfectSmoother(X; lambda = 100)

Applies an assymetric least squares smoothing function to a a 2-Array X. The lambda parameter controls the smoothness. See the reference below for more information.

Paul H. C. Eilers. "A Perfect Smoother". Analytical Chemistry, 2003, 75 (14), pp 3631–3636.

Pipeline( X, FnStack... )

Construct a pipeline object from vector/tuple of Transforms. The Transforms vector are effectively a vector of functions which transform data.

Pipeline(Transforms)

Constructs a transformation pipeline from vector/tuple of Transforms. The Transforms vector are effectively a vector of functions which transform data.

PipelineInPlace( X, FnStack...)

Construct a pipeline object from vector/tuple of Transforms. The Transforms vector are effectively a vector of functions which transform data. This function makes "inplace" changes to the Array X as though it has been sent through the pipeline. This is more efficient if memory is a concern, but can irreversibly transform data in memory depending on the transforms in the pipeline.

RAFFT(raw, reference; maxlags::Int = 500, lookahead::Int = 1, minlength::Int = 20, mincorr::Float64 = 0.05)

RAFFT corrects shifts in the raw spectral bands to be similar to those in a given reference spectra through the use of "recursive alignment by FFT". It returns an array of corrected spectra/chromatograms. The number of maximum lags can be specified, the lookahead parameter ensures that additional recursive executions are performed so the first solution found is not preemptively accepted, the minimum segment length(minlength) can also be specified if FWHM are estimable, and the minimum cross correlation(mincorr) for a match can dictate whether peaks were found to align or not.

Note This method works best with flat baselines because it repeats last known values when padding aligned spectra. It is highly efficient, and in my tests does a good job, but other methods definitely exist. Let me know if other peak Alignment methods are important for your work-flow, I'll see if I can implement them.

Application of Fast Fourier Transform Cross-Correlation for the Alignment of Large Chromatographic and Spectral Datasets Jason W. H. Wong, Caterina Durante, and, Hugh M. Cartwright. Analytical Chemistry 2005 77 (17), 5655-5661

RMSE( y, yhat )

Calculates Root Mean Squared Error from vectors Y and YHat

RSquare( y, yhat )

Calculates R^2 from Y and YHat

Remove!(RM::RunningMean, x)

Removes an observation(x) from a RunningMean object(RM) and reculates the mean in place.

Remove!(RM::RunningMean, x)

Removes an observation(x) from a RunningMean object(RM) and recuturns the new RunningMean object.

SSE( y, yhat )

Calculates Sum of Squared Errors from vectors Y and YHat

SSReg( y, yhat )

Calculates Sum of Squared Deviations due to Regression from vectors Y and YHat

SSRes( y, yhat )

Calculates Sum of Squared Residuals from vectors Y and YHat

SSTotal( y, yhat )

Calculates Total Sum of Squared Deviations from vectors Y and YHat

SampleSkewness(X)

returns a measure of skewness for vector X that is corrected for a sample of the population.

Joanes, D. N., and C. A. Gill. 1998. “Comparing Measures of Sample Skewness and Kurtosis”. The Statistician 47(1): 183–189.

SavitzkyGolay(X, Delta, PolyOrder, windowsize)

Performs SavitskyGolay smoothing across every row in an Array X. The window size is the size of the convolution filter, PolyOrder is the order of the polynomial, and Delta is the order of the derivative.

Savitzky, A.; Golay, M.J.E. (1964). "Smoothing and Differentiation of Data by Simplified Least Squares Procedures". Analytical Chemistry. 36 (8): 1627–39. doi:10.1021/ac60214a047.

Scale1Norm(X)

Scales the columns of X by the 1-Norm of each row. Returns the scaled array.

Scale2Norm(X)

Scales the columns of X by the 2-Norm of each row. Returns the scaled array.

ScaleInfNorm(X)

Scales the columns of X by the Inf-Norm of each row. Returns the scaled array.

ScaleMinMax(X)

Scales the columns of X by the Min and Max of each row such that no observation is greater than 1 or less than zero. Returns the scaled array.

FirstDerivative(X)

Uses the finite difference method to compute the second derivative for every row in X. Note: This operation results in the loss of two columns.

Shuffle!( X, Y )

Shuffles the rows of the X and Y data without replacement in place. In place, means that this function alters the order of the data in memory and this function does not return anything.

Shuffle( X, Y )

Shuffles the rows of the X and Y data without replacement. It returns a 2-Tuple of the shuffled set.

SinglePointCrossOver( L1::BinaryLifeform, L2::BinaryLifeform )

Creates two offspring (new BinaryLifeForms) by mixing the genes from L1 and L2 after a random position in the vector.

Skewness(X)

returns a measure of skewness for a population vector X.

Joanes, D. N., and C. A. Gill. 1998. “Comparing Measures of Sample Skewness and Kurtosis”. The Statistician 47(1): 183–189.

SplitByProportion(X::Array, Y::Array,Proportion::Float64 = 0.5)

Splits an X and Associated Y Array along the observations dimension into a 2-Tuple of 2-Tuples based on the Proportion. The form of the output is the following: ( (X1, Y1), (X2, Y2) )

SplitByProportion(X::Array, Proportion::Float64 = 0.5)

Splits X Array along the observations dimension into a 2-Tuple based on the Proportion. The form of the output is the following: ( X1, X2 )

StandardNormalVariate(X)

Scales the columns of X by the mean and standard deviation of each row. Returns the scaled array.

StatsDictToDataFrame(DictOfStats, schema)

Converts a dictionary of statistics which is returned from MulticlassStats into a labelled dataframe. This is an intermediate step for automated report generation.

StatsFromTFPN(TP, TN, FP, FN)

Calculates many essential classification statistics based on the numbers of True Positive(TP), True Negative(TN), False Positive(FP), and False Negative(FN) examples.

Threshold(yhat; level = 0.5)

For a binary vector yhat this decides if the label is a 0 or a 1 based on it's value relative to a threshold level.

Update!(RM::RunningMean, x)

Adds new observation(x) to a RunningMean object(RM) in place.

Update!(RV::RunningVar, x)

Adds new observation(x) to a RunningVar object(RV) and updates it in place.

Update!(RM::RunningMean, x)

Adds new observation(x) to a RunningMean object(RM) and returns the new object.

Variance(P::ewma)

This function returns the EWMA control variance.

Variance(rv::RunningVar)

Returns the current variance inside of a RunningVar object.

VenetianBlinds(X,Y)

Splits an X and associated Y Array along the observation dimension into a 2-Tuple of 2-Tuples based on the whether it is even or odd. The form of the output is the following: ( (X1,Y1), (X2, Y2) )

VenetianBlinds(X)

Splits an X Array along the observations dimension into a 2-Tuple of 2-Tuples based on the whether it is even or odd. The form of the output is the following: ( X1, X2 )

boxcar(X; windowsize = 3, fn = mean)

Applies a boxcar function (fn) to each window of size windowsize to every row in X. Note: the function provided must support a dims argument/parameter.

entropy(v)

Calculates the Shannon-Entropy of a probability vector v. Returns a scalar. A common gain function used in tree methods.

findpeaks( vY; m = 3)

Finds the indices of peaks in a vector vY with a window span of 2m. Original R function by Stas_G:(https://stats.stackexchange.com/questions/22974/how-to-find-local-peaks-valleys-in-a-series-of-data) This version is based on a C++ variant by me.

gini(p)

Calculates the GINI coefficient of a probability vector p. Returns a scalar. A common gain function used in tree methods.

offsetToZero(X)

Ensures that no observation(row) of Array X is less than zero, by ensuring the minimum value of each row is zero.

plotchem(QQ::{QQ, BlandAltman}; title )

returns either a QQ Plot or a Bland-Altman plot with the defined title

rbinomial( p, size... )

Makes an N-dimensional array of size(s) size with a probability of being a 1 over a 0 of 1 p.

sigmoid(x)

Applies the sigmoid function to a scalar value X. Returns a scalar. Can be broad-casted over an Array.

ssd(p)

Calculates the sum squared deviations from a decision tree split. Accepts a vector of values, and the mean of that vector. Returns a scalar. A common gain function used in tree methods.

(DSX::DirectStandardizationXform)(X; Factors = length(DSX.pca.Values))

Applies a the transform from a learned direct standardization object DSX to new data X.

(M::ELM)(X)

Makes an inference from X using a ELM object.

(M::KRR)(X)

Makes an inference from X using a KRR object which wraps a ClassicLeastSquares object.

EWMA(P::ewma)(New; train = true)

Provides an EWMA score for a New scalar value. If train == true the model is updated to include this new value.

(P::pipeline)(X; inverse = false)

Applies the stored transformations in a pipeline object P to data in X. The inverse flag can allow for the transformations to be reversed provided they are invertible functions.

ChangeCenter(P::ewma, new::Float64)

This is a convenience function to update the center of a P EWMA model, to a new scalar value.

diff --git a/docs/man/GeneticAlgorithms/index.html b/docs/man/GeneticAlgorithms/index.html index e7db715..3e77f34 100644 --- a/docs/man/GeneticAlgorithms/index.html +++ b/docs/man/GeneticAlgorithms/index.html @@ -1,2 +1,2 @@ -Genetic Algorithms · ChemometricsTools

Genetic Algorithms

Genetic Algorithms API Reference

Functions

Lifeform(size, onlikelihood, initialscore)

Constructor for a BinaryLifeForm struct. Binary life forms are basically wrappers for a binary vector, which has a likelihood for being 1(onlikelihood). Each life form also has a score based on it's "fitness". So the GA's in this package can be used to minimize or maximize this is an open parameter, but Inf/-Inf is a good initialscore.

source
Mutate( L::BinaryLifeform, amount = 0.05 )

Assesses each element in the gene vector (inside of L). If a randomly drawn value has a binomial probability of amount the element is mutated.

source
SinglePointCrossOver( L1::BinaryLifeform, L2::BinaryLifeform )

Creates two offspring (new BinaryLifeForms) by mixing the genes from L1 and L2 after a random position in the vector.

source
+Genetic Algorithms · ChemometricsTools

Genetic Algorithms

Genetic Algorithms API Reference

Functions

diff --git a/docs/man/MultiWay/index.html b/docs/man/MultiWay/index.html index 5e118a3..be64593 100644 --- a/docs/man/MultiWay/index.html +++ b/docs/man/MultiWay/index.html @@ -1,2 +1,2 @@ -MultiWay · ChemometricsTools

MultiWay

Multiway API Reference

Functions

MultiCenter(Z, mode = 1)

Acquires the mean of the specified mode in Z and returns a transform that will remove those means from any future data.

source
(T::MultiCenter)(Z; inverse = false)

Centers data in Tensor Z mode-wise according to learned centers in MultiCenter object T.

source
MultiScale(Z, mode = 1)

Acquires the standard deviations of the specified mode in Z and returns a transform that will scale by those standard deviations from any future data.

source
(T::MultiScale)(Z; inverse = false)

Scales data in Tensor Z mode-wise according to learned standard deviations in MultiScale object T.

source
MultiNorm(T)

Computes the equivalent of the Froebinius norm on a tensor T. Returns a scalar.

source
MultiPCA(X; Factors = 2)

Performs multiway PCA aka Higher Order SVD aka Tucker, etc. The number of factors decomposed can be a scalar(repeated across all modes) or a vector/tuple for each mode.

Returns a tuple of (Core Tensor, Basis Tensors)

source
+MultiWay · ChemometricsTools

MultiWay

Multiway API Reference

Functions

diff --git a/docs/man/PSO/index.html b/docs/man/PSO/index.html index 22386c9..edf1307 100644 --- a/docs/man/PSO/index.html +++ b/docs/man/PSO/index.html @@ -1,2 +1,2 @@ -PSO · ChemometricsTools

PSO

Particle Swarm Optimizer API Reference

Functions

Bounds(dims)

Constructor for a Bounds object. Returns a bounds object with a lower bound of [lower...] and upper bound[upper...] with length of dims.

source
Bounds(dims)

Default constructor for a Bounds object. Returns a bounds object with a lower bound of [0...] and upper bound[1...] with length of dims.

source
Particle(ProblemBounds, VelocityBounds)

Default constructor for a Particle object. It creates a random unformly distributed particle within the specified ProblemBounds, and limits it's velocity to the specified VelocityBounds.

source
PSO(fn, Bounds, VelRange, Particles; tolerance = 1e-6, maxiters = 1000, InertialDecay = 0.5, PersonalWeight = 0.5, GlobalWeight = 0.5, InternalParams = nothing)

Minimizes function fn with-in the user specified Bounds via a Particle Swarm Optimizer. The particle velocities are limitted to the VelRange. The number of particles are defined by the Particles parameter.

Returns a Tuple of the following form: ( GlobalBestPos, GlobalBestScore, P ) Where P is an array of the particles used in the optimization.

*Note: if the optimization function requires an additional constant parameter, please pass that parameter to InternalParams. This will only work if the optimized parameter(o) and constant parameter(c) for the function of interest has the following format: F(o,c) *

Kennedy, J.; Eberhart, R. (1995). Particle Swarm Optimization. Proceedings of IEEE International Conference on Neural Networks. IV. pp. 1942–1948. doi:10.1109/ICNN.1995.488968

source
+PSO · ChemometricsTools

PSO

Particle Swarm Optimizer API Reference

Functions

diff --git a/docs/man/Plotting/index.html b/docs/man/Plotting/index.html new file mode 100644 index 0000000..17551d2 --- /dev/null +++ b/docs/man/Plotting/index.html @@ -0,0 +1,2 @@ + +Plotting Tools API Reference · ChemometricsTools

Plotting Tools API Reference

Plotting Tools API Reference

Functions

diff --git a/docs/man/Preprocess/index.html b/docs/man/Preprocess/index.html index 5e89bab..2726a41 100644 --- a/docs/man/Preprocess/index.html +++ b/docs/man/Preprocess/index.html @@ -1,2 +1,2 @@ -Preprocessing · ChemometricsTools

Preprocessing

Preprocessing API Reference

Functions

CORAL(X1, X2; lambda = 1.0)

Performs CORAL to facilitate covariance based transfer from X1 to X2 with regularization parameter lambda. Returns a CORAL object.

Correlation Alignment for Unsupervised Domain Adaptation. Baochen Sun, Jiashi Feng, Kate Saenko. https://arxiv.org/abs/1612.01939

source
(C::CORAL)(Z)

Applies a the transform from a learned CORAL object to new data Z.

source
(T::MultiplicativeScatterCorrection)(Z)

Applies MultiplicativeScatterCorrection from a stored object T to Array Z.

source
OrthogonalSignalCorrection(X, Y; Factors = 1)

Performs Thomas Fearn's Orthogonal Signal Correction to an endogenous X and exogenous Y. The number of Factors are the number of orthogonal components to be removed from X. This function returns an OSC object.

Tom Fearn. On orthogonal signal correction. Chemometrics and Intelligent Laboratory Systems. Volume 50, Issue 1, 2000, Pages 47-52.

source
(OSC::OrthogonalSignalCorrection)(Z; Factors = 2)

Applies a the transform from a learned orthogonal signal correction object OSC to new data Z.

source
TransferByOrthogonalProjection(X1, X2; Factors = 1)

Performs Thomas Fearns Transfer By Orthogonal Projection to facilitate transfer from X1 to X2. Returns a TransferByOrthogonalProjection object.

Anne Andrew, Tom Fearn. Transfer by orthogonal projection: making near-infrared calibrations robust to between-instrument variation. Chemometrics and Intelligent Laboratory Systems. Volume 72, Issue 1, 2004, Pages 51-56,

source
(TbOP::TransferByOrthogonalProjection)(X1; Factors = TbOP.Factors)

Applies a the transform from a learned transfer by orthogonal projection object TbOP to new data X1.

source
ALSSmoother(X; lambda = 100, p = 0.001, maxiters = 10)

Applies an assymetric least squares smoothing function to a 2-Array X. The lambda, p, and maxiters parameters control the smoothness. See the reference below for more information.

Paul H. C. Eilers, Hans F.M. Boelens. Baseline Correction with Asymmetric Least Squares Smoothing. 2005

source
DirectStandardization(InstrumentX1, InstrumentX2; Factors = minimum(collect(size(InstrumentX1))) - 1)

Makes a DirectStandardization object to facilitate the transfer from Instrument #2 to Instrument #1 . The returned object can be used to transfer unseen data to the approximated space of instrument 1. The number of Factors used are those from the internal orthogonal basis.

Yongdong Wang and Bruce R. Kowalski, "Calibration Transfer and Measurement Stability of Near-Infrared Spectrometers," Appl. Spectrosc. 46, 764-771 (1992)

source
FirstDerivative(X)

Uses the finite difference method to compute the first derivative for every row in X. Note: This operation results in the loss of a column dimension.

source
FractionalDerivative(Y, X = 1 : length(Y); Order = 0.5)

Calculates the Grunwald-Leitnikov fractional order derivative on every row of Array Y. Array X is a vector that has the spacing between column-wise entries in Y. X can be a scalar if that is constant (common in spectroscopy). Order is the fractional order of the derivative. Note: This operation results in the loss of a column dimension.

The Fractional Calculus, by Oldham, K.; and Spanier, J. Hardcover: 234 pages. Publisher: Academic Press, 1974. ISBN 0-12-525550-0

source
PerfectSmoother(X; lambda = 100)

Applies an assymetric least squares smoothing function to a a 2-Array X. The lambda parameter controls the smoothness. See the reference below for more information.

Paul H. C. Eilers. "A Perfect Smoother". Analytical Chemistry, 2003, 75 (14), pp 3631–3636.

source
SavitzkyGolay(X, Delta, PolyOrder, windowsize)

Performs SavitskyGolay smoothing across every row in an Array X. The window size is the size of the convolution filter, PolyOrder is the order of the polynomial, and Delta is the order of the derivative.

Savitzky, A.; Golay, M.J.E. (1964). "Smoothing and Differentiation of Data by Simplified Least Squares Procedures". Analytical Chemistry. 36 (8): 1627–39. doi:10.1021/ac60214a047.

source
Scale1Norm(X)

Scales the columns of X by the 1-Norm of each row. Returns the scaled array.

source
Scale2Norm(X)

Scales the columns of X by the 2-Norm of each row. Returns the scaled array.

source
ScaleInfNorm(X)

Scales the columns of X by the Inf-Norm of each row. Returns the scaled array.

source
ScaleMinMax(X)

Scales the columns of X by the Min and Max of each row such that no observation is greater than 1 or less than zero. Returns the scaled array.

source
FirstDerivative(X)

Uses the finite difference method to compute the second derivative for every row in X. Note: This operation results in the loss of two columns.

source
StandardNormalVariate(X)

Scales the columns of X by the mean and standard deviation of each row. Returns the scaled array.

source
boxcar(X; windowsize = 3, fn = mean)

Applies a boxcar function (fn) to each window of size windowsize to every row in X. Note: the function provided must support a dims argument/parameter.

source
offsetToZero(X)

Ensures that no observation(row) of Array X is less than zero, by ensuring the minimum value of each row is zero.

source
(DSX::DirectStandardizationXform)(X; Factors = length(DSX.pca.Values))

Applies a the transform from a learned direct standardization object DSX to new data X.

source
+Preprocessing · ChemometricsTools

Preprocessing

Preprocessing API Reference

Functions

diff --git a/docs/man/RegressionModels/index.html b/docs/man/RegressionModels/index.html index a06f328..dd65fb5 100644 --- a/docs/man/RegressionModels/index.html +++ b/docs/man/RegressionModels/index.html @@ -1,2 +1,2 @@ -Regression Models · ChemometricsTools

Regression Models

Regression Models API Reference

Functions

ClassicLeastSquares( X, Y; Bias = false )

Makes a ClassicLeastSquares regression model of the form Y = AX with or without a Bias term. Returns a CLS object.

source
(M::ClassicLeastSquares)(X)

Makes an inference from X using a ClassicLeastSquares object.

source
LSSVM( X, Y, Penalty; KernelParameter = 0.0, KernelType = "linear" )

Makes a LSSVM model of the form Y = AK with a bias term using a user specified Kernel("linear", or "gaussian") and has an L2 Penalty. Returns a LSSVM Wrapper for a CLS object.

source
(M::LSSVM)(X)

Makes an inference from X using a LSSVM object.

source
PartialLeastSquares( X, Y; Factors = minimum(size(X)) - 2, tolerance = 1e-8, maxiters = 200 )

Returns a PartialLeastSquares regression model object from arrays X and Y.

  1. PARTIAL LEAST-SQUARES REGRESSION: A TUTORIAL PAUL GELADI and BRUCE R.KOWALSKI. Analytica Chimica Acta, 186, (1986) PARTIAL LEAST-SQUARES REGRESSION:
  2. Martens H., NÊs T. Multivariate Calibration. Wiley: New York, 1989.
  3. Re-interpretation of NIPALS results solves PLSR inconsistency problem. Rolf Ergon. Published in Journal of Chemometrics 2009; Vol. 23/1: 72-75
source
(M::PartialLeastSquares)

Makes an inference from X using a PartialLeastSquares object.

source
(M::PrincipalComponentRegression)( X )

Makes an inference from X using a PrincipalComponentRegression object.

source
PrincipalComponentRegression(PCAObject, Y )

Makes a PrincipalComponentRegression model object from a PCA Object and property value Y.

source
RidgeRegression( X, Y, Penalty; Bias = false )

Makes a RidgeRegression model of the form Y = AX with or without a Bias term and has an L2 Penalty. Returns a CLS object.

source
(M::RidgeRegression)(X)

Makes an inference from X using a RidgeRegression object which wraps a ClassicLeastSquares object.

source
ExtremeLearningMachine(X, Y, ReservoirSize = 10; ActivationFn = sigmoid)

Returns a ELM regression model object from arrays X and Y, with a user specified ReservoirSize and ActivationFn.

Extreme learning machine: a new learning scheme of feedforward neural networks. Guang-Bin Huang ; Qin-Yu Zhu ; Chee-Kheong Siew. 2004 IEEE International Joint...

source
KernelRidgeRegression( X, Y, Penalty; KernelParameter = 0.0, KernelType = "linear" )

Makes a KernelRidgeRegression model of the form Y = AK using a user specified Kernel("Linear", or "Guassian") and has an L2 Penalty. Returns a KRR Wrapper for a CLS object.

source
sigmoid(x)

Applies the sigmoid function to a scalar value X. Returns a scalar. Can be broad-casted over an Array.

source
(M::ELM)(X)

Makes an inference from X using a ELM object.

source
(M::KRR)(X)

Makes an inference from X using a KRR object which wraps a ClassicLeastSquares object.

source
+Regression Models · ChemometricsTools

Regression Models

Regression Models API Reference

Functions

diff --git a/docs/man/Sampling/index.html b/docs/man/Sampling/index.html index 9f32f84..d0f9bb9 100644 --- a/docs/man/Sampling/index.html +++ b/docs/man/Sampling/index.html @@ -1,2 +1,2 @@ -Sampling · ChemometricsTools

Sampling

Sampling API Reference

Functions

KennardStone(X, TrainSamples; distance = "euclidean")

Returns the indices of the Kennard-Stone sampled exemplars (E), and those not sampled (O) as a 2-Tuple (E, O).

R. W. Kennard & L. A. Stone (1969) Computer Aided Design of Experiments, Technometrics, 111, 137-148, DOI: 10.1080/00401706.1969.10490666

source
SplitByProportion(X::Array, Proportion::Float64 = 0.5)

Splits X Array along the observations dimension into a 2-Tuple based on the Proportion. The form of the output is the following: ( X1, X2 )

source
SplitByProportion(X::Array, Y::Array,Proportion::Float64 = 0.5)

Splits an X and Associated Y Array along the observations dimension into a 2-Tuple of 2-Tuples based on the Proportion. The form of the output is the following: ( (X1, Y1), (X2, Y2) )

source
VenetianBlinds(X,Y)

Splits an X and associated Y Array along the observation dimension into a 2-Tuple of 2-Tuples based on the whether it is even or odd. The form of the output is the following: ( (X1,Y1), (X2, Y2) )

source
VenetianBlinds(X)

Splits an X Array along the observations dimension into a 2-Tuple of 2-Tuples based on the whether it is even or odd. The form of the output is the following: ( X1, X2 )

source
+Sampling · ChemometricsTools

Sampling

Sampling API Reference

Functions

diff --git a/docs/man/Stats/index.html b/docs/man/Stats/index.html index 53e5fd0..c2cae95 100644 --- a/docs/man/Stats/index.html +++ b/docs/man/Stats/index.html @@ -1,2 +1,2 @@ -Stats. · ChemometricsTools

Stats.

Stats API Reference

Functions

RunningMean(x)

Constructs a running mean object with an initial scalar value of x.

source
RunningVar(x)

Constructs a RunningVar object with an initial scalar value of x. Note: RunningVar objects implicitly calculate the running mean.

source
EmpiricalQuantiles(X, quantiles)

Finds the column-wise quantiles of 2-Array X and returns them in a 2-Array of size quantiles by variables. *Note: This copies the array... Use a subset if memory is the concern. *

source
Mean(rv::RunningMean)

Returns the current mean inside of a RunningMean object.

source
Mean(rv::RunningVar)

Returns the current mean inside of a RunningVar object.

source
Remove!(RM::RunningMean, x)

Removes an observation(x) from a RunningMean object(RM) and reculates the mean in place.

source
Remove!(RM::RunningMean, x)

Removes an observation(x) from a RunningMean object(RM) and recuturns the new RunningMean object.

source
SampleSkewness(X)

returns a measure of skewness for vector X that is corrected for a sample of the population.

Joanes, D. N., and C. A. Gill. 1998. “Comparing Measures of Sample Skewness and Kurtosis”. The Statistician 47(1): 183–189.

source
Skewness(X)

returns a measure of skewness for a population vector X.

Joanes, D. N., and C. A. Gill. 1998. “Comparing Measures of Sample Skewness and Kurtosis”. The Statistician 47(1): 183–189.

source
Update!(RM::RunningMean, x)

Adds new observation(x) to a RunningMean object(RM) in place.

source
Update!(RV::RunningVar, x)

Adds new observation(x) to a RunningVar object(RV) and updates it in place.

source
Update!(RM::RunningMean, x)

Adds new observation(x) to a RunningMean object(RM) and returns the new object.

source
Variance(rv::RunningVar)

Returns the current variance inside of a RunningVar object.

source
rbinomial( p, size... )

Makes an N-dimensional array of size(s) size with a probability of being a 1 over a 0 of 1 p.

source
+Stats. · ChemometricsTools

Stats.

Stats API Reference

Functions

diff --git a/docs/man/TimeSeries/index.html b/docs/man/TimeSeries/index.html index 496639e..2762660 100644 --- a/docs/man/TimeSeries/index.html +++ b/docs/man/TimeSeries/index.html @@ -1,2 +1,2 @@ -Time Series · ChemometricsTools

Time Series

Time Series API Reference

Functions

RollingWindow(samples::Int,windowsize::Int,skip::Int)

Creates a RollingWindow iterator from a number of samples and a static windowsize where every iteration skip steps are skipped. The iterator can be used in for loops to iteratively return indices of a dynamic rolling window.

source
RollingWindow(samples::Int,windowsize::Int)

Creates a RollingWindow iterator from a number of samples and a static windowsize. The iterator can be used in for loops to iteratively return indices of a dynamic rolling window.

source
EWMA(Initial::Float64, Lambda::Float64) = ewma(Lambda, Initial, Initial, RunningVar(Initial))

Constructs an exponentially weighted moving average object from an vector of scalar property values Initial and the decay parameter Lambda. This computes the running statistcs neccesary for creating the EWMA model using the interval provided and updates the center value to the mean of the provided values.

source
EWMA(Initial::Float64, Lambda::Float64) = ewma(Lambda, Initial, Initial, RunningVar(Initial))

Constructs an exponentially weighted moving average object from an initial scalar property value Initial and the decay parameter Lambda. This defaults the center value to be the initial value.

source
Limits(P::ewma; k = 3.0)

This function returns the upper and lower control limits with a k span of variance for an EWMA object P.

source
Variance(P::ewma)

This function returns the EWMA control variance.

source
EWMA(P::ewma)(New; train = true)

Provides an EWMA score for a New scalar value. If train == true the model is updated to include this new value.

source
ChangeCenter(P::ewma, new::Float64)

This is a convenience function to update the center of a P EWMA model, to a new scalar value.

source
+Time Series · ChemometricsTools

Time Series

Time Series API Reference

Functions

diff --git a/docs/man/Training/index.html b/docs/man/Training/index.html index 9efe4c3..a170fde 100644 --- a/docs/man/Training/index.html +++ b/docs/man/Training/index.html @@ -1,2 +1,2 @@ -Training · ChemometricsTools

Training

Training API Reference

Functions

KFoldsValidation(K::Int, x, y)

Returns a KFoldsValidation iterator with K folds. Because it's an iterator it can be used in for loops, see the tutorials for pragmatic examples. The iterator returns a 2-Tuple of 2-Tuples which have the following form: ((TrainX,TrainY),(ValidateX,ValidateY).

source
LeaveOneOut(x, y)

Returns a KFoldsValidation iterator with leave one out folds. Because it's an iterator it can be used in for loops, see the tutorials for pragmatic examples. The iterator returns a 2-Tuple of 2-Tuples which have the following form: ((TrainX,TrainY),(ValidateX,ValidateY).

source
Shuffle!( X, Y )

Shuffles the rows of the X and Y data without replacement in place. In place, means that this function alters the order of the data in memory and this function does not return anything.

source
Shuffle( X, Y )

Shuffles the rows of the X and Y data without replacement. It returns a 2-Tuple of the shuffled set.

source
+Training · ChemometricsTools

Training

Training API Reference

Functions

diff --git a/docs/man/Transformations/index.html b/docs/man/Transformations/index.html index eae8ffd..34cdd37 100644 --- a/docs/man/Transformations/index.html +++ b/docs/man/Transformations/index.html @@ -1,2 +1,2 @@ -Transformations/Pipelines · ChemometricsTools

Transformations/Pipelines

Transformations/Pipelines API Reference

Functions

(T::Center)(Z; inverse = false)

Centers data in array Z column-wise according to learned mean centers in Center object T.

source
(T::CenterScale)(Z; inverse = false)

Centers and Scales data in array Z column-wise according to learned measures of central tendancy in Scale object T.

source
(T::QuantileTrim)(X, inverse = false)

Trims data in array X columns wise according to learned quantiles in QuantileTrim object T This function does NOT have an inverse.

source
QuantileTrim(Z; quantiles::Tuple{Float64,Float64} = (0.05, 0.95) )

Trims values above or below the specified columnwise quantiles to the quantile values themselves.

source
(T::RangeNorm)(Z; inverse = false)

Scales and shifts data in array Z column-wise according to learned min-maxes in RangeNorm object T.

source
(T::Scale)(Z; inverse = false)

Scales data in array Z column-wise according to learned standard deviations in Scale object T.

source
Logit(Z; inverse = false)

Logit transforms (ln( X / (1 - X) ))) every element in Z. The inverse may also be applied. Warning: This can return Infs and NaNs if elements of Z are not suited to the transform

source
Pipeline( X, FnStack... )

Construct a pipeline object from vector/tuple of Transforms. The Transforms vector are effectively a vector of functions which transform data.

source
Pipeline(Transforms)

Constructs a transformation pipeline from vector/tuple of Transforms. The Transforms vector are effectively a vector of functions which transform data.

source
PipelineInPlace( X, FnStack...)

Construct a pipeline object from vector/tuple of Transforms. The Transforms vector are effectively a vector of functions which transform data. This function makes "inplace" changes to the Array X as though it has been sent through the pipeline. This is more efficient if memory is a concern, but can irreversibly transform data in memory depending on the transforms in the pipeline.

source
(P::pipeline)(X; inverse = false)

Applies the stored transformations in a pipeline object P to data in X. The inverse flag can allow for the transformations to be reversed provided they are invertible functions.

source
+Transformations/Pipelines · ChemometricsTools

Transformations/Pipelines

Transformations/Pipelines API Reference

Functions

diff --git a/docs/man/Trees/index.html b/docs/man/Trees/index.html index ae39963..a6d6dda 100644 --- a/docs/man/Trees/index.html +++ b/docs/man/Trees/index.html @@ -1,2 +1,2 @@ -Tree Methods · ChemometricsTools

Tree Methods

Tree Methods API Reference

Functions

ClassificationTree(x, y; gainfn = entropy, maxdepth = 4, minbranchsize = 3)

Builds a CART object using either gini or entropy as a partioning method. Y must be a one hot encoded 2-Array. Predictions can be formed by calling the following function from the CART object: (M::CART)(x).

*Note: this is a purely nonrecursive decision tree. The julia compiler doesn't like storing structs of nested things. I wrote it the recursive way in the past and it was quite slow, I think this is true also of interpretted languages like R/Python...So here it is, nonrecursive tree's!

source
OneHotOdds(Y)

Calculates the odds of a one-hot formatted probability matrix. Returns a tuple.

source
entropy(v)

Calculates the Shannon-Entropy of a probability vector v. Returns a scalar. A common gain function used in tree methods.

source
gini(p)

Calculates the GINI coefficient of a probability vector p. Returns a scalar. A common gain function used in tree methods.

source
ssd(p)

Calculates the sum squared deviations from a decision tree split. Accepts a vector of values, and the mean of that vector. Returns a scalar. A common gain function used in tree methods.

source
+Tree Methods · ChemometricsTools

Tree Methods

Tree Methods API Reference

Functions

diff --git a/docs/man/classMetrics/index.html b/docs/man/classMetrics/index.html index c116fd7..57796da 100644 --- a/docs/man/classMetrics/index.html +++ b/docs/man/classMetrics/index.html @@ -1,2 +1,2 @@ -Classification Metrics · ChemometricsTools

Classification Metrics

Classification Metrics API Reference

Functions

ColdToHot(Y, Schema::ClassificationLabel)

Turns a cold encoded Y vector into a one hot encoded array.

source
HighestVote(yhat)

Returns the column index for each row that has the highest value in one hot encoded yhat. Returns a one cold encoded vector.

source
HighestVoteOneHot(yhat)

Turns the highest column-wise value to a 1 and the others to zeros per row in a one hot encoded yhat. Returns a one cold encoded vector.

source
HotToCold(Y, Schema::ClassificationLabel)

Turns a one hot encoded Y array into a cold encoded vector.

source
IsColdEncoded(Y)

Returns a boolean true if the array Y is cold encoded, and false if not.

source

" LabelEncoding(HotOrCold)

Determines if an Array, Y, is one hot encoded, or cold encoded by it's dimensions. Returns a ClassificationLabel object/schema to convert between the formats.

source
MulticlassStats(Y, GT, schema; Microaverage = true)

Calculates many essential classification statistics based on predicted values Y, and ground truth values GT, using the encoding schema. Returns a tuple whose first entry is a dictionary of averaged statistics, and whose second entry is a dictionary of the form "Class" => Statistics Dictionary ...

source
MulticlassThreshold(yhat; level = 0.5)

Effectively does the same thing as Threshold() but per-row across columns.

Warning this function can allow for no class assignments. HighestVote is preferred

source
StatsDictToDataFrame(DictOfStats, schema)

Converts a dictionary of statistics which is returned from MulticlassStats into a labelled dataframe. This is an intermediate step for automated report generation.

source
StatsFromTFPN(TP, TN, FP, FN)

Calculates many essential classification statistics based on the numbers of True Positive(TP), True Negative(TN), False Positive(FP), and False Negative(FN) examples.

source
Threshold(yhat; level = 0.5)

For a binary vector yhat this decides if the label is a 0 or a 1 based on it's value relative to a threshold level.

source
+Classification Metrics · ChemometricsTools

Classification Metrics

Classification Metrics API Reference

Functions

diff --git a/docs/man/regressMetrics/index.html b/docs/man/regressMetrics/index.html index 4fb83f6..6b9d788 100644 --- a/docs/man/regressMetrics/index.html +++ b/docs/man/regressMetrics/index.html @@ -1,2 +1,2 @@ -Regression Metrics · ChemometricsTools

Regression Metrics

Regression Metrics API Reference

Functions

MAE( y, yhat )

Calculates Mean Average Error from vectors Y and YHat

source
MAPE( y, yhat )

Calculates Mean Average Percent Error from vectors Y and YHat

source
ME( y, yhat )

Calculates Mean Error from vectors Y and YHat.

source
MSE( y, yhat )

Calculates Mean Squared Error from vectors Y and YHat

source
PearsonCorrelationCoefficient( y, yhat )

Calculates The Pearson Correlation Coefficient from vectors Y and YHat

source
PercentRMSE( y, yhat )

Calculates Percent Root Mean Squared Error from vectors Y and YHat

source
RMSE( y, yhat )

Calculates Root Mean Squared Error from vectors Y and YHat

source
RSquare( y, yhat )

Calculates R^2 from Y and YHat

source
SSE( y, yhat )

Calculates Sum of Squared Errors from vectors Y and YHat

source
SSReg( y, yhat )

Calculates Sum of Squared Deviations due to Regression from vectors Y and YHat

source
SSRes( y, yhat )

Calculates Sum of Squared Residuals from vectors Y and YHat

source
SSTotal( y, yhat )

Calculates Total Sum of Squared Deviations from vectors Y and YHat

source
+Regression Metrics · ChemometricsTools

Regression Metrics

Regression Metrics API Reference

Functions

diff --git a/docs/search_index.js b/docs/search_index.js index fe70252..d253a78 100644 --- a/docs/search_index.js +++ b/docs/search_index.js @@ -1,3 +1,3 @@ var documenterSearchIndex = {"docs": -[{"location":"#ChemometricsTools.jl-1","page":"Home","title":"ChemometricsTools.jl","text":"","category":"section"},{"location":"#","page":"Home","title":"Home","text":"A Chemometrics Suite for Julia.","category":"page"},{"location":"#","page":"Home","title":"Home","text":"This package offers access to essential chemometrics methods in a convenient and reliable way. It is a lightweight library written for performance and longevity. That being said, it's still a bit of a work in progress and if you find any bugs please make an issue!","category":"page"},{"location":"#Installation:-1","page":"Home","title":"Installation:","text":"","category":"section"},{"location":"#","page":"Home","title":"Home","text":"using Pkg\nPkg.add(\"ChemometricsTools\")","category":"page"},{"location":"#Support:-1","page":"Home","title":"Support:","text":"","category":"section"},{"location":"#","page":"Home","title":"Home","text":"This package was written in Julia 1.0.3 but should run fine in 1.1 or later releases. That's the beauty of from scratch code with minimal dependencies.","category":"page"},{"location":"#Ethos-1","page":"Home","title":"Ethos","text":"","category":"section"},{"location":"#","page":"Home","title":"Home","text":"Dependencies: Only base libraries (LinearAlgebra, StatsBase, Statistics, Plots) etc will be required. This is for longevity, and to provide a fast precompilation time. As wonderful as it is that other packages exist to do some of the internal operations this one needs, we won't have to worry about a breaking change made by an external author working out the kinks in a separate package. I want this to be long-term reliable without much upkeep. I'm a busy guy working a day job; I write this to warm-up before work, and unwind afterwards.","category":"page"},{"location":"#","page":"Home","title":"Home","text":"Arrays Only: In it's current state all of the algorithms available in this package operate exclusively on 1 or 2 Arrays. To be specific, the format of input arrays should be such that the number of rows are the observations, and the number of columns are the variables. This choice was made out of convenience and my personal bias. If enough users want DataFrames, Tables, JuliaDB formats, maybe this will change.","category":"page"},{"location":"#","page":"Home","title":"Home","text":"Center-Scaling: None of the methods in this package will center and scale for you. This package won't waste your time deciding if it should auto-center/scale large chunks of data every-time you do a regression/classification.","category":"page"},{"location":"#Why-Julia?-1","page":"Home","title":"Why Julia?","text":"","category":"section"},{"location":"#","page":"Home","title":"Home","text":"In Julia we can do mathematics like R or Matlab (no installations/imports), but write glue code as easily as python, with the expressiveness of scala, with (often) the performance of C/C++. Multidispatch makes recycling code painless, and broadcasting allows for intuitive application of operations across collections. I'm not a soft-ware engineer, but, these things have made Julia my language of choice. Try it for a week on Julia 1.0.3, if you don't get hooked, I'd be surprised.","category":"page"},{"location":"Demos/Transforms/#Transforms-Demo-1","page":"Transforms","title":"Transforms Demo","text":"","category":"section"},{"location":"Demos/Transforms/#","page":"Transforms","title":"Transforms","text":"Two design choices introduced in this package are \"Transformations\" and \"Pipelines\". Transformations are the smallest unit of a 'pipeline'. They are simply functions that have a deterministic inverse. For example if we mean center our data and store the mean vector, we can always invert the transform by adding the mean back to the data. That's effectively what transforms do, they provide to and from common data transformations used in chemometrics.","category":"page"},{"location":"Demos/Transforms/#","page":"Transforms","title":"Transforms","text":"Let's start with a trivial example with faux data where a random matrix of data is center scaled and divided by the standard deviation(StandardNormalVariate):","category":"page"},{"location":"Demos/Transforms/#","page":"Transforms","title":"Transforms","text":"FauxSpectra1 = randn(10,200);\nSNV = StandardNormalVariate(FauxSpectra1);\nTransformed1 = SNV(FauxSpectra1);","category":"page"},{"location":"Demos/Transforms/#","page":"Transforms","title":"Transforms","text":"As can be seen the application of the StandardNormalVariate() function returns an object that is used to transform future data by the data it was created from. This object can be applied to new data as follows,","category":"page"},{"location":"Demos/Transforms/#","page":"Transforms","title":"Transforms","text":"FauxSpectra2 = randn(10,200);\nTransformed2 = SNV(FauxSpectra2);","category":"page"},{"location":"Demos/Transforms/#","page":"Transforms","title":"Transforms","text":"Transformations can also be inverted (with-in numerical noise). For example,","category":"page"},{"location":"Demos/Transforms/#","page":"Transforms","title":"Transforms","text":"RMSE(FauxSpectra1, SNV(Transformed1; inverse = true)) < 1e-14\nRMSE(FauxSpectra2, SNV(Transformed2; inverse = true)) < 1e-14","category":"page"},{"location":"Demos/Transforms/#","page":"Transforms","title":"Transforms","text":"We can use transformations to treat data from multiple sources the same way. This helps mitigate user-error for cases where test data is scaled based on training data, calibration transfer, etc. Pipelines are a logical and convenient extension of transformations.","category":"page"},{"location":"Demos/Pipelines/#Pipelines-Demo-1","page":"Pipelines","title":"Pipelines Demo","text":"","category":"section"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"Multiple Transformations can be easily chained together and stored using \"Pipelines\". Preprocessing methods, or really any univariate function may be included in a pipeline, but that will likely mean it can no longer be inverted. Pipelines are basically convenience functions, but are somewhat flexible and can be used for automated searches,","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"PreprocessPipe = Pipeline(FauxSpectra1, RangeNorm, Center);\nProcessed = PreprocessPipe(FauxSpectra1);","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"Of course pipelines of transforms can also be inverted,","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"RMSE( FauxSpectra1, PreprocessPipe(Processed; inverse = true) ) < 1e-14","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"Pipelines can also be created and executed as an 'in place' operation for large datasets. This has the advantage that your data is transformed immediately without making copies in memory. This may be useful for large datasets and memory constrained environments. WARNING: be careful to only run the pipeline call or its inverse once! It is much safer to use the not inplace function outside of a REPL/script environment.","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"FauxSpectra = randn(10,200);\nOriginalCopy = copy(FauxSpectra);\nInPlacePipe = PipelineInPlace(FauxSpectra, Center, Scale);","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"See without returning the data or an extra function call we have transformed it according to the pipeline as it was instantiated...","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"FauxSpectra == OriginalCopy\n#Inplace transform the data back\nInPlacePipe(FauxSpectra; inverse = true)\nRMSE( OriginalCopy, FauxSpectra ) < 1e-14","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"Pipelines are kind of flexible. We can put nontransform (operations that cannot be inverted) preprocessing steps in them as well. In the example below the first derivative is applied to the data, this irreversibly removes a column from the data,","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"PreprocessPipe = Pipeline(FauxSpectra1, FirstDerivative, RangeNorm, Center);\nProcessed = PreprocessPipe(FauxSpectra1);\n#This should be equivalent to the following...\nSpectraDeriv = FirstDerivative(FauxSpectra1);\nAlternative = Pipeline(SpectraDeriv , RangeNorm, Center);\nProcessed == Alternative(SpectraDeriv)","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"Great right? Well what happens if we try to do the inverse of our pipeline with an irreversible function (First Derivative) in it?","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"PreprocessPipe(Processed; inverse = true)","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"Well we get an assertion error.","category":"page"},{"location":"Demos/Pipelines/#Automated-Pipeline-Example-1","page":"Pipelines","title":"Automated Pipeline Example","text":"","category":"section"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"We can take advantage of how pipelines are created; at their core they are tuples of transforms/functions. So if we can make an array of transforms and set some conditions they can be stored and applied to unseen data. A fun example of an automated transform pipeline is in the whimsical paper written by Willem Windig et. al. That paper is called 'Loopy Multiplicative Scatter Transform'. Below I'll show how we can implement that algorithm here (or anything similar) with ease. Loopy MSC: A Simple Way to Improve Multiplicative Scatter Correction. Willem Windig, Jeremy Shaver, Rasmus Bro. Applied Spectroscopy. 2008. Vol 62, issue: 10, 1153-1159","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"First let's look at the classic Diesel data before applying Loopy MSC (Image: rawspectra)","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"Alright, there is scatter, let's go for it,","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"RealSpectra = convert(Array, CSV.read(\"/diesel_spectra.csv\"));\nCurrent = RealSpectra;\nLast = zeros(size(Current));\nTransformArray = [];\nwhile RMSE(Last, Current) > 1e-5\n if any(isnan.(Current))\n break\n else\n push!(TransformArray, MultiplicativeScatterCorrection( Current ) )\n Last = Current\n Current = TransformArray[end](Last)\n end\nend\n#Now we can make a pipeline object from the array of stored transforms\nLoopyPipe = Pipeline( Tuple( TransformArray ) );","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"For a sanity check we can ensure the output of the algorithm is the same as the new pipeline so it can be applied to new data.","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"Current == LoopyPipe(RealSpectra)","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"Looks like our automation driven pipeline is equivalent to the loop it took to make it. More importantly did we remove scatter after 3 automated iterations of MSC? (Image: loopymsc)","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"Yes, yes we did. Pretty easy right?","category":"page"},{"location":"Demos/ClassificationExample/#Classification-Demo:-1","page":"Classification","title":"Classification Demo:","text":"","category":"section"},{"location":"Demos/ClassificationExample/#","page":"Classification","title":"Classification","text":"This demo shows an applied solution to a classification problem using real mid-infrared data. If you want to see the gambit of methods included in ChemometricsTools check the classification shootout example. There's also a bunch of tools for changes of basis such as: principal components analysis, linear discriminant analysis, orthogonal signal correction, etc. With those kinds of tools we can reduce the dimensions of our data and make classes more separable. So separable that trivial classification methods like a Gaussian discriminant can get us pretty good results. Below is an example analysis performed on mid-infrared spectra of strawberry purees and adulterated strawberry purees (yes fraudulent food items are a common concern).","category":"page"},{"location":"Demos/ClassificationExample/#","page":"Classification","title":"Classification","text":"(Image: Raw)","category":"page"},{"location":"Demos/ClassificationExample/#","page":"Classification","title":"Classification","text":"Use of Fourier transform infrared spectroscopy and partial least squares regression for the detection of adulteration of strawberry purées. J K Holland, E K Kemsley, R H Wilson","category":"page"},{"location":"Demos/ClassificationExample/#","page":"Classification","title":"Classification","text":"snv = StandardNormalVariate(Train);\nTrain_pca = PCA(snv(Train);; Factors = 15);\n\nEnc = LabelEncoding(TrnLbl);\nHot = ColdToHot(TrnLbl, Enc);\n\nlda = LDA(Train_pca.Scores , Hot);\nclassifier = GaussianDiscriminant(lda, TrainS, Hot)\nTrainPreds = classifier(TrainS; Factors = 2);","category":"page"},{"location":"Demos/ClassificationExample/#","page":"Classification","title":"Classification","text":"(Image: LDA of PCA)","category":"page"},{"location":"Demos/ClassificationExample/#","page":"Classification","title":"Classification","text":"Cool right? Well, we can now apply the same transformations to the test set and pull some multivariate Gaussians over the train set classes to see how we do identifying fraudulent puree's,","category":"page"},{"location":"Demos/ClassificationExample/#","page":"Classification","title":"Classification","text":"TestSet = Train_pca(snv(Test));\nTestSet = lda(TestSet);\nTestPreds = classifier(TestS; Factors = 2);\nMulticlassStats(TestPreds .- 1, TstLbl , Enc)","category":"page"},{"location":"Demos/ClassificationExample/#","page":"Classification","title":"Classification","text":"If you're following along you'll get a ~92% F-measure depending on your random split. Not bad. You may also notice this package has a nice collection of performance metrics for classification on board. Anyways, I've gotten 100%'s with more advanced methods but this is a cute way to show off some of the tools currently available.","category":"page"},{"location":"Demos/RegressionExample/#Regression/Training-Demo:-1","page":"Regression","title":"Regression/Training Demo:","text":"","category":"section"},{"location":"Demos/RegressionExample/#","page":"Regression","title":"Regression","text":"This demo shows a few ways to build a PLS regression model and perform cross validation. If you want to see the gambit of regression methods included in ChemometricsTools check the regression shootout example.","category":"page"},{"location":"Demos/RegressionExample/#","page":"Regression","title":"Regression","text":"There are a few built-in's to make training models a snap. Philosophically I decided, that making wrapper functions to perform Cross Validation is not fair to the end-user. There are many cases where we want specialized CV's but we don't want to write nested for-loops that run for hours then debug them... Similarly, most people don't want to spend their time hacking into rigid GridSearch objects, or scouring stack exchange / package documentation. Especially when it'd be easier to write an equivalent approach that is self documenting from scratch. Instead, I used Julia's iterators to make K-Fold validations convenient, below is an example Partial Least Squares Regression CV.","category":"page"},{"location":"Demos/RegressionExample/#","page":"Regression","title":"Regression","text":"#Split our data into two parts one 70% one 30%\n((TrainX,TrainY),(TestX, TestY)) = SplitByProportion(x, yprop, 0.7);\n#Preprocess it\nMSC_Obj = MultiplicativeScatterCorrection(TrainX);\nTrainX = MSC_Obj(TrainX);\nTestX = MSC_Obj(TestX);\n#Begin CV!\nLatentVariables = 22\nErr = repeat([0.0], LatentVariables);\n#Note this is the Julian way to nest two loops\nfor Lv in 1:LatentVariables, (Fold, HoldOut) in KFoldsValidation(20, TrainX, TrainY)\n PLSR = PartialLeastSquares(Fold[1], Fold[2]; Factors = Lv)\n Err[Lv] += SSE( PLSR(HoldOut[1]), HoldOut[2] )\nend\nscatter(Err, xlabel = \"Latent Variables\", ylabel = \"Cumulative SSE\", labels = [\"Error\"])\nBestLV = argmin(Err)\nPLSR = PartialLeastSquares(TrainX, TrainY; Factors = BestLV)\nRMSE( PLSR(TestX), TestY )","category":"page"},{"location":"Demos/RegressionExample/#","page":"Regression","title":"Regression","text":"(Image: 20folds)","category":"page"},{"location":"Demos/RegressionExample/#","page":"Regression","title":"Regression","text":"That's great right? but, hey that was kind of slow. Knowing what we know about ALS based models, we can do the same operation in linear time with respect to latent factors by computing the most latent variables first and only recomputing the regression coefficients. An example of this is below,","category":"page"},{"location":"Demos/RegressionExample/#","page":"Regression","title":"Regression","text":"Err = repeat([0.0], 22);\nModels = []\nfor Lv in 22:-1:1\n for ( i, ( Fold, HoldOut ) ) in enumerate(KFoldsValidation(20, TrainX, TrainY))\n if Lv == 22\n push!( Models, PartialLeastSquares(Fold[1], Fold[2]; Factors = Lv) )\n end\n Err[Lv] += SSE( Models[i]( HoldOut[1]; Factors = Lv), HoldOut[2] )\n end\nend","category":"page"},{"location":"Demos/RegressionExample/#","page":"Regression","title":"Regression","text":"This approach is ~5 times faster on a single core( < 2 seconds), pours through 7Gb less data, and makes 1/5th the allocations (on this dataset at least). If you wanted you could distribute the inner loop (using Distributed.jl) and see drastic speed ups!","category":"page"},{"location":"Demos/SIPLS/#Stacked-Interval-Partial-Least-Squares-1","page":"SIPLS","title":"Stacked Interval Partial Least Squares","text":"","category":"section"},{"location":"Demos/SIPLS/#","page":"SIPLS","title":"SIPLS","text":"Here's a post I kind of debated making... I once read a paper stating that SIPLS was \"too complicated\" to implement, and used that as an argument to favor other methods. SIPLS is actually pretty simple, highly effective, and it has statistical guarantees. What's complicated about SIPLS is providing it to end-users without shielding them from the internals, or leaving them with a pile of hard to read low level code. I decided, the way to go for 'advanced' methods, is to just provide convenience functions. Make life easier for an end-user that knows what they are doing. Demo's are for helping ferry people along and showing at least one way to do things, but there's no golden ticket one-line generic code-base here. Providing it, would be a mistake to people who would actually rely on using this sort of method.","category":"page"},{"location":"Demos/SIPLS/#Steps-to-SIPLS-1","page":"SIPLS","title":"4-Steps to SIPLS","text":"","category":"section"},{"location":"Demos/SIPLS/#","page":"SIPLS","title":"SIPLS","text":"Break the spectra's columnspace into invervals (the size can be CV'd but below I just picked one), then we CV PLS models inside each interval.\nOn a hold out set(or via pooling), we find the prediction error of our intervals\nThose errors are then reciprocally weighted\nApply those weights to future predictions via multiplication and sum the result of each interval model.","category":"page"},{"location":"Demos/SIPLS/#.-Crossvalidate-the-interval-models-1","page":"SIPLS","title":"1. Crossvalidate the interval models","text":"","category":"section"},{"location":"Demos/SIPLS/#","page":"SIPLS","title":"SIPLS","text":"MaxLvs = 10\nCVModels = []\nCVErr = []\nIntervals = MakeIntervals( size(calib1)[2], 30 );\nfor interval in Intervals\n IntervalError = repeat([0.0], MaxLvs);\n Models = []\n\n for Lv in MaxLvs:-1:1\n for ( i, ( Fold, HoldOut ) ) in enumerate(KFoldsValidation(10, calib1, caliby))\n if Lv == MaxLvs\n KFoldModel = PartialLeastSquares(Fold[1][:,interval], Fold[2]; Factors = Lv)\n push!( Models, KFoldModel )\n end\n\n Predictions = Models[i]( HoldOut[1][:, interval]; Factors = Lv)\n IntervalError[Lv] += SSE( Predictions, HoldOut[2])\n end\n end\n OptimalLv = argmin(IntervalError)\n push!(CVModels, PartialLeastSquares(calib1[:, interval], caliby; Factors = OptimalLv) )\n push!(CVErr, IntervalError[OptimalLv] )\nend","category":"page"},{"location":"Demos/SIPLS/#","page":"SIPLS","title":"SIPLS","text":"For fun, we can view the weights of each intervals relative error on the CV'd spectra with this lovely convenience function,","category":"page"},{"location":"Demos/SIPLS/#","page":"SIPLS","title":"SIPLS","text":"IntervalOverlay(calib1, Intervals, CVErr)","category":"page"},{"location":"Demos/SIPLS/#","page":"SIPLS","title":"SIPLS","text":"(Image: CVERR)","category":"page"},{"location":"Demos/SIPLS/#.-Validate-1","page":"SIPLS","title":"2. Validate","text":"","category":"section"},{"location":"Demos/SIPLS/#","page":"SIPLS","title":"SIPLS","text":"VErr = []\nIntervalError = repeat([0.0], MaxLvs);\nfor (model, interval) in enumerate(Intervals)\n push!(VErr, SSE( CVModels[model](valid1[:,interval]), validy) )\nend","category":"page"},{"location":"Demos/SIPLS/#.-Make-reciprocal-weights-1","page":"SIPLS","title":"3. Make reciprocal weights","text":"","category":"section"},{"location":"Demos/SIPLS/#","page":"SIPLS","title":"SIPLS","text":"StackedWeights = stackedweights(VErr);","category":"page"},{"location":"Demos/SIPLS/#","page":"SIPLS","title":"SIPLS","text":"We can recycle that same plot recipe to observe what this weighting function does for us. After calling the stacked weights function we can see how much each interval will contribute to our additve model. In essence, the weights make the intervals with lower error contribute more to the final stacked model, (Image: OS)","category":"page"},{"location":"Demos/SIPLS/#.-Pool-predictions-on-test-set-and-weight-results-1","page":"SIPLS","title":"4. Pool predictions on test set and weight results","text":"","category":"section"},{"location":"Demos/SIPLS/#","page":"SIPLS","title":"SIPLS","text":"Results = zeros(size(tst1)[1]);\nfor (model, interval) in enumerate(Intervals)\n Results += CVModels[model](tst1[:,interval]) .* StackedWeights[model]\nend\n\nRMSE( Results, tsty)","category":"page"},{"location":"Demos/SIPLS/#","page":"SIPLS","title":"SIPLS","text":"> 4.09","category":"page"},{"location":"Demos/SIPLS/#","page":"SIPLS","title":"SIPLS","text":"The RMSE from the SIPLS model is ~0.6 units less then that which we can observe from the same dataset using base PLSR in my Calibration Transfer Demo. This is actually really fast to run too. Every line in this script (aside from importing CSV) runs in roughly ~1-2 seconds.","category":"page"},{"location":"Demos/CalibXfer/#Direct-Standardization-Demo-1","page":"Calibration Transfer","title":"Direct Standardization Demo","text":"","category":"section"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"The point of this demo is to basically show off that ChemometricsTools contains some base methods for Calibration Transfer. If you don't know what that is, it's basically the subset of Chemometrics that focuses on transfer learning data collected on one instrument to another. This saves time and money for instruments that need to be calibrated but perform routine analysis'.","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"This demo uses the 2002 pharmaceutical shoot-out data and predicts upon the first property value(pretty sure its API content). The dataset contains the same samples of an unstated pharmaceutical measured on two spectrometers with experimentally determined property values. Our goal will be to use one model but adapt the domain from one of the spectrometers to the other.","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"First let's look at our linear sources of variation to get a feel for the data,","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"pca = PCA(calib1; Factors = 20);\nplot(cumsum(ExplainedVariance(pca)), title = \"Scree plot\", xlabel = \"PC's\", ylabel = \"Variance Explained\")","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"(Image: scree)","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"Yea so this isn't a true Scree plot, but it has the same information...","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"Looks like after ~5 factors we have garbage w.r.t X decompositions, good to know. So I'd venture to guess a maximum of 15 Latent Variables for a PLS-1 regression is more than a good enough cut-off for cross-validaiton.","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"MaxLvs = 15\nErr = repeat([0.0], MaxLvs);\nModels = []\nfor Lv in MaxLvs:-1:1\n for ( i, ( Fold, HoldOut ) ) in enumerate(KFoldsValidation(10, calib1, caliby))\n if Lv == MaxLvs\n push!( Models, PartialLeastSquares(Fold[1], Fold[2]; Factors = Lv) )\n end\n Err[Lv] += SSE( Models[i]( HoldOut[1]; Factors = Lv), HoldOut[2] )\n end\nend\n\nscatter(Err, xlabel = \"Latent Variables\", ylabel = \"Cumulative SSE\", labels = [\"Error\"])","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"(Image: cv)","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"Great looks like we can get by with 5-8 LV's. Let's fine tune our Latent Variables based on the hold out set to make our final PLSR model.","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"PLSR1 = PartialLeastSquares(calib1, caliby; Factors = 8);\nfor vLv in 5:8\n println(\"LV: \", vLv)\n println(\"RMSEV: \", RMSE(PLSR1(valid1; Factors = vLv), validy))\nend","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"Kind of hacky, but it works fine for a demo, we see that 7 factors is optimal on the hold out set so that's what we'll use from here on,","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"println(\"RMSEP: \", RMSE(PLSR1(tst1; Factors = 7), tsty))","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"> RMSEP: 4.76860402876937","category":"page"},{"location":"Demos/CalibXfer/#Getting-to-the-point-1","page":"Calibration Transfer","title":"Getting to the point","text":"","category":"section"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"So why do we need to do a calibration transfer? Same chemical, same type of measurements, even the same wavelengths are recorded and compared. Do the naive thing, apply this model to the measurements on instrument 2. See what error you get.","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"println(\"RMSEP: \", RMSE(PLSR1(tst2; Factors = 7), tsty))","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":">RMSEP: 10.303430504546292","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"The prediction error is about 2 fold, in this case it'd be hard to argue this is a useful model at all. Especially if you check the residuals. It's pretty clear the contributions of variance across multiple instruments are not the same in this case.","category":"page"},{"location":"Demos/CalibXfer/#Now-for-calibration-transfer!-1","page":"Calibration Transfer","title":"Now for calibration transfer!","text":"","category":"section"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"So let's use DirectStandardization. First we'll find the optimal number of DirectStandardization Factors to include in our model. We can do that on our hold out set and this should be very fast because we have a hold out set, so we can do this with some inefficient code.","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"Factors = 1:15\nErr = repeat([0.0], length(Factors));\nfor F in Factors\n DS2to1 = DirectStandardization(calib1, calib2; Factors = F);\n cds2to1 = DS2to1(valid2; Factors = F)\n Err[F] = RMSE( PLSR1(cds2to1; Factors = 7), validy )\nend\nscatter(Err, title = \"Transfered Model Validation Error\", xlabel = \"Latent Factors\",\n ylabel = \"RMSE\", labels = [\"Error\"])","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"(Image: cv)","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"OptimalDSFactor = argmin(Err)\nDS2to1 = DirectStandardization(calib1, calib2; Factors = OptimalDSFactor);\ntds2to1 = DS2to1(tst2; Factors = OptimalDSFactor);","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"Looks like 8 Factors in the DS transfer is pretty good. Lets see how the transferred data compares on the prediction set using the same model,","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"println(\"RMSEP: \", RMSE(PLSR1(tds2to1; Factors = 7), tsty))","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"> RMSEP: 5.693023386113084","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"Viola... So in conclusion we can transform the data from instrument 2 to be similar to that of instrument 1. The errors we see are effectively commensurate between the data sources with this transform, and without it the error is about 2x greater. Maybe the main point here is \"look ChemometricsTools has some calibration transfer methods and the tools included work\". OSC, TOP, CORAL, etc is also included.","category":"page"},{"location":"Demos/CurveResolution/#Curve-Resolution-Demo-1","page":"Curve Resolution","title":"Curve Resolution Demo","text":"","category":"section"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"ChemometricsTools has some curve resolution methods baked in. So far NMF, SIMPLISMA, and MCR-ALS are included. If you aren't familiar with them, they are used to extract spectral and concentration estimates from unknown mixtures in chemical signals. Below is an example of spectra which are composed of signals from a mixture of a 3 components. I could write a volume analyzing this simple set, but this is just a show-case of some methods and how to call them, what kind of results they might give you. The beauty of this example is that, we know what is in it, in a forensic or real-world situation we won't know what is in it, and we have to rely on domain knowledge, physical reasoning, and metrics to determine the validity of our results.","category":"page"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"Anyways, because we know, the pure spectra look like the following: (Image: pure)","category":"page"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"Note: There are three components (water, acetic acid, methanol), but their spectra were collected in duplicate.","category":"page"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"And the concentration profiles of the components follow the following simplex design, (Image: pureC)","category":"page"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"But the models we are using will only see the following (no pure components) (Image: impure)","category":"page"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"Raw = CSV.read(\"/triliq.csv\");\nMixture = collect(convert(Array, Raw)[:,1:end]);\npure = [10,11,20,21,28,29];\nPURE = Mixture[pure,:];\nimpure = [collect(1:9); collect(12:19);collect(22:27)];\nMixture = Mixture[impure,:];","category":"page"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"Great, so now let's run NMF, SIMPLISMA, and MCR-ALS with the SIMPLISMA estimates.","category":"page"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"( W_NMF, H_NMF ) = NMF(Mixture; Factors = 3, maxiters = 300, tolerance = 1e-8)\n(C_Simplisma,S_Simplisma, vars) = SIMPLISMA(Mixture; Factors = 18)\nvars\n#Find purest variables that are not neighbors with one another\ncuts = S_Simplisma[ [1,3,17], :];\n( C_MCRALS, S_MCRALS, err ) = MCRALS(Mixture, nothing, RangeNorm(cuts')(cuts')';\n Factors = 3, maxiters = 10,\n norm = (true, false),\n nonnegative = (true, true) )","category":"page"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"","category":"page"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"(Image: NMFS)","category":"page"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"(Image: SIMPLISMAS)","category":"page"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"(Image: MCRALSS)","category":"page"},{"location":"Demos/CurveResolution/#Spectral-Recovery-Discussion-(Results-by-Eye):-1","page":"Curve Resolution","title":"Spectral Recovery Discussion (Results by Eye):","text":"","category":"section"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"As we can see, NMF does resolve a few components that resemble a few of the actual pure components, but it really butchers the 3rd. While SIMPLISMA does a good job, at finding spectra that look \"real\" there are characteristics missing from the true spectra. It must be stated; SIMPLISMA wasn't invented for NIR signals. Finding pure variables in dozens... err... hundreds of over-lapping bands isn't really ideal. However, MCR-ALS quickly made work of those initial SIMPLISMA estimates and seems to have found some estimates that somewhat closely resemble the pure components.","category":"page"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"","category":"page"},{"location":"Demos/CurveResolution/#Concentration-Profile-Discussion-(Results-by-Eye):-1","page":"Curve Resolution","title":"Concentration Profile Discussion (Results by Eye):","text":"","category":"section"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"(Image: NMFC)","category":"page"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"(Image: SIMPLISMAC)","category":"page"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"(Image: MCRALSC)","category":"page"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"SIMPLISMA basically botched this dataset with regards to the concentration profiles. While NMF and MCR-ALS do quite good. Of course preprocessing can help here, and tinkering too. Ultimately not bad, given the mixture components. I do have a paper that shows another approach to this problem doubtful I'd be allowed to rewrite the code, I think my university owns it!","category":"page"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"Casey Kneale, Steven D. Brown, Band target entropy minimization and target partial least squares for spectral recovery and quantitation, Analytica Chimica Acta, Volume 1031, 2018, Pages 38-46, ISSN 0003-2670, https://doi.org/10.1016/j.aca.2018.07.054.","category":"page"},{"location":"man/Preprocess/#Preprocessing-API-Reference-1","page":"Preprocessing","title":"Preprocessing API Reference","text":"","category":"section"},{"location":"man/Preprocess/#Functions-1","page":"Preprocessing","title":"Functions","text":"","category":"section"},{"location":"man/Preprocess/#","page":"Preprocessing","title":"Preprocessing","text":"Modules = [ChemometricsTools]\nPages = [\"Preprocess.jl\"]","category":"page"},{"location":"man/Preprocess/#ChemometricsTools.CORAL-Tuple{Any,Any}","page":"Preprocessing","title":"ChemometricsTools.CORAL","text":"CORAL(X1, X2; lambda = 1.0)\n\nPerforms CORAL to facilitate covariance based transfer from X1 to X2 with regularization parameter lambda. Returns a CORAL object.\n\nCorrelation Alignment for Unsupervised Domain Adaptation. Baochen Sun, Jiashi Feng, Kate Saenko. https://arxiv.org/abs/1612.01939\n\n\n\n\n\n","category":"method"},{"location":"man/Preprocess/#ChemometricsTools.CORAL-Tuple{Any}","page":"Preprocessing","title":"ChemometricsTools.CORAL","text":"(C::CORAL)(Z)\n\nApplies a the transform from a learned CORAL object to new data Z.\n\n\n\n\n\n","category":"method"},{"location":"man/Preprocess/#ChemometricsTools.MultiplicativeScatterCorrection-Tuple{Any}","page":"Preprocessing","title":"ChemometricsTools.MultiplicativeScatterCorrection","text":"(T::MultiplicativeScatterCorrection)(Z)\n\nApplies MultiplicativeScatterCorrection from a stored object T to Array Z.\n\n\n\n\n\n","category":"method"},{"location":"man/Preprocess/#ChemometricsTools.OrthogonalSignalCorrection-Tuple{Any,Any}","page":"Preprocessing","title":"ChemometricsTools.OrthogonalSignalCorrection","text":"OrthogonalSignalCorrection(X, Y; Factors = 1)\n\nPerforms Thomas Fearn's Orthogonal Signal Correction to an endogenous X and exogenous Y. The number of Factors are the number of orthogonal components to be removed from X. This function returns an OSC object.\n\nTom Fearn. On orthogonal signal correction. Chemometrics and Intelligent Laboratory Systems. Volume 50, Issue 1, 2000, Pages 47-52.\n\n\n\n\n\n","category":"method"},{"location":"man/Preprocess/#ChemometricsTools.OrthogonalSignalCorrection-Tuple{Any}","page":"Preprocessing","title":"ChemometricsTools.OrthogonalSignalCorrection","text":"(OSC::OrthogonalSignalCorrection)(Z; Factors = 2)\n\nApplies a the transform from a learned orthogonal signal correction object OSC to new data Z.\n\n\n\n\n\n","category":"method"},{"location":"man/Preprocess/#ChemometricsTools.TransferByOrthogonalProjection-Tuple{Any,Any}","page":"Preprocessing","title":"ChemometricsTools.TransferByOrthogonalProjection","text":"TransferByOrthogonalProjection(X1, X2; Factors = 1)\n\nPerforms Thomas Fearns Transfer By Orthogonal Projection to facilitate transfer from X1 to X2. Returns a TransferByOrthogonalProjection object.\n\nAnne Andrew, Tom Fearn. Transfer by orthogonal projection: making near-infrared calibrations robust to between-instrument variation. Chemometrics and Intelligent Laboratory Systems. Volume 72, Issue 1, 2004, Pages 51-56,\n\n\n\n\n\n","category":"method"},{"location":"man/Preprocess/#ChemometricsTools.TransferByOrthogonalProjection-Tuple{Any}","page":"Preprocessing","title":"ChemometricsTools.TransferByOrthogonalProjection","text":"(TbOP::TransferByOrthogonalProjection)(X1; Factors = TbOP.Factors)\n\nApplies a the transform from a learned transfer by orthogonal projection object TbOP to new data X1.\n\n\n\n\n\n","category":"method"},{"location":"man/Preprocess/#ChemometricsTools.ALSSmoother-Tuple{Any}","page":"Preprocessing","title":"ChemometricsTools.ALSSmoother","text":"ALSSmoother(X; lambda = 100, p = 0.001, maxiters = 10)\n\nApplies an assymetric least squares smoothing function to a 2-Array X. The lambda, p, and maxiters parameters control the smoothness. See the reference below for more information.\n\nPaul H. C. Eilers, Hans F.M. Boelens. Baseline Correction with Asymmetric Least Squares Smoothing. 2005\n\n\n\n\n\n","category":"method"},{"location":"man/Preprocess/#ChemometricsTools.DirectStandardization-Tuple{Any,Any}","page":"Preprocessing","title":"ChemometricsTools.DirectStandardization","text":"DirectStandardization(InstrumentX1, InstrumentX2; Factors = minimum(collect(size(InstrumentX1))) - 1)\n\nMakes a DirectStandardization object to facilitate the transfer from Instrument #2 to Instrument #1 . The returned object can be used to transfer unseen data to the approximated space of instrument 1. The number of Factors used are those from the internal orthogonal basis.\n\nYongdong Wang and Bruce R. Kowalski, \"Calibration Transfer and Measurement Stability of Near-Infrared Spectrometers,\" Appl. Spectrosc. 46, 764-771 (1992)\n\n\n\n\n\n","category":"method"},{"location":"man/Preprocess/#ChemometricsTools.FirstDerivative-Tuple{Any}","page":"Preprocessing","title":"ChemometricsTools.FirstDerivative","text":"FirstDerivative(X)\n\nUses the finite difference method to compute the first derivative for every row in X. Note: This operation results in the loss of a column dimension.\n\n\n\n\n\n","category":"method"},{"location":"man/Preprocess/#ChemometricsTools.FractionalDerivative","page":"Preprocessing","title":"ChemometricsTools.FractionalDerivative","text":"FractionalDerivative(Y, X = 1 : length(Y); Order = 0.5)\n\nCalculates the Grunwald-Leitnikov fractional order derivative on every row of Array Y. Array X is a vector that has the spacing between column-wise entries in Y. X can be a scalar if that is constant (common in spectroscopy). Order is the fractional order of the derivative. Note: This operation results in the loss of a column dimension.\n\nThe Fractional Calculus, by Oldham, K.; and Spanier, J. Hardcover: 234 pages. Publisher: Academic Press, 1974. ISBN 0-12-525550-0\n\n\n\n\n\n","category":"function"},{"location":"man/Preprocess/#ChemometricsTools.PerfectSmoother-Tuple{Any}","page":"Preprocessing","title":"ChemometricsTools.PerfectSmoother","text":"PerfectSmoother(X; lambda = 100)\n\nApplies an assymetric least squares smoothing function to a a 2-Array X. The lambda parameter controls the smoothness. See the reference below for more information.\n\nPaul H. C. Eilers. \"A Perfect Smoother\". Analytical Chemistry, 2003, 75 (14), pp 3631–3636.\n\n\n\n\n\n","category":"method"},{"location":"man/Preprocess/#ChemometricsTools.SavitzkyGolay-Tuple{Any,Any,Any,Int64}","page":"Preprocessing","title":"ChemometricsTools.SavitzkyGolay","text":"SavitzkyGolay(X, Delta, PolyOrder, windowsize)\n\nPerforms SavitskyGolay smoothing across every row in an Array X. The window size is the size of the convolution filter, PolyOrder is the order of the polynomial, and Delta is the order of the derivative.\n\nSavitzky, A.; Golay, M.J.E. (1964). \"Smoothing and Differentiation of Data by Simplified Least Squares Procedures\". Analytical Chemistry. 36 (8): 1627–39. doi:10.1021/ac60214a047.\n\n\n\n\n\n","category":"method"},{"location":"man/Preprocess/#ChemometricsTools.Scale1Norm-Tuple{Any}","page":"Preprocessing","title":"ChemometricsTools.Scale1Norm","text":"Scale1Norm(X)\n\nScales the columns of X by the 1-Norm of each row. Returns the scaled array.\n\n\n\n\n\n","category":"method"},{"location":"man/Preprocess/#ChemometricsTools.Scale2Norm-Tuple{Any}","page":"Preprocessing","title":"ChemometricsTools.Scale2Norm","text":"Scale2Norm(X)\n\nScales the columns of X by the 2-Norm of each row. Returns the scaled array.\n\n\n\n\n\n","category":"method"},{"location":"man/Preprocess/#ChemometricsTools.ScaleInfNorm-Tuple{Any}","page":"Preprocessing","title":"ChemometricsTools.ScaleInfNorm","text":"ScaleInfNorm(X)\n\nScales the columns of X by the Inf-Norm of each row. Returns the scaled array.\n\n\n\n\n\n","category":"method"},{"location":"man/Preprocess/#ChemometricsTools.ScaleMinMax-Tuple{Any}","page":"Preprocessing","title":"ChemometricsTools.ScaleMinMax","text":"ScaleMinMax(X)\n\nScales the columns of X by the Min and Max of each row such that no observation is greater than 1 or less than zero. Returns the scaled array.\n\n\n\n\n\n","category":"method"},{"location":"man/Preprocess/#ChemometricsTools.SecondDerivative-Tuple{Any}","page":"Preprocessing","title":"ChemometricsTools.SecondDerivative","text":"FirstDerivative(X)\n\nUses the finite difference method to compute the second derivative for every row in X. Note: This operation results in the loss of two columns.\n\n\n\n\n\n","category":"method"},{"location":"man/Preprocess/#ChemometricsTools.StandardNormalVariate-Tuple{Any}","page":"Preprocessing","title":"ChemometricsTools.StandardNormalVariate","text":"StandardNormalVariate(X)\n\nScales the columns of X by the mean and standard deviation of each row. Returns the scaled array.\n\n\n\n\n\n","category":"method"},{"location":"man/Preprocess/#ChemometricsTools.boxcar-Tuple{Any}","page":"Preprocessing","title":"ChemometricsTools.boxcar","text":"boxcar(X; windowsize = 3, fn = mean)\n\nApplies a boxcar function (fn) to each window of size windowsize to every row in X. Note: the function provided must support a dims argument/parameter.\n\n\n\n\n\n","category":"method"},{"location":"man/Preprocess/#ChemometricsTools.offsetToZero-Tuple{Any}","page":"Preprocessing","title":"ChemometricsTools.offsetToZero","text":"offsetToZero(X)\n\nEnsures that no observation(row) of Array X is less than zero, by ensuring the minimum value of each row is zero.\n\n\n\n\n\n","category":"method"},{"location":"man/Preprocess/#ChemometricsTools.DirectStandardizationXform-Tuple{Any}","page":"Preprocessing","title":"ChemometricsTools.DirectStandardizationXform","text":"(DSX::DirectStandardizationXform)(X; Factors = length(DSX.pca.Values))\n\nApplies a the transform from a learned direct standardization object DSX to new data X.\n\n\n\n\n\n","category":"method"},{"location":"man/Transformations/#Transformations/Pipelines-API-Reference-1","page":"Transformations/Pipelines","title":"Transformations/Pipelines API Reference","text":"","category":"section"},{"location":"man/Transformations/#Functions-1","page":"Transformations/Pipelines","title":"Functions","text":"","category":"section"},{"location":"man/Transformations/#","page":"Transformations/Pipelines","title":"Transformations/Pipelines","text":"Modules = [ChemometricsTools]\nPages = [\"Transformations.jl\"]","category":"page"},{"location":"man/Transformations/#ChemometricsTools.Center-Tuple{Any}","page":"Transformations/Pipelines","title":"ChemometricsTools.Center","text":"(T::Center)(Z; inverse = false)\n\nCenters data in array Z column-wise according to learned mean centers in Center object T.\n\n\n\n\n\n","category":"method"},{"location":"man/Transformations/#ChemometricsTools.CenterScale-Tuple{Any}","page":"Transformations/Pipelines","title":"ChemometricsTools.CenterScale","text":"(T::CenterScale)(Z; inverse = false)\n\nCenters and Scales data in array Z column-wise according to learned measures of central tendancy in Scale object T.\n\n\n\n\n\n","category":"method"},{"location":"man/Transformations/#ChemometricsTools.QuantileTrim","page":"Transformations/Pipelines","title":"ChemometricsTools.QuantileTrim","text":"(T::QuantileTrim)(X, inverse = false)\n\nTrims data in array X columns wise according to learned quantiles in QuantileTrim object T This function does NOT have an inverse.\n\n\n\n\n\n","category":"type"},{"location":"man/Transformations/#ChemometricsTools.QuantileTrim-Tuple{Any}","page":"Transformations/Pipelines","title":"ChemometricsTools.QuantileTrim","text":"QuantileTrim(Z; quantiles::Tuple{Float64,Float64} = (0.05, 0.95) )\n\nTrims values above or below the specified columnwise quantiles to the quantile values themselves.\n\n\n\n\n\n","category":"method"},{"location":"man/Transformations/#ChemometricsTools.RangeNorm-Tuple{Any}","page":"Transformations/Pipelines","title":"ChemometricsTools.RangeNorm","text":"(T::RangeNorm)(Z; inverse = false)\n\nScales and shifts data in array Z column-wise according to learned min-maxes in RangeNorm object T.\n\n\n\n\n\n","category":"method"},{"location":"man/Transformations/#ChemometricsTools.Scale-Tuple{Any}","page":"Transformations/Pipelines","title":"ChemometricsTools.Scale","text":"(T::Scale)(Z; inverse = false)\n\nScales data in array Z column-wise according to learned standard deviations in Scale object T.\n\n\n\n\n\n","category":"method"},{"location":"man/Transformations/#ChemometricsTools.Logit-Tuple{Any}","page":"Transformations/Pipelines","title":"ChemometricsTools.Logit","text":"Logit(Z; inverse = false)\n\nLogit transforms (ln( X / (1 - X) ))) every element in Z. The inverse may also be applied. Warning: This can return Infs and NaNs if elements of Z are not suited to the transform\n\n\n\n\n\n","category":"method"},{"location":"man/Transformations/#ChemometricsTools.Pipeline-Tuple{Any,Vararg{Any,N} where N}","page":"Transformations/Pipelines","title":"ChemometricsTools.Pipeline","text":"Pipeline( X, FnStack... )\n\nConstruct a pipeline object from vector/tuple of Transforms. The Transforms vector are effectively a vector of functions which transform data.\n\n\n\n\n\n","category":"method"},{"location":"man/Transformations/#ChemometricsTools.Pipeline-Tuple{Any}","page":"Transformations/Pipelines","title":"ChemometricsTools.Pipeline","text":"Pipeline(Transforms)\n\nConstructs a transformation pipeline from vector/tuple of Transforms. The Transforms vector are effectively a vector of functions which transform data.\n\n\n\n\n\n","category":"method"},{"location":"man/Transformations/#ChemometricsTools.PipelineInPlace-Tuple{Any,Vararg{Any,N} where N}","page":"Transformations/Pipelines","title":"ChemometricsTools.PipelineInPlace","text":"PipelineInPlace( X, FnStack...)\n\nConstruct a pipeline object from vector/tuple of Transforms. The Transforms vector are effectively a vector of functions which transform data. This function makes \"inplace\" changes to the Array X as though it has been sent through the pipeline. This is more efficient if memory is a concern, but can irreversibly transform data in memory depending on the transforms in the pipeline.\n\n\n\n\n\n","category":"method"},{"location":"man/Transformations/#ChemometricsTools.pipeline-Tuple{Any}","page":"Transformations/Pipelines","title":"ChemometricsTools.pipeline","text":"(P::pipeline)(X; inverse = false)\n\nApplies the stored transformations in a pipeline object P to data in X. The inverse flag can allow for the transformations to be reversed provided they are invertible functions.\n\n\n\n\n\n","category":"method"},{"location":"man/Sampling/#Sampling-API-Reference-1","page":"Sampling","title":"Sampling API Reference","text":"","category":"section"},{"location":"man/Sampling/#Functions-1","page":"Sampling","title":"Functions","text":"","category":"section"},{"location":"man/Sampling/#","page":"Sampling","title":"Sampling","text":"Modules = [ChemometricsTools]\nPages = [\"Sampling.jl\"]","category":"page"},{"location":"man/Sampling/#ChemometricsTools.KennardStone-Tuple{Any,Any}","page":"Sampling","title":"ChemometricsTools.KennardStone","text":"KennardStone(X, TrainSamples; distance = \"euclidean\")\n\nReturns the indices of the Kennard-Stone sampled exemplars (E), and those not sampled (O) as a 2-Tuple (E, O).\n\nR. W. Kennard & L. A. Stone (1969) Computer Aided Design of Experiments, Technometrics, 111, 137-148, DOI: 10.1080/00401706.1969.10490666\n\n\n\n\n\n","category":"method"},{"location":"man/Sampling/#ChemometricsTools.SplitByProportion","page":"Sampling","title":"ChemometricsTools.SplitByProportion","text":"SplitByProportion(X::Array, Proportion::Float64 = 0.5)\n\nSplits X Array along the observations dimension into a 2-Tuple based on the Proportion. The form of the output is the following: ( X1, X2 )\n\n\n\n\n\n","category":"function"},{"location":"man/Sampling/#ChemometricsTools.SplitByProportion","page":"Sampling","title":"ChemometricsTools.SplitByProportion","text":"SplitByProportion(X::Array, Y::Array,Proportion::Float64 = 0.5)\n\nSplits an X and Associated Y Array along the observations dimension into a 2-Tuple of 2-Tuples based on the Proportion. The form of the output is the following: ( (X1, Y1), (X2, Y2) )\n\n\n\n\n\n","category":"function"},{"location":"man/Sampling/#ChemometricsTools.VenetianBlinds-Tuple{Any,Any}","page":"Sampling","title":"ChemometricsTools.VenetianBlinds","text":"VenetianBlinds(X,Y)\n\nSplits an X and associated Y Array along the observation dimension into a 2-Tuple of 2-Tuples based on the whether it is even or odd. The form of the output is the following: ( (X1,Y1), (X2, Y2) )\n\n\n\n\n\n","category":"method"},{"location":"man/Sampling/#ChemometricsTools.VenetianBlinds-Tuple{Any}","page":"Sampling","title":"ChemometricsTools.VenetianBlinds","text":"VenetianBlinds(X)\n\nSplits an X Array along the observations dimension into a 2-Tuple of 2-Tuples based on the whether it is even or odd. The form of the output is the following: ( X1, X2 )\n\n\n\n\n\n","category":"method"},{"location":"man/Training/#Training-API-Reference-1","page":"Training","title":"Training API Reference","text":"","category":"section"},{"location":"man/Training/#Functions-1","page":"Training","title":"Functions","text":"","category":"section"},{"location":"man/Training/#","page":"Training","title":"Training","text":"Modules = [ChemometricsTools]\nPages = [\"Training.jl\"]","category":"page"},{"location":"man/Training/#ChemometricsTools.KFoldsValidation-Tuple{Int64,Any,Any}","page":"Training","title":"ChemometricsTools.KFoldsValidation","text":"KFoldsValidation(K::Int, x, y)\n\nReturns a KFoldsValidation iterator with K folds. Because it's an iterator it can be used in for loops, see the tutorials for pragmatic examples. The iterator returns a 2-Tuple of 2-Tuples which have the following form: ((TrainX,TrainY),(ValidateX,ValidateY).\n\n\n\n\n\n","category":"method"},{"location":"man/Training/#ChemometricsTools.LeaveOneOut-Tuple{Any,Any}","page":"Training","title":"ChemometricsTools.LeaveOneOut","text":"LeaveOneOut(x, y)\n\nReturns a KFoldsValidation iterator with leave one out folds. Because it's an iterator it can be used in for loops, see the tutorials for pragmatic examples. The iterator returns a 2-Tuple of 2-Tuples which have the following form: ((TrainX,TrainY),(ValidateX,ValidateY).\n\n\n\n\n\n","category":"method"},{"location":"man/Training/#ChemometricsTools.Shuffle!-Tuple{Any,Any}","page":"Training","title":"ChemometricsTools.Shuffle!","text":"Shuffle!( X, Y )\n\nShuffles the rows of the X and Y data without replacement in place. In place, means that this function alters the order of the data in memory and this function does not return anything.\n\n\n\n\n\n","category":"method"},{"location":"man/Training/#ChemometricsTools.Shuffle-Tuple{Any,Any}","page":"Training","title":"ChemometricsTools.Shuffle","text":"Shuffle( X, Y )\n\nShuffles the rows of the X and Y data without replacement. It returns a 2-Tuple of the shuffled set.\n\n\n\n\n\n","category":"method"},{"location":"man/TimeSeries/#Time-Series-API-Reference-1","page":"Time Series","title":"Time Series API Reference","text":"","category":"section"},{"location":"man/TimeSeries/#Functions-1","page":"Time Series","title":"Functions","text":"","category":"section"},{"location":"man/TimeSeries/#","page":"Time Series","title":"Time Series","text":"Modules = [ChemometricsTools]\nPages = [\"TimeSeries.jl\"]","category":"page"},{"location":"man/TimeSeries/#ChemometricsTools.RollingWindow-Tuple{Int64,Int64,Int64}","page":"Time Series","title":"ChemometricsTools.RollingWindow","text":"RollingWindow(samples::Int,windowsize::Int,skip::Int)\n\nCreates a RollingWindow iterator from a number of samples and a static windowsize where every iteration skip steps are skipped. The iterator can be used in for loops to iteratively return indices of a dynamic rolling window.\n\n\n\n\n\n","category":"method"},{"location":"man/TimeSeries/#ChemometricsTools.RollingWindow-Tuple{Int64,Int64}","page":"Time Series","title":"ChemometricsTools.RollingWindow","text":"RollingWindow(samples::Int,windowsize::Int)\n\nCreates a RollingWindow iterator from a number of samples and a static windowsize. The iterator can be used in for loops to iteratively return indices of a dynamic rolling window.\n\n\n\n\n\n","category":"method"},{"location":"man/TimeSeries/#ChemometricsTools.EWMA-Tuple{Array,Float64}","page":"Time Series","title":"ChemometricsTools.EWMA","text":"EWMA(Initial::Float64, Lambda::Float64) = ewma(Lambda, Initial, Initial, RunningVar(Initial))\n\nConstructs an exponentially weighted moving average object from an vector of scalar property values Initial and the decay parameter Lambda. This computes the running statistcs neccesary for creating the EWMA model using the interval provided and updates the center value to the mean of the provided values.\n\n\n\n\n\n","category":"method"},{"location":"man/TimeSeries/#ChemometricsTools.EWMA-Tuple{Float64,Float64}","page":"Time Series","title":"ChemometricsTools.EWMA","text":"EWMA(Initial::Float64, Lambda::Float64) = ewma(Lambda, Initial, Initial, RunningVar(Initial))\n\nConstructs an exponentially weighted moving average object from an initial scalar property value Initial and the decay parameter Lambda. This defaults the center value to be the initial value.\n\n\n\n\n\n","category":"method"},{"location":"man/TimeSeries/#ChemometricsTools.Limits-Tuple{ChemometricsTools.ewma}","page":"Time Series","title":"ChemometricsTools.Limits","text":"Limits(P::ewma; k = 3.0)\n\nThis function returns the upper and lower control limits with a k span of variance for an EWMA object P.\n\n\n\n\n\n","category":"method"},{"location":"man/TimeSeries/#ChemometricsTools.Variance-Tuple{ChemometricsTools.ewma}","page":"Time Series","title":"ChemometricsTools.Variance","text":"Variance(P::ewma)\n\nThis function returns the EWMA control variance.\n\n\n\n\n\n","category":"method"},{"location":"man/TimeSeries/#ChemometricsTools.ewma-Tuple{Any}","page":"Time Series","title":"ChemometricsTools.ewma","text":"EWMA(P::ewma)(New; train = true)\n\nProvides an EWMA score for a New scalar value. If train == true the model is updated to include this new value.\n\n\n\n\n\n","category":"method"},{"location":"man/TimeSeries/#ChemometricsTools.ChangeCenter-Tuple{ChemometricsTools.ewma,Float64}","page":"Time Series","title":"ChemometricsTools.ChangeCenter","text":"ChangeCenter(P::ewma, new::Float64)\n\nThis is a convenience function to update the center of a P EWMA model, to a new scalar value.\n\n\n\n\n\n","category":"method"},{"location":"man/RegressionModels/#Regression-Models-API-Reference-1","page":"Regression Models","title":"Regression Models API Reference","text":"","category":"section"},{"location":"man/RegressionModels/#Functions-1","page":"Regression Models","title":"Functions","text":"","category":"section"},{"location":"man/RegressionModels/#","page":"Regression Models","title":"Regression Models","text":"Modules = [ChemometricsTools]\nPages = [\"RegressionModels.jl\"]","category":"page"},{"location":"man/RegressionModels/#ChemometricsTools.ClassicLeastSquares-Tuple{Any,Any}","page":"Regression Models","title":"ChemometricsTools.ClassicLeastSquares","text":"ClassicLeastSquares( X, Y; Bias = false )\n\nMakes a ClassicLeastSquares regression model of the form Y = AX with or without a Bias term. Returns a CLS object.\n\n\n\n\n\n","category":"method"},{"location":"man/RegressionModels/#ChemometricsTools.ClassicLeastSquares-Tuple{Any}","page":"Regression Models","title":"ChemometricsTools.ClassicLeastSquares","text":"(M::ClassicLeastSquares)(X)\n\nMakes an inference from X using a ClassicLeastSquares object.\n\n\n\n\n\n","category":"method"},{"location":"man/RegressionModels/#ChemometricsTools.LSSVM-Tuple{Any,Any,Any}","page":"Regression Models","title":"ChemometricsTools.LSSVM","text":"LSSVM( X, Y, Penalty; KernelParameter = 0.0, KernelType = \"linear\" )\n\nMakes a LSSVM model of the form Y = AK with a bias term using a user specified Kernel(\"linear\", or \"gaussian\") and has an L2 Penalty. Returns a LSSVM Wrapper for a CLS object.\n\n\n\n\n\n","category":"method"},{"location":"man/RegressionModels/#ChemometricsTools.LSSVM-Tuple{Any}","page":"Regression Models","title":"ChemometricsTools.LSSVM","text":"(M::LSSVM)(X)\n\nMakes an inference from X using a LSSVM object.\n\n\n\n\n\n","category":"method"},{"location":"man/RegressionModels/#ChemometricsTools.PartialLeastSquares-Tuple{Any,Any}","page":"Regression Models","title":"ChemometricsTools.PartialLeastSquares","text":"PartialLeastSquares( X, Y; Factors = minimum(size(X)) - 2, tolerance = 1e-8, maxiters = 200 )\n\nReturns a PartialLeastSquares regression model object from arrays X and Y.\n\nPARTIAL LEAST-SQUARES REGRESSION: A TUTORIAL PAUL GELADI and BRUCE R.KOWALSKI. Analytica Chimica Acta, 186, (1986) PARTIAL LEAST-SQUARES REGRESSION:\nMartens H., NÊs T. Multivariate Calibration. Wiley: New York, 1989.\nRe-interpretation of NIPALS results solves PLSR inconsistency problem. Rolf Ergon. Published in Journal of Chemometrics 2009; Vol. 23/1: 72-75\n\n\n\n\n\n","category":"method"},{"location":"man/RegressionModels/#ChemometricsTools.PartialLeastSquares-Tuple{Any}","page":"Regression Models","title":"ChemometricsTools.PartialLeastSquares","text":"(M::PartialLeastSquares)\n\nMakes an inference from X using a PartialLeastSquares object.\n\n\n\n\n\n","category":"method"},{"location":"man/RegressionModels/#ChemometricsTools.PrincipalComponentRegression-Tuple{Any}","page":"Regression Models","title":"ChemometricsTools.PrincipalComponentRegression","text":"(M::PrincipalComponentRegression)( X )\n\nMakes an inference from X using a PrincipalComponentRegression object.\n\n\n\n\n\n","category":"method"},{"location":"man/RegressionModels/#ChemometricsTools.PrincipalComponentRegression-Tuple{PCA,Any}","page":"Regression Models","title":"ChemometricsTools.PrincipalComponentRegression","text":"PrincipalComponentRegression(PCAObject, Y )\n\nMakes a PrincipalComponentRegression model object from a PCA Object and property value Y.\n\n\n\n\n\n","category":"method"},{"location":"man/RegressionModels/#ChemometricsTools.RidgeRegression-Tuple{Any,Any,Any}","page":"Regression Models","title":"ChemometricsTools.RidgeRegression","text":"RidgeRegression( X, Y, Penalty; Bias = false )\n\nMakes a RidgeRegression model of the form Y = AX with or without a Bias term and has an L2 Penalty. Returns a CLS object.\n\n\n\n\n\n","category":"method"},{"location":"man/RegressionModels/#ChemometricsTools.RidgeRegression-Tuple{Any}","page":"Regression Models","title":"ChemometricsTools.RidgeRegression","text":"(M::RidgeRegression)(X)\n\nMakes an inference from X using a RidgeRegression object which wraps a ClassicLeastSquares object.\n\n\n\n\n\n","category":"method"},{"location":"man/RegressionModels/#ChemometricsTools.ExtremeLearningMachine","page":"Regression Models","title":"ChemometricsTools.ExtremeLearningMachine","text":"ExtremeLearningMachine(X, Y, ReservoirSize = 10; ActivationFn = sigmoid)\n\nReturns a ELM regression model object from arrays X and Y, with a user specified ReservoirSize and ActivationFn.\n\nExtreme learning machine: a new learning scheme of feedforward neural networks. Guang-Bin Huang ; Qin-Yu Zhu ; Chee-Kheong Siew. \t2004 IEEE International Joint...\n\n\n\n\n\n","category":"function"},{"location":"man/RegressionModels/#ChemometricsTools.KernelRidgeRegression-Tuple{Any,Any,Any}","page":"Regression Models","title":"ChemometricsTools.KernelRidgeRegression","text":"KernelRidgeRegression( X, Y, Penalty; KernelParameter = 0.0, KernelType = \"linear\" )\n\nMakes a KernelRidgeRegression model of the form Y = AK using a user specified Kernel(\"Linear\", or \"Guassian\") and has an L2 Penalty. Returns a KRR Wrapper for a CLS object.\n\n\n\n\n\n","category":"method"},{"location":"man/RegressionModels/#ChemometricsTools.sigmoid-Tuple{Any}","page":"Regression Models","title":"ChemometricsTools.sigmoid","text":"sigmoid(x)\n\nApplies the sigmoid function to a scalar value X. Returns a scalar. Can be broad-casted over an Array.\n\n\n\n\n\n","category":"method"},{"location":"man/RegressionModels/#ChemometricsTools.ELM-Tuple{Any}","page":"Regression Models","title":"ChemometricsTools.ELM","text":"(M::ELM)(X)\n\nMakes an inference from X using a ELM object.\n\n\n\n\n\n","category":"method"},{"location":"man/RegressionModels/#ChemometricsTools.KRR-Tuple{Any}","page":"Regression Models","title":"ChemometricsTools.KRR","text":"(M::KRR)(X)\n\nMakes an inference from X using a KRR object which wraps a ClassicLeastSquares object.\n\n\n\n\n\n","category":"method"},{"location":"man/regressMetrics/#Regression-Metrics-API-Reference-1","page":"Regression Metrics","title":"Regression Metrics API Reference","text":"","category":"section"},{"location":"man/regressMetrics/#Functions-1","page":"Regression Metrics","title":"Functions","text":"","category":"section"},{"location":"man/regressMetrics/#","page":"Regression Metrics","title":"Regression Metrics","text":"Modules = [ChemometricsTools]\nPages = [\"RegressionMetrics.jl\"]","category":"page"},{"location":"man/regressMetrics/#ChemometricsTools.MAE-Tuple{Any,Any}","page":"Regression Metrics","title":"ChemometricsTools.MAE","text":"MAE( y, yhat )\n\nCalculates Mean Average Error from vectors Y and YHat\n\n\n\n\n\n","category":"method"},{"location":"man/regressMetrics/#ChemometricsTools.MAPE-Tuple{Any,Any}","page":"Regression Metrics","title":"ChemometricsTools.MAPE","text":"MAPE( y, yhat )\n\nCalculates Mean Average Percent Error from vectors Y and YHat\n\n\n\n\n\n","category":"method"},{"location":"man/regressMetrics/#ChemometricsTools.ME-Tuple{Any,Any}","page":"Regression Metrics","title":"ChemometricsTools.ME","text":"ME( y, yhat )\n\nCalculates Mean Error from vectors Y and YHat.\n\n\n\n\n\n","category":"method"},{"location":"man/regressMetrics/#ChemometricsTools.MSE-Tuple{Any,Any}","page":"Regression Metrics","title":"ChemometricsTools.MSE","text":"MSE( y, yhat )\n\nCalculates Mean Squared Error from vectors Y and YHat\n\n\n\n\n\n","category":"method"},{"location":"man/regressMetrics/#ChemometricsTools.PearsonCorrelationCoefficient-Tuple{Any,Any}","page":"Regression Metrics","title":"ChemometricsTools.PearsonCorrelationCoefficient","text":"PearsonCorrelationCoefficient( y, yhat )\n\nCalculates The Pearson Correlation Coefficient from vectors Y and YHat\n\n\n\n\n\n","category":"method"},{"location":"man/regressMetrics/#ChemometricsTools.PercentRMSE-Tuple{Any,Any}","page":"Regression Metrics","title":"ChemometricsTools.PercentRMSE","text":"PercentRMSE( y, yhat )\n\nCalculates Percent Root Mean Squared Error from vectors Y and YHat\n\n\n\n\n\n","category":"method"},{"location":"man/regressMetrics/#ChemometricsTools.RMSE-Tuple{Any,Any}","page":"Regression Metrics","title":"ChemometricsTools.RMSE","text":"RMSE( y, yhat )\n\nCalculates Root Mean Squared Error from vectors Y and YHat\n\n\n\n\n\n","category":"method"},{"location":"man/regressMetrics/#ChemometricsTools.RSquare-Tuple{Any,Any}","page":"Regression Metrics","title":"ChemometricsTools.RSquare","text":"RSquare( y, yhat )\n\nCalculates R^2 from Y and YHat\n\n\n\n\n\n","category":"method"},{"location":"man/regressMetrics/#ChemometricsTools.SSE-Tuple{Any,Any}","page":"Regression Metrics","title":"ChemometricsTools.SSE","text":"SSE( y, yhat )\n\nCalculates Sum of Squared Errors from vectors Y and YHat\n\n\n\n\n\n","category":"method"},{"location":"man/regressMetrics/#ChemometricsTools.SSReg-Tuple{Any,Any}","page":"Regression Metrics","title":"ChemometricsTools.SSReg","text":"SSReg( y, yhat )\n\nCalculates Sum of Squared Deviations due to Regression from vectors Y and YHat\n\n\n\n\n\n","category":"method"},{"location":"man/regressMetrics/#ChemometricsTools.SSRes-Tuple{Any,Any}","page":"Regression Metrics","title":"ChemometricsTools.SSRes","text":"SSRes( y, yhat )\n\nCalculates Sum of Squared Residuals from vectors Y and YHat\n\n\n\n\n\n","category":"method"},{"location":"man/regressMetrics/#ChemometricsTools.SSTotal-Tuple{Any}","page":"Regression Metrics","title":"ChemometricsTools.SSTotal","text":"SSTotal( y, yhat )\n\nCalculates Total Sum of Squared Deviations from vectors Y and YHat\n\n\n\n\n\n","category":"method"},{"location":"man/ClassificationModels/#Classification-Models-API-Reference-1","page":"Classification Models","title":"Classification Models API Reference","text":"","category":"section"},{"location":"man/ClassificationModels/#Functions-1","page":"Classification Models","title":"Functions","text":"","category":"section"},{"location":"man/ClassificationModels/#","page":"Classification Models","title":"Classification Models","text":"Modules = [ChemometricsTools]\nPages = [\"ClassificationModels.jl\"]","category":"page"},{"location":"man/ClassificationModels/#ChemometricsTools.GaussianDiscriminant-Tuple{Any,Any,Any}","page":"Classification Models","title":"ChemometricsTools.GaussianDiscriminant","text":"GaussianDiscriminant(M, X, Y; Factors = nothing)\n\nReturns a GaussianDiscriminant classification model on basis object M (PCA, LDA) and one hot encoded Y.\n\n\n\n\n\n","category":"method"},{"location":"man/ClassificationModels/#ChemometricsTools.GaussianDiscriminant-Tuple{Any}","page":"Classification Models","title":"ChemometricsTools.GaussianDiscriminant","text":"( model::GaussianDiscriminant )( Z; Factors = size(model.ProjectedClassMeans)[2] )\n\nReturns a 1 hot encoded inference from Z using a GaussianDiscriminant object. This function enforces positive definiteness in the class covariance matrices.\n\n\n\n\n\n","category":"method"},{"location":"man/ClassificationModels/#ChemometricsTools.GaussianNaiveBayes-Tuple{Any,Any}","page":"Classification Models","title":"ChemometricsTools.GaussianNaiveBayes","text":"GaussianNaiveBayes(X,Y)\n\nReturns a GaussianNaiveBayes classification model object from X and one hot encoded Y.\n\n\n\n\n\n","category":"method"},{"location":"man/ClassificationModels/#ChemometricsTools.GaussianNaiveBayes-Tuple{Any}","page":"Classification Models","title":"ChemometricsTools.GaussianNaiveBayes","text":"(gnb::GaussianNaiveBayes)(X)\n\nReturns a 1 hot encoded inference from X using a GaussianNaiveBayes object.\n\n\n\n\n\n","category":"method"},{"location":"man/ClassificationModels/#ChemometricsTools.KNN","page":"Classification Models","title":"ChemometricsTools.KNN","text":"KNN( X, Y; DistanceType::String )\n\nDistanceType can be \"euclidean\", \"manhattan\". Y Must be one hot encoded.\n\nReturns a KNN classification model.\n\n\n\n\n\n","category":"type"},{"location":"man/ClassificationModels/#ChemometricsTools.KNN-Tuple{Any}","page":"Classification Models","title":"ChemometricsTools.KNN","text":"( model::KNN )( Z; K = 1 )\n\nReturns a 1 hot encoded inference from X with K Nearest Neighbors, using a KNN object.\n\n\n\n\n\n","category":"method"},{"location":"man/ClassificationModels/#ChemometricsTools.LogisticRegression-Tuple{Any}","page":"Classification Models","title":"ChemometricsTools.LogisticRegression","text":"( model::LogisticRegression )( X )\n\nReturns a 1 hot encoded inference from X using a LogisticRegression object.\n\n\n\n\n\n","category":"method"},{"location":"man/ClassificationModels/#ChemometricsTools.ProbabilisticNeuralNetwork","page":"Classification Models","title":"ChemometricsTools.ProbabilisticNeuralNetwork","text":"ProbabilisticNeuralNetwork( X, Y )\n\nStores data for a PNN. Y Must be one hot encoded.\n\nReturns a PNN classification model.\n\n\n\n\n\n","category":"type"},{"location":"man/ClassificationModels/#ChemometricsTools.ProbabilisticNeuralNetwork-Tuple{Any}","page":"Classification Models","title":"ChemometricsTools.ProbabilisticNeuralNetwork","text":"(PNN::ProbabilisticNeuralNetwork)(X; sigma = 0.1)\n\nReturns a 1 hot encoded inference from X with a probabilistic neural network.\n\n\n\n\n\n","category":"method"},{"location":"man/ClassificationModels/#ChemometricsTools.ConfidenceEllipse","page":"Classification Models","title":"ChemometricsTools.ConfidenceEllipse","text":"ConfidenceEllipse(cov, mean, confidence, axis = [1,2]; pointestimate = 180 )\n\nReturns a 2-D array whose columns are X & Y coordinates of a confidence ellipse. The ellipse is generated by the covariance matrix, mean vector, and the number of points to include in the plot.\n\n\n\n\n\n","category":"function"},{"location":"man/ClassificationModels/#ChemometricsTools.LinearPerceptronBatch-Tuple{Any,Any}","page":"Classification Models","title":"ChemometricsTools.LinearPerceptronBatch","text":"LinearPerceptron(X, Y; LearningRate = 1e-3, MaxIters = 5000)\n\nReturns a batch trained LinearPerceptron classification model object from X and one hot encoded Y.\n\n\n\n\n\n","category":"method"},{"location":"man/ClassificationModels/#ChemometricsTools.LinearPerceptronSGD-Tuple{Any,Any}","page":"Classification Models","title":"ChemometricsTools.LinearPerceptronSGD","text":"LinearPerceptronsgd(X, Y; LearningRate = 1e-3, MaxIters = 5000)\n\nReturns a SGD trained LinearPerceptron classification model object from X and one hot encoded Y.\n\n\n\n\n\n","category":"method"},{"location":"man/ClassificationModels/#ChemometricsTools.MultinomialSoftmaxRegression-Tuple{Any,Any}","page":"Classification Models","title":"ChemometricsTools.MultinomialSoftmaxRegression","text":"MultinomialSoftmaxRegression(X, Y; LearnRate = 1e-3, maxiters = 1000, L2 = 0.0)\n\nReturns a LogisticRegression classification model made by Stochastic Gradient Descent.\n\n\n\n\n\n","category":"method"},{"location":"man/ClassificationModels/#ChemometricsTools.linearperceptron-Tuple{Any}","page":"Classification Models","title":"ChemometricsTools.linearperceptron","text":"(L::linearperceptron)(X)\n\nReturns a 1 hot encoded inference from X using a LinearPerceptron object.\n\n\n\n\n\n","category":"method"},{"location":"man/classMetrics/#Classification-Metrics-API-Reference-1","page":"Classification Metrics","title":"Classification Metrics API Reference","text":"","category":"section"},{"location":"man/classMetrics/#Functions-1","page":"Classification Metrics","title":"Functions","text":"","category":"section"},{"location":"man/classMetrics/#","page":"Classification Metrics","title":"Classification Metrics","text":"Modules = [ChemometricsTools]\nPages = [\"ClassificationMetrics.jl\"]","category":"page"},{"location":"man/classMetrics/#ChemometricsTools.ColdToHot-Tuple{Any,ChemometricsTools.ClassificationLabel}","page":"Classification Metrics","title":"ChemometricsTools.ColdToHot","text":"ColdToHot(Y, Schema::ClassificationLabel)\n\nTurns a cold encoded Y vector into a one hot encoded array.\n\n\n\n\n\n","category":"method"},{"location":"man/classMetrics/#ChemometricsTools.HighestVote-Tuple{Any}","page":"Classification Metrics","title":"ChemometricsTools.HighestVote","text":"HighestVote(yhat)\n\nReturns the column index for each row that has the highest value in one hot encoded yhat. Returns a one cold encoded vector.\n\n\n\n\n\n","category":"method"},{"location":"man/classMetrics/#ChemometricsTools.HighestVoteOneHot-Tuple{Any}","page":"Classification Metrics","title":"ChemometricsTools.HighestVoteOneHot","text":"HighestVoteOneHot(yhat)\n\nTurns the highest column-wise value to a 1 and the others to zeros per row in a one hot encoded yhat. Returns a one cold encoded vector.\n\n\n\n\n\n","category":"method"},{"location":"man/classMetrics/#ChemometricsTools.HotToCold-Tuple{Any,ChemometricsTools.ClassificationLabel}","page":"Classification Metrics","title":"ChemometricsTools.HotToCold","text":"HotToCold(Y, Schema::ClassificationLabel)\n\nTurns a one hot encoded Y array into a cold encoded vector.\n\n\n\n\n\n","category":"method"},{"location":"man/classMetrics/#ChemometricsTools.IsColdEncoded-Tuple{Any}","page":"Classification Metrics","title":"ChemometricsTools.IsColdEncoded","text":"IsColdEncoded(Y)\n\nReturns a boolean true if the array Y is cold encoded, and false if not.\n\n\n\n\n\n","category":"method"},{"location":"man/classMetrics/#ChemometricsTools.LabelEncoding-Tuple{Any}","page":"Classification Metrics","title":"ChemometricsTools.LabelEncoding","text":"\" LabelEncoding(HotOrCold)\n\nDetermines if an Array, Y, is one hot encoded, or cold encoded by it's dimensions. Returns a ClassificationLabel object/schema to convert between the formats.\n\n\n\n\n\n","category":"method"},{"location":"man/classMetrics/#ChemometricsTools.MulticlassStats-Tuple{Any,Any,Any}","page":"Classification Metrics","title":"ChemometricsTools.MulticlassStats","text":"MulticlassStats(Y, GT, schema; Microaverage = true)\n\nCalculates many essential classification statistics based on predicted values Y, and ground truth values GT, using the encoding schema. Returns a tuple whose first entry is a dictionary of averaged statistics, and whose second entry is a dictionary of the form \"Class\" => Statistics Dictionary ...\n\n\n\n\n\n","category":"method"},{"location":"man/classMetrics/#ChemometricsTools.MulticlassThreshold-Tuple{Any}","page":"Classification Metrics","title":"ChemometricsTools.MulticlassThreshold","text":"MulticlassThreshold(yhat; level = 0.5)\n\nEffectively does the same thing as Threshold() but per-row across columns.\n\nWarning this function can allow for no class assignments. HighestVote is preferred\n\n\n\n\n\n","category":"method"},{"location":"man/classMetrics/#ChemometricsTools.StatsDictToDataFrame-Tuple{Any,Any}","page":"Classification Metrics","title":"ChemometricsTools.StatsDictToDataFrame","text":"StatsDictToDataFrame(DictOfStats, schema)\n\nConverts a dictionary of statistics which is returned from MulticlassStats into a labelled dataframe. This is an intermediate step for automated report generation.\n\n\n\n\n\n","category":"method"},{"location":"man/classMetrics/#ChemometricsTools.StatsFromTFPN-NTuple{4,Any}","page":"Classification Metrics","title":"ChemometricsTools.StatsFromTFPN","text":"StatsFromTFPN(TP, TN, FP, FN)\n\nCalculates many essential classification statistics based on the numbers of True Positive(TP), True Negative(TN), False Positive(FP), and False Negative(FN) examples.\n\n\n\n\n\n","category":"method"},{"location":"man/classMetrics/#ChemometricsTools.Threshold-Tuple{Any}","page":"Classification Metrics","title":"ChemometricsTools.Threshold","text":"Threshold(yhat; level = 0.5)\n\nFor a binary vector yhat this decides if the label is a 0 or a 1 based on it's value relative to a threshold level.\n\n\n\n\n\n","category":"method"},{"location":"man/Trees/#Tree-Methods-API-Reference-1","page":"Tree Methods","title":"Tree Methods API Reference","text":"","category":"section"},{"location":"man/Trees/#Functions-1","page":"Tree Methods","title":"Functions","text":"","category":"section"},{"location":"man/Trees/#","page":"Tree Methods","title":"Tree Methods","text":"Modules = [ChemometricsTools]\nPages = [\"Trees.jl\"]","category":"page"},{"location":"man/Trees/#ChemometricsTools.ClassificationTree-Tuple{Any,Any}","page":"Tree Methods","title":"ChemometricsTools.ClassificationTree","text":"ClassificationTree(x, y; gainfn = entropy, maxdepth = 4, minbranchsize = 3)\n\nBuilds a CART object using either gini or entropy as a partioning method. Y must be a one hot encoded 2-Array. Predictions can be formed by calling the following function from the CART object: (M::CART)(x).\n\n*Note: this is a purely nonrecursive decision tree. The julia compiler doesn't like storing structs of nested things. I wrote it the recursive way in the past and it was quite slow, I think this is true also of interpretted languages like R/Python...So here it is, nonrecursive tree's!\n\n\n\n\n\n","category":"method"},{"location":"man/Trees/#ChemometricsTools.OneHotOdds-Tuple{Any}","page":"Tree Methods","title":"ChemometricsTools.OneHotOdds","text":"OneHotOdds(Y)\n\nCalculates the odds of a one-hot formatted probability matrix. Returns a tuple.\n\n\n\n\n\n","category":"method"},{"location":"man/Trees/#ChemometricsTools.entropy-Tuple{Any}","page":"Tree Methods","title":"ChemometricsTools.entropy","text":"entropy(v)\n\nCalculates the Shannon-Entropy of a probability vector v. Returns a scalar. A common gain function used in tree methods.\n\n\n\n\n\n","category":"method"},{"location":"man/Trees/#ChemometricsTools.gini-Tuple{Any}","page":"Tree Methods","title":"ChemometricsTools.gini","text":"gini(p)\n\nCalculates the GINI coefficient of a probability vector p. Returns a scalar. A common gain function used in tree methods.\n\n\n\n\n\n","category":"method"},{"location":"man/Trees/#ChemometricsTools.ssd-Tuple{Any,Any}","page":"Tree Methods","title":"ChemometricsTools.ssd","text":"ssd(p)\n\nCalculates the sum squared deviations from a decision tree split. Accepts a vector of values, and the mean of that vector. Returns a scalar. A common gain function used in tree methods.\n\n\n\n\n\n","category":"method"},{"location":"man/Ensemble/#Ensemble-Models-API-Reference-1","page":"Ensemble Models","title":"Ensemble Models API Reference","text":"","category":"section"},{"location":"man/Ensemble/#Functions-1","page":"Ensemble Models","title":"Functions","text":"","category":"section"},{"location":"man/Ensemble/#","page":"Ensemble Models","title":"Ensemble Models","text":"Modules = [ChemometricsTools]\nPages = [\"Ensembles.jl\"]","category":"page"},{"location":"man/Ensemble/#ChemometricsTools.RandomForest","page":"Ensemble Models","title":"ChemometricsTools.RandomForest","text":"RandomForest(x, y, mode = :classification; gainfn = entropy, trees = 50, maxdepth = 10, minbranchsize = 5, samples = 0.7, maxvars = nothing)\n\nReturns a classification (mode = :classification) or a regression (mode = :regression) random forest model. The gainfn can be entropy or gini for classification or ssd for regression. If the number of maximumvars is not provided it will default to sqrt(variables) for classification or variables/3 for regression.\n\nThe returned object can be used for inference by calling new data on the object as a function.\n\nBreiman, L. Machine Learning (2001) 45: 5. https://doi.org/10.1023/A:1010933404324\n\n\n\n\n\n","category":"type"},{"location":"man/Ensemble/#ChemometricsTools.RandomForest-Tuple{Any}","page":"Ensemble Models","title":"ChemometricsTools.RandomForest","text":"(RF::RandomForest)(X)\n\nReturns bagged prediction vector of random forest model.\n\n\n\n\n\n","category":"method"},{"location":"man/Ensemble/#ChemometricsTools.MakeIntervals","page":"Ensemble Models","title":"ChemometricsTools.MakeIntervals","text":"MakeIntervals( columns::Int, intervalsize::Union{Array, Tuple} = [20, 50, 100] )\n\nCreates an Dictionary whose key is the interval size and values are an array of intervals from the range: 1 - columns of size intervalsize.\n\n\n\n\n\n","category":"function"},{"location":"man/Ensemble/#ChemometricsTools.MakeIntervals","page":"Ensemble Models","title":"ChemometricsTools.MakeIntervals","text":"MakeIntervals( columns::Int, intervalsize::Int = 20 )\n\nReturns an 1-Array of intervals from the range: 1 - columns of size intervalsize.\n\n\n\n\n\n","category":"function"},{"location":"man/Ensemble/#ChemometricsTools.stackedweights-Tuple{Any}","page":"Ensemble Models","title":"ChemometricsTools.stackedweights","text":"stackedweights(ErrVec; power = 2)\n\nWeights stacked interval errors by the reciprocal power specified. Used for SIPLS, SISPLS, etc.\n\nNi, W. , Brown, S. D. and Man, R. (2009), Stacked partial least squares regression analysis for spectral calibration and prediction. J. Chemometrics, 23: 505-517. doi:10.1002/cem.1246\n\n\n\n\n\n","category":"method"},{"location":"man/Clustering/#Clustering-API-Reference-1","page":"Clustering","title":"Clustering API Reference","text":"","category":"section"},{"location":"man/Clustering/#K-means-Elbow-Plot-Recipe-1","page":"Clustering","title":"K-means Elbow Plot Recipe","text":"","category":"section"},{"location":"man/Clustering/#","page":"Clustering","title":"Clustering","text":" using Plots\n ExplainedVar = []\n for K in 1:10\n km = KMeans( X, K; tolerance = 1e-14, maxiters = 1000 )\n TCSS = TotalClusterSS( km )\n WCSS = WithinClusterSS( km )\n #BCSS = BetweenClusterSS( km )\n push!(ExplainedVar, WCSS / TCSS)\n end\n scatter(ExplainedVar, title = \"Elbow Plot\", ylabel = \"WCSS/TCSS\", xlabel = \"Clusters (#)\", label = \"K-means\" )","category":"page"},{"location":"man/Clustering/#Functions-1","page":"Clustering","title":"Functions","text":"","category":"section"},{"location":"man/Clustering/#","page":"Clustering","title":"Clustering","text":"Modules = [ChemometricsTools]\nPages = [\"Clustering.jl\"]","category":"page"},{"location":"man/Clustering/#ChemometricsTools.BetweenClusterSS-Tuple{ChemometricsTools.ClusterModel}","page":"Clustering","title":"ChemometricsTools.BetweenClusterSS","text":"BetweenClusterSS( Clustered::ClusterModel )\n\nReturns a scalar of the between cluster sum of squares for a ClusterModel object.\n\n\n\n\n\n","category":"method"},{"location":"man/Clustering/#ChemometricsTools.KMeans-Tuple{Any,Any}","page":"Clustering","title":"ChemometricsTools.KMeans","text":"KMeans( X, Clusters; tolerance = 1e-8, maxiters = 200 )\n\nReturns a ClusterModel object after finding clusterings for data in X via MacQueens K-Means algorithm. Clusters is the K parameter, or the # of clusters.\n\nMacQueen, J. B. (1967). Some Methods for classification and Analysis of Multivariate Observations. Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability. 1. University of California Press. pp. 281–297.\n\n\n\n\n\n","category":"method"},{"location":"man/Clustering/#ChemometricsTools.TotalClusterSS-Tuple{ChemometricsTools.ClusterModel}","page":"Clustering","title":"ChemometricsTools.TotalClusterSS","text":"TotalClusterSS( Clustered::ClusterModel )\n\nReturns a scalar of the total sum of squares for a ClusterModel object.\n\n\n\n\n\n","category":"method"},{"location":"man/Clustering/#ChemometricsTools.WithinClusterSS-Tuple{ChemometricsTools.ClusterModel}","page":"Clustering","title":"ChemometricsTools.WithinClusterSS","text":"WithinClusterSS( Clustered::ClusterModel )\n\nReturns a scalar of the within cluter sum of squares for a ClusterModel object.\n\n\n\n\n\n","category":"method"},{"location":"man/MultiWay/#Multiway-API-Reference-1","page":"MultiWay","title":"Multiway API Reference","text":"","category":"section"},{"location":"man/MultiWay/#Functions-1","page":"MultiWay","title":"Functions","text":"","category":"section"},{"location":"man/MultiWay/#","page":"MultiWay","title":"MultiWay","text":"Modules = [ChemometricsTools]\nPages = [\"MultiWay.jl\"]","category":"page"},{"location":"man/MultiWay/#ChemometricsTools.MultiCenter","page":"MultiWay","title":"ChemometricsTools.MultiCenter","text":"MultiCenter(Z, mode = 1)\n\nAcquires the mean of the specified mode in Z and returns a transform that will remove those means from any future data.\n\n\n\n\n\n","category":"type"},{"location":"man/MultiWay/#ChemometricsTools.MultiCenter-Tuple{Any}","page":"MultiWay","title":"ChemometricsTools.MultiCenter","text":"(T::MultiCenter)(Z; inverse = false)\n\nCenters data in Tensor Z mode-wise according to learned centers in MultiCenter object T.\n\n\n\n\n\n","category":"method"},{"location":"man/MultiWay/#ChemometricsTools.MultiScale","page":"MultiWay","title":"ChemometricsTools.MultiScale","text":"MultiScale(Z, mode = 1)\n\nAcquires the standard deviations of the specified mode in Z and returns a transform that will scale by those standard deviations from any future data.\n\n\n\n\n\n","category":"type"},{"location":"man/MultiWay/#ChemometricsTools.MultiScale-Tuple{Any}","page":"MultiWay","title":"ChemometricsTools.MultiScale","text":"(T::MultiScale)(Z; inverse = false)\n\nScales data in Tensor Z mode-wise according to learned standard deviations in MultiScale object T.\n\n\n\n\n\n","category":"method"},{"location":"man/MultiWay/#ChemometricsTools.MultiNorm-Tuple{Any}","page":"MultiWay","title":"ChemometricsTools.MultiNorm","text":"MultiNorm(T)\n\nComputes the equivalent of the Froebinius norm on a tensor T. Returns a scalar.\n\n\n\n\n\n","category":"method"},{"location":"man/MultiWay/#ChemometricsTools.MultiPCA-Tuple{Any}","page":"MultiWay","title":"ChemometricsTools.MultiPCA","text":"MultiPCA(X; Factors = 2)\n\nPerforms multiway PCA aka Higher Order SVD aka Tucker, etc. The number of factors decomposed can be a scalar(repeated across all modes) or a vector/tuple for each mode.\n\nReturns a tuple of (Core Tensor, Basis Tensors)\n\n\n\n\n\n","category":"method"},{"location":"man/AnomalyDetection/#Anomaly-Detection-API-Reference-1","page":"Anomaly Detection","title":"Anomaly Detection API Reference","text":"","category":"section"},{"location":"man/AnomalyDetection/#","page":"Anomaly Detection","title":"Anomaly Detection","text":"ChemometricsTools has a few anomaly detection methods. Feel free to read the API below. If that's too abstract, check out the shoot-out example : AnomalyDetection","category":"page"},{"location":"man/AnomalyDetection/#Functions-1","page":"Anomaly Detection","title":"Functions","text":"","category":"section"},{"location":"man/AnomalyDetection/#","page":"Anomaly Detection","title":"Anomaly Detection","text":"Modules = [ChemometricsTools]\nPages = [\"AnomalyDetection.jl\"]","category":"page"},{"location":"man/AnomalyDetection/#ChemometricsTools.Hotelling-Tuple{Any,PCA}","page":"Anomaly Detection","title":"ChemometricsTools.Hotelling","text":"Hotelling(X, pca::PCA; Quantile = 0.05, Variance = 1.0)\n\nComputes the hotelling Tsq and upper control limit cut off of a pca object using a specified Quantile and cumulative variance explained Variance for new or old data X.\n\nA review of PCA-based statistical process monitoring methodsfor time-dependent, high-dimensional data. Bart De Ketelaere https://wis.kuleuven.be/stat/robust/papers/2013/deketelaere-review.pdf\n\n\n\n\n\n","category":"method"},{"location":"man/AnomalyDetection/#ChemometricsTools.Leverage-Tuple{PCA}","page":"Anomaly Detection","title":"ChemometricsTools.Leverage","text":"Leverage(pca::PCA)\n\nCalculates the leverage of samples in a pca object.\n\n\n\n\n\n","category":"method"},{"location":"man/AnomalyDetection/#ChemometricsTools.Q-Tuple{Any,PCA}","page":"Anomaly Detection","title":"ChemometricsTools.Q","text":"Q(X, pca::PCA; Quantile = 0.95, Variance = 1.0)\n\nComputes the Q-statistic and upper control limit cut off of a pca object using a specified Quantile and cumulative variance explained Variance for new or old data X.\n\nA review of PCA-based statistical process monitoring methodsfor time-dependent, high-dimensional data. Bart De Ketelaere https://wis.kuleuven.be/stat/robust/papers/2013/deketelaere-review.pdf\n\n\n\n\n\n","category":"method"},{"location":"man/CurveResolution/#Curve-Resolution-Models-API-Reference-1","page":"Curve Resolution","title":"Curve Resolution Models API Reference","text":"","category":"section"},{"location":"man/CurveResolution/#Functions-1","page":"Curve Resolution","title":"Functions","text":"","category":"section"},{"location":"man/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"Modules = [ChemometricsTools]\nPages = [\"CurveResolution.jl\"]","category":"page"},{"location":"man/CurveResolution/#ChemometricsTools.BTEM","page":"Curve Resolution","title":"ChemometricsTools.BTEM","text":"BTEM(X, bands = nothing; Factors = 3, particles = 50, maxiters = 1000)\n\nReturns a single recovered spectra from a 2-Array X, the selected bands, number of Factors, using a Particle Swarm Optimizer. Note: This is not the function used in the original paper. This will be updated... it was written from memory. Also the original method uses Simulated Annealing not PSO. Band-Target Entropy Minimization (BTEM):  An Advanced Method for Recovering Unknown Pure Component Spectra. Application to the FTIR Spectra of Unstable Organometallic Mixtures. Wee Chew,Effendi Widjaja, and, and Marc Garland. Organometallics 2002 21 (9), 1982-1990. DOI: 10.1021/om0108752\n\n\n\n\n\n","category":"function"},{"location":"man/CurveResolution/#ChemometricsTools.BTEMobjective-Tuple{Any,Any}","page":"Curve Resolution","title":"ChemometricsTools.BTEMobjective","text":"BTEMobjective( a, X )\n\nReturns the scalar BTEM objective function obtained from the linear combination vector a and loadings X. Note: This is not the function used in the original paper. This will be updated... it was written from memory.\n\n\n\n\n\n","category":"method"},{"location":"man/CurveResolution/#ChemometricsTools.FNNLS-Tuple{Any,Any}","page":"Curve Resolution","title":"ChemometricsTools.FNNLS","text":"FNNLS( A, b; LHS = false, maxiters = 500 )\n\nUses an implementation of Bro et. al's Fast Non-Negative Least Squares on the matrix A and vector b. Returns regression coefficients in the form of a vector. Bro, R., de Jong, S. (1997) A fast non-negativity-constrained least squares algorithm. Journal of Chemometrics, 11, 393-401.\n\n\n\n\n\n","category":"method"},{"location":"man/CurveResolution/#ChemometricsTools.MCRALS","page":"Curve Resolution","title":"ChemometricsTools.MCRALS","text":"MCRALS(X, C, S = nothing; norm = (false, false), Factors = 1, maxiters = 20, nonnegative = (false, false) )\n\nPerforms Multivariate Curve Resolution using Alternating Least Squares on X taking initial estimates for S or C. S or C can be constrained by their norm, or by nonnegativity using nonnegative arguments. The number of resolved Factors can also be set. Tauler, R. Izquierdo-Ridorsa, A. Casassas, E. Simultaneous analysis of several spectroscopic titrations with self-modelling curve resolution.Chemometrics and Intelligent Laboratory Systems. 18, 3, (1993), 293-300.\n\n\n\n\n\n","category":"function"},{"location":"man/CurveResolution/#ChemometricsTools.NMF-Tuple{Any}","page":"Curve Resolution","title":"ChemometricsTools.NMF","text":"NMF(X; Factors = 1, tolerance = 1e-7, maxiters = 200)\n\nPerforms a variation of non-negative matrix factorization on Array X and returns the a 2-Tuple of (Concentration Profile, Spectra) Note: This is not a coordinate descent based NMF. This is a simple fast version which works well enough for chemical signals Algorithms for non-negative matrix factorization. Daniel D. Lee. H. Sebastian Seung. NIPS'00 Proceedings of the 13th International Conference on Neural Information Processing Systems. 535-54\n\n\n\n\n\n","category":"method"},{"location":"man/CurveResolution/#ChemometricsTools.SIMPLISMA-Tuple{Any}","page":"Curve Resolution","title":"ChemometricsTools.SIMPLISMA","text":"SIMPLISMA(X; Factors = 1, alpha = 0.05, includedvars = 1:size(X)[2], SecondDeriv = true)\n\nPerforms SIMPLISMA on Array X using either the raw spectra or the Second Derivative spectra. alpha can be set to reduce contributions of baseline, and a list of included variables in the determination of pure variables may also be provided. Returns a tuple of the following form: (Concentraion Profile, Pure Spectral Estimates, Pure Variables) W. Windig, Spectral Data Files for Self-Modeling Curve Resolution with Examples Using the SIMPLISMA Approach, Chemometrics and Intelligent Laboratory Systems, 36, 1997, 3-16.\n\n\n\n\n\n","category":"method"},{"location":"man/Stats/#Stats-API-Reference-1","page":"Stats.","title":"Stats API Reference","text":"","category":"section"},{"location":"man/Stats/#Functions-1","page":"Stats.","title":"Functions","text":"","category":"section"},{"location":"man/Stats/#","page":"Stats.","title":"Stats.","text":"Modules = [ChemometricsTools]\nPages = [\"InHouseStats.jl\"]","category":"page"},{"location":"man/Stats/#ChemometricsTools.RunningMean-Tuple{Any}","page":"Stats.","title":"ChemometricsTools.RunningMean","text":"RunningMean(x)\n\nConstructs a running mean object with an initial scalar value of x.\n\n\n\n\n\n","category":"method"},{"location":"man/Stats/#ChemometricsTools.RunningVar-Tuple{Any}","page":"Stats.","title":"ChemometricsTools.RunningVar","text":"RunningVar(x)\n\nConstructs a RunningVar object with an initial scalar value of x. Note: RunningVar objects implicitly calculate the running mean.\n\n\n\n\n\n","category":"method"},{"location":"man/Stats/#ChemometricsTools.EmpiricalQuantiles-Tuple{Any,Any}","page":"Stats.","title":"ChemometricsTools.EmpiricalQuantiles","text":"EmpiricalQuantiles(X, quantiles)\n\nFinds the column-wise quantiles of 2-Array X and returns them in a 2-Array of size quantiles by variables. *Note: This copies the array... Use a subset if memory is the concern. *\n\n\n\n\n\n","category":"method"},{"location":"man/Stats/#ChemometricsTools.Mean-Tuple{RunningMean}","page":"Stats.","title":"ChemometricsTools.Mean","text":"Mean(rv::RunningMean)\n\nReturns the current mean inside of a RunningMean object.\n\n\n\n\n\n","category":"method"},{"location":"man/Stats/#ChemometricsTools.Mean-Tuple{RunningVar}","page":"Stats.","title":"ChemometricsTools.Mean","text":"Mean(rv::RunningVar)\n\nReturns the current mean inside of a RunningVar object.\n\n\n\n\n\n","category":"method"},{"location":"man/Stats/#ChemometricsTools.Remove!-Tuple{RunningMean,Any}","page":"Stats.","title":"ChemometricsTools.Remove!","text":"Remove!(RM::RunningMean, x)\n\nRemoves an observation(x) from a RunningMean object(RM) and reculates the mean in place.\n\n\n\n\n\n","category":"method"},{"location":"man/Stats/#ChemometricsTools.Remove-Tuple{RunningMean,Any}","page":"Stats.","title":"ChemometricsTools.Remove","text":"Remove!(RM::RunningMean, x)\n\nRemoves an observation(x) from a RunningMean object(RM) and recuturns the new RunningMean object.\n\n\n\n\n\n","category":"method"},{"location":"man/Stats/#ChemometricsTools.SampleSkewness-Tuple{Any}","page":"Stats.","title":"ChemometricsTools.SampleSkewness","text":"SampleSkewness(X)\n\nreturns a measure of skewness for vector X that is corrected for a sample of the population.\n\nJoanes, D. N., and C. A. Gill. 1998. “Comparing Measures of Sample Skewness and Kurtosis”. The Statistician 47(1): 183–189.\n\n\n\n\n\n","category":"method"},{"location":"man/Stats/#ChemometricsTools.Skewness-Tuple{Any}","page":"Stats.","title":"ChemometricsTools.Skewness","text":"Skewness(X)\n\nreturns a measure of skewness for a population vector X.\n\nJoanes, D. N., and C. A. Gill. 1998. “Comparing Measures of Sample Skewness and Kurtosis”. The Statistician 47(1): 183–189.\n\n\n\n\n\n","category":"method"},{"location":"man/Stats/#ChemometricsTools.Update!-Tuple{RunningMean,Any}","page":"Stats.","title":"ChemometricsTools.Update!","text":"Update!(RM::RunningMean, x)\n\nAdds new observation(x) to a RunningMean object(RM) in place.\n\n\n\n\n\n","category":"method"},{"location":"man/Stats/#ChemometricsTools.Update!-Tuple{RunningVar,Any}","page":"Stats.","title":"ChemometricsTools.Update!","text":"Update!(RV::RunningVar, x)\n\nAdds new observation(x) to a RunningVar object(RV) and updates it in place.\n\n\n\n\n\n","category":"method"},{"location":"man/Stats/#ChemometricsTools.Update-Tuple{RunningMean,Any}","page":"Stats.","title":"ChemometricsTools.Update","text":"Update!(RM::RunningMean, x)\n\nAdds new observation(x) to a RunningMean object(RM) and returns the new object.\n\n\n\n\n\n","category":"method"},{"location":"man/Stats/#ChemometricsTools.Variance-Tuple{RunningVar}","page":"Stats.","title":"ChemometricsTools.Variance","text":"Variance(rv::RunningVar)\n\nReturns the current variance inside of a RunningVar object.\n\n\n\n\n\n","category":"method"},{"location":"man/Stats/#ChemometricsTools.rbinomial-Tuple{Any,Vararg{Any,N} where N}","page":"Stats.","title":"ChemometricsTools.rbinomial","text":"rbinomial( p, size... )\n\nMakes an N-dimensional array of size(s) size with a probability of being a 1 over a 0 of 1 p.\n\n\n\n\n\n","category":"method"},{"location":"man/Dists/#Distances-API-Reference-1","page":"Distance Measures","title":"Distances API Reference","text":"","category":"section"},{"location":"man/Dists/#Functions-1","page":"Distance Measures","title":"Functions","text":"","category":"section"},{"location":"man/Dists/#","page":"Distance Measures","title":"Distance Measures","text":"Modules = [ChemometricsTools]\nPages = [\"DistanceMeasures.jl\"]","category":"page"},{"location":"man/Dists/#ChemometricsTools.Kernel-Tuple{Any}","page":"Distance Measures","title":"ChemometricsTools.Kernel","text":"(K::Kernel)(X)\n\nThis is a convenience function to allow for one-line construction of kernels from a Kernel object K and new data X.\n\n\n\n\n\n","category":"method"},{"location":"man/Dists/#ChemometricsTools.AdjacencyMatrix-Tuple{Any}","page":"Distance Measures","title":"ChemometricsTools.AdjacencyMatrix","text":"NearestNeighbors(DistanceMatrix)\n\nReturns the nearest neighbor adjacency matrix from a given DistanceMatrix.\n\n\n\n\n\n","category":"method"},{"location":"man/Dists/#ChemometricsTools.CauchyKernel-Tuple{Any,Any,Any}","page":"Distance Measures","title":"ChemometricsTools.CauchyKernel","text":"CauchyKernel(X, Y, sigma)\n\nCreates a Cauchy kernel from Arrays X and Y using hyperparameters sigma.\n\n\n\n\n\n","category":"method"},{"location":"man/Dists/#ChemometricsTools.CauchyKernel-Tuple{Any,Any}","page":"Distance Measures","title":"ChemometricsTools.CauchyKernel","text":"CauchyKernel(X, sigma)\n\nCreates a Cauchy kernel from Array X using hyperparameters sigma.\n\n\n\n\n\n","category":"method"},{"location":"man/Dists/#ChemometricsTools.EuclideanDistance-Tuple{Any,Any}","page":"Distance Measures","title":"ChemometricsTools.EuclideanDistance","text":"EuclideanDistance(X, Y)\n\nReturns the euclidean distance matrix of X and Y such that the columns are the samples in Y.\n\n\n\n\n\n","category":"method"},{"location":"man/Dists/#ChemometricsTools.EuclideanDistance-Tuple{Any}","page":"Distance Measures","title":"ChemometricsTools.EuclideanDistance","text":"EuclideanDistance(X)\n\nReturns the Grahm aka the euclidean distance matrix of X.\n\n\n\n\n\n","category":"method"},{"location":"man/Dists/#ChemometricsTools.GaussianKernel-Tuple{Any,Any,Any}","page":"Distance Measures","title":"ChemometricsTools.GaussianKernel","text":"GaussianKernel(X, Y, sigma)\n\nCreates a Gaussian/RBF kernel from Arrays X and Y with hyperparameter sigma.\n\n\n\n\n\n","category":"method"},{"location":"man/Dists/#ChemometricsTools.GaussianKernel-Tuple{Any,Any}","page":"Distance Measures","title":"ChemometricsTools.GaussianKernel","text":"GaussianKernel(X, sigma)\n\nCreates a Gaussian/RBF kernel from Array X using hyperparameter sigma.\n\n\n\n\n\n","category":"method"},{"location":"man/Dists/#ChemometricsTools.InClassAdjacencyMatrix","page":"Distance Measures","title":"ChemometricsTools.InClassAdjacencyMatrix","text":"InClassAdjacencyMatrix(DistanceMatrix, YHOT, K = 1)\n\nComputes the in class Adjacency matrix with K nearest neighbors.\n\n\n\n\n\n","category":"function"},{"location":"man/Dists/#ChemometricsTools.LinearKernel-Tuple{Any,Any,Any}","page":"Distance Measures","title":"ChemometricsTools.LinearKernel","text":"LinearKernel(X, Y, c)\n\nCreates a Linear kernel from Arrays X and Y with hyperparameter C.\n\n\n\n\n\n","category":"method"},{"location":"man/Dists/#ChemometricsTools.LinearKernel-Tuple{Any,Any}","page":"Distance Measures","title":"ChemometricsTools.LinearKernel","text":"LinearKernel(X, c)\n\nCreates a Linear kernel from Array X and hyperparameter C.\n\n\n\n\n\n","category":"method"},{"location":"man/Dists/#ChemometricsTools.ManhattanDistance-Tuple{Any,Any}","page":"Distance Measures","title":"ChemometricsTools.ManhattanDistance","text":"ManhattanDistance(X, Y)\n\nReturns the Manhattan distance matrix of X and Y such that the columns are the samples in Y.\n\n\n\n\n\n","category":"method"},{"location":"man/Dists/#ChemometricsTools.ManhattanDistance-Tuple{Any}","page":"Distance Measures","title":"ChemometricsTools.ManhattanDistance","text":"ManhattanDistance(X)\n\nReturns the Manhattan distance matrix of X.\n\n\n\n\n\n","category":"method"},{"location":"man/Dists/#ChemometricsTools.NearestNeighbors-Tuple{Any,Any}","page":"Distance Measures","title":"ChemometricsTools.NearestNeighbors","text":"NearestNeighbors(DistanceMatrix, N)\n\nReturns a matrix of dimensions DistanceMatrix rows, by N columns. Basically this goes through each row and finds the ones corresponding column which has the smallest distance.\n\n\n\n\n\n","category":"method"},{"location":"man/Dists/#ChemometricsTools.OutOfClassAdjacencyMatrix","page":"Distance Measures","title":"ChemometricsTools.OutOfClassAdjacencyMatrix","text":"OutOfClassAdjacencyMatrix(DistanceMatrix, YHOT, K = 1)\n\nComputes the out of class Adjacency matrix with K nearest neighbors.\n\n\n\n\n\n","category":"function"},{"location":"man/Dists/#ChemometricsTools.SquareEuclideanDistance-Tuple{Any,Any}","page":"Distance Measures","title":"ChemometricsTools.SquareEuclideanDistance","text":"SquareEuclideanDistance(X, Y)\n\nReturns the squared euclidean distance matrix of X and Y such that the columns are the samples in Y.\n\n\n\n\n\n","category":"method"},{"location":"man/Dists/#ChemometricsTools.SquareEuclideanDistance-Tuple{Any}","page":"Distance Measures","title":"ChemometricsTools.SquareEuclideanDistance","text":"SquareEuclideanDistance(X)\n\nReturns the squared Grahm aka the euclidean distance matrix of X.\n\n\n\n\n\n","category":"method"},{"location":"man/PSO/#Particle-Swarm-Optimizer-API-Reference-1","page":"PSO","title":"Particle Swarm Optimizer API Reference","text":"","category":"section"},{"location":"man/PSO/#Functions-1","page":"PSO","title":"Functions","text":"","category":"section"},{"location":"man/PSO/#","page":"PSO","title":"PSO","text":"Modules = [ChemometricsTools]\nPages = [\"PSO.jl\"]","category":"page"},{"location":"man/PSO/#ChemometricsTools.Bounds-Tuple{Any,Any,Any}","page":"PSO","title":"ChemometricsTools.Bounds","text":"Bounds(dims)\n\nConstructor for a Bounds object. Returns a bounds object with a lower bound of [lower...] and upper bound[upper...] with length of dims.\n\n\n\n\n\n","category":"method"},{"location":"man/PSO/#ChemometricsTools.Bounds-Tuple{Any}","page":"PSO","title":"ChemometricsTools.Bounds","text":"Bounds(dims)\n\nDefault constructor for a Bounds object. Returns a bounds object with a lower bound of [0...] and upper bound[1...] with length of dims.\n\n\n\n\n\n","category":"method"},{"location":"man/PSO/#ChemometricsTools.Particle-Tuple{Any,Any}","page":"PSO","title":"ChemometricsTools.Particle","text":"Particle(ProblemBounds, VelocityBounds)\n\nDefault constructor for a Particle object. It creates a random unformly distributed particle within the specified ProblemBounds, and limits it's velocity to the specified VelocityBounds.\n\n\n\n\n\n","category":"method"},{"location":"man/PSO/#ChemometricsTools.PSO-Tuple{Any,Bounds,Bounds,Int64}","page":"PSO","title":"ChemometricsTools.PSO","text":"PSO(fn, Bounds, VelRange, Particles; tolerance = 1e-6, maxiters = 1000, InertialDecay = 0.5, PersonalWeight = 0.5, GlobalWeight = 0.5, InternalParams = nothing)\n\nMinimizes function fn with-in the user specified Bounds via a Particle Swarm Optimizer. The particle velocities are limitted to the VelRange. The number of particles are defined by the Particles parameter.\n\nReturns a Tuple of the following form: ( GlobalBestPos, GlobalBestScore, P ) Where P is an array of the particles used in the optimization.\n\n*Note: if the optimization function requires an additional constant parameter, please pass that parameter to InternalParams. This will only work if the optimized parameter(o) and constant parameter(c) for the function of interest has the following format: F(o,c) *\n\nKennedy, J.; Eberhart, R. (1995). Particle Swarm Optimization. Proceedings of IEEE International Conference on Neural Networks. IV. pp. 1942–1948. doi:10.1109/ICNN.1995.488968\n\n\n\n\n\n","category":"method"},{"location":"man/GeneticAlgorithms/#Genetic-Algorithms-API-Reference-1","page":"Genetic Algorithms","title":"Genetic Algorithms API Reference","text":"","category":"section"},{"location":"man/GeneticAlgorithms/#Functions-1","page":"Genetic Algorithms","title":"Functions","text":"","category":"section"},{"location":"man/GeneticAlgorithms/#","page":"Genetic Algorithms","title":"Genetic Algorithms","text":"Modules = [ChemometricsTools]\nPages = [\"SimpleGAs.jl\"]","category":"page"},{"location":"man/GeneticAlgorithms/#ChemometricsTools.Lifeform-Tuple{Any,Any,Any}","page":"Genetic Algorithms","title":"ChemometricsTools.Lifeform","text":"Lifeform(size, onlikelihood, initialscore)\n\nConstructor for a BinaryLifeForm struct. Binary life forms are basically wrappers for a binary vector, which has a likelihood for being 1(onlikelihood). Each life form also has a score based on it's \"fitness\". So the GA's in this package can be used to minimize or maximize this is an open parameter, but Inf/-Inf is a good initialscore.\n\n\n\n\n\n","category":"method"},{"location":"man/GeneticAlgorithms/#ChemometricsTools.Mutate","page":"Genetic Algorithms","title":"ChemometricsTools.Mutate","text":"Mutate( L::BinaryLifeform, amount = 0.05 )\n\nAssesses each element in the gene vector (inside of L). If a randomly drawn value has a binomial probability of amount the element is mutated.\n\n\n\n\n\n","category":"function"},{"location":"man/GeneticAlgorithms/#ChemometricsTools.SinglePointCrossOver-Tuple{BinaryLifeform,BinaryLifeform}","page":"Genetic Algorithms","title":"ChemometricsTools.SinglePointCrossOver","text":"SinglePointCrossOver( L1::BinaryLifeform, L2::BinaryLifeform )\n\nCreates two offspring (new BinaryLifeForms) by mixing the genes from L1 and L2 after a random position in the vector.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#API-1","page":"Full API","title":"API","text":"","category":"section"},{"location":"man/FullAPI/#","page":"Full API","title":"Full API","text":"CurrentModule = ChemometricsTools\nDocTestSetup = quote\n\tusing ChemometricsTools\nend","category":"page"},{"location":"man/FullAPI/#","page":"Full API","title":"Full API","text":"Modules = [ChemometricsTools]","category":"page"},{"location":"man/FullAPI/#ChemometricsTools.CanonicalCorrelationAnalysis-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.CanonicalCorrelationAnalysis","text":"CanonicalCorrelationAnalysis(A, B)\n\nReturns a CanonicalCorrelationAnalysis object which contains (U, V, r) from Arrays A and B. Currently Untested for correctness but should compute....\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.GaussianBand","page":"Full API","title":"ChemometricsTools.GaussianBand","text":"GaussianBand(sigma,amplitude,center)\n\nConstructs a Gaussian kernel generator.\n\n\n\n\n\n","category":"type"},{"location":"man/FullAPI/#ChemometricsTools.GaussianBand-Tuple{Float64}","page":"Full API","title":"ChemometricsTools.GaussianBand","text":"(B::GaussianBand)(X::Float64)\n\nReturns the scalar probability associated with a GaussianBand object (kernel) at a location in space(X).\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.LDA-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.LDA","text":"LDA(X, Y; Factors = 1)\n\nCompute's a LinearDiscriminantAnalysis transform from x with a user specified number of latent variables(Factors). Returns an LDA object.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.LDA-Tuple{Any}","page":"Full API","title":"ChemometricsTools.LDA","text":"( model::LDA )( Z; Factors = length(model.Values) )\n\nCalling a LDA object on new data brings the new data Z into the LDA basis.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.LorentzianBand","page":"Full API","title":"ChemometricsTools.LorentzianBand","text":"LorentzianBand(gamma,amplitude,center)\n\nConstructs a Lorentzian kernel generator.\n\n\n\n\n\n","category":"type"},{"location":"man/FullAPI/#ChemometricsTools.LorentzianBand-Tuple{Float64}","page":"Full API","title":"ChemometricsTools.LorentzianBand","text":"(B::LorentzianBand)(X::Float64)\n\nReturns the probability associated with a LorentzianBand object (kernel) at a location in space(X).\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.PCA-Tuple{Any}","page":"Full API","title":"ChemometricsTools.PCA","text":"PCA(X; Factors = minimum(size(X)) - 1)\n\nCompute's a PCA from x using LinearAlgebra's SVD algorithm with a user specified number of latent variables(Factors). Returns a PCA object.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.PCA-Tuple{Array}","page":"Full API","title":"ChemometricsTools.PCA","text":"(T::PCA)(Z::Array; Factors = length(T.Values), inverse = false)\n\nCalling a PCA object on new data brings the new data Z into or out of (inverse = true) the PCA basis.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.Universe-Tuple","page":"Full API","title":"ChemometricsTools.Universe","text":"(U::Universe)(Band...)\n\nA Universe objects internal \"spectra\" can be updated to include the additive contribution of many Band-like objects.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.Universe-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.Universe","text":"Universe(mini, maxi; width = nothing, bins = nothing)\n\nCreates a 1-D discretized segment that starts at mini and ends at maxi. The width of the bins for the discretization can either be provided or inferred from the number of bins. Returns a Universe object.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.Universe-Tuple{Union{GaussianBand, LorentzianBand}}","page":"Full API","title":"ChemometricsTools.Universe","text":"(U::Universe)(Band::Union{ GaussianBand, LorentzianBand})\n\nA Universe objects internal \"spectra\" can be updated to include the additive contribution of any Band-like object.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.AssessHealth-Tuple{Any}","page":"Full API","title":"ChemometricsTools.AssessHealth","text":"AssessHealth( X )\n\nReturns a somewhat detailed Dict containing information about the 'health' of a dataset. What is included is the following: - PercentMissing: percent of missing entries (includes nothing, inf / nan) in the dataset - EmptyColumns: the columns which have only 1 value - RankEstimate: An estimate of the rank of X - (optional)Duplicates: returns the rows of duplicate observations\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.ExplainedVariance-Tuple{LDA}","page":"Full API","title":"ChemometricsTools.ExplainedVariance","text":"ExplainedVariance(lda::LDA)\n\nCalculates the explained variance of each singular value in an LDA object.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.ExplainedVariance-Tuple{PCA}","page":"Full API","title":"ChemometricsTools.ExplainedVariance","text":"ExplainedVariance(PCA::PCA)\n\nCalculates the explained variance of each singular value in a pca object.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.HLDA-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.HLDA","text":"HLDA(X, YHOT; K = 1, Factors = 1)\n\nCompute's a Hierarchical LinearDiscriminantAnalysis transform from x with a user specified number of latent variables(Factors). The adjacency matrices are created from K nearest neighbors.\n\nReturns an LDA object. Note: this can be used with any other LDA functions such as Gaussian discriminants or explained variance.\n\nLu D, Ding C, Xu J, Wang S. Hierarchical Discriminant Analysis. Sensors (Basel). 2018 Jan 18;18(1). pii: E279. doi: 10.3390/s18010279.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.PCA_NIPALS-Tuple{Any}","page":"Full API","title":"ChemometricsTools.PCA_NIPALS","text":"PCA_NIPALS(X; Factors = minimum(size(X)) - 1, tolerance = 1e-7, maxiters = 200)\n\nCompute's a PCA from x using the NIPALS algorithm with a user specified number of latent variables(Factors). The tolerance is the minimum change in the F norm before ceasing execution. Returns a PCA object.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.RAFFT-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.RAFFT","text":"RAFFT(raw, reference; maxlags::Int = 500, lookahead::Int = 1, minlength::Int = 20, mincorr::Float64 = 0.05)\n\nRAFFT corrects shifts in the raw spectral bands to be similar to those in a given reference spectra through the use of \"recursive alignment by FFT\". It returns an array of corrected spectra/chromatograms. The number of maximum lags can be specified, the lookahead parameter ensures that additional recursive executions are performed so the first solution found is not preemptively accepted, the minimum segment length(minlength) can also be specified if FWHM are estimable, and the minimum cross correlation(mincorr) for a match can dictate whether peaks were found to align or not.\n\nNote This method works best with flat baselines because it repeats last known values when padding aligned spectra. It is highly efficient, and in my tests does a good job, but other methods definitely exist. Let me know if other peak Alignment methods are important for your work-flow, I'll see if I can implement them.\n\nApplication of Fast Fourier Transform Cross-Correlation for the Alignment of Large Chromatographic and Spectral Datasets Jason W. H. Wong, Caterina Durante, and, Hugh M. Cartwright. Analytical Chemistry 2005 77 (17), 5655-5661\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.findpeaks-Tuple{Any}","page":"Full API","title":"ChemometricsTools.findpeaks","text":"findpeaks( vY; m = 3)\n\nFinds the indices of peaks in a vector vY with a window span of 2m. Original R function by Stas_G:(https://stats.stackexchange.com/questions/22974/how-to-find-local-peaks-valleys-in-a-series-of-data) This version is based on a C++ variant by me.\n\n\n\n\n\n","category":"method"}] +[{"location":"man/Preprocess/#Preprocessing-API-Reference-1","page":"Preprocessing","title":"Preprocessing API Reference","text":"","category":"section"},{"location":"man/Preprocess/#Functions-1","page":"Preprocessing","title":"Functions","text":"","category":"section"},{"location":"man/Preprocess/#","page":"Preprocessing","title":"Preprocessing","text":"Modules = [ChemometricsTools]\nPages = [\"Preprocess.jl\"]","category":"page"},{"location":"man/Transformations/#Transformations/Pipelines-API-Reference-1","page":"Transformations/Pipelines","title":"Transformations/Pipelines API Reference","text":"","category":"section"},{"location":"man/Transformations/#Functions-1","page":"Transformations/Pipelines","title":"Functions","text":"","category":"section"},{"location":"man/Transformations/#","page":"Transformations/Pipelines","title":"Transformations/Pipelines","text":"Modules = [ChemometricsTools]\nPages = [\"Transformations.jl\"]","category":"page"},{"location":"man/GeneticAlgorithms/#Genetic-Algorithms-API-Reference-1","page":"Genetic Algorithms","title":"Genetic Algorithms API Reference","text":"","category":"section"},{"location":"man/GeneticAlgorithms/#Functions-1","page":"Genetic Algorithms","title":"Functions","text":"","category":"section"},{"location":"man/GeneticAlgorithms/#","page":"Genetic Algorithms","title":"Genetic Algorithms","text":"Modules = [ChemometricsTools]\nPages = [\"SimpleGAs.jl\"]","category":"page"},{"location":"man/RegressionModels/#Regression-Models-API-Reference-1","page":"Regression Models","title":"Regression Models API Reference","text":"","category":"section"},{"location":"man/RegressionModels/#Functions-1","page":"Regression Models","title":"Functions","text":"","category":"section"},{"location":"man/RegressionModels/#","page":"Regression Models","title":"Regression Models","text":"Modules = [ChemometricsTools]\nPages = [\"RegressionModels.jl\"]","category":"page"},{"location":"man/classMetrics/#Classification-Metrics-API-Reference-1","page":"Classification Metrics","title":"Classification Metrics API Reference","text":"","category":"section"},{"location":"man/classMetrics/#Functions-1","page":"Classification Metrics","title":"Functions","text":"","category":"section"},{"location":"man/classMetrics/#","page":"Classification Metrics","title":"Classification Metrics","text":"Modules = [ChemometricsTools]\nPages = [\"ClassificationMetrics.jl\"]","category":"page"},{"location":"Demos/Pipelines/#Pipelines-Demo-1","page":"Pipelines","title":"Pipelines Demo","text":"","category":"section"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"Multiple Transformations can be easily chained together and stored using \"Pipelines\". Preprocessing methods, or really any univariate function may be included in a pipeline, but that will likely mean it can no longer be inverted. Pipelines are basically convenience functions, but are somewhat flexible and can be used for automated searches,","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"PreprocessPipe = Pipeline(FauxSpectra1, RangeNorm, Center);\nProcessed = PreprocessPipe(FauxSpectra1);","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"Of course pipelines of transforms can also be inverted,","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"RMSE( FauxSpectra1, PreprocessPipe(Processed; inverse = true) ) < 1e-14","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"Pipelines can also be created and executed as an 'in place' operation for large datasets. This has the advantage that your data is transformed immediately without making copies in memory. This may be useful for large datasets and memory constrained environments. WARNING: be careful to only run the pipeline call or its inverse once! It is much safer to use the not inplace function outside of a REPL/script environment.","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"FauxSpectra = randn(10,200);\nOriginalCopy = copy(FauxSpectra);\nInPlacePipe = PipelineInPlace(FauxSpectra, Center, Scale);","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"See without returning the data or an extra function call we have transformed it according to the pipeline as it was instantiated...","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"FauxSpectra == OriginalCopy\n#Inplace transform the data back\nInPlacePipe(FauxSpectra; inverse = true)\nRMSE( OriginalCopy, FauxSpectra ) < 1e-14","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"Pipelines are kind of flexible. We can put nontransform (operations that cannot be inverted) preprocessing steps in them as well. In the example below the first derivative is applied to the data, this irreversibly removes a column from the data,","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"PreprocessPipe = Pipeline(FauxSpectra1, FirstDerivative, RangeNorm, Center);\nProcessed = PreprocessPipe(FauxSpectra1);\n#This should be equivalent to the following...\nSpectraDeriv = FirstDerivative(FauxSpectra1);\nAlternative = Pipeline(SpectraDeriv , RangeNorm, Center);\nProcessed == Alternative(SpectraDeriv)","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"Great right? Well what happens if we try to do the inverse of our pipeline with an irreversible function (First Derivative) in it?","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"PreprocessPipe(Processed; inverse = true)","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"Well we get an assertion error.","category":"page"},{"location":"Demos/Pipelines/#Automated-Pipeline-Example-1","page":"Pipelines","title":"Automated Pipeline Example","text":"","category":"section"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"We can take advantage of how pipelines are created; at their core they are tuples of transforms/functions. So if we can make an array of transforms and set some conditions they can be stored and applied to unseen data. A fun example of an automated transform pipeline is in the whimsical paper written by Willem Windig et. al. That paper is called 'Loopy Multiplicative Scatter Transform'. Below I'll show how we can implement that algorithm here (or anything similar) with ease. Loopy MSC: A Simple Way to Improve Multiplicative Scatter Correction. Willem Windig, Jeremy Shaver, Rasmus Bro. Applied Spectroscopy. 2008. Vol 62, issue: 10, 1153-1159","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"First let's look at the classic Diesel data before applying Loopy MSC (Image: rawspectra)","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"Alright, there is scatter, let's go for it,","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"RealSpectra = convert(Array, CSV.read(\"/diesel_spectra.csv\"));\nCurrent = RealSpectra;\nLast = zeros(size(Current));\nTransformArray = [];\nwhile RMSE(Last, Current) > 1e-5\n if any(isnan.(Current))\n break\n else\n push!(TransformArray, MultiplicativeScatterCorrection( Current ) )\n Last = Current\n Current = TransformArray[end](Last)\n end\nend\n#Now we can make a pipeline object from the array of stored transforms\nLoopyPipe = Pipeline( Tuple( TransformArray ) );","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"For a sanity check we can ensure the output of the algorithm is the same as the new pipeline so it can be applied to new data.","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"Current == LoopyPipe(RealSpectra)","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"Looks like our automation driven pipeline is equivalent to the loop it took to make it. More importantly did we remove scatter after 3 automated iterations of MSC? (Image: loopymsc)","category":"page"},{"location":"Demos/Pipelines/#","page":"Pipelines","title":"Pipelines","text":"Yes, yes we did. Pretty easy right?","category":"page"},{"location":"man/PSO/#Particle-Swarm-Optimizer-API-Reference-1","page":"PSO","title":"Particle Swarm Optimizer API Reference","text":"","category":"section"},{"location":"man/PSO/#Functions-1","page":"PSO","title":"Functions","text":"","category":"section"},{"location":"man/PSO/#","page":"PSO","title":"PSO","text":"Modules = [ChemometricsTools]\nPages = [\"PSO.jl\"]","category":"page"},{"location":"man/MultiWay/#Multiway-API-Reference-1","page":"MultiWay","title":"Multiway API Reference","text":"","category":"section"},{"location":"man/MultiWay/#Functions-1","page":"MultiWay","title":"Functions","text":"","category":"section"},{"location":"man/MultiWay/#","page":"MultiWay","title":"MultiWay","text":"Modules = [ChemometricsTools]\nPages = [\"MultiWay.jl\"]","category":"page"},{"location":"man/regressMetrics/#Regression-Metrics-API-Reference-1","page":"Regression Metrics","title":"Regression Metrics API Reference","text":"","category":"section"},{"location":"man/regressMetrics/#Functions-1","page":"Regression Metrics","title":"Functions","text":"","category":"section"},{"location":"man/regressMetrics/#","page":"Regression Metrics","title":"Regression Metrics","text":"Modules = [ChemometricsTools]\nPages = [\"RegressionMetrics.jl\"]","category":"page"},{"location":"Demos/ClassificationExample/#Classification-Demo:-1","page":"Classification","title":"Classification Demo:","text":"","category":"section"},{"location":"Demos/ClassificationExample/#","page":"Classification","title":"Classification","text":"This demo shows an applied solution to a classification problem using real mid-infrared data. If you want to see the gambit of methods included in ChemometricsTools check the classification shootout example. There's also a bunch of tools for changes of basis such as: principal components analysis, linear discriminant analysis, orthogonal signal correction, etc. With those kinds of tools we can reduce the dimensions of our data and make classes more separable. So separable that trivial classification methods like a Gaussian discriminant can get us pretty good results. Below is an example analysis performed on mid-infrared spectra of strawberry purees and adulterated strawberry purees (yes fraudulent food items are a common concern).","category":"page"},{"location":"Demos/ClassificationExample/#","page":"Classification","title":"Classification","text":"(Image: Raw)","category":"page"},{"location":"Demos/ClassificationExample/#","page":"Classification","title":"Classification","text":"Use of Fourier transform infrared spectroscopy and partial least squares regression for the detection of adulteration of strawberry purées. J K Holland, E K Kemsley, R H Wilson","category":"page"},{"location":"Demos/ClassificationExample/#","page":"Classification","title":"Classification","text":"snv = StandardNormalVariate(Train);\nTrain_pca = PCA(snv(Train);; Factors = 15);\n\nEnc = LabelEncoding(TrnLbl);\nHot = ColdToHot(TrnLbl, Enc);\n\nlda = LDA(Train_pca.Scores , Hot);\nclassifier = GaussianDiscriminant(lda, TrainS, Hot)\nTrainPreds = classifier(TrainS; Factors = 2);","category":"page"},{"location":"Demos/ClassificationExample/#","page":"Classification","title":"Classification","text":"(Image: LDA of PCA)","category":"page"},{"location":"Demos/ClassificationExample/#","page":"Classification","title":"Classification","text":"Cool right? Well, we can now apply the same transformations to the test set and pull some multivariate Gaussians over the train set classes to see how we do identifying fraudulent puree's,","category":"page"},{"location":"Demos/ClassificationExample/#","page":"Classification","title":"Classification","text":"TestSet = Train_pca(snv(Test));\nTestSet = lda(TestSet);\nTestPreds = classifier(TestS; Factors = 2);\nMulticlassStats(TestPreds .- 1, TstLbl , Enc)","category":"page"},{"location":"Demos/ClassificationExample/#","page":"Classification","title":"Classification","text":"If you're following along you'll get a ~92% F-measure depending on your random split. Not bad. You may also notice this package has a nice collection of performance metrics for classification on board. Anyways, I've gotten 100%'s with more advanced methods but this is a cute way to show off some of the tools currently available.","category":"page"},{"location":"man/Trees/#Tree-Methods-API-Reference-1","page":"Tree Methods","title":"Tree Methods API Reference","text":"","category":"section"},{"location":"man/Trees/#Functions-1","page":"Tree Methods","title":"Functions","text":"","category":"section"},{"location":"man/Trees/#","page":"Tree Methods","title":"Tree Methods","text":"Modules = [ChemometricsTools]\nPages = [\"Trees.jl\"]","category":"page"},{"location":"man/Training/#Training-API-Reference-1","page":"Training","title":"Training API Reference","text":"","category":"section"},{"location":"man/Training/#Functions-1","page":"Training","title":"Functions","text":"","category":"section"},{"location":"man/Training/#","page":"Training","title":"Training","text":"Modules = [ChemometricsTools]\nPages = [\"Training.jl\"]","category":"page"},{"location":"man/Ensemble/#Ensemble-Models-API-Reference-1","page":"Ensemble Models","title":"Ensemble Models API Reference","text":"","category":"section"},{"location":"man/Ensemble/#Functions-1","page":"Ensemble Models","title":"Functions","text":"","category":"section"},{"location":"man/Ensemble/#","page":"Ensemble Models","title":"Ensemble Models","text":"Modules = [ChemometricsTools]\nPages = [\"Ensembles.jl\"]","category":"page"},{"location":"man/Ensemble/#ChemometricsTools.RandomForest","page":"Ensemble Models","title":"ChemometricsTools.RandomForest","text":"RandomForest(x, y, mode = :classification; gainfn = entropy, trees = 50, maxdepth = 10, minbranchsize = 5, samples = 0.7, maxvars = nothing)\n\nReturns a classification (mode = :classification) or a regression (mode = :regression) random forest model. The gainfn can be entropy or gini for classification or ssd for regression. If the number of maximumvars is not provided it will default to sqrt(variables) for classification or variables/3 for regression.\n\nThe returned object can be used for inference by calling new data on the object as a function.\n\nBreiman, L. Machine Learning (2001) 45: 5. https://doi.org/10.1023/A:1010933404324\n\n\n\n\n\n","category":"type"},{"location":"man/Ensemble/#ChemometricsTools.RandomForest-Tuple{Any}","page":"Ensemble Models","title":"ChemometricsTools.RandomForest","text":"(RF::RandomForest)(X)\n\nReturns bagged prediction vector of random forest model.\n\n\n\n\n\n","category":"method"},{"location":"man/Ensemble/#ChemometricsTools.MakeIntervals","page":"Ensemble Models","title":"ChemometricsTools.MakeIntervals","text":"MakeIntervals( columns::Int, intervalsize::Union{Array, Tuple} = [20, 50, 100] )\n\nCreates an Dictionary whose key is the interval size and values are an array of intervals from the range: 1 - columns of size intervalsize.\n\n\n\n\n\n","category":"function"},{"location":"man/Ensemble/#ChemometricsTools.MakeIntervals","page":"Ensemble Models","title":"ChemometricsTools.MakeIntervals","text":"MakeIntervals( columns::Int, intervalsize::Int = 20 )\n\nReturns an 1-Array of intervals from the range: 1 - columns of size intervalsize.\n\n\n\n\n\n","category":"function"},{"location":"man/Ensemble/#ChemometricsTools.stackedweights-Tuple{Any}","page":"Ensemble Models","title":"ChemometricsTools.stackedweights","text":"stackedweights(ErrVec; power = 2)\n\nWeights stacked interval errors by the reciprocal power specified. Used for SIPLS, SISPLS, etc.\n\nNi, W. , Brown, S. D. and Man, R. (2009), Stacked partial least squares regression analysis for spectral calibration and prediction. J. Chemometrics, 23: 505-517. doi:10.1002/cem.1246\n\n\n\n\n\n","category":"method"},{"location":"man/Stats/#Stats-API-Reference-1","page":"Stats.","title":"Stats API Reference","text":"","category":"section"},{"location":"man/Stats/#Functions-1","page":"Stats.","title":"Functions","text":"","category":"section"},{"location":"man/Stats/#","page":"Stats.","title":"Stats.","text":"Modules = [ChemometricsTools]\nPages = [\"InHouseStats.jl\"]","category":"page"},{"location":"Demos/CurveResolution/#Curve-Resolution-Demo-1","page":"Curve Resolution","title":"Curve Resolution Demo","text":"","category":"section"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"ChemometricsTools has some curve resolution methods baked in. So far NMF, SIMPLISMA, and MCR-ALS are included. If you aren't familiar with them, they are used to extract spectral and concentration estimates from unknown mixtures in chemical signals. Below is an example of spectra which are composed of signals from a mixture of a 3 components. I could write a volume analyzing this simple set, but this is just a show-case of some methods and how to call them, what kind of results they might give you. The beauty of this example is that, we know what is in it, in a forensic or real-world situation we won't know what is in it, and we have to rely on domain knowledge, physical reasoning, and metrics to determine the validity of our results.","category":"page"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"Anyways, because we know, the pure spectra look like the following: (Image: pure)","category":"page"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"Note: There are three components (water, acetic acid, methanol), but their spectra were collected in duplicate.","category":"page"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"And the concentration profiles of the components follow the following simplex design, (Image: pureC)","category":"page"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"But the models we are using will only see the following (no pure components) (Image: impure)","category":"page"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"Raw = CSV.read(\"/triliq.csv\");\nMixture = collect(convert(Array, Raw)[:,1:end]);\npure = [10,11,20,21,28,29];\nPURE = Mixture[pure,:];\nimpure = [collect(1:9); collect(12:19);collect(22:27)];\nMixture = Mixture[impure,:];","category":"page"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"Great, so now let's run NMF, SIMPLISMA, and MCR-ALS with the SIMPLISMA estimates.","category":"page"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"( W_NMF, H_NMF ) = NMF(Mixture; Factors = 3, maxiters = 300, tolerance = 1e-8)\n(C_Simplisma,S_Simplisma, vars) = SIMPLISMA(Mixture; Factors = 18)\nvars\n#Find purest variables that are not neighbors with one another\ncuts = S_Simplisma[ [1,3,17], :];\n( C_MCRALS, S_MCRALS, err ) = MCRALS(Mixture, nothing, RangeNorm(cuts')(cuts')';\n Factors = 3, maxiters = 10,\n norm = (true, false),\n nonnegative = (true, true) )","category":"page"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"","category":"page"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"(Image: NMFS)","category":"page"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"(Image: SIMPLISMAS)","category":"page"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"(Image: MCRALSS)","category":"page"},{"location":"Demos/CurveResolution/#Spectral-Recovery-Discussion-(Results-by-Eye):-1","page":"Curve Resolution","title":"Spectral Recovery Discussion (Results by Eye):","text":"","category":"section"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"As we can see, NMF does resolve a few components that resemble a few of the actual pure components, but it really butchers the 3rd. While SIMPLISMA does a good job, at finding spectra that look \"real\" there are characteristics missing from the true spectra. It must be stated; SIMPLISMA wasn't invented for NIR signals. Finding pure variables in dozens... err... hundreds of over-lapping bands isn't really ideal. However, MCR-ALS quickly made work of those initial SIMPLISMA estimates and seems to have found some estimates that somewhat closely resemble the pure components.","category":"page"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"","category":"page"},{"location":"Demos/CurveResolution/#Concentration-Profile-Discussion-(Results-by-Eye):-1","page":"Curve Resolution","title":"Concentration Profile Discussion (Results by Eye):","text":"","category":"section"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"(Image: NMFC)","category":"page"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"(Image: SIMPLISMAC)","category":"page"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"(Image: MCRALSC)","category":"page"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"SIMPLISMA basically botched this dataset with regards to the concentration profiles. While NMF and MCR-ALS do quite good. Of course preprocessing can help here, and tinkering too. Ultimately not bad, given the mixture components. I do have a paper that shows another approach to this problem doubtful I'd be allowed to rewrite the code, I think my university owns it!","category":"page"},{"location":"Demos/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"Casey Kneale, Steven D. Brown, Band target entropy minimization and target partial least squares for spectral recovery and quantitation, Analytica Chimica Acta, Volume 1031, 2018, Pages 38-46, ISSN 0003-2670, https://doi.org/10.1016/j.aca.2018.07.054.","category":"page"},{"location":"man/Sampling/#Sampling-API-Reference-1","page":"Sampling","title":"Sampling API Reference","text":"","category":"section"},{"location":"man/Sampling/#Functions-1","page":"Sampling","title":"Functions","text":"","category":"section"},{"location":"man/Sampling/#","page":"Sampling","title":"Sampling","text":"Modules = [ChemometricsTools]\nPages = [\"Sampling.jl\"]","category":"page"},{"location":"man/ClassificationModels/#Classification-Models-API-Reference-1","page":"Classification Models","title":"Classification Models API Reference","text":"","category":"section"},{"location":"man/ClassificationModels/#Functions-1","page":"Classification Models","title":"Functions","text":"","category":"section"},{"location":"man/ClassificationModels/#","page":"Classification Models","title":"Classification Models","text":"Modules = [ChemometricsTools]\nPages = [\"ClassificationModels.jl\"]","category":"page"},{"location":"man/ClassificationModels/#ChemometricsTools.GaussianDiscriminant-Tuple{Any,Any,Any}","page":"Classification Models","title":"ChemometricsTools.GaussianDiscriminant","text":"GaussianDiscriminant(M, X, Y; Factors = nothing)\n\nReturns a GaussianDiscriminant classification model on basis object M (PCA, LDA) and one hot encoded Y.\n\n\n\n\n\n","category":"method"},{"location":"man/ClassificationModels/#ChemometricsTools.GaussianDiscriminant-Tuple{Any}","page":"Classification Models","title":"ChemometricsTools.GaussianDiscriminant","text":"( model::GaussianDiscriminant )( Z; Factors = size(model.ProjectedClassMeans)[2] )\n\nReturns a 1 hot encoded inference from Z using a GaussianDiscriminant object. This function enforces positive definiteness in the class covariance matrices.\n\n\n\n\n\n","category":"method"},{"location":"man/ClassificationModels/#ChemometricsTools.GaussianNaiveBayes-Tuple{Any,Any}","page":"Classification Models","title":"ChemometricsTools.GaussianNaiveBayes","text":"GaussianNaiveBayes(X,Y)\n\nReturns a GaussianNaiveBayes classification model object from X and one hot encoded Y.\n\n\n\n\n\n","category":"method"},{"location":"man/ClassificationModels/#ChemometricsTools.GaussianNaiveBayes-Tuple{Any}","page":"Classification Models","title":"ChemometricsTools.GaussianNaiveBayes","text":"(gnb::GaussianNaiveBayes)(X)\n\nReturns a 1 hot encoded inference from X using a GaussianNaiveBayes object.\n\n\n\n\n\n","category":"method"},{"location":"man/ClassificationModels/#ChemometricsTools.KNN","page":"Classification Models","title":"ChemometricsTools.KNN","text":"KNN( X, Y; DistanceType::String )\n\nDistanceType can be \"euclidean\", \"manhattan\". Y Must be one hot encoded.\n\nReturns a KNN classification model.\n\n\n\n\n\n","category":"type"},{"location":"man/ClassificationModels/#ChemometricsTools.KNN-Tuple{Any}","page":"Classification Models","title":"ChemometricsTools.KNN","text":"( model::KNN )( Z; K = 1 )\n\nReturns a 1 hot encoded inference from X with K Nearest Neighbors, using a KNN object.\n\n\n\n\n\n","category":"method"},{"location":"man/ClassificationModels/#ChemometricsTools.LogisticRegression-Tuple{Any}","page":"Classification Models","title":"ChemometricsTools.LogisticRegression","text":"( model::LogisticRegression )( X )\n\nReturns a 1 hot encoded inference from X using a LogisticRegression object.\n\n\n\n\n\n","category":"method"},{"location":"man/ClassificationModels/#ChemometricsTools.ProbabilisticNeuralNetwork","page":"Classification Models","title":"ChemometricsTools.ProbabilisticNeuralNetwork","text":"ProbabilisticNeuralNetwork( X, Y )\n\nStores data for a PNN. Y Must be one hot encoded.\n\nReturns a PNN classification model.\n\n\n\n\n\n","category":"type"},{"location":"man/ClassificationModels/#ChemometricsTools.ProbabilisticNeuralNetwork-Tuple{Any}","page":"Classification Models","title":"ChemometricsTools.ProbabilisticNeuralNetwork","text":"(PNN::ProbabilisticNeuralNetwork)(X; sigma = 0.1)\n\nReturns a 1 hot encoded inference from X with a probabilistic neural network.\n\n\n\n\n\n","category":"method"},{"location":"man/ClassificationModels/#ChemometricsTools.ConfidenceEllipse","page":"Classification Models","title":"ChemometricsTools.ConfidenceEllipse","text":"ConfidenceEllipse(cov, mean, confidence, axis = [1,2]; pointestimate = 180 )\n\nReturns a 2-D array whose columns are X & Y coordinates of a confidence ellipse. The ellipse is generated by the covariance matrix, mean vector, and the number of points to include in the plot.\n\n\n\n\n\n","category":"function"},{"location":"man/ClassificationModels/#ChemometricsTools.LinearPerceptronBatch-Tuple{Any,Any}","page":"Classification Models","title":"ChemometricsTools.LinearPerceptronBatch","text":"LinearPerceptron(X, Y; LearningRate = 1e-3, MaxIters = 5000)\n\nReturns a batch trained LinearPerceptron classification model object from X and one hot encoded Y.\n\n\n\n\n\n","category":"method"},{"location":"man/ClassificationModels/#ChemometricsTools.LinearPerceptronSGD-Tuple{Any,Any}","page":"Classification Models","title":"ChemometricsTools.LinearPerceptronSGD","text":"LinearPerceptronsgd(X, Y; LearningRate = 1e-3, MaxIters = 5000)\n\nReturns a SGD trained LinearPerceptron classification model object from X and one hot encoded Y.\n\n\n\n\n\n","category":"method"},{"location":"man/ClassificationModels/#ChemometricsTools.MultinomialSoftmaxRegression-Tuple{Any,Any}","page":"Classification Models","title":"ChemometricsTools.MultinomialSoftmaxRegression","text":"MultinomialSoftmaxRegression(X, Y; LearnRate = 1e-3, maxiters = 1000, L2 = 0.0)\n\nReturns a LogisticRegression classification model made by Stochastic Gradient Descent.\n\n\n\n\n\n","category":"method"},{"location":"man/ClassificationModels/#ChemometricsTools.linearperceptron-Tuple{Any}","page":"Classification Models","title":"ChemometricsTools.linearperceptron","text":"(L::linearperceptron)(X)\n\nReturns a 1 hot encoded inference from X using a LinearPerceptron object.\n\n\n\n\n\n","category":"method"},{"location":"Demos/CalibXfer/#Direct-Standardization-Demo-1","page":"Calibration Transfer","title":"Direct Standardization Demo","text":"","category":"section"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"The point of this demo is to basically show off that ChemometricsTools contains some base methods for Calibration Transfer. If you don't know what that is, it's basically the subset of Chemometrics that focuses on transfer learning data collected on one instrument to another. This saves time and money for instruments that need to be calibrated but perform routine analysis'.","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"This demo uses the 2002 pharmaceutical shoot-out data and predicts upon the first property value(pretty sure its API content). The dataset contains the same samples of an unstated pharmaceutical measured on two spectrometers with experimentally determined property values. Our goal will be to use one model but adapt the domain from one of the spectrometers to the other.","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"First let's look at our linear sources of variation to get a feel for the data,","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"pca = PCA(calib1; Factors = 20);\nplot(cumsum(ExplainedVariance(pca)), title = \"Scree plot\", xlabel = \"PC's\", ylabel = \"Variance Explained\")","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"(Image: scree)","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"Yea so this isn't a true Scree plot, but it has the same information...","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"Looks like after ~5 factors we have garbage w.r.t X decompositions, good to know. So I'd venture to guess a maximum of 15 Latent Variables for a PLS-1 regression is more than a good enough cut-off for cross-validaiton.","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"MaxLvs = 15\nErr = repeat([0.0], MaxLvs);\nModels = []\nfor Lv in MaxLvs:-1:1\n for ( i, ( Fold, HoldOut ) ) in enumerate(KFoldsValidation(10, calib1, caliby))\n if Lv == MaxLvs\n push!( Models, PartialLeastSquares(Fold[1], Fold[2]; Factors = Lv) )\n end\n Err[Lv] += SSE( Models[i]( HoldOut[1]; Factors = Lv), HoldOut[2] )\n end\nend\n\nscatter(Err, xlabel = \"Latent Variables\", ylabel = \"Cumulative SSE\", labels = [\"Error\"])","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"(Image: cv)","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"Great looks like we can get by with 5-8 LV's. Let's fine tune our Latent Variables based on the hold out set to make our final PLSR model.","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"PLSR1 = PartialLeastSquares(calib1, caliby; Factors = 8);\nfor vLv in 5:8\n println(\"LV: \", vLv)\n println(\"RMSEV: \", RMSE(PLSR1(valid1; Factors = vLv), validy))\nend","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"Kind of hacky, but it works fine for a demo, we see that 7 factors is optimal on the hold out set so that's what we'll use from here on,","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"println(\"RMSEP: \", RMSE(PLSR1(tst1; Factors = 7), tsty))","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"> RMSEP: 4.76860402876937","category":"page"},{"location":"Demos/CalibXfer/#Getting-to-the-point-1","page":"Calibration Transfer","title":"Getting to the point","text":"","category":"section"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"So why do we need to do a calibration transfer? Same chemical, same type of measurements, even the same wavelengths are recorded and compared. Do the naive thing, apply this model to the measurements on instrument 2. See what error you get.","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"println(\"RMSEP: \", RMSE(PLSR1(tst2; Factors = 7), tsty))","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":">RMSEP: 10.303430504546292","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"The prediction error is about 2 fold, in this case it'd be hard to argue this is a useful model at all. Especially if you check the residuals. It's pretty clear the contributions of variance across multiple instruments are not the same in this case.","category":"page"},{"location":"Demos/CalibXfer/#Now-for-calibration-transfer!-1","page":"Calibration Transfer","title":"Now for calibration transfer!","text":"","category":"section"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"So let's use DirectStandardization. First we'll find the optimal number of DirectStandardization Factors to include in our model. We can do that on our hold out set and this should be very fast because we have a hold out set, so we can do this with some inefficient code.","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"Factors = 1:15\nErr = repeat([0.0], length(Factors));\nfor F in Factors\n DS2to1 = DirectStandardization(calib1, calib2; Factors = F);\n cds2to1 = DS2to1(valid2; Factors = F)\n Err[F] = RMSE( PLSR1(cds2to1; Factors = 7), validy )\nend\nscatter(Err, title = \"Transfered Model Validation Error\", xlabel = \"Latent Factors\",\n ylabel = \"RMSE\", labels = [\"Error\"])","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"(Image: cv)","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"OptimalDSFactor = argmin(Err)\nDS2to1 = DirectStandardization(calib1, calib2; Factors = OptimalDSFactor);\ntds2to1 = DS2to1(tst2; Factors = OptimalDSFactor);","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"Looks like 8 Factors in the DS transfer is pretty good. Lets see how the transferred data compares on the prediction set using the same model,","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"println(\"RMSEP: \", RMSE(PLSR1(tds2to1; Factors = 7), tsty))","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"> RMSEP: 5.693023386113084","category":"page"},{"location":"Demos/CalibXfer/#","page":"Calibration Transfer","title":"Calibration Transfer","text":"Viola... So in conclusion we can transform the data from instrument 2 to be similar to that of instrument 1. The errors we see are effectively commensurate between the data sources with this transform, and without it the error is about 2x greater. Maybe the main point here is \"look ChemometricsTools has some calibration transfer methods and the tools included work\". OSC, TOP, CORAL, etc is also included.","category":"page"},{"location":"Demos/SIPLS/#Stacked-Interval-Partial-Least-Squares-1","page":"SIPLS","title":"Stacked Interval Partial Least Squares","text":"","category":"section"},{"location":"Demos/SIPLS/#","page":"SIPLS","title":"SIPLS","text":"Here's a post I kind of debated making... I once read a paper stating that SIPLS was \"too complicated\" to implement, and used that as an argument to favor other methods. SIPLS is actually pretty simple, highly effective, and it has statistical guarantees. What's complicated about SIPLS is providing it to end-users without shielding them from the internals, or leaving them with a pile of hard to read low level code. I decided, the way to go for 'advanced' methods, is to just provide convenience functions. Make life easier for an end-user that knows what they are doing. Demo's are for helping ferry people along and showing at least one way to do things, but there's no golden ticket one-line generic code-base here. Providing it, would be a mistake to people who would actually rely on using this sort of method.","category":"page"},{"location":"Demos/SIPLS/#Steps-to-SIPLS-1","page":"SIPLS","title":"4-Steps to SIPLS","text":"","category":"section"},{"location":"Demos/SIPLS/#","page":"SIPLS","title":"SIPLS","text":"Break the spectra's columnspace into invervals (the size can be CV'd but below I just picked one), then we CV PLS models inside each interval.\nOn a hold out set(or via pooling), we find the prediction error of our intervals\nThose errors are then reciprocally weighted\nApply those weights to future predictions via multiplication and sum the result of each interval model.","category":"page"},{"location":"Demos/SIPLS/#.-Crossvalidate-the-interval-models-1","page":"SIPLS","title":"1. Crossvalidate the interval models","text":"","category":"section"},{"location":"Demos/SIPLS/#","page":"SIPLS","title":"SIPLS","text":"MaxLvs = 10\nCVModels = []\nCVErr = []\nIntervals = MakeIntervals( size(calib1)[2], 30 );\nfor interval in Intervals\n IntervalError = repeat([0.0], MaxLvs);\n Models = []\n\n for Lv in MaxLvs:-1:1\n for ( i, ( Fold, HoldOut ) ) in enumerate(KFoldsValidation(10, calib1, caliby))\n if Lv == MaxLvs\n KFoldModel = PartialLeastSquares(Fold[1][:,interval], Fold[2]; Factors = Lv)\n push!( Models, KFoldModel )\n end\n\n Predictions = Models[i]( HoldOut[1][:, interval]; Factors = Lv)\n IntervalError[Lv] += SSE( Predictions, HoldOut[2])\n end\n end\n OptimalLv = argmin(IntervalError)\n push!(CVModels, PartialLeastSquares(calib1[:, interval], caliby; Factors = OptimalLv) )\n push!(CVErr, IntervalError[OptimalLv] )\nend","category":"page"},{"location":"Demos/SIPLS/#","page":"SIPLS","title":"SIPLS","text":"For fun, we can view the weights of each intervals relative error on the CV'd spectra with this lovely convenience function,","category":"page"},{"location":"Demos/SIPLS/#","page":"SIPLS","title":"SIPLS","text":"IntervalOverlay(calib1, Intervals, CVErr)","category":"page"},{"location":"Demos/SIPLS/#","page":"SIPLS","title":"SIPLS","text":"(Image: CVERR)","category":"page"},{"location":"Demos/SIPLS/#.-Validate-1","page":"SIPLS","title":"2. Validate","text":"","category":"section"},{"location":"Demos/SIPLS/#","page":"SIPLS","title":"SIPLS","text":"VErr = []\nIntervalError = repeat([0.0], MaxLvs);\nfor (model, interval) in enumerate(Intervals)\n push!(VErr, SSE( CVModels[model](valid1[:,interval]), validy) )\nend","category":"page"},{"location":"Demos/SIPLS/#.-Make-reciprocal-weights-1","page":"SIPLS","title":"3. Make reciprocal weights","text":"","category":"section"},{"location":"Demos/SIPLS/#","page":"SIPLS","title":"SIPLS","text":"StackedWeights = stackedweights(VErr);","category":"page"},{"location":"Demos/SIPLS/#","page":"SIPLS","title":"SIPLS","text":"We can recycle that same plot recipe to observe what this weighting function does for us. After calling the stacked weights function we can see how much each interval will contribute to our additve model. In essence, the weights make the intervals with lower error contribute more to the final stacked model, (Image: OS)","category":"page"},{"location":"Demos/SIPLS/#.-Pool-predictions-on-test-set-and-weight-results-1","page":"SIPLS","title":"4. Pool predictions on test set and weight results","text":"","category":"section"},{"location":"Demos/SIPLS/#","page":"SIPLS","title":"SIPLS","text":"Results = zeros(size(tst1)[1]);\nfor (model, interval) in enumerate(Intervals)\n Results += CVModels[model](tst1[:,interval]) .* StackedWeights[model]\nend\n\nRMSE( Results, tsty)","category":"page"},{"location":"Demos/SIPLS/#","page":"SIPLS","title":"SIPLS","text":"> 4.09","category":"page"},{"location":"Demos/SIPLS/#","page":"SIPLS","title":"SIPLS","text":"The RMSE from the SIPLS model is ~0.6 units less then that which we can observe from the same dataset using base PLSR in my Calibration Transfer Demo. This is actually really fast to run too. Every line in this script (aside from importing CSV) runs in roughly ~1-2 seconds.","category":"page"},{"location":"Demos/Transforms/#Transforms-Demo-1","page":"Transforms","title":"Transforms Demo","text":"","category":"section"},{"location":"Demos/Transforms/#","page":"Transforms","title":"Transforms","text":"Two design choices introduced in this package are \"Transformations\" and \"Pipelines\". Transformations are the smallest unit of a 'pipeline'. They are simply functions that have a deterministic inverse. For example if we mean center our data and store the mean vector, we can always invert the transform by adding the mean back to the data. That's effectively what transforms do, they provide to and from common data transformations used in chemometrics.","category":"page"},{"location":"Demos/Transforms/#","page":"Transforms","title":"Transforms","text":"Let's start with a trivial example with faux data where a random matrix of data is center scaled and divided by the standard deviation(StandardNormalVariate):","category":"page"},{"location":"Demos/Transforms/#","page":"Transforms","title":"Transforms","text":"FauxSpectra1 = randn(10,200);\nSNV = StandardNormalVariate(FauxSpectra1);\nTransformed1 = SNV(FauxSpectra1);","category":"page"},{"location":"Demos/Transforms/#","page":"Transforms","title":"Transforms","text":"As can be seen the application of the StandardNormalVariate() function returns an object that is used to transform future data by the data it was created from. This object can be applied to new data as follows,","category":"page"},{"location":"Demos/Transforms/#","page":"Transforms","title":"Transforms","text":"FauxSpectra2 = randn(10,200);\nTransformed2 = SNV(FauxSpectra2);","category":"page"},{"location":"Demos/Transforms/#","page":"Transforms","title":"Transforms","text":"Transformations can also be inverted (with-in numerical noise). For example,","category":"page"},{"location":"Demos/Transforms/#","page":"Transforms","title":"Transforms","text":"RMSE(FauxSpectra1, SNV(Transformed1; inverse = true)) < 1e-14\nRMSE(FauxSpectra2, SNV(Transformed2; inverse = true)) < 1e-14","category":"page"},{"location":"Demos/Transforms/#","page":"Transforms","title":"Transforms","text":"We can use transformations to treat data from multiple sources the same way. This helps mitigate user-error for cases where test data is scaled based on training data, calibration transfer, etc. Pipelines are a logical and convenient extension of transformations.","category":"page"},{"location":"man/CurveResolution/#Curve-Resolution-Models-API-Reference-1","page":"Curve Resolution","title":"Curve Resolution Models API Reference","text":"","category":"section"},{"location":"man/CurveResolution/#Functions-1","page":"Curve Resolution","title":"Functions","text":"","category":"section"},{"location":"man/CurveResolution/#","page":"Curve Resolution","title":"Curve Resolution","text":"Modules = [ChemometricsTools]\nPages = [\"CurveResolution.jl\"]","category":"page"},{"location":"man/CurveResolution/#ChemometricsTools.BTEM","page":"Curve Resolution","title":"ChemometricsTools.BTEM","text":"BTEM(X, bands = nothing; Factors = 3, particles = 50, maxiters = 1000)\n\nReturns a single recovered spectra from a 2-Array X, the selected bands, number of Factors, using a Particle Swarm Optimizer. Note: This is not the function used in the original paper. This will be updated... it was written from memory. Also the original method uses Simulated Annealing not PSO. Band-Target Entropy Minimization (BTEM):  An Advanced Method for Recovering Unknown Pure Component Spectra. Application to the FTIR Spectra of Unstable Organometallic Mixtures. Wee Chew,Effendi Widjaja, and, and Marc Garland. Organometallics 2002 21 (9), 1982-1990. DOI: 10.1021/om0108752\n\n\n\n\n\n","category":"function"},{"location":"man/CurveResolution/#ChemometricsTools.BTEMobjective-Tuple{Any,Any}","page":"Curve Resolution","title":"ChemometricsTools.BTEMobjective","text":"BTEMobjective( a, X )\n\nReturns the scalar BTEM objective function obtained from the linear combination vector a and loadings X. Note: This is not the function used in the original paper. This will be updated... it was written from memory.\n\n\n\n\n\n","category":"method"},{"location":"man/CurveResolution/#ChemometricsTools.FNNLS-Tuple{Any,Any}","page":"Curve Resolution","title":"ChemometricsTools.FNNLS","text":"FNNLS( A, b; LHS = false, maxiters = 500 )\n\nUses an implementation of Bro et. al's Fast Non-Negative Least Squares on the matrix A and vector b. Returns regression coefficients in the form of a vector. Bro, R., de Jong, S. (1997) A fast non-negativity-constrained least squares algorithm. Journal of Chemometrics, 11, 393-401.\n\n\n\n\n\n","category":"method"},{"location":"man/CurveResolution/#ChemometricsTools.MCRALS","page":"Curve Resolution","title":"ChemometricsTools.MCRALS","text":"MCRALS(X, C, S = nothing; norm = (false, false), Factors = 1, maxiters = 20, nonnegative = (false, false) )\n\nPerforms Multivariate Curve Resolution using Alternating Least Squares on X taking initial estimates for S or C. S or C can be constrained by their norm, or by nonnegativity using nonnegative arguments. The number of resolved Factors can also be set. Tauler, R. Izquierdo-Ridorsa, A. Casassas, E. Simultaneous analysis of several spectroscopic titrations with self-modelling curve resolution.Chemometrics and Intelligent Laboratory Systems. 18, 3, (1993), 293-300.\n\n\n\n\n\n","category":"function"},{"location":"man/CurveResolution/#ChemometricsTools.NMF-Tuple{Any}","page":"Curve Resolution","title":"ChemometricsTools.NMF","text":"NMF(X; Factors = 1, tolerance = 1e-7, maxiters = 200)\n\nPerforms a variation of non-negative matrix factorization on Array X and returns the a 2-Tuple of (Concentration Profile, Spectra) Note: This is not a coordinate descent based NMF. This is a simple fast version which works well enough for chemical signals Algorithms for non-negative matrix factorization. Daniel D. Lee. H. Sebastian Seung. NIPS'00 Proceedings of the 13th International Conference on Neural Information Processing Systems. 535-54\n\n\n\n\n\n","category":"method"},{"location":"man/CurveResolution/#ChemometricsTools.SIMPLISMA-Tuple{Any}","page":"Curve Resolution","title":"ChemometricsTools.SIMPLISMA","text":"SIMPLISMA(X; Factors = 1, alpha = 0.05, includedvars = 1:size(X)[2], SecondDeriv = true)\n\nPerforms SIMPLISMA on Array X using either the raw spectra or the Second Derivative spectra. alpha can be set to reduce contributions of baseline, and a list of included variables in the determination of pure variables may also be provided. Returns a tuple of the following form: (Concentraion Profile, Pure Spectral Estimates, Pure Variables) W. Windig, Spectral Data Files for Self-Modeling Curve Resolution with Examples Using the SIMPLISMA Approach, Chemometrics and Intelligent Laboratory Systems, 36, 1997, 3-16.\n\n\n\n\n\n","category":"method"},{"location":"man/AnomalyDetection/#Anomaly-Detection-API-Reference-1","page":"Anomaly Detection","title":"Anomaly Detection API Reference","text":"","category":"section"},{"location":"man/AnomalyDetection/#","page":"Anomaly Detection","title":"Anomaly Detection","text":"ChemometricsTools has a few anomaly detection methods. Feel free to read the API below. If that's too abstract, check out the shoot-out example : AnomalyDetection","category":"page"},{"location":"man/AnomalyDetection/#Functions-1","page":"Anomaly Detection","title":"Functions","text":"","category":"section"},{"location":"man/AnomalyDetection/#","page":"Anomaly Detection","title":"Anomaly Detection","text":"Modules = [ChemometricsTools]\nPages = [\"AnomalyDetection.jl\"]","category":"page"},{"location":"man/AnomalyDetection/#ChemometricsTools.Hotelling-Tuple{Any,PCA}","page":"Anomaly Detection","title":"ChemometricsTools.Hotelling","text":"Hotelling(X, pca::PCA; Quantile = 0.05, Variance = 1.0)\n\nComputes the hotelling Tsq and upper control limit cut off of a pca object using a specified Quantile and cumulative variance explained Variance for new or old data X.\n\nA review of PCA-based statistical process monitoring methodsfor time-dependent, high-dimensional data. Bart De Ketelaere https://wis.kuleuven.be/stat/robust/papers/2013/deketelaere-review.pdf\n\n\n\n\n\n","category":"method"},{"location":"man/AnomalyDetection/#ChemometricsTools.Leverage-Tuple{PCA}","page":"Anomaly Detection","title":"ChemometricsTools.Leverage","text":"Leverage(pca::PCA)\n\nCalculates the leverage of samples in a pca object.\n\n\n\n\n\n","category":"method"},{"location":"man/AnomalyDetection/#ChemometricsTools.Q-Tuple{Any,PCA}","page":"Anomaly Detection","title":"ChemometricsTools.Q","text":"Q(X, pca::PCA; Quantile = 0.95, Variance = 1.0)\n\nComputes the Q-statistic and upper control limit cut off of a pca object using a specified Quantile and cumulative variance explained Variance for new or old data X.\n\nA review of PCA-based statistical process monitoring methodsfor time-dependent, high-dimensional data. Bart De Ketelaere https://wis.kuleuven.be/stat/robust/papers/2013/deketelaere-review.pdf\n\n\n\n\n\n","category":"method"},{"location":"man/Clustering/#Clustering-API-Reference-1","page":"Clustering","title":"Clustering API Reference","text":"","category":"section"},{"location":"man/Clustering/#K-means-Elbow-Plot-Recipe-1","page":"Clustering","title":"K-means Elbow Plot Recipe","text":"","category":"section"},{"location":"man/Clustering/#","page":"Clustering","title":"Clustering","text":" using Plots\n ExplainedVar = []\n for K in 1:10\n km = KMeans( X, K; tolerance = 1e-14, maxiters = 1000 )\n TCSS = TotalClusterSS( km )\n WCSS = WithinClusterSS( km )\n #BCSS = BetweenClusterSS( km )\n push!(ExplainedVar, WCSS / TCSS)\n end\n scatter(ExplainedVar, title = \"Elbow Plot\", ylabel = \"WCSS/TCSS\", xlabel = \"Clusters (#)\", label = \"K-means\" )","category":"page"},{"location":"man/Clustering/#Functions-1","page":"Clustering","title":"Functions","text":"","category":"section"},{"location":"man/Clustering/#","page":"Clustering","title":"Clustering","text":"Modules = [ChemometricsTools]\nPages = [\"Clustering.jl\"]","category":"page"},{"location":"man/Clustering/#ChemometricsTools.BetweenClusterSS-Tuple{ChemometricsTools.ClusterModel}","page":"Clustering","title":"ChemometricsTools.BetweenClusterSS","text":"BetweenClusterSS( Clustered::ClusterModel )\n\nReturns a scalar of the between cluster sum of squares for a ClusterModel object.\n\n\n\n\n\n","category":"method"},{"location":"man/Clustering/#ChemometricsTools.KMeans-Tuple{Any,Any}","page":"Clustering","title":"ChemometricsTools.KMeans","text":"KMeans( X, Clusters; tolerance = 1e-8, maxiters = 200 )\n\nReturns a ClusterModel object after finding clusterings for data in X via MacQueens K-Means algorithm. Clusters is the K parameter, or the # of clusters.\n\nMacQueen, J. B. (1967). Some Methods for classification and Analysis of Multivariate Observations. Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability. 1. University of California Press. pp. 281–297.\n\n\n\n\n\n","category":"method"},{"location":"man/Clustering/#ChemometricsTools.TotalClusterSS-Tuple{ChemometricsTools.ClusterModel}","page":"Clustering","title":"ChemometricsTools.TotalClusterSS","text":"TotalClusterSS( Clustered::ClusterModel )\n\nReturns a scalar of the total sum of squares for a ClusterModel object.\n\n\n\n\n\n","category":"method"},{"location":"man/Clustering/#ChemometricsTools.WithinClusterSS-Tuple{ChemometricsTools.ClusterModel}","page":"Clustering","title":"ChemometricsTools.WithinClusterSS","text":"WithinClusterSS( Clustered::ClusterModel )\n\nReturns a scalar of the within cluter sum of squares for a ClusterModel object.\n\n\n\n\n\n","category":"method"},{"location":"man/Dists/#Distances-API-Reference-1","page":"Distance Measures","title":"Distances API Reference","text":"","category":"section"},{"location":"man/Dists/#Functions-1","page":"Distance Measures","title":"Functions","text":"","category":"section"},{"location":"man/Dists/#","page":"Distance Measures","title":"Distance Measures","text":"Modules = [ChemometricsTools]\nPages = [\"DistanceMeasures.jl\"]","category":"page"},{"location":"man/Dists/#ChemometricsTools.Kernel-Tuple{Any}","page":"Distance Measures","title":"ChemometricsTools.Kernel","text":"(K::Kernel)(X)\n\nThis is a convenience function to allow for one-line construction of kernels from a Kernel object K and new data X.\n\n\n\n\n\n","category":"method"},{"location":"man/Dists/#ChemometricsTools.AdjacencyMatrix-Tuple{Any}","page":"Distance Measures","title":"ChemometricsTools.AdjacencyMatrix","text":"NearestNeighbors(DistanceMatrix)\n\nReturns the nearest neighbor adjacency matrix from a given DistanceMatrix.\n\n\n\n\n\n","category":"method"},{"location":"man/Dists/#ChemometricsTools.CauchyKernel-Tuple{Any,Any,Any}","page":"Distance Measures","title":"ChemometricsTools.CauchyKernel","text":"CauchyKernel(X, Y, sigma)\n\nCreates a Cauchy kernel from Arrays X and Y using hyperparameters sigma.\n\n\n\n\n\n","category":"method"},{"location":"man/Dists/#ChemometricsTools.CauchyKernel-Tuple{Any,Any}","page":"Distance Measures","title":"ChemometricsTools.CauchyKernel","text":"CauchyKernel(X, sigma)\n\nCreates a Cauchy kernel from Array X using hyperparameters sigma.\n\n\n\n\n\n","category":"method"},{"location":"man/Dists/#ChemometricsTools.EuclideanDistance-Tuple{Any,Any}","page":"Distance Measures","title":"ChemometricsTools.EuclideanDistance","text":"EuclideanDistance(X, Y)\n\nReturns the euclidean distance matrix of X and Y such that the columns are the samples in Y.\n\n\n\n\n\n","category":"method"},{"location":"man/Dists/#ChemometricsTools.EuclideanDistance-Tuple{Any}","page":"Distance Measures","title":"ChemometricsTools.EuclideanDistance","text":"EuclideanDistance(X)\n\nReturns the Grahm aka the euclidean distance matrix of X.\n\n\n\n\n\n","category":"method"},{"location":"man/Dists/#ChemometricsTools.GaussianKernel-Tuple{Any,Any,Any}","page":"Distance Measures","title":"ChemometricsTools.GaussianKernel","text":"GaussianKernel(X, Y, sigma)\n\nCreates a Gaussian/RBF kernel from Arrays X and Y with hyperparameter sigma.\n\n\n\n\n\n","category":"method"},{"location":"man/Dists/#ChemometricsTools.GaussianKernel-Tuple{Any,Any}","page":"Distance Measures","title":"ChemometricsTools.GaussianKernel","text":"GaussianKernel(X, sigma)\n\nCreates a Gaussian/RBF kernel from Array X using hyperparameter sigma.\n\n\n\n\n\n","category":"method"},{"location":"man/Dists/#ChemometricsTools.InClassAdjacencyMatrix","page":"Distance Measures","title":"ChemometricsTools.InClassAdjacencyMatrix","text":"InClassAdjacencyMatrix(DistanceMatrix, YHOT, K = 1)\n\nComputes the in class Adjacency matrix with K nearest neighbors.\n\n\n\n\n\n","category":"function"},{"location":"man/Dists/#ChemometricsTools.LinearKernel-Tuple{Any,Any,Any}","page":"Distance Measures","title":"ChemometricsTools.LinearKernel","text":"LinearKernel(X, Y, c)\n\nCreates a Linear kernel from Arrays X and Y with hyperparameter C.\n\n\n\n\n\n","category":"method"},{"location":"man/Dists/#ChemometricsTools.LinearKernel-Tuple{Any,Any}","page":"Distance Measures","title":"ChemometricsTools.LinearKernel","text":"LinearKernel(X, c)\n\nCreates a Linear kernel from Array X and hyperparameter C.\n\n\n\n\n\n","category":"method"},{"location":"man/Dists/#ChemometricsTools.ManhattanDistance-Tuple{Any,Any}","page":"Distance Measures","title":"ChemometricsTools.ManhattanDistance","text":"ManhattanDistance(X, Y)\n\nReturns the Manhattan distance matrix of X and Y such that the columns are the samples in Y.\n\n\n\n\n\n","category":"method"},{"location":"man/Dists/#ChemometricsTools.ManhattanDistance-Tuple{Any}","page":"Distance Measures","title":"ChemometricsTools.ManhattanDistance","text":"ManhattanDistance(X)\n\nReturns the Manhattan distance matrix of X.\n\n\n\n\n\n","category":"method"},{"location":"man/Dists/#ChemometricsTools.NearestNeighbors-Tuple{Any,Any}","page":"Distance Measures","title":"ChemometricsTools.NearestNeighbors","text":"NearestNeighbors(DistanceMatrix, N)\n\nReturns a matrix of dimensions DistanceMatrix rows, by N columns. Basically this goes through each row and finds the ones corresponding column which has the smallest distance.\n\n\n\n\n\n","category":"method"},{"location":"man/Dists/#ChemometricsTools.OutOfClassAdjacencyMatrix","page":"Distance Measures","title":"ChemometricsTools.OutOfClassAdjacencyMatrix","text":"OutOfClassAdjacencyMatrix(DistanceMatrix, YHOT, K = 1)\n\nComputes the out of class Adjacency matrix with K nearest neighbors.\n\n\n\n\n\n","category":"function"},{"location":"man/Dists/#ChemometricsTools.SquareEuclideanDistance-Tuple{Any,Any}","page":"Distance Measures","title":"ChemometricsTools.SquareEuclideanDistance","text":"SquareEuclideanDistance(X, Y)\n\nReturns the squared euclidean distance matrix of X and Y such that the columns are the samples in Y.\n\n\n\n\n\n","category":"method"},{"location":"man/Dists/#ChemometricsTools.SquareEuclideanDistance-Tuple{Any}","page":"Distance Measures","title":"ChemometricsTools.SquareEuclideanDistance","text":"SquareEuclideanDistance(X)\n\nReturns the squared Grahm aka the euclidean distance matrix of X.\n\n\n\n\n\n","category":"method"},{"location":"man/TimeSeries/#Time-Series-API-Reference-1","page":"Time Series","title":"Time Series API Reference","text":"","category":"section"},{"location":"man/TimeSeries/#Functions-1","page":"Time Series","title":"Functions","text":"","category":"section"},{"location":"man/TimeSeries/#","page":"Time Series","title":"Time Series","text":"Modules = [ChemometricsTools]\nPages = [\"TimeSeries.jl\"]","category":"page"},{"location":"Demos/RegressionExample/#Regression/Training-Demo:-1","page":"Regression","title":"Regression/Training Demo:","text":"","category":"section"},{"location":"Demos/RegressionExample/#","page":"Regression","title":"Regression","text":"This demo shows a few ways to build a PLS regression model and perform cross validation. If you want to see the gambit of regression methods included in ChemometricsTools check the regression shootout example.","category":"page"},{"location":"Demos/RegressionExample/#","page":"Regression","title":"Regression","text":"There are a few built-in's to make training models a snap. Philosophically I decided, that making wrapper functions to perform Cross Validation is not fair to the end-user. There are many cases where we want specialized CV's but we don't want to write nested for-loops that run for hours then debug them... Similarly, most people don't want to spend their time hacking into rigid GridSearch objects, or scouring stack exchange / package documentation. Especially when it'd be easier to write an equivalent approach that is self documenting from scratch. Instead, I used Julia's iterators to make K-Fold validations convenient, below is an example Partial Least Squares Regression CV.","category":"page"},{"location":"Demos/RegressionExample/#","page":"Regression","title":"Regression","text":"#Split our data into two parts one 70% one 30%\n((TrainX,TrainY),(TestX, TestY)) = SplitByProportion(x, yprop, 0.7);\n#Preprocess it\nMSC_Obj = MultiplicativeScatterCorrection(TrainX);\nTrainX = MSC_Obj(TrainX);\nTestX = MSC_Obj(TestX);\n#Begin CV!\nLatentVariables = 22\nErr = repeat([0.0], LatentVariables);\n#Note this is the Julian way to nest two loops\nfor Lv in 1:LatentVariables, (Fold, HoldOut) in KFoldsValidation(20, TrainX, TrainY)\n PLSR = PartialLeastSquares(Fold[1], Fold[2]; Factors = Lv)\n Err[Lv] += SSE( PLSR(HoldOut[1]), HoldOut[2] )\nend\nscatter(Err, xlabel = \"Latent Variables\", ylabel = \"Cumulative SSE\", labels = [\"Error\"])\nBestLV = argmin(Err)\nPLSR = PartialLeastSquares(TrainX, TrainY; Factors = BestLV)\nRMSE( PLSR(TestX), TestY )","category":"page"},{"location":"Demos/RegressionExample/#","page":"Regression","title":"Regression","text":"(Image: 20folds)","category":"page"},{"location":"Demos/RegressionExample/#","page":"Regression","title":"Regression","text":"That's great right? but, hey that was kind of slow. Knowing what we know about ALS based models, we can do the same operation in linear time with respect to latent factors by computing the most latent variables first and only recomputing the regression coefficients. An example of this is below,","category":"page"},{"location":"Demos/RegressionExample/#","page":"Regression","title":"Regression","text":"Err = repeat([0.0], 22);\nModels = []\nfor Lv in 22:-1:1\n for ( i, ( Fold, HoldOut ) ) in enumerate(KFoldsValidation(20, TrainX, TrainY))\n if Lv == 22\n push!( Models, PartialLeastSquares(Fold[1], Fold[2]; Factors = Lv) )\n end\n Err[Lv] += SSE( Models[i]( HoldOut[1]; Factors = Lv), HoldOut[2] )\n end\nend","category":"page"},{"location":"Demos/RegressionExample/#","page":"Regression","title":"Regression","text":"This approach is ~5 times faster on a single core( < 2 seconds), pours through 7Gb less data, and makes 1/5th the allocations (on this dataset at least). If you wanted you could distribute the inner loop (using Distributed.jl) and see drastic speed ups!","category":"page"},{"location":"man/Plotting/#Plotting-Tools-API-Reference-1","page":"Plotting Tools API Reference","title":"Plotting Tools API Reference","text":"","category":"section"},{"location":"man/Plotting/#Functions-1","page":"Plotting Tools API Reference","title":"Functions","text":"","category":"section"},{"location":"man/Plotting/#","page":"Plotting Tools API Reference","title":"Plotting Tools API Reference","text":"Modules = [ChemometricsTools]\nPages = [\"PlottingTools.jl\"]","category":"page"},{"location":"man/FullAPI/#API-1","page":"Full API","title":"API","text":"","category":"section"},{"location":"man/FullAPI/#","page":"Full API","title":"Full API","text":"CurrentModule = ChemometricsTools\nDocTestSetup = quote\n\tusing ChemometricsTools\nend","category":"page"},{"location":"man/FullAPI/#","page":"Full API","title":"Full API","text":"Modules = [ChemometricsTools]","category":"page"},{"location":"man/FullAPI/#ChemometricsTools.BlandAltman-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.BlandAltman","text":"BlandAltman(Y1, Y2; Confidence = 1.96)\n\nReturns a Plot object of a Bland-Altman plot between vectors Y1 and Y2 with a confidence limit of Confidence.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.Bounds-Tuple{Any,Any,Any}","page":"Full API","title":"ChemometricsTools.Bounds","text":"Bounds(dims)\n\nConstructor for a Bounds object. Returns a bounds object with a lower bound of [lower...] and upper bound[upper...] with length of dims.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.Bounds-Tuple{Any}","page":"Full API","title":"ChemometricsTools.Bounds","text":"Bounds(dims)\n\nDefault constructor for a Bounds object. Returns a bounds object with a lower bound of [0...] and upper bound[1...] with length of dims.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.CORAL-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.CORAL","text":"CORAL(X1, X2; lambda = 1.0)\n\nPerforms CORAL to facilitate covariance based transfer from X1 to X2 with regularization parameter lambda. Returns a CORAL object.\n\nCorrelation Alignment for Unsupervised Domain Adaptation. Baochen Sun, Jiashi Feng, Kate Saenko. https://arxiv.org/abs/1612.01939\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.CORAL-Tuple{Any}","page":"Full API","title":"ChemometricsTools.CORAL","text":"(C::CORAL)(Z)\n\nApplies a the transform from a learned CORAL object to new data Z.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.CanonicalCorrelationAnalysis-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.CanonicalCorrelationAnalysis","text":"CanonicalCorrelationAnalysis(A, B)\n\nReturns a CanonicalCorrelationAnalysis object which contains (U, V, r) from Arrays A and B. Currently Untested for correctness but should compute....\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.Center-Tuple{Any}","page":"Full API","title":"ChemometricsTools.Center","text":"(T::Center)(Z; inverse = false)\n\nCenters data in array Z column-wise according to learned mean centers in Center object T.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.CenterScale-Tuple{Any}","page":"Full API","title":"ChemometricsTools.CenterScale","text":"(T::CenterScale)(Z; inverse = false)\n\nCenters and Scales data in array Z column-wise according to learned measures of central tendancy in Scale object T.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.ClassicLeastSquares-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.ClassicLeastSquares","text":"ClassicLeastSquares( X, Y; Bias = false )\n\nMakes a ClassicLeastSquares regression model of the form Y = AX with or without a Bias term. Returns a CLS object.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.ClassicLeastSquares-Tuple{Any}","page":"Full API","title":"ChemometricsTools.ClassicLeastSquares","text":"(M::ClassicLeastSquares)(X)\n\nMakes an inference from X using a ClassicLeastSquares object.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.GaussianBand","page":"Full API","title":"ChemometricsTools.GaussianBand","text":"GaussianBand(sigma,amplitude,center)\n\nConstructs a Gaussian kernel generator.\n\n\n\n\n\n","category":"type"},{"location":"man/FullAPI/#ChemometricsTools.GaussianBand-Tuple{Float64}","page":"Full API","title":"ChemometricsTools.GaussianBand","text":"(B::GaussianBand)(X::Float64)\n\nReturns the scalar probability associated with a GaussianBand object (kernel) at a location in space(X).\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.KFoldsValidation-Tuple{Int64,Any,Any}","page":"Full API","title":"ChemometricsTools.KFoldsValidation","text":"KFoldsValidation(K::Int, x, y)\n\nReturns a KFoldsValidation iterator with K folds. Because it's an iterator it can be used in for loops, see the tutorials for pragmatic examples. The iterator returns a 2-Tuple of 2-Tuples which have the following form: ((TrainX,TrainY),(ValidateX,ValidateY).\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.LDA-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.LDA","text":"LDA(X, Y; Factors = 1)\n\nCompute's a LinearDiscriminantAnalysis transform from x with a user specified number of latent variables(Factors). Returns an LDA object.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.LDA-Tuple{Any}","page":"Full API","title":"ChemometricsTools.LDA","text":"( model::LDA )( Z; Factors = length(model.Values) )\n\nCalling a LDA object on new data brings the new data Z into the LDA basis.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.LSSVM-Tuple{Any,Any,Any}","page":"Full API","title":"ChemometricsTools.LSSVM","text":"LSSVM( X, Y, Penalty; KernelParameter = 0.0, KernelType = \"linear\" )\n\nMakes a LSSVM model of the form Y = AK with a bias term using a user specified Kernel(\"linear\", or \"gaussian\") and has an L2 Penalty. Returns a LSSVM Wrapper for a CLS object.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.LSSVM-Tuple{Any}","page":"Full API","title":"ChemometricsTools.LSSVM","text":"(M::LSSVM)(X)\n\nMakes an inference from X using a LSSVM object.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.LorentzianBand","page":"Full API","title":"ChemometricsTools.LorentzianBand","text":"LorentzianBand(gamma,amplitude,center)\n\nConstructs a Lorentzian kernel generator.\n\n\n\n\n\n","category":"type"},{"location":"man/FullAPI/#ChemometricsTools.LorentzianBand-Tuple{Float64}","page":"Full API","title":"ChemometricsTools.LorentzianBand","text":"(B::LorentzianBand)(X::Float64)\n\nReturns the probability associated with a LorentzianBand object (kernel) at a location in space(X).\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.MultiCenter","page":"Full API","title":"ChemometricsTools.MultiCenter","text":"MultiCenter(Z, mode = 1)\n\nAcquires the mean of the specified mode in Z and returns a transform that will remove those means from any future data.\n\n\n\n\n\n","category":"type"},{"location":"man/FullAPI/#ChemometricsTools.MultiCenter-Tuple{Any}","page":"Full API","title":"ChemometricsTools.MultiCenter","text":"(T::MultiCenter)(Z; inverse = false)\n\nCenters data in Tensor Z mode-wise according to learned centers in MultiCenter object T.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.MultiScale","page":"Full API","title":"ChemometricsTools.MultiScale","text":"MultiScale(Z, mode = 1)\n\nAcquires the standard deviations of the specified mode in Z and returns a transform that will scale by those standard deviations from any future data.\n\n\n\n\n\n","category":"type"},{"location":"man/FullAPI/#ChemometricsTools.MultiScale-Tuple{Any}","page":"Full API","title":"ChemometricsTools.MultiScale","text":"(T::MultiScale)(Z; inverse = false)\n\nScales data in Tensor Z mode-wise according to learned standard deviations in MultiScale object T.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.MultiplicativeScatterCorrection-Tuple{Any}","page":"Full API","title":"ChemometricsTools.MultiplicativeScatterCorrection","text":"(T::MultiplicativeScatterCorrection)(Z)\n\nApplies MultiplicativeScatterCorrection from a stored object T to Array Z.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.OrthogonalSignalCorrection-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.OrthogonalSignalCorrection","text":"OrthogonalSignalCorrection(X, Y; Factors = 1)\n\nPerforms Thomas Fearn's Orthogonal Signal Correction to an endogenous X and exogenous Y. The number of Factors are the number of orthogonal components to be removed from X. This function returns an OSC object.\n\nTom Fearn. On orthogonal signal correction. Chemometrics and Intelligent Laboratory Systems. Volume 50, Issue 1, 2000, Pages 47-52.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.OrthogonalSignalCorrection-Tuple{Any}","page":"Full API","title":"ChemometricsTools.OrthogonalSignalCorrection","text":"(OSC::OrthogonalSignalCorrection)(Z; Factors = 2)\n\nApplies a the transform from a learned orthogonal signal correction object OSC to new data Z.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.PCA-Tuple{Any}","page":"Full API","title":"ChemometricsTools.PCA","text":"PCA(X; Factors = minimum(size(X)) - 1)\n\nCompute's a PCA from x using LinearAlgebra's SVD algorithm with a user specified number of latent variables(Factors). Returns a PCA object.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.PCA-Tuple{Array}","page":"Full API","title":"ChemometricsTools.PCA","text":"(T::PCA)(Z::Array; Factors = length(T.Values), inverse = false)\n\nCalling a PCA object on new data brings the new data Z into or out of (inverse = true) the PCA basis.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.PartialLeastSquares-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.PartialLeastSquares","text":"PartialLeastSquares( X, Y; Factors = minimum(size(X)) - 2, tolerance = 1e-8, maxiters = 200 )\n\nReturns a PartialLeastSquares regression model object from arrays X and Y.\n\nPARTIAL LEAST-SQUARES REGRESSION: A TUTORIAL PAUL GELADI and BRUCE R.KOWALSKI. Analytica Chimica Acta, 186, (1986) PARTIAL LEAST-SQUARES REGRESSION:\nMartens H., NÊs T. Multivariate Calibration. Wiley: New York, 1989.\nRe-interpretation of NIPALS results solves PLSR inconsistency problem. Rolf Ergon. Published in Journal of Chemometrics 2009; Vol. 23/1: 72-75\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.PartialLeastSquares-Tuple{Any}","page":"Full API","title":"ChemometricsTools.PartialLeastSquares","text":"(M::PartialLeastSquares)\n\nMakes an inference from X using a PartialLeastSquares object.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.Particle-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.Particle","text":"Particle(ProblemBounds, VelocityBounds)\n\nDefault constructor for a Particle object. It creates a random unformly distributed particle within the specified ProblemBounds, and limits it's velocity to the specified VelocityBounds.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.PrincipalComponentRegression-Tuple{Any}","page":"Full API","title":"ChemometricsTools.PrincipalComponentRegression","text":"(M::PrincipalComponentRegression)( X )\n\nMakes an inference from X using a PrincipalComponentRegression object.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.PrincipalComponentRegression-Tuple{PCA,Any}","page":"Full API","title":"ChemometricsTools.PrincipalComponentRegression","text":"PrincipalComponentRegression(PCAObject, Y )\n\nMakes a PrincipalComponentRegression model object from a PCA Object and property value Y.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.QQ-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.QQ","text":"QQ( Y1, Y2; Quantiles = collect( 1 : 99 ) ./ 100 )\n\nReturns a Plot object of a Quantile-Quantile plot between vectors Y1 and Y2 at the desired Quantiles.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.QuantileTrim","page":"Full API","title":"ChemometricsTools.QuantileTrim","text":"(T::QuantileTrim)(X, inverse = false)\n\nTrims data in array X columns wise according to learned quantiles in QuantileTrim object T This function does NOT have an inverse.\n\n\n\n\n\n","category":"type"},{"location":"man/FullAPI/#ChemometricsTools.QuantileTrim-Tuple{Any}","page":"Full API","title":"ChemometricsTools.QuantileTrim","text":"QuantileTrim(Z; quantiles::Tuple{Float64,Float64} = (0.05, 0.95) )\n\nTrims values above or below the specified columnwise quantiles to the quantile values themselves.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.RangeNorm-Tuple{Any}","page":"Full API","title":"ChemometricsTools.RangeNorm","text":"(T::RangeNorm)(Z; inverse = false)\n\nScales and shifts data in array Z column-wise according to learned min-maxes in RangeNorm object T.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.RidgeRegression-Tuple{Any,Any,Any}","page":"Full API","title":"ChemometricsTools.RidgeRegression","text":"RidgeRegression( X, Y, Penalty; Bias = false )\n\nMakes a RidgeRegression model of the form Y = AX with or without a Bias term and has an L2 Penalty. Returns a CLS object.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.RidgeRegression-Tuple{Any}","page":"Full API","title":"ChemometricsTools.RidgeRegression","text":"(M::RidgeRegression)(X)\n\nMakes an inference from X using a RidgeRegression object which wraps a ClassicLeastSquares object.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.RollingWindow-Tuple{Int64,Int64,Int64}","page":"Full API","title":"ChemometricsTools.RollingWindow","text":"RollingWindow(samples::Int,windowsize::Int,skip::Int)\n\nCreates a RollingWindow iterator from a number of samples and a static windowsize where every iteration skip steps are skipped. The iterator can be used in for loops to iteratively return indices of a dynamic rolling window.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.RollingWindow-Tuple{Int64,Int64}","page":"Full API","title":"ChemometricsTools.RollingWindow","text":"RollingWindow(samples::Int,windowsize::Int)\n\nCreates a RollingWindow iterator from a number of samples and a static windowsize. The iterator can be used in for loops to iteratively return indices of a dynamic rolling window.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.RunningMean-Tuple{Any}","page":"Full API","title":"ChemometricsTools.RunningMean","text":"RunningMean(x)\n\nConstructs a running mean object with an initial scalar value of x.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.RunningVar-Tuple{Any}","page":"Full API","title":"ChemometricsTools.RunningVar","text":"RunningVar(x)\n\nConstructs a RunningVar object with an initial scalar value of x. Note: RunningVar objects implicitly calculate the running mean.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.Scale-Tuple{Any}","page":"Full API","title":"ChemometricsTools.Scale","text":"(T::Scale)(Z; inverse = false)\n\nScales data in array Z column-wise according to learned standard deviations in Scale object T.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.TransferByOrthogonalProjection-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.TransferByOrthogonalProjection","text":"TransferByOrthogonalProjection(X1, X2; Factors = 1)\n\nPerforms Thomas Fearns Transfer By Orthogonal Projection to facilitate transfer from X1 to X2. Returns a TransferByOrthogonalProjection object.\n\nAnne Andrew, Tom Fearn. Transfer by orthogonal projection: making near-infrared calibrations robust to between-instrument variation. Chemometrics and Intelligent Laboratory Systems. Volume 72, Issue 1, 2004, Pages 51-56,\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.TransferByOrthogonalProjection-Tuple{Any}","page":"Full API","title":"ChemometricsTools.TransferByOrthogonalProjection","text":"(TbOP::TransferByOrthogonalProjection)(X1; Factors = TbOP.Factors)\n\nApplies a the transform from a learned transfer by orthogonal projection object TbOP to new data X1.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.Universe-Tuple","page":"Full API","title":"ChemometricsTools.Universe","text":"(U::Universe)(Band...)\n\nA Universe objects internal \"spectra\" can be updated to include the additive contribution of many Band-like objects.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.Universe-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.Universe","text":"Universe(mini, maxi; width = nothing, bins = nothing)\n\nCreates a 1-D discretized segment that starts at mini and ends at maxi. The width of the bins for the discretization can either be provided or inferred from the number of bins. Returns a Universe object.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.Universe-Tuple{Union{GaussianBand, LorentzianBand}}","page":"Full API","title":"ChemometricsTools.Universe","text":"(U::Universe)(Band::Union{ GaussianBand, LorentzianBand})\n\nA Universe objects internal \"spectra\" can be updated to include the additive contribution of any Band-like object.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.ALSSmoother-Tuple{Any}","page":"Full API","title":"ChemometricsTools.ALSSmoother","text":"ALSSmoother(X; lambda = 100, p = 0.001, maxiters = 10)\n\nApplies an assymetric least squares smoothing function to a 2-Array X. The lambda, p, and maxiters parameters control the smoothness. See the reference below for more information.\n\nPaul H. C. Eilers, Hans F.M. Boelens. Baseline Correction with Asymmetric Least Squares Smoothing. 2005\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.AssessHealth-Tuple{Any}","page":"Full API","title":"ChemometricsTools.AssessHealth","text":"AssessHealth( X )\n\nReturns a somewhat detailed Dict containing information about the 'health' of a dataset. What is included is the following: - PercentMissing: percent of missing entries (includes nothing, inf / nan) in the dataset - EmptyColumns: the columns which have only 1 value - RankEstimate: An estimate of the rank of X - (optional)Duplicates: returns the rows of duplicate observations\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.ClassificationTree-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.ClassificationTree","text":"ClassificationTree(x, y; gainfn = entropy, maxdepth = 4, minbranchsize = 3)\n\nBuilds a CART object using either gini or entropy as a partioning method. Y must be a one hot encoded 2-Array. Predictions can be formed by calling the following function from the CART object: (M::CART)(x).\n\n*Note: this is a purely nonrecursive decision tree. The julia compiler doesn't like storing structs of nested things. I wrote it the recursive way in the past and it was quite slow, I think this is true also of interpretted languages like R/Python...So here it is, nonrecursive tree's!\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.ColdToHot-Tuple{Any,ChemometricsTools.ClassificationLabel}","page":"Full API","title":"ChemometricsTools.ColdToHot","text":"ColdToHot(Y, Schema::ClassificationLabel)\n\nTurns a cold encoded Y vector into a one hot encoded array.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.DirectStandardization-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.DirectStandardization","text":"DirectStandardization(InstrumentX1, InstrumentX2; Factors = minimum(collect(size(InstrumentX1))) - 1)\n\nMakes a DirectStandardization object to facilitate the transfer from Instrument #2 to Instrument #1 . The returned object can be used to transfer unseen data to the approximated space of instrument 1. The number of Factors used are those from the internal orthogonal basis.\n\nYongdong Wang and Bruce R. Kowalski, \"Calibration Transfer and Measurement Stability of Near-Infrared Spectrometers,\" Appl. Spectrosc. 46, 764-771 (1992)\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.EWMA-Tuple{Array,Float64}","page":"Full API","title":"ChemometricsTools.EWMA","text":"EWMA(Initial::Float64, Lambda::Float64) = ewma(Lambda, Initial, Initial, RunningVar(Initial))\n\nConstructs an exponentially weighted moving average object from an vector of scalar property values Initial and the decay parameter Lambda. This computes the running statistcs neccesary for creating the EWMA model using the interval provided and updates the center value to the mean of the provided values.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.EWMA-Tuple{Float64,Float64}","page":"Full API","title":"ChemometricsTools.EWMA","text":"EWMA(Initial::Float64, Lambda::Float64) = ewma(Lambda, Initial, Initial, RunningVar(Initial))\n\nConstructs an exponentially weighted moving average object from an initial scalar property value Initial and the decay parameter Lambda. This defaults the center value to be the initial value.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.EmpiricalQuantiles-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.EmpiricalQuantiles","text":"EmpiricalQuantiles(X, quantiles)\n\nFinds the column-wise quantiles of 2-Array X and returns them in a 2-Array of size quantiles by variables. *Note: This copies the array... Use a subset if memory is the concern. *\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.ExplainedVariance-Tuple{LDA}","page":"Full API","title":"ChemometricsTools.ExplainedVariance","text":"ExplainedVariance(lda::LDA)\n\nCalculates the explained variance of each singular value in an LDA object.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.ExplainedVariance-Tuple{PCA}","page":"Full API","title":"ChemometricsTools.ExplainedVariance","text":"ExplainedVariance(PCA::PCA)\n\nCalculates the explained variance of each singular value in a pca object.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.ExtremeLearningMachine","page":"Full API","title":"ChemometricsTools.ExtremeLearningMachine","text":"ExtremeLearningMachine(X, Y, ReservoirSize = 10; ActivationFn = sigmoid)\n\nReturns a ELM regression model object from arrays X and Y, with a user specified ReservoirSize and ActivationFn.\n\nExtreme learning machine: a new learning scheme of feedforward neural networks. Guang-Bin Huang ; Qin-Yu Zhu ; Chee-Kheong Siew. \t2004 IEEE International Joint...\n\n\n\n\n\n","category":"function"},{"location":"man/FullAPI/#ChemometricsTools.FirstDerivative-Tuple{Any}","page":"Full API","title":"ChemometricsTools.FirstDerivative","text":"FirstDerivative(X)\n\nUses the finite difference method to compute the first derivative for every row in X. Note: This operation results in the loss of a column dimension.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.FractionalDerivative","page":"Full API","title":"ChemometricsTools.FractionalDerivative","text":"FractionalDerivative(Y, X = 1 : length(Y); Order = 0.5)\n\nCalculates the Grunwald-Leitnikov fractional order derivative on every row of Array Y. Array X is a vector that has the spacing between column-wise entries in Y. X can be a scalar if that is constant (common in spectroscopy). Order is the fractional order of the derivative. Note: This operation results in the loss of a column dimension.\n\nThe Fractional Calculus, by Oldham, K.; and Spanier, J. Hardcover: 234 pages. Publisher: Academic Press, 1974. ISBN 0-12-525550-0\n\n\n\n\n\n","category":"function"},{"location":"man/FullAPI/#ChemometricsTools.HLDA-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.HLDA","text":"HLDA(X, YHOT; K = 1, Factors = 1)\n\nCompute's a Hierarchical LinearDiscriminantAnalysis transform from x with a user specified number of latent variables(Factors). The adjacency matrices are created from K nearest neighbors.\n\nReturns an LDA object. Note: this can be used with any other LDA functions such as Gaussian discriminants or explained variance.\n\nLu D, Ding C, Xu J, Wang S. Hierarchical Discriminant Analysis. Sensors (Basel). 2018 Jan 18;18(1). pii: E279. doi: 10.3390/s18010279.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.HighestVote-Tuple{Any}","page":"Full API","title":"ChemometricsTools.HighestVote","text":"HighestVote(yhat)\n\nReturns the column index for each row that has the highest value in one hot encoded yhat. Returns a one cold encoded vector.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.HighestVoteOneHot-Tuple{Any}","page":"Full API","title":"ChemometricsTools.HighestVoteOneHot","text":"HighestVoteOneHot(yhat)\n\nTurns the highest column-wise value to a 1 and the others to zeros per row in a one hot encoded yhat. Returns a one cold encoded vector.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.HotToCold-Tuple{Any,ChemometricsTools.ClassificationLabel}","page":"Full API","title":"ChemometricsTools.HotToCold","text":"HotToCold(Y, Schema::ClassificationLabel)\n\nTurns a one hot encoded Y array into a cold encoded vector.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.IntervalOverlay-Tuple{Any,Any,Any}","page":"Full API","title":"ChemometricsTools.IntervalOverlay","text":"IntervalOverlay(Spectra, Intervals, Err)\n\nDisplays the relative error(Err) of each interval ontop of a Spectra.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.IsColdEncoded-Tuple{Any}","page":"Full API","title":"ChemometricsTools.IsColdEncoded","text":"IsColdEncoded(Y)\n\nReturns a boolean true if the array Y is cold encoded, and false if not.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.KennardStone-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.KennardStone","text":"KennardStone(X, TrainSamples; distance = \"euclidean\")\n\nReturns the indices of the Kennard-Stone sampled exemplars (E), and those not sampled (O) as a 2-Tuple (E, O).\n\nR. W. Kennard & L. A. Stone (1969) Computer Aided Design of Experiments, Technometrics, 111, 137-148, DOI: 10.1080/00401706.1969.10490666\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.KernelRidgeRegression-Tuple{Any,Any,Any}","page":"Full API","title":"ChemometricsTools.KernelRidgeRegression","text":"KernelRidgeRegression( X, Y, Penalty; KernelParameter = 0.0, KernelType = \"linear\" )\n\nMakes a KernelRidgeRegression model of the form Y = AK using a user specified Kernel(\"Linear\", or \"Guassian\") and has an L2 Penalty. Returns a KRR Wrapper for a CLS object.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.LabelEncoding-Tuple{Any}","page":"Full API","title":"ChemometricsTools.LabelEncoding","text":"\" LabelEncoding(HotOrCold)\n\nDetermines if an Array, Y, is one hot encoded, or cold encoded by it's dimensions. Returns a ClassificationLabel object/schema to convert between the formats.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.LeaveOneOut-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.LeaveOneOut","text":"LeaveOneOut(x, y)\n\nReturns a KFoldsValidation iterator with leave one out folds. Because it's an iterator it can be used in for loops, see the tutorials for pragmatic examples. The iterator returns a 2-Tuple of 2-Tuples which have the following form: ((TrainX,TrainY),(ValidateX,ValidateY).\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.Lifeform-Tuple{Any,Any,Any}","page":"Full API","title":"ChemometricsTools.Lifeform","text":"Lifeform(size, onlikelihood, initialscore)\n\nConstructor for a BinaryLifeForm struct. Binary life forms are basically wrappers for a binary vector, which has a likelihood for being 1(onlikelihood). Each life form also has a score based on it's \"fitness\". So the GA's in this package can be used to minimize or maximize this is an open parameter, but Inf/-Inf is a good initialscore.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.Limits-Tuple{ChemometricsTools.ewma}","page":"Full API","title":"ChemometricsTools.Limits","text":"Limits(P::ewma; k = 3.0)\n\nThis function returns the upper and lower control limits with a k span of variance for an EWMA object P.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.Logit-Tuple{Any}","page":"Full API","title":"ChemometricsTools.Logit","text":"Logit(Z; inverse = false)\n\nLogit transforms (ln( X / (1 - X) ))) every element in Z. The inverse may also be applied. Warning: This can return Infs and NaNs if elements of Z are not suited to the transform\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.MAE-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.MAE","text":"MAE( y, yhat )\n\nCalculates Mean Average Error from vectors Y and YHat\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.MAPE-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.MAPE","text":"MAPE( y, yhat )\n\nCalculates Mean Average Percent Error from vectors Y and YHat\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.ME-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.ME","text":"ME( y, yhat )\n\nCalculates Mean Error from vectors Y and YHat.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.MSE-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.MSE","text":"MSE( y, yhat )\n\nCalculates Mean Squared Error from vectors Y and YHat\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.Mean-Tuple{RunningMean}","page":"Full API","title":"ChemometricsTools.Mean","text":"Mean(rv::RunningMean)\n\nReturns the current mean inside of a RunningMean object.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.Mean-Tuple{RunningVar}","page":"Full API","title":"ChemometricsTools.Mean","text":"Mean(rv::RunningVar)\n\nReturns the current mean inside of a RunningVar object.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.MultiNorm-Tuple{Any}","page":"Full API","title":"ChemometricsTools.MultiNorm","text":"MultiNorm(T)\n\nComputes the equivalent of the Froebinius norm on a tensor T. Returns a scalar.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.MultiPCA-Tuple{Any}","page":"Full API","title":"ChemometricsTools.MultiPCA","text":"MultiPCA(X; Factors = 2)\n\nPerforms multiway PCA aka Higher Order SVD aka Tucker, etc. The number of factors decomposed can be a scalar(repeated across all modes) or a vector/tuple for each mode.\n\nReturns a tuple of (Core Tensor, Basis Tensors)\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.MulticlassStats-Tuple{Any,Any,Any}","page":"Full API","title":"ChemometricsTools.MulticlassStats","text":"MulticlassStats(Y, GT, schema; Microaverage = true)\n\nCalculates many essential classification statistics based on predicted values Y, and ground truth values GT, using the encoding schema. Returns a tuple whose first entry is a dictionary of averaged statistics, and whose second entry is a dictionary of the form \"Class\" => Statistics Dictionary ...\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.MulticlassThreshold-Tuple{Any}","page":"Full API","title":"ChemometricsTools.MulticlassThreshold","text":"MulticlassThreshold(yhat; level = 0.5)\n\nEffectively does the same thing as Threshold() but per-row across columns.\n\nWarning this function can allow for no class assignments. HighestVote is preferred\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.Mutate","page":"Full API","title":"ChemometricsTools.Mutate","text":"Mutate( L::BinaryLifeform, amount = 0.05 )\n\nAssesses each element in the gene vector (inside of L). If a randomly drawn value has a binomial probability of amount the element is mutated.\n\n\n\n\n\n","category":"function"},{"location":"man/FullAPI/#ChemometricsTools.OneHotOdds-Tuple{Any}","page":"Full API","title":"ChemometricsTools.OneHotOdds","text":"OneHotOdds(Y)\n\nCalculates the odds of a one-hot formatted probability matrix. Returns a tuple.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.PCA_NIPALS-Tuple{Any}","page":"Full API","title":"ChemometricsTools.PCA_NIPALS","text":"PCA_NIPALS(X; Factors = minimum(size(X)) - 1, tolerance = 1e-7, maxiters = 200)\n\nCompute's a PCA from x using the NIPALS algorithm with a user specified number of latent variables(Factors). The tolerance is the minimum change in the F norm before ceasing execution. Returns a PCA object.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.PSO-Tuple{Any,Bounds,Bounds,Int64}","page":"Full API","title":"ChemometricsTools.PSO","text":"PSO(fn, Bounds, VelRange, Particles; tolerance = 1e-6, maxiters = 1000, InertialDecay = 0.5, PersonalWeight = 0.5, GlobalWeight = 0.5, InternalParams = nothing)\n\nMinimizes function fn with-in the user specified Bounds via a Particle Swarm Optimizer. The particle velocities are limitted to the VelRange. The number of particles are defined by the Particles parameter.\n\nReturns a Tuple of the following form: ( GlobalBestPos, GlobalBestScore, P ) Where P is an array of the particles used in the optimization.\n\n*Note: if the optimization function requires an additional constant parameter, please pass that parameter to InternalParams. This will only work if the optimized parameter(o) and constant parameter(c) for the function of interest has the following format: F(o,c) *\n\nKennedy, J.; Eberhart, R. (1995). Particle Swarm Optimization. Proceedings of IEEE International Conference on Neural Networks. IV. pp. 1942–1948. doi:10.1109/ICNN.1995.488968\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.PearsonCorrelationCoefficient-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.PearsonCorrelationCoefficient","text":"PearsonCorrelationCoefficient( y, yhat )\n\nCalculates The Pearson Correlation Coefficient from vectors Y and YHat\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.PercentRMSE-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.PercentRMSE","text":"PercentRMSE( y, yhat )\n\nCalculates Percent Root Mean Squared Error from vectors Y and YHat\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.PerfectSmoother-Tuple{Any}","page":"Full API","title":"ChemometricsTools.PerfectSmoother","text":"PerfectSmoother(X; lambda = 100)\n\nApplies an assymetric least squares smoothing function to a a 2-Array X. The lambda parameter controls the smoothness. See the reference below for more information.\n\nPaul H. C. Eilers. \"A Perfect Smoother\". Analytical Chemistry, 2003, 75 (14), pp 3631–3636.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.Pipeline-Tuple{Any,Vararg{Any,N} where N}","page":"Full API","title":"ChemometricsTools.Pipeline","text":"Pipeline( X, FnStack... )\n\nConstruct a pipeline object from vector/tuple of Transforms. The Transforms vector are effectively a vector of functions which transform data.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.Pipeline-Tuple{Any}","page":"Full API","title":"ChemometricsTools.Pipeline","text":"Pipeline(Transforms)\n\nConstructs a transformation pipeline from vector/tuple of Transforms. The Transforms vector are effectively a vector of functions which transform data.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.PipelineInPlace-Tuple{Any,Vararg{Any,N} where N}","page":"Full API","title":"ChemometricsTools.PipelineInPlace","text":"PipelineInPlace( X, FnStack...)\n\nConstruct a pipeline object from vector/tuple of Transforms. The Transforms vector are effectively a vector of functions which transform data. This function makes \"inplace\" changes to the Array X as though it has been sent through the pipeline. This is more efficient if memory is a concern, but can irreversibly transform data in memory depending on the transforms in the pipeline.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.RAFFT-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.RAFFT","text":"RAFFT(raw, reference; maxlags::Int = 500, lookahead::Int = 1, minlength::Int = 20, mincorr::Float64 = 0.05)\n\nRAFFT corrects shifts in the raw spectral bands to be similar to those in a given reference spectra through the use of \"recursive alignment by FFT\". It returns an array of corrected spectra/chromatograms. The number of maximum lags can be specified, the lookahead parameter ensures that additional recursive executions are performed so the first solution found is not preemptively accepted, the minimum segment length(minlength) can also be specified if FWHM are estimable, and the minimum cross correlation(mincorr) for a match can dictate whether peaks were found to align or not.\n\nNote This method works best with flat baselines because it repeats last known values when padding aligned spectra. It is highly efficient, and in my tests does a good job, but other methods definitely exist. Let me know if other peak Alignment methods are important for your work-flow, I'll see if I can implement them.\n\nApplication of Fast Fourier Transform Cross-Correlation for the Alignment of Large Chromatographic and Spectral Datasets Jason W. H. Wong, Caterina Durante, and, Hugh M. Cartwright. Analytical Chemistry 2005 77 (17), 5655-5661\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.RMSE-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.RMSE","text":"RMSE( y, yhat )\n\nCalculates Root Mean Squared Error from vectors Y and YHat\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.RSquare-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.RSquare","text":"RSquare( y, yhat )\n\nCalculates R^2 from Y and YHat\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.Remove!-Tuple{RunningMean,Any}","page":"Full API","title":"ChemometricsTools.Remove!","text":"Remove!(RM::RunningMean, x)\n\nRemoves an observation(x) from a RunningMean object(RM) and reculates the mean in place.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.Remove-Tuple{RunningMean,Any}","page":"Full API","title":"ChemometricsTools.Remove","text":"Remove!(RM::RunningMean, x)\n\nRemoves an observation(x) from a RunningMean object(RM) and recuturns the new RunningMean object.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.SSE-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.SSE","text":"SSE( y, yhat )\n\nCalculates Sum of Squared Errors from vectors Y and YHat\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.SSReg-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.SSReg","text":"SSReg( y, yhat )\n\nCalculates Sum of Squared Deviations due to Regression from vectors Y and YHat\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.SSRes-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.SSRes","text":"SSRes( y, yhat )\n\nCalculates Sum of Squared Residuals from vectors Y and YHat\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.SSTotal-Tuple{Any}","page":"Full API","title":"ChemometricsTools.SSTotal","text":"SSTotal( y, yhat )\n\nCalculates Total Sum of Squared Deviations from vectors Y and YHat\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.SampleSkewness-Tuple{Any}","page":"Full API","title":"ChemometricsTools.SampleSkewness","text":"SampleSkewness(X)\n\nreturns a measure of skewness for vector X that is corrected for a sample of the population.\n\nJoanes, D. N., and C. A. Gill. 1998. “Comparing Measures of Sample Skewness and Kurtosis”. The Statistician 47(1): 183–189.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.SavitzkyGolay-Tuple{Any,Any,Any,Int64}","page":"Full API","title":"ChemometricsTools.SavitzkyGolay","text":"SavitzkyGolay(X, Delta, PolyOrder, windowsize)\n\nPerforms SavitskyGolay smoothing across every row in an Array X. The window size is the size of the convolution filter, PolyOrder is the order of the polynomial, and Delta is the order of the derivative.\n\nSavitzky, A.; Golay, M.J.E. (1964). \"Smoothing and Differentiation of Data by Simplified Least Squares Procedures\". Analytical Chemistry. 36 (8): 1627–39. doi:10.1021/ac60214a047.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.Scale1Norm-Tuple{Any}","page":"Full API","title":"ChemometricsTools.Scale1Norm","text":"Scale1Norm(X)\n\nScales the columns of X by the 1-Norm of each row. Returns the scaled array.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.Scale2Norm-Tuple{Any}","page":"Full API","title":"ChemometricsTools.Scale2Norm","text":"Scale2Norm(X)\n\nScales the columns of X by the 2-Norm of each row. Returns the scaled array.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.ScaleInfNorm-Tuple{Any}","page":"Full API","title":"ChemometricsTools.ScaleInfNorm","text":"ScaleInfNorm(X)\n\nScales the columns of X by the Inf-Norm of each row. Returns the scaled array.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.ScaleMinMax-Tuple{Any}","page":"Full API","title":"ChemometricsTools.ScaleMinMax","text":"ScaleMinMax(X)\n\nScales the columns of X by the Min and Max of each row such that no observation is greater than 1 or less than zero. Returns the scaled array.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.SecondDerivative-Tuple{Any}","page":"Full API","title":"ChemometricsTools.SecondDerivative","text":"FirstDerivative(X)\n\nUses the finite difference method to compute the second derivative for every row in X. Note: This operation results in the loss of two columns.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.Shuffle!-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.Shuffle!","text":"Shuffle!( X, Y )\n\nShuffles the rows of the X and Y data without replacement in place. In place, means that this function alters the order of the data in memory and this function does not return anything.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.Shuffle-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.Shuffle","text":"Shuffle( X, Y )\n\nShuffles the rows of the X and Y data without replacement. It returns a 2-Tuple of the shuffled set.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.SinglePointCrossOver-Tuple{BinaryLifeform,BinaryLifeform}","page":"Full API","title":"ChemometricsTools.SinglePointCrossOver","text":"SinglePointCrossOver( L1::BinaryLifeform, L2::BinaryLifeform )\n\nCreates two offspring (new BinaryLifeForms) by mixing the genes from L1 and L2 after a random position in the vector.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.Skewness-Tuple{Any}","page":"Full API","title":"ChemometricsTools.Skewness","text":"Skewness(X)\n\nreturns a measure of skewness for a population vector X.\n\nJoanes, D. N., and C. A. Gill. 1998. “Comparing Measures of Sample Skewness and Kurtosis”. The Statistician 47(1): 183–189.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.SplitByProportion","page":"Full API","title":"ChemometricsTools.SplitByProportion","text":"SplitByProportion(X::Array, Y::Array,Proportion::Float64 = 0.5)\n\nSplits an X and Associated Y Array along the observations dimension into a 2-Tuple of 2-Tuples based on the Proportion. The form of the output is the following: ( (X1, Y1), (X2, Y2) )\n\n\n\n\n\n","category":"function"},{"location":"man/FullAPI/#ChemometricsTools.SplitByProportion","page":"Full API","title":"ChemometricsTools.SplitByProportion","text":"SplitByProportion(X::Array, Proportion::Float64 = 0.5)\n\nSplits X Array along the observations dimension into a 2-Tuple based on the Proportion. The form of the output is the following: ( X1, X2 )\n\n\n\n\n\n","category":"function"},{"location":"man/FullAPI/#ChemometricsTools.StandardNormalVariate-Tuple{Any}","page":"Full API","title":"ChemometricsTools.StandardNormalVariate","text":"StandardNormalVariate(X)\n\nScales the columns of X by the mean and standard deviation of each row. Returns the scaled array.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.StatsDictToDataFrame-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.StatsDictToDataFrame","text":"StatsDictToDataFrame(DictOfStats, schema)\n\nConverts a dictionary of statistics which is returned from MulticlassStats into a labelled dataframe. This is an intermediate step for automated report generation.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.StatsFromTFPN-NTuple{4,Any}","page":"Full API","title":"ChemometricsTools.StatsFromTFPN","text":"StatsFromTFPN(TP, TN, FP, FN)\n\nCalculates many essential classification statistics based on the numbers of True Positive(TP), True Negative(TN), False Positive(FP), and False Negative(FN) examples.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.Threshold-Tuple{Any}","page":"Full API","title":"ChemometricsTools.Threshold","text":"Threshold(yhat; level = 0.5)\n\nFor a binary vector yhat this decides if the label is a 0 or a 1 based on it's value relative to a threshold level.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.Update!-Tuple{RunningMean,Any}","page":"Full API","title":"ChemometricsTools.Update!","text":"Update!(RM::RunningMean, x)\n\nAdds new observation(x) to a RunningMean object(RM) in place.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.Update!-Tuple{RunningVar,Any}","page":"Full API","title":"ChemometricsTools.Update!","text":"Update!(RV::RunningVar, x)\n\nAdds new observation(x) to a RunningVar object(RV) and updates it in place.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.Update-Tuple{RunningMean,Any}","page":"Full API","title":"ChemometricsTools.Update","text":"Update!(RM::RunningMean, x)\n\nAdds new observation(x) to a RunningMean object(RM) and returns the new object.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.Variance-Tuple{ChemometricsTools.ewma}","page":"Full API","title":"ChemometricsTools.Variance","text":"Variance(P::ewma)\n\nThis function returns the EWMA control variance.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.Variance-Tuple{RunningVar}","page":"Full API","title":"ChemometricsTools.Variance","text":"Variance(rv::RunningVar)\n\nReturns the current variance inside of a RunningVar object.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.VenetianBlinds-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.VenetianBlinds","text":"VenetianBlinds(X,Y)\n\nSplits an X and associated Y Array along the observation dimension into a 2-Tuple of 2-Tuples based on the whether it is even or odd. The form of the output is the following: ( (X1,Y1), (X2, Y2) )\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.VenetianBlinds-Tuple{Any}","page":"Full API","title":"ChemometricsTools.VenetianBlinds","text":"VenetianBlinds(X)\n\nSplits an X Array along the observations dimension into a 2-Tuple of 2-Tuples based on the whether it is even or odd. The form of the output is the following: ( X1, X2 )\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.boxcar-Tuple{Any}","page":"Full API","title":"ChemometricsTools.boxcar","text":"boxcar(X; windowsize = 3, fn = mean)\n\nApplies a boxcar function (fn) to each window of size windowsize to every row in X. Note: the function provided must support a dims argument/parameter.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.entropy-Tuple{Any}","page":"Full API","title":"ChemometricsTools.entropy","text":"entropy(v)\n\nCalculates the Shannon-Entropy of a probability vector v. Returns a scalar. A common gain function used in tree methods.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.findpeaks-Tuple{Any}","page":"Full API","title":"ChemometricsTools.findpeaks","text":"findpeaks( vY; m = 3)\n\nFinds the indices of peaks in a vector vY with a window span of 2m. Original R function by Stas_G:(https://stats.stackexchange.com/questions/22974/how-to-find-local-peaks-valleys-in-a-series-of-data) This version is based on a C++ variant by me.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.gini-Tuple{Any}","page":"Full API","title":"ChemometricsTools.gini","text":"gini(p)\n\nCalculates the GINI coefficient of a probability vector p. Returns a scalar. A common gain function used in tree methods.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.offsetToZero-Tuple{Any}","page":"Full API","title":"ChemometricsTools.offsetToZero","text":"offsetToZero(X)\n\nEnsures that no observation(row) of Array X is less than zero, by ensuring the minimum value of each row is zero.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.plotchem-Tuple{QQ}","page":"Full API","title":"ChemometricsTools.plotchem","text":"plotchem(QQ::{QQ, BlandAltman}; title )\n\nreturns either a QQ Plot or a Bland-Altman plot with the defined title\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.rbinomial-Tuple{Any,Vararg{Any,N} where N}","page":"Full API","title":"ChemometricsTools.rbinomial","text":"rbinomial( p, size... )\n\nMakes an N-dimensional array of size(s) size with a probability of being a 1 over a 0 of 1 p.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.sigmoid-Tuple{Any}","page":"Full API","title":"ChemometricsTools.sigmoid","text":"sigmoid(x)\n\nApplies the sigmoid function to a scalar value X. Returns a scalar. Can be broad-casted over an Array.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.ssd-Tuple{Any,Any}","page":"Full API","title":"ChemometricsTools.ssd","text":"ssd(p)\n\nCalculates the sum squared deviations from a decision tree split. Accepts a vector of values, and the mean of that vector. Returns a scalar. A common gain function used in tree methods.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.DirectStandardizationXform-Tuple{Any}","page":"Full API","title":"ChemometricsTools.DirectStandardizationXform","text":"(DSX::DirectStandardizationXform)(X; Factors = length(DSX.pca.Values))\n\nApplies a the transform from a learned direct standardization object DSX to new data X.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.ELM-Tuple{Any}","page":"Full API","title":"ChemometricsTools.ELM","text":"(M::ELM)(X)\n\nMakes an inference from X using a ELM object.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.KRR-Tuple{Any}","page":"Full API","title":"ChemometricsTools.KRR","text":"(M::KRR)(X)\n\nMakes an inference from X using a KRR object which wraps a ClassicLeastSquares object.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.ewma-Tuple{Any}","page":"Full API","title":"ChemometricsTools.ewma","text":"EWMA(P::ewma)(New; train = true)\n\nProvides an EWMA score for a New scalar value. If train == true the model is updated to include this new value.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.pipeline-Tuple{Any}","page":"Full API","title":"ChemometricsTools.pipeline","text":"(P::pipeline)(X; inverse = false)\n\nApplies the stored transformations in a pipeline object P to data in X. The inverse flag can allow for the transformations to be reversed provided they are invertible functions.\n\n\n\n\n\n","category":"method"},{"location":"man/FullAPI/#ChemometricsTools.ChangeCenter-Tuple{ChemometricsTools.ewma,Float64}","page":"Full API","title":"ChemometricsTools.ChangeCenter","text":"ChangeCenter(P::ewma, new::Float64)\n\nThis is a convenience function to update the center of a P EWMA model, to a new scalar value.\n\n\n\n\n\n","category":"method"},{"location":"#ChemometricsTools.jl-1","page":"Home","title":"ChemometricsTools.jl","text":"","category":"section"},{"location":"#","page":"Home","title":"Home","text":"A Chemometrics Suite for Julia.","category":"page"},{"location":"#","page":"Home","title":"Home","text":"This package offers access to essential chemometrics methods in a convenient and reliable way. It is a lightweight library written for performance and longevity. That being said, it's still a bit of a work in progress and if you find any bugs please make an issue!","category":"page"},{"location":"#Installation:-1","page":"Home","title":"Installation:","text":"","category":"section"},{"location":"#","page":"Home","title":"Home","text":"using Pkg\nPkg.add(\"ChemometricsTools\")","category":"page"},{"location":"#Support:-1","page":"Home","title":"Support:","text":"","category":"section"},{"location":"#","page":"Home","title":"Home","text":"This package was written in Julia 1.0.3 but should run fine in 1.1 or later releases. That's the beauty of from scratch code with minimal dependencies.","category":"page"},{"location":"#Ethos-1","page":"Home","title":"Ethos","text":"","category":"section"},{"location":"#","page":"Home","title":"Home","text":"Dependencies: Only base libraries (LinearAlgebra, StatsBase, Statistics, Plots) etc will be required. This is for longevity, and to provide a fast precompilation time. As wonderful as it is that other packages exist to do some of the internal operations this one needs, we won't have to worry about a breaking change made by an external author working out the kinks in a separate package. I want this to be long-term reliable without much upkeep. I'm a busy guy working a day job; I write this to warm-up before work, and unwind afterwards.","category":"page"},{"location":"#","page":"Home","title":"Home","text":"Arrays Only: In it's current state all of the algorithms available in this package operate exclusively on 1 or 2 Arrays. To be specific, the format of input arrays should be such that the number of rows are the observations, and the number of columns are the variables. This choice was made out of convenience and my personal bias. If enough users want DataFrames, Tables, JuliaDB formats, maybe this will change.","category":"page"},{"location":"#","page":"Home","title":"Home","text":"Center-Scaling: None of the methods in this package will center and scale for you. This package won't waste your time deciding if it should auto-center/scale large chunks of data every-time you do a regression/classification.","category":"page"},{"location":"#Why-Julia?-1","page":"Home","title":"Why Julia?","text":"","category":"section"},{"location":"#","page":"Home","title":"Home","text":"In Julia we can do mathematics like R or Matlab (no installations/imports), but write glue code as easily as python, with the expressiveness of scala, with (often) the performance of C/C++. Multidispatch makes recycling code painless, and broadcasting allows for intuitive application of operations across collections. I'm not a soft-ware engineer, but, these things have made Julia my language of choice. Try it for a week on Julia 1.0.3, if you don't get hooked, I'd be surprised.","category":"page"}] } diff --git a/src/CurveResolution.jl b/src/CurveResolution.jl index 391834f..4ea4574 100644 --- a/src/CurveResolution.jl +++ b/src/CurveResolution.jl @@ -256,10 +256,12 @@ end """ MCRALS(X, C, S = nothing; norm = (false, false), Factors = 1, maxiters = 20, constraintiters = 500, nonnegative = (false, false), unimodalS = false, fixedunimodal = false ) -Performs Multivariate Curve Resolution using Alternating Least Squares on `X` taking initial estimates for `S` or `C`. -S or C can be constrained by their `norm`, or by nonnegativity using `nonnegative` arguments. S can be constrained by unimodality(EXPERIMENTAL). - +Performs Multivariate Curve Resolution using Alternating Least Squares on `X` taking initial estimates for either `S` or `C`. +The number of maximum iterations for the ALS updates can be set `maxiters`. +S or C can be constrained by their `norm`(true/false,true/false), or by nonnegativity by using `nonnegative` arguments (true/false,true/false). +S can also be constrained by unimodality(`unimodalS`). Two unimodal algorithms exist the `fixedunimodal`(true), and the quadratic (false). The number of resolved `Factors` can also be set. +The number of maximum iterations for constraints can be set by `constraintiters`. Tauler, R. Izquierdo-Ridorsa, A. Casassas, E. Simultaneous analysis of several spectroscopic titrations with self-modelling curve resolution.Chemometrics and Intelligent Laboratory Systems. 18, 3, (1993), 293-300. """