| Type: | Package | 
| Title: | Refined Modified Stahel-Donoho (MSD) Estimators for Outlier Detection (Parallel Version) | 
| Version: | 0.1.1 | 
| Suggests: | testthat (≥ 3.0.0) | 
| Depends: | R (≥ 2.10), stats | 
| Imports: | parallel, doParallel, foreach | 
| Description: | A parallel function for multivariate outlier detection named modified Stahel-Donoho estimators is contained in this package. The function RMSDp() is for elliptically distributed datasets and recognizes outliers based on Mahalanobis distance. This function is for higher dimensional datasets that cannot be handled by a single core function RMSD() included in 'RMSD' package. See Wada and Tsubaki (2013) <doi:10.1109/CLOUDCOM-ASIA.2013.86> for the detail of the algorithm. | 
| License: | GPL (≥ 3) | 
| Encoding: | UTF-8 | 
| Language: | en-US | 
| RoxygenNote: | 7.3.1 | 
| Config/testthat/edition: | 3 | 
| LazyData: | true | 
| NeedsCompilation: | no | 
| Packaged: | 2024-06-10 13:48:49 UTC; wada | 
| Author: | Kazumi Wada | 
| Maintainer: | Kazumi Wada <kazwd2008@gmail.com> | 
| Repository: | CRAN | 
| Date/Publication: | 2024-06-12 21:00:21 UTC | 
Modified Stahel-Donoho Estimators (parallel version)
Description
This function is for multivariate outlier detection. version 0.0.1 2013/06/15 Related paper: DOI: 10.1109/CLOUDCOM-ASIA.2013.86 version 0.0.2 2021/11/15 Outlier detection step added version 0.0.3 2022/08/12 Bug fixed about Random seed setting
Usage
RMSDp(inp, cores = 0, nb = 0, sd = 0, pt = 0.999, dv = 10000)
Arguments
| inp | input data (a numeric matrix) | 
| cores | number of cores used for this function | 
| nb | number of basis | 
| sd | seed (for reproducibility) | 
| pt | threshold for outlier detection (probability) | 
| dv | maximum number of elements processed together on the same core | 
Value
a list of the following information
- u final mean vector 
- V final covariance matrix 
- wt final weights 
- mah squared squared Mahalanobis distances 
- cf threshold to detect outlier (percentile point) 
- ot outlier flag (1:normal observation, 2:outlier) 
Wine dataset in UCI Machine Learning Repository
Description
A subset of data from the World Health Organization Global Tuberculosis Report ...
Usage
wine
Format
## 'wine' A data frame with 178 rows and 13 columns:
- Alcohol
- Alcohol 
- Malic acid
- Malic acid 
- Ash
- Ash 
- Alcalinity of ash
- Alcalinity of ash 
- Magnesium
- Magnesium 
- Total phenols
- Total phenols 
- Flavonoids
- Flavonoids 
- Nonflavanoid phenols
- Nonflavanoid phenols 
- Proanthocyanins
- Proanthocyanins 
- Color intensity
- Color intensity 
- Hue
- Hue 
- OD280/OD315 of diluted wines
- OD280/OD315 of diluted wines 
- Proline
- Proline 
Source
<https://archive.ics.uci.edu/dataset/109/wine>