One problem of healthcare is the collecting and restoring the date they’ve gathered. Data mining, a step in the process of Knowledge Discovery in Databases, is a method of unearthing information from large data sets. 259 more words

Tutorial: Medical Mining

PAKDD 2016 – Medical Mining Tutorial

PAKDD 2016 Tutorial on “Medical Mining”

By: Myra Spiliopoulou, Pedro Pereira Rodrigues, Ernestina Menasalvas


Goodness-of-Fit Testing with SQL Server Part 7.3: The Anderson-Darling Test

By Steve Bolton

…………As mentioned in previous installments of this series of amateur self-tutorials, goodness-of-fit tests can be differentiated in many ways, including by the data and content types of the inputs and the mathematical properties, data types and cardinality of the outputs, not to mention the performance impact of the internal calculations in between them. 2,600 more words

Goodness-of-Fit Testing with SQL Server Part 7.2: The Lilliefors Test

By Steve Bolton

…………Since I’m teaching myself as I go in this series of self-tutorials, I often have only a vague idea of the challenges that will arise when trying to implement the next goodness-of-fit test with SQL Server. 2,066 more words

How NASA Experiments with Knowledge Discovery

Even in a mature and knowledge-driven organization like NASA, finding an answer to a common business issue can be frustrating.

What, Who, When, Where?

Title … 154 more words
Goodness-of-Fit Testing with SQL Server Part 7.1: The Kolmogorov-Smirnov and Kuiper’s Tests

By Steve Bolton

…………“The names statisticians use for non-parametric analyses are misnomers too, in my opinion: Kruskal-Wallis tests and Kolmogorov-Smirnov statistics, for example. Good grief! These analyses are simple applications of parametric modeling that belie their intimidating exotic names.”[i] 3,603 more words

Goodness-of-Fit Testing with SQL Server Part 6.2: The Ryan-Joiner Test

By Steve Bolton

…………In the last installment of this amateur series of self-tutorials, we saw how the Shapiro-Wilk Test might probably prove less useful to SQL Server users, despite the fact that it is one of the most popular goodness-of-fit tests among statisticians and researchers. 2,966 more words

