NEWSGROUPS Dissimilarity dataset.
D = NEWSGROUPS This is a small part of the so-called 20Newsgroups data, as considered by Roweis. A non-metric correlation measure for messages from four classes of newsgroups, 'comp.*', 'rec.*', 'sci.*' and 'talk.*' are computed on the occurrence for 100 words across 16242 postings. Reference(s)E. Pekalska and R.P.W. Duin, The Dissimilarity Representation for Pattern Recognition, Foundations and Applications, World Scientific, Singapore, 2005. See alsoprtools, datasets, distools, prdisdata,
|