Les indices de similitudes proposés dans iramuteq sont ceux disponibles dans la librairie proxy écrit par David Meyer et Christian Buchta. La description des indices suivante est extraite de la documentation de cette librairie.
id | var1 | var2 | var3 |
---|---|---|---|
1 | 1 | 0 | 1 |
2 | 0 | 1 | 1 |
3 | 1 | 1 | 0 |
4 | 0 | 0 | 1 |
5 | 0 | 1 | 1 |
Tableau 1
var1 | |||
---|---|---|---|
1 | 0 | ||
var2 | 1 | a | b |
0 | c | d |
Tableau 2
n = a + b + c + d = nombre de ligne du tableau
a
voir Russel
names Jaccard, binary, Reyssac, Roux type binary loop FALSE formula a / (a + b + c) reference Jaccard, P. (1908). Nouvelles recherches sur la distribution florale. Bull. Soc. Vaud. Sci. Nat., 44, pp. 223--270. description The Jaccard Similarity (C implementation) for binary data. It is the proportion of (TRUE, TRUE) pairs, but not considering (FALSE, FALSE) pairs. So it compares the intersection with the union of object sets.
names Kulczynski1 type binary loop TRUE formula a / (b + c) reference Kurzcynski, T.W. (1970). Generalized distance and discrete variables. Biometrics, 26, pp. 525--534. description Kulczynski Similarity for binary data. Relates the (TRUE, TRUE) pairs to discordant pairs.
names Kulczynski2 type binary loop TRUE formula [a / (a + b) + a / (a + c)] / 2 reference Kurzcynski, T.W. (1970). Generalized distance and discrete variables. Biometrics, 26, pp. 525--534. description Kulczynski Similarity for binary data. Relates the (TRUE, TRUE) pairs to the discordant pairs.
names Mountford type binary loop TRUE formula 2a / (ab + ac + 2bc) reference Mountford, M.D. (1962). An index of similarity and its application to classificatory probems. In P.W. Murphy (ed.), Progress in Soil Zoology, pp. 43--50. Butterworth, London. description The Mountford Similarity for binary data.
names Fager, McGowan type binary loop TRUE formula a / sqrt( (a + b)(a + c) ) - 1 / 2 sqrt(a + c) reference Fager, E. W. and McGowan, J. A. (1963). Zooplankton species groups in the North Pacific. Science, N. Y. 140: 453-460 description The Fager / McGowan distance.
names Russel, Rao type binary loop TRUE formula a / n reference Russell, P.F., and Rao T.R. (1940). On habitat and association of species of anopheline larvae in southeastern, Madras, J. Malaria Inst. India 3, pp. 153--178 description The Russel/Rao Similarity for binary data. It is just the proportion of (TRUE, TRUE) pairs.
names simple matching, Sokal/Michener type binary loop TRUE formula (a + d) / n reference Sokal, R.R., and Michener, C.D. (1958). A statistical method for evaluating systematic relationships. Univ. Kansas Sci. Bull., 39, pp. 1409--1438. description The Simple Matching Similarity or binary data. It is the proportion of concordant pairs.
names Hamman type binary loop TRUE formula ([a + d] - [b + c]) / n reference Hamann, U. (1961). Merkmalbestand und Verwandtschaftsbeziehungen der Farinosae. Ein Beitrag zum System der Monokotyledonen. Willdenowia, 2, pp. 639-768. description The Hamman Matching Similarity for binary data. It is the proportion difference of the concordant and discordant pairs.
names Faith type binary loop TRUE formula (a + d/2) / n reference Belbin, L., Marshall, C. & Faith, D.P. (1983). Representing relationships by automatic assignment of colour. The Australian Computing Journal 15, 160-163. description The Faith similarity
names Tanimoto, Rogers type binary loop TRUE formula (a + d) / (a + 2b + 2c + d) reference Rogers, D.J, and Tanimoto, T.T. (1960). A computer program for classifying plants. Science, 132, pp. 1115--1118. description The Rogers/Tanimoto Similarity for binary data. Similar to the simple matching coefficient, but putting double weight on the discordant pairs.
names Dice, Czekanowski, Sorensen type binary loop TRUE formula 2a / (2a + b + c) reference Dice, L.R. (1945). Measures of the amount of ecologic association between species. Ecolology, 26, pp. 297--302. description The Dice Similarity
names Phi type binary loop TRUE formula (ad - bc) / sqrt[(a + b)(c + d)(a + c)(b + d)] reference Sokal, R.R, and Sneath, P.H.A. (1963). Principles of numerical taxonomy. W.H. Freeman and Company, San Francisco. description The Phi Similarity (= Product-Moment-Correlation for binary variables)
names Stiles type binary loop TRUE formula log(n(|ad-bc| - 0.5n)^2 / [(a + b)(c + d)(a + c)(b + d)]) reference Stiles, H.E. (1961). The association factor in information retrieval. Communictions of the ACM, 8, 1, pp. 271--279. description The Stiles Similarity (used for information retrieval). Identical to the logarithm of Krylov's distance.
names Michael type binary loop TRUE formula 4(ad - bc) / [(a + d)^2 + (b + c)^2] reference Cox, T.F., and Cox, M.A.A. (2001). Multidimensional Scaling. Chapmann and Hall. description The Michael Similarity
names Mozley, Margalef type binary loop TRUE formula an / (a + b)(a + c) reference Margalef, D.R. (1958). Information theory in ecology. Gen.Systems, 3, pp. 36--71. description The Mozley/Margalef Similarity
names Yule type binary loop TRUE formula (ad - bc) / (ad + bc) reference Yule, G.U. (1912). On measuring associations between attributes. J. Roy. Stat. Soc., 75, pp. 579--642. description Yule Similarity
names Yule2 type binary loop TRUE formula (sqrt(ad) - sqrt(bc)) / (sqrt(ad) + sqrt(bc)) reference Yule, G.U. (1912). On measuring associations between attributes. J. Roy. Stat. Soc., 75, pp. 579--642. description Yule Similarity
names Ochiai type binary loop TRUE formula a / sqrt[(a + b)(a + c)] reference Sokal, R.R, and Sneath, P.H.A. (1963). Principles of numerical taxonomy. W.H. Freeman and Company, San Francisco. description The Ochiai Similarity
names Simpson type binary loop TRUE formula a / min{(a + b), (a + c)} reference Simpson, G.G. (1960). Notes on the measurement of faunal resemblance. American Journal of Science 258-A: 300-311. description The Simpson Similarity (used in Zoology).
names Braun-Blanquet type binary loop TRUE formula a / max{(a + b), (a + c)} reference Braun-Blanquet, J. (1964): Pflanzensoziologie. Springer Verlag, Wien and New York. description The Braun-Blanquet Similarity (used in Biology).