In most commonly used ranking systems, some level of underlying transitivity is assumed. If transitivity exists in a system then information about pairwise comparisons can be translated to other linked pairs. For example, if typically A beats B and B beats C, this could inform us about the expected outcome between A and C. We show that in the seminal Bradley-Terry model knowing the probabilities of A beating B and B beating C completely defines the probability of A beating C, with these probabilities determined by individual skill levels of A, B and C. Users of this model tend not to investigate the validity of this transitive assumption, nor that some skill levels may not be statistically significantly different from each other; the latter leading to false conclusions about rankings. We provide a novel extension to the Bradley-Terry model, which accounts for both of these features: the intransitive relationships between pairs of objects are dealt with through interaction terms that are specific to each pair; and by partitioning the nݑ›nitalic_n skills into A+1≤nÝ´1ݑ›A+1\leq nitalic_A + 1 ≤ italic_n distinct clusters, any differences in the objects’ skills become significant, given appropriate AÝ´Aitalic_A. With nݑ›nitalic_n competitors there are n(n-1)/2ݑ›ݑ›12n(n-1)/2italic_n ( italic_n - 1 ) / 2 interactions, so even in multiple round robin competitions this gives too many parameters to efficiently estimate. Therefore we separately cluster the n(n-1)/2ݑ›ݑ›12n(n-1)/2italic_n ( italic_n - 1 ) / 2 values of intransitivity into KݾKitalic_K clusters, giving (A,K)ݴݾ(A,K)( italic_A , italic_K ) estimatable values respectively, typically with A+K
Keywords: baseball, Bayesian hierarchical modelling, Bradley-Terry, clustering, intransitivity, reversible jump Markov chain Monte Carlo.
The seminal Bradley-Terry model (Bradley and Terry,, 1952) is commonly used to rank objects from paired comparison data. Given a set â„â„\mathcalIcaligraphic_I of nݑ›nitalic_n objects with each object i∈â„ݑ–â„i\in\mathcalIitalic_i ∈ caligraphic_I having skill ri∈â„subscriptݑŸݑ–â„r_i\in\mathbbRitalic_r start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ blackboard_R, then the Bradley-Terry model gives, for i≠j∈â„ݑ–ݑ—â„i\neq j\in\mathcalIitalic_i ≠italic_j ∈ caligraphic_I,
pij(BT)=Pri≻j:=[1+exp(-(ri-rj))]-1,subscriptsuperscriptÝ‘ÂBTݑ–ݑ—Prsucceedsݑ–ݑ—assignsuperscriptdelimited-[]1subscriptݑŸݑ–subscriptݑŸݑ—1p^(\textBT)_ij=\Pr\i\succ j\:=[1+\exp(-(r_i-r_j))]^-1,italic_p start_POSTSUPERSCRIPT ( BT ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = roman_Pr italic_i ≻ italic_j := [ 1 + roman_exp ( - ( italic_r start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_r start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ) ] start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ,
where a≻bsucceedsÝ‘ÂŽÝ‘Âa\succ bitalic_a ≻ italic_b denotes preference for an object aÝ‘ÂŽaitalic_a over bÝ‘Âbitalic_b, and r1=0subscriptݑŸ10r_1=0italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0 to avoid identifiability issues. The ranking of the objects is then given by simply sorting ݒ“:=ri∈â„:i∈â„assignݒ“conditional-setsubscriptݑŸݑ–â„ݑ–â„\boldsymbolr:=\r_i\in\mathbbR:i\in\mathcalI\bold_italic_r := italic_r start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ blackboard_R : italic_i ∈ caligraphic_I . Now consider a game of Rock-Paper-Scissors, where a player rݑŸritalic_r always picks Rock, a player pÝ‘Âpitalic_p always picks Paper, and a player sÝ‘ sitalic_s always picks Scissors, (and all players have an equal number of games against the other two players). In this situation, the Bradley-Terry model would completely fail in correctly modelling outcomes between any pairwise match up of rݑŸritalic_r, pÝ‘Âpitalic_p, and sÝ‘ sitalic_s. This is because the Bradley-Terry model disregards one key element - intransitivity. Put simply, if data of paired comparisons exhibits transitivity, then a ranking could be formed, according to which, every better ranked object would on average beat every worse ranked object. Then, any departure from this is an example of intransitivity. In the Rock-Paper-Scissors example therefore, there does not exist a single ranking that could describe the true relationships, due to the inherent intransitivity. In fact, any ranking system which does not consider intransitive relations, including the Bradley-Terry model, would simply find that rݑŸritalic_r, pÝ‘Âpitalic_p and sÝ‘ sitalic_s are all evenly ranked, and would therefore predict a 50/50 chance of winning in any given comparison between any of the pairs. Rock-Paper-Scissors is surely amongst the most simple of games, played by children the world over, and yet almost all commonly used ranking systems would fail to correctly model this system.
Of course, on average a worse ranked object could beat a better ranked object purely due to chance, i.e., an “upsetâ€, and distinguishing an upset from a true intransitive relationship is often challenging. In fact, intransitivity is commonly assumed as being due to inference variation or as arising due to errors in the dataset (Skinner and Freeman,, 2009; Kéri,, 2011), such as underlying incompleteness of preferences, and it is therefore treated as a nuisance which should be removed. However, in the case of Rock-Paper-Scissors this is clearly not the case - the intransitivity is built into the inherent structure of the competition. In this article therefore, we develop a ranking system which directly models these effects in more general competitions, acknowledging that intransitivity may be a real characteristic of the dataset.
To model intransitivity it is of course necessary to define it. Non-stochastic transitivity, which considers only deterministic pairwise comparisons, states that, given three objects i,j,kݑ–ݑ—ݑ˜i,j,kitalic_i , italic_j , italic_k, then if i≻j,j≻k⇒i≻kformulae-sequencesucceedsݑ–ݑ—succeedsݑ—ݑ˜⇒ݑ–succeedsݑ˜i\succ j,\;j\succ k\Rightarrow i\succ kitalic_i ≻ italic_j , italic_j ≻ italic_k ⇒ italic_i ≻ italic_k. Intransitivity therefore is a departure from this assumption and results in cyclic trios, as well as arbitrarily large preference cycles of n>3ݑ›3n>3italic_n >3 objects.
This article deals with probabilistic pairwise comparisons. There are three main definitions of stochastic intransitivity (Tversky,, 1969; Latta,, 1979), given any subset of three objects i,j,kݑ–ݑ—ݑ˜i,j,kitalic_i , italic_j , italic_k: weak stochastic transitivity, whereby
Pri≻j>0.5,Prj≻k>0.5⇒Pri≻k>0.5,formulae-sequencePrsucceedsݑ–ݑ—0.5Prsucceedsݑ—ݑ˜0.5⇒Prsucceedsݑ–ݑ˜0.5\Pr\i\succ j\>0.5,\;\Pr\j\succ k\>0.5\Rightarrow\Pr\i\succ k\>0.5,roman_Pr italic_i ≻ italic_j >0.5 , roman_Pr italic_j ≻ italic_k >0.5 ⇒ roman_Pr italic_i ≻ italic_k >0.5 ,
i.e., we can deduce preferences only; strong stochastic transitivity, whereby
Pri≻j>0.5,Prj≻k>0.5⇒Pri≻k>maxPri≻j,Prj≻k,formulae-sequencePrsucceedsݑ–ݑ—0.5Prsucceedsݑ—ݑ˜0.5⇒Prsucceedsݑ–ݑ˜Prsucceedsݑ–ݑ—Prsucceedsݑ—ݑ˜\Pr\i\succ j\>0.5,\;\Pr\j\succ k\>0.5\Rightarrow\Pr\i\succ k\>\max\\Pr% \i\succ j\,\Pr\j\succ k\\,roman_Pr italic_i ≻ italic_j >0.5 , roman_Pr italic_j ≻ italic_k >0.5 ⇒ roman_Pr italic_i ≻ italic_k >roman_max roman_Pr italic_i ≻ italic_j , roman_Pr italic_j ≻ italic_k ,
i.e., we can deduce something about the probability; and linear transitivity, whereby
Pri≻j=F(ri-rj),F:â„→[0,1],dF(x)dx>0,∀x,:Prsucceedsݑ–ݑ—ݹsubscriptݑŸݑ–subscriptݑŸݑ—ݹformulae-sequence→â„01dݹݑ¥dÝ‘Â¥0for-allÝ‘Â¥\Pr\i\succ j\=F(r_i-r_j),\;F:\mathbbR\rightarrow[0,1],\;\frac\mathop{% }\!\mathrmdF(x)\mathop{}\!\mathrmdx>0,\;\forall x,roman_Pr italic_i ≻ italic_j = italic_F ( italic_r start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_r start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) , italic_F : blackboard_R → [ 0 , 1 ] , divide start_ARG roman_d italic_F ( italic_x ) end_ARG start_ARG roman_d italic_x end_ARG >0 , ∀ italic_x ,
where ra∈â„subscriptݑŸݑŽâ„r_a\in\mathbbRitalic_r start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ∈ blackboard_R is some ability of object aÝ‘ÂŽaitalic_a, ∀a∈i,j,kfor-allݑŽݑ–ݑ—ݑ˜\forall a\in\i,j,k\∀ italic_a ∈ italic_i , italic_j , italic_k e.g., if F(x)=[1+exp(-x)]-1ݹݑ¥superscriptdelimited-[]1Ý‘Â¥1F(x)=\left[1+\exp(-x)\right]^-1italic_F ( italic_x ) = [ 1 + roman_exp ( - italic_x ) ] start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT then this is the Bradley-Terry model, so logit(Pri≻j)=ri-rj,∀i≠j∈â„formulae-sequencelogitPrsucceedsݑ–ݑ—subscriptݑŸݑ–subscriptݑŸݑ—for-allݑ–ݑ—â„\textlogit\left(\Pr\i\succ j\\right)=r_i-r_j,\;\forall i\neq j\in% \mathcalIlogit ( roman_Pr italic_i ≻ italic_j ) = italic_r start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_r start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , ∀ italic_i ≠italic_j ∈ caligraphic_I. Linear intransitivity is the strictest definition of transitivity and implies that given any three objects i≠j≠k∈â„ݑ–ݑ—ݑ˜â„i\neq j\neq k\in\mathcalIitalic_i ≠italic_j ≠italic_k ∈ caligraphic_I, and two probabilities pij(BT)=Pri≻jsubscriptsuperscriptÝ‘ÂBTݑ–ݑ—Prsucceedsݑ–ݑ—p^(\textBT)_ij=\Pr\i\succ j\italic_p start_POSTSUPERSCRIPT ( BT ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = roman_Pr italic_i ≻ italic_j , pik(BT)=Pri≻ksubscriptsuperscriptÝ‘ÂBTݑ–ݑ˜Prsucceedsݑ–ݑ˜p^(\textBT)_ik=\Pr\i\succ k\italic_p start_POSTSUPERSCRIPT ( BT ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i italic_k end_POSTSUBSCRIPT = roman_Pr italic_i ≻ italic_k , the probability pjk(BT)=Prj≻ksubscriptsuperscriptÝ‘ÂBTݑ—ݑ˜Prsucceedsݑ—ݑ˜p^(\textBT)_jk=\Pr\j\succ k\italic_p start_POSTSUPERSCRIPT ( BT ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j italic_k end_POSTSUBSCRIPT = roman_Pr italic_j ≻ italic_k is completely determined by the other two probabilities. This will be proven in Section 3.
An immediate problem is in choosing which definition to use. Moreover, with any of the three definitions above there is no established measure of the magnitude of the intransitivity: the conditions are either satisfied or violated, with no reference as to how much they are violated by, so measuring the amount of intransitivity in a dataset is not currently well-defined. Typically in the literature, for example Makhijani and Ugander, (2019), the amount of intransitivity in a dataset is measured by recording either the proportion of intransitive triplets, defined as the proportion of times that the empirical pairwise probabilities violate weak stochastic transitivity, or the proportion of intransitive objects, which is the proportion of objects that are involved in an intransitive triplet.
We define a more concrete definition of intransitivity and introduce a distance metric between the assumed probability of paired comparisons under a Bradley-Terry model, and the observed probability, such that for any given dataset the magnitude of the present intransitivity is unambiguous. A novel model is then developed, which is an extension to the Bradley-Terry model, and whose structure contains the flexibility to incorporate varying degrees of intransitivity, as defined by our measure, through interaction terms that are specific to each pair. In general with nݑ›nitalic_n objects there are n(n-1)/2ݑ›ݑ›12n(n-1)/2italic_n ( italic_n - 1 ) / 2 interactions, of which n(n-3)/2ݑ›ݑ›32n(n-3)/2italic_n ( italic_n - 3 ) / 2 are identifiable. Even in multiple round robin competitions this gives too many parameters to efficiently estimate, therefore we cluster the n(n-1)/2ݑ›ݑ›12n(n-1)/2italic_n ( italic_n - 1 ) / 2 values of intransitivity into KݾKitalic_K clusters.
Moreover, the nݑ›nitalic_n skills are separately clustered into A+1≤nÝ´1ݑ›A+1\leq nitalic_A + 1 ≤ italic_n distinct skill clusters. Similarly to Masarotto and Varin, (2012), which clustered skills in a Bradley-Terry model via a lasso-type procedure, our model forces objects with similar skills into the same skill cluster. This avoids over-interpretation of insignificant differences between objects’ skills and thus a misinformed ranking. In the Rock-Paper-Scissors competition above, with unbalanced numbers of matches per pair, the Bradley-Terry model would incorrectly impose some ranking on the objects. By appropriately choosing a single skill cluster, our model would correctly identify all objects as having equal ability.
Since both the intransitivity of the pairs of objects, as well as objects skills are being clustered, parsimony is achieved, with it being anticipated that typically A+K
The article is set out as follows. Section 2 displays examples of intransitivity in real life pairwise comparisons and introduces other approaches to model it. Section 3 then introduces our novel measure of intransitivity and our full Bayesian hierarchical modelling strategy and prior specifications. Section 4 covers the RJMCMC algorithm, and Sections 5 and 6 compare this model with the Bradley-Terry model and other competitor models, on simulated and American League baseball data respectively.
2 Literature on Intransitive Modelling
2.1 Justifications for the Presence of Intransitivity
Although much literature on intransitivity focuses on removing or “correcting†for the intransitivity (Skinner and Freeman,, 2009; Kéri,, 2011), it has been shown experimentally that intransitivity can be a real feature of a system which cannot be accounted for by errors or natural variation (Tversky,, 1969; Montgomery,, 1977; Lindman and Lyons,, 1978). In recommender systems, some view intransitivity as arising due to the aggregation process from different judges’ underlying preferences, but that each judge’s underlying preferences are still transitive (Rendle et al.,, 2009; Pan and Chen,, 2013). Other work recognizes that intransitivity may be systematic even in a one judge system, which is where our focus sits, and some model intransitivity as arising from both sources (Chen et al.,, 2017).
Systematic intransitivity even under a one judge system is observable not just in artificial constructions, such as dice games (De Schuymer et al.,, 2003) or quantum games in physics (Makowski and Piotrowski,, 2006), but also occurs naturally, for example, in competition between bacteria (Reichenbach et al.,, 2007) or mating choices of lizards (Sinervo and Lively,, 1996). Pahikkala et al., (2010) contains many more examples, and argues that violation of weak stochastic transitivity can occur in any situation where the best strategy in a given comparison depends on the strategy of the opponent. Given this, it would not be surprising to find cases of intransitivity in sports. In fact, by drawing parallels with social choice theory, Smead, (2019) provides a philosophical argument as to why intransitivity is not only unsurprising, but is particularly likely to occur in sports.
Many existing works support the violation of strong stochastic transitivity (Dixon and Coles,, 1997; Carroll and De Soete,, 1991); however, supporting the violation of weak stochastic transitivity, and therefore intransitive triplets (or cycles), is less common. Early examples include the work of Tsai and Böckenholt, (2006), which uses a random effects model to capture intransitivity, and that of Pahikkala et al., (2010), which tackles the problem from a machine learning standpoint using kernel methods to estimate the intransitive relations.
2.2 Bradley-Terry Based Models
Several works attempt to generalise the Bradley-Terry model to capture intransitivity, making these the most similar to our approach. An early attempt is the 2-d Bradley-Terry model of Causeur and Husson, (2005), which has 2n+n(n-1)/22ݑ›ݑ›ݑ›122n+n(n-1)/22 italic_n + italic_n ( italic_n - 1 ) / 2 free parameters. They include two skills (ri,1,ri,2)∈â„2subscriptݑŸݑ–1subscriptݑŸݑ–2superscriptâ„2\left(r_i,1,r_i,2\right)\in\mathbbR^2( italic_r start_POSTSUBSCRIPT italic_i , 1 end_POSTSUBSCRIPT , italic_r start_POSTSUBSCRIPT italic_i , 2 end_POSTSUBSCRIPT ) ∈ blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT for each object i∈â„ݑ–â„i\in\mathcalIitalic_i ∈ caligraphic_I, and also a binary matrix Σ∈â„n×nΣsuperscriptâ„ݑ›ݑ›\Sigma\in\mathbbR^n\times nroman_Σ ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT which takes the value Σij=+1subscriptΣݑ–ݑ—1\Sigma_ij=+1roman_Σ start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = + 1 if iݑ–iitalic_i beats jݑ—jitalic_j in the majority of comparisons, and Σij=-1subscriptΣݑ–ݑ—1\Sigma_ij=-1roman_Σ start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = - 1 otherwise, and where ΣijsubscriptΣݑ–ݑ—\Sigma_ijroman_Σ start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT is the (i,j)-limit-fromݑ–ݑ—(i,j)-( italic_i , italic_j ) -th entry of ΣΣ\Sigmaroman_Σ for any i≠j∈â„ݑ–ݑ—â„i\neq j\in\mathcalIitalic_i ≠italic_j ∈ caligraphic_I. They use a logit function, similarly to the standard Bradley-Terry model, to map to probabilities based on Euclidean distance of skills, such that
logit(Pri≻j)=Σij((ri,1-rj,1)2+(ri,2-rj,2)2)12.logitPrsucceedsݑ–ݑ—subscriptΣݑ–ݑ—superscriptsuperscriptsubscriptݑŸݑ–1subscriptݑŸݑ—12superscriptsubscriptݑŸݑ–2subscriptݑŸݑ—2212\textlogit\left(\Pr\i\succ j\\right)=\Sigma_ij\left(\left(r_i,1-r_j,1% \right)^2+\left(r_i,2-r_j,2\right)^2\right)^\frac12.logit ( roman_Pr italic_i ≻ italic_j ) = roman_Σ start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ( ( italic_r start_POSTSUBSCRIPT italic_i , 1 end_POSTSUBSCRIPT - italic_r start_POSTSUBSCRIPT italic_j , 1 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_r start_POSTSUBSCRIPT italic_i , 2 end_POSTSUBSCRIPT - italic_r start_POSTSUBSCRIPT italic_j , 2 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT .
This allows for intransitive relationships, but the discrete nature of the entries of ΣΣ\Sigmaroman_Σ mean that the pairwise probabilities are sensitive to small changes in the data. For example, it may only require an object iݑ–iitalic_i to win one extra match against jݑ—jitalic_j, and suddenly the probabilities that iݑ–iitalic_i beats jݑ—jitalic_j and jݑ—jitalic_j beats iݑ–iitalic_i are completely reversed.
The blade-chest model of Chen and Joachims, (2016) extends the Bradley-Terry model into dݑ‘ditalic_d dimensions by incorporating so-called blade and chest vectors ݒƒi,ݒ„i∈â„d:i∈â„:subscriptݒƒݑ–subscriptݒ„ݑ–superscriptâ„ݑ‘ݑ–â„\boldsymbolb_i,\boldsymbolc_i\in\mathbbR^d:\;i\in\mathcalIbold_italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT : italic_i ∈ caligraphic_I respectively for each player iݑ–iitalic_i. The blade-chest model has two variants: the blade-chest-distance (BCD) model, expressed as
logit(pij(BCD)):=||ݒƒj-ݒ„i||22-||ݒƒi-ݒ„j||22+ri-rj,assignlogitsubscriptsuperscriptÝ‘Âݵݶݷݑ–ݑ—superscriptsubscriptnormsubscriptݒƒݑ—subscriptݒ„ݑ–22superscriptsubscriptnormsubscriptݒƒݑ–subscriptݒ„ݑ—22subscriptݑŸݑ–subscriptݑŸݑ—\textlogit\left(p^(BCD)_ij\right):=||\boldsymbolb_j-\boldsymbolc_% i||_2^2-||\boldsymbolb_i-\boldsymbolc_j||_2^2+r_i-r_j,logit ( italic_p start_POSTSUPERSCRIPT ( italic_B italic_C italic_D ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) := | | bold_italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - bold_italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | | start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - | | bold_italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - bold_italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | | start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_r start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_r start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ,
and the blade-chest-inner (BCI) model is defined as
logit(pij(BCI)):=ݒƒiT⋅ݒ„j-ݒƒjT⋅ݒ„i+ri-rj.assignlogitsubscriptsuperscriptÝ‘Âݵݶݼݑ–ݑ—⋅superscriptsubscriptݒƒݑ–ݑ‡subscriptݒ„ݑ—⋅superscriptsubscriptݒƒݑ—ݑ‡subscriptݒ„ݑ–subscriptݑŸݑ–subscriptݑŸݑ—\textlogit\left(p^(BCI)_ij\right):=\boldsymbolb_i^T\cdot% \boldsymbolc_j-\boldsymbolb_j^T\cdot\boldsymbolc_i+r_i-r_j.logit ( italic_p start_POSTSUPERSCRIPT ( italic_B italic_C italic_I ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) := bold_italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT â‹… bold_italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - bold_italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT â‹… bold_italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + italic_r start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_r start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT .
The intuition behind the model is that the blade and chest parameters of all the objects can be viewed as features on a dݑ‘ditalic_d-dimensional latent space. Then, if an object iݑ–iitalic_i’s blade is close to object jݑ—jitalic_j’s chest, and simultaneously object iݑ–iitalic_i’s chest is far from object jݑ—jitalic_j’s blade, then object iݑ–iitalic_i has an additional advantage over object jݑ—jitalic_j. If d≥2ݑ‘2d\geq 2italic_d ≥ 2 then it is simple to construct a latent space which represents a Rock-Paper-Scissors scenario, by placing the blade of Rock near the chest of Scissors, the blade of Scissors near the chest of Paper, and the blade of Paper near the chest of Rock. It follows that by increasing dݑ‘ditalic_d, ever more complex relationships can be captured between the pairs of objects. Given nݑ›nitalic_n objects and ݒƒi,ݒ„i∈â„dsubscriptݒƒݑ–subscriptݒ„ݑ–superscriptâ„ݑ‘\boldsymbolb_i,\boldsymbolc_i\in\mathbbR^dbold_italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT for each object iݑ–iitalic_i, the model requires 2d(n-1)2ݑ‘ݑ›12d(n-1)2 italic_d ( italic_n - 1 ) independent parameters. The ݒ“ݒ“\boldsymbolrbold_italic_r parameters can be absorbed into the blade and chest parameters; however, the above parametrisation makes it clear that the Bradley-Terry model is a special case of the blade-chest model, when ݒƒi=ݒƒj=ݒ„i=ݒ„j,∀i,j∈â„formulae-sequencesubscriptݒƒݑ–subscriptݒƒݑ—subscriptݒ„ݑ–subscriptݒ„ݑ—for-allݑ–ݑ—â„\boldsymbolb_i=\boldsymbolb_j=\boldsymbolc_i=\boldsymbolc_j,\;% \forall i,j\in\mathcalIbold_italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = bold_italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = bold_italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = bold_italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , ∀ italic_i , italic_j ∈ caligraphic_I.
Duan et al., (2017) introduce a generalised model for intransitivity, with
logit(pij(G))=ÝÂiTΣÝÂj+ÝÂiTΓÝÂi-ÝÂjTΓÝÂj,logitsuperscriptsubscriptÝ‘Âݑ–ݑ—ݺsuperscriptsubscriptÝÂݑ–ݑ‡ΣsubscriptÝÂݑ—superscriptsubscriptÝÂݑ–ݑ‡ΓsubscriptÝÂݑ–superscriptsubscriptÝÂݑ—ݑ‡ΓsubscriptÝÂݑ—\textlogit\left(p_ij^(G)\right)=\boldsymbol\mu_i^T\Sigma% \boldsymbol\mu_j+\boldsymbol\mu_i^T\Gamma\boldsymbol\mu_i-% \boldsymbol\mu_j^T\Gamma\boldsymbol\mu_j,logit ( italic_p start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_G ) end_POSTSUPERSCRIPT ) = bold_italic_μ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_Σ bold_italic_μ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + bold_italic_μ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_Γ bold_italic_μ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - bold_italic_μ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_Γ bold_italic_μ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ,
where ÝÂi∈â„dsubscriptÝÂݑ–superscriptâ„ݑ‘\boldsymbol\mu_i\in\mathbbR^dbold_italic_μ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT is a dݑ‘ditalic_d dimensional strength vector for object iݑ–iitalic_i, ÝÂ:=ÝÂi:i∈â„assignÝÂconditional-setsubscriptÝÂݑ–ݑ–â„\boldsymbol\mu:=\\boldsymbol\mu_i:i\in\mathcalI\bold_italic_μ := bold_italic_μ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT : italic_i ∈ caligraphic_I , and Σ,Γ∈â„d×dΣΓsuperscriptâ„ݑ‘ݑ‘\Sigma,\Gamma\in\mathbbR^d\times droman_Σ , roman_Γ ∈ blackboard_R start_POSTSUPERSCRIPT italic_d × italic_d end_POSTSUPERSCRIPT are so-called transitive matrices. The first matrix ΣΣ\Sigmaroman_Σ represents the interactions between objects, and ΓΓ\Gammaroman_Γ controls how an individual object’s strength components form the object’s overall strength. The number of parameters is d(2d+n-1)ݑ‘2ݑ‘ݑ›1d(2d+n-1)italic_d ( 2 italic_d + italic_n - 1 ), since there are two d×dݑ‘ݑ‘d\times ditalic_d × italic_d matrices (Σ,Γ)ΣΓ(\Sigma,\Gamma)( roman_Σ , roman_Γ ) and nݑ›nitalic_n dݑ‘ditalic_d-dimensional vectors ÝÂÝÂ\boldsymbol\mubold_italic_μ. They show that their model is a generalisation of the blade-chest and Bradley-Terry models. Specifically the BCI model arises when ÝÂi=(ݒƒi,ݒ„i),subscriptÝÂݑ–subscriptݒƒݑ–subscriptݒ„ݑ–\boldsymbol\mu_i=(\boldsymbolb_i,\boldsymbolc_i),bold_italic_μ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = ( bold_italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) , ÝÂj=(ݒƒj,ݒ„j)subscriptÝÂݑ—subscriptݒƒݑ—subscriptݒ„ݑ—\boldsymbol\mu_j=(\boldsymbolb_j,\boldsymbolc_j)bold_italic_μ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = ( bold_italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ), and ||ÝÂݒŠ||22=||ÝÂݒ‹||22superscriptsubscriptnormsubscriptÝÂݒŠ22superscriptsubscriptnormsubscriptÝÂݒ‹22||\boldsymbol\mu_i||_2^2=||\boldsymbol\mu_j||_2^2| | bold_italic_μ start_POSTSUBSCRIPT bold_italic_i end_POSTSUBSCRIPT | | start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = | | bold_italic_μ start_POSTSUBSCRIPT bold_italic_j end_POSTSUBSCRIPT | | start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, that is, the objects all have equal skill, and
Σ=[0Id/2-Id/20],Σdelimited-[]matrix0subscriptݼݑ‘2subscriptݼݑ‘20\Sigma=\left[\beginmatrix0&I_d/2\\ -I_d/2&0\endmatrix\right],roman_Σ = [ start_ARG start_ROW start_CELL 0 end_CELL start_CELL italic_I start_POSTSUBSCRIPT italic_d / 2 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL - italic_I start_POSTSUBSCRIPT italic_d / 2 end_POSTSUBSCRIPT end_CELL start_CELL 0 end_CELL end_ROW end_ARG ] ,
where ImsubscriptݼݑšI_mitalic_I start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT is the m×mݑšݑšm\times mitalic_m × italic_m identity matrix, then ÝÂiTΣÝÂj=ݒƒiT⋅ݒ„j-ݒƒjT⋅ݒ„i.superscriptsubscriptÝÂݑ–ݑ‡ΣsubscriptÝÂݑ—⋅superscriptsubscriptݒƒݑ–ݑ‡subscriptݒ„ݑ—⋅superscriptsubscriptݒƒݑ—ݑ‡subscriptݒ„ݑ–\boldsymbol\mu_i^T\Sigma\boldsymbol\mu_j=\boldsymbolb_i^T\cdot% \boldsymbolc_j-\boldsymbolb_j^T\cdot\boldsymbolc_i.bold_italic_μ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_Σ bold_italic_μ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = bold_italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT â‹… bold_italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - bold_italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT â‹… bold_italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT .
This model of Duan et al., (2017) contains fewer parameters than the blade-chest model when d
Makhijani and Ugander, (2019) introduced a majority vote model. Here, an object iݑ–iitalic_i has a vector of dݑ‘ditalic_d skill attributes, such that ÝÂi=(μi,1,…,μi,d)subscriptÝÂݑ–subscriptݜ‡ݑ–1…subscriptݜ‡ݑ–ݑ‘\boldsymbol\mu_i=\left(\mu_i,1,\dots,\mu_i,d\right)bold_italic_μ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = ( italic_μ start_POSTSUBSCRIPT italic_i , 1 end_POSTSUBSCRIPT , … , italic_μ start_POSTSUBSCRIPT italic_i , italic_d end_POSTSUBSCRIPT ), where dݑ‘ditalic_d is odd. Then, given a suitable choice of mapping function fݑ“fitalic_f, for example, logistic or Gaussian, define qijl=f(μi,l-μj,l),subscriptsuperscriptݑžݑ™ݑ–ݑ—ݑ“subscriptݜ‡ݑ–ݑ™subscriptݜ‡ݑ—ݑ™q^l_ij=f(\mu_i,l-\mu_j,l),italic_q start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = italic_f ( italic_μ start_POSTSUBSCRIPT italic_i , italic_l end_POSTSUBSCRIPT - italic_μ start_POSTSUBSCRIPT italic_j , italic_l end_POSTSUBSCRIPT ) , ∀l∈1,…,dfor-allݑ™1…ݑ‘\;\forall l\in\1,\dots,d\∀ italic_l ∈ 1 , … , italic_d to be the probability of iݑ–iitalic_i beating jݑ—jitalic_j based only on their lݑ™litalic_lth attribute. Then, majority vote model says that the probability of iݑ–iitalic_i being preferred to jݑ—jitalic_j overall, is the probability that it wins across the majority of attributes. For example, with d=3ݑ‘3d=3italic_d = 3, then
Pri≻j=qij1qij2qij3+(1-qij1)qij2qij3+qij1(1-qij2)qij3+qij1qij2(1-qij3).Prsucceedsݑ–ݑ—subscriptsuperscriptݑž1ݑ–ݑ—subscriptsuperscriptݑž2ݑ–ݑ—subscriptsuperscriptݑž3ݑ–ݑ—1subscriptsuperscriptݑž1ݑ–ݑ—subscriptsuperscriptݑž2ݑ–ݑ—subscriptsuperscriptݑž3ݑ–ݑ—subscriptsuperscriptݑž1ݑ–ݑ—1subscriptsuperscriptݑž2ݑ–ݑ—subscriptsuperscriptݑž3ݑ–ݑ—subscriptsuperscriptݑž1ݑ–ݑ—subscriptsuperscriptݑž2ݑ–ݑ—1subscriptsuperscriptݑž3ݑ–ݑ—\Pr\i\succ j\=q^1_ijq^2_ijq^3_ij+(1-q^1_ij)q^2_ijq^3_% ij+q^1_ij(1-q^2_ij)q^3_ij+q^1_ijq^2_ij(1-q^3_ij).roman_Pr italic_i ≻ italic_j = italic_q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT italic_q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT italic_q start_POSTSUPERSCRIPT 3 end_POSTSUP
|