ANNALES SOCIETATIS MATHEMATICAE POLONAE Series I: COMMENTATIONES MATHEMATICAE X IX (1976) ROCZNIKI POLSKIEGO TOWARZYSTWA MATEMATYCZNEGO
Séria I: PRACE MATEMATYCZNE X IX (1976)
B
obeetBAKTOSZYisfSKi (Warszawa)
Some remarks on the secretary problem
In this note we shall consider a modification of the so-called secretary problem. We shall namely analyse the policies aimed at finding either the best or the second best candidate. Two techniques of conditioning in computing various probabilities will be employed; the comparison of the results will lead to some combinatorial identities, generalizing those given in [ 1 ].
The secretary problem. The secretary problem may be formulated as follows. In response to a newspaper announcement, there appear n candidates for a vacant secretary position. If we could interview them all, we could rank them from the best (rank 1 ) to the worst (rank n), with no ties. However, we can interview them only one at a time, in a ran dom order, and after seeing a candidate we can rank her only with respect to those already interviewed, but not with respect to the others. After each interview we can make one of the two decisions: employ the last candidate, where upon the process ends, or let her go, and interview the next one ; in such case, we cannot reverse the decision on any later stage, employing the candidates who have already been interviewed and passed.
In the original formulation (see for instance [2], [3] or [4]), the problem was to devise a policy which would maximize the probability of getting the best candidate. The well-known answer is: “pass r — 1 candidates without stopping any of them, and then stop the first candi
date (if any) who is better than all those already passed”, with r being (for large n ) equal approximately nje.
As already stated, we shall treat the problem of finding an optimal policy, if the object is to employ either the best or the second best can
didate.
The optimal policy is then contained (see for instance [3], [4]) in
the class of policies defined by pairs of integers r, s ( r < s ), consisting
of letting the first r — 1 candidates go without stopping, and then stopping
the first candidate with relative rank 1 (better than all the preceding
ones), with the additional provision th at beginning with the trial number s
one stops also the candidate if she has the relative rank 2 (i.e. is second best relatively to all those already passed).
Call such a policy (r, s), and let p{r, s; n) be the probability of em
ploying th e candidate with the tru e rank 1 or 2 under the policy (r, s).
Denote by aq, aq, ..., æn the random permutation of ranks 1, 2, n.
The rule (r, s) tells us to pass aq, ..., œr_1 and then stop at the first index i (if any) such th at
(a) i > r and equals the minimum among aq, ..., œi or
(b) ъ ^ s and equals to the second in rank aniong aq , *. * ^ a^ » The simpler solution. We shall now compute the probability p(r, $; n) of winning (i.e. employing one of the two best candidates). Write A = {1, 2, ..., r — 1}, В = {r, r + 1 , ..., s — 1} and C — {s, s + 1 , ..., n}, and let ik be the (random) place occupied by the number h in the per
m utation a?!, ..., xn (i.e. we have æ{ = Jc).
Now, if i x, i 2eA we lose, hence a necessary condition for winning is th a t at least one of q , i % is not in A. We shall consider several possible cases, depending on the positions of i x and i 2.
(a) В, гх< г 2. The probability th a t i x — h is 1 jn, and given i x = fc, the probability th a t i x < equals (n — k)l(n — 1). We win if among -aq, .. . , aq_x the minimum occurs at one of the places numbered 1 , 2 , ...
.. . , r — 1 , the probability of this event (given th a t i 2> i x = к > r) being (r — l)/(Tc — l). Summing over possible к we get
‘( 1 )
Р г
r — 1 n (n — 1 )
I
s-1k = r
n — к
k - 1 ‘
(b) гхеС, i x< i z. As before, the probability th a t i x = к and i x< г 2 1 n — k
is — *--- Given i 2 > i x — keC, we win if the sequence aq, ..., sok_x
n n — 1
of elements preceding the element 1 satisfies two conditions : its minimum appears at one of the places numbered 1 , . . . , r — 1 , and the second in rank element of this sequence appears at some place numbered 1 , ..., s — 1 .
r —1 The probability of this event i s --- k — 1
■ •--- , and summing over all pos 8—2 k — 2
sible к ( = s, ., n — 1) we obtain
( 2 ) p ( r - l ) ( s - 2 ) n — k
2 n{n — 1 ) Z j (* — 2 ) *
k=s
The same reasoning applies to the case i 2eB<jC, i 2< i x, hence
2 (P x + P 2) is the probability of winning when none of the elements 1, 2
.appears among th e first r — 1 terms of th e sequence aq, ...,a?w.
Secretary problem
17(c) ij^eA. In order to win it is now necessary th a t i 2eC. The proba- r —1
bility of -joint event i xeJL, i 2 = JceC equals --- , and we win if the n(n — 1 )
second best among the terms aq, ..., (the minimum being 1 ) appears at some place among those numbered 1 , — 1 . The probability of this event is (s — 2)l(k — 2), and summing over all possible к we get
(3 ) P * =
( r - 1 ) ( 8 - 2 )
vh1
n ln — 1) X-j k — 2 *
' k = s
(d) г 2 еA. In this case, we have either гхе В or гхе(7. If гхе В , we win;
the probability of this event is
(4)
(r — 1 ) (s — r) n(n — 1 )
In the second case, we win if — к eC, and the second best among aq, ...,æ k_1 (the minimum being equal 2) appears at one of the places 1, — 1 . This yields probability P 3.
Thus, we showed th a t the probability p (r,s-,n ) of winning under the policy (r, s) equals
2 ( P 1A P 2+ P s) + P i .
This is essentially the formula given (without proof) in [4].
The more complicated solution. We shall now compute the probability p{r,s-,n) using a different method. I t will be convenient to introduce th e following notation : for D a { 1 , ..., n)
(5) I ( P) — {ijc: k e D)
will be the (random) set of places at which the elements of D appear in th e permutation aq, ..., xn, and
< 6 ) X ( B ) = {xk: k eD}
will be the (random) set of values appearing at places from P .
In the sequel we shall assume th at r > 3. The set A = { 1 , ..., r — 1 } (of places at which the elements are passed without stopping) consists then of more than one element. We shall use the conditioning with respect to the two least elements in X( A) , i.e. with respect to the two least el
ements among those which appear at places from A.
Clearly, if 1 and 2 are in X { A ), we cannot win. We shall partition the remaining cases as follows:
2 — R oezniki PTM — P r a c e M atem atyczn e XXX
(a) the two least elements in X ( A ) are к and m with 3 < k < mj (b) the two least elements in X( A) are 1 and & > 3 ;
(c) the two least elements in X( A) are 2 and к > 3.
C ase (a). The probability th a t the least two elements in X ( A ) are к
while m may be к + 1 , n — r +3. For fixed к and m, denote U = { 1 , ...
..., k — l j and V = {k + 1 , ..., m — 1 } and consider further conditioning with respect to the position of elements of I{U) and I(V), where I ( U) and I {V) are defined by (5). We have here I ( U ) u I ( V ) с Б и С , and the main point is th a t if к and m are the least elements of X( A) , then the only elements which may have relative rank 1 in B u G are those in U, while the only elements which may have bhe relative rank 2 in the set В и C are those in TJ и V. The outcome then depends on the position of the sets I ( TJ) and I ( V) within B vj G: the process stops at the first occurrence of an element of U in B, or — if all elements of U are in C, th e process stops at the first occurrence of an element of U u V', where V consists of all those elements of V which are less than the minimal element of V appearing in B.
More precisely, the systematic enumeration of all cases (within case (a)) is as follows. If I ( U ) n B Ф0 , the position of elements of V plays no role: the process stops at the first appearence of an element of TJ (since it is the first element in В with the relative rank 1 ). The probability of winning is then equal 2/(k — l).
Now, the probability th at I { U ) n B Ф 0 equals 1 if к — 1 > n — s + 1
= the size of the set C, hence for к > n — s + 3. Otherwise, this probability equals
1
n — s + 1 , lc-1
n —r + 1 к - 1 This yields the term
(7)
Let us now consider the case I ( U) <= C (still within the general case
(a)). The positions of the elements of V play
полуan essential role. If
Secretary problem
191(F) cz C, there are no elements with the relative rank 1 in the set B, and the process stops at the first occurrence of any of the elements 1 , 2 , . . . , к — 1 , к + 1 , ..., m — 1 , hence the probability of winning is 2 /( w — 2 )..
j
ST
ow, the probability th a t I ( U ) kj I ( V)
czC is "г*)’
yields th e term
There remains th e most cumbersome case I ( V ) n B Ф 0 . This neces
sitates m > & + 1. The process is then stopped at the first element 1, 2, ...
...,1c —
1,jfc-fl, . . . , t —
1,where t is the least element of V which falls into B.
Now, the probability th a t J(U) cz C is
leaves % —s + 1 — (fc — 1 ) = n — s — fc-f -2 places in the set G, and all s — r places in the set В for distributing the set I(V). Altogether this set has m — k — 1 elements, and the probability th a t j among them will fall into
the set B, and m — lc — l —j will fall into the set C equals ( s - r \ (n—s —k + 2\
\ j ) \ m —fc — 1 —j j
l n—r — k +
2\\ m —k —
1J
Given th a t j elements of I {V) fall into B, the probability th a t t will ( iYïb ”■ t X \ / / 'Щ j \
\ [ . j (here t may be equal k-\-l, ..., m — j). In such case, the probability of winning is 2[(t — 2).
Combining it all together we get
(9)
C ase (b). The least elements in X ( A ) are 1 and к > 3; probability
of this event equals jД г ^ x j , where к may be equal 3, 4, .. . , n — r + 3 .
Denote FT = {3, 4, ..., к — 1 }. Clearly, no term in the set B u G will have
the relative rank 1 , hence the process may be stopped only at some place
in the set C. In order to win, we must have г 2 еС. Eeasoning as before we distinguish two cases: IÇW)
<=G and I ( W ) n B Ф 0 . In the first case, the probability of winning equals l / ( k — 2 ), as the process stops a t the first occurrence of numbers 2, 3, .. . , fc —1. The probability th at
i 2eC, I ( W
) czG equals giyes the term
(
10
)n - r + 3
* - 1*=3 Us W ( l l n—s + w—r + l \ * - 2 /1\ ]c — 2
]c — 2
k — 2
Suppose now th a t i 2eC and I ( W) r \ B Ф 0 (this necessitates к > 4).
If th e least element of W which is in В equals t, the process stops at the first occurrence of one of the elements 2,3 , t —
1,hence the probability of winning equals 1/(2 — 2 ).
Proceeding as in point (a), th e probability th a t i 2eC, j among ele
ments of W are in B, and k — 3 —j are in C equals n
n - r - f - 1
(
s —r\ ( n —s \ ___ J A k - j - з )
( n
—k - \
Given th a t j elements of I (IT) appear in B, the minimum among them being t has the probability Hk — 3
Combining it together we obtain
( i l )
Qs = X
n - r + 3 i n ~ k
П \ r - 3 n k=4
( r - l )k
- 3• g + i
y1 Is ~ r\ / n ~ s
\ j ; U - 3 - j k— j П—Г + 1 j=l n
—rk - 3
k - t - 3 i - 1 k - 3
3
t
- 2C ase (c). The least two elements in X{ A) are 2 and к > 3; as before, the probability of this event equals
If i x e B, we win. This gives the term (n—k\
1-Г+ 3
Q ' - Z k=3
r — 3 n
r - l n — r + l
If {геС, the situation depends on the position of the set IT = (3, 4, ...
..., к — 1 }, and is analogous to th a t in case (b), which gives the probability of winning ф 4 + ф5.
We have thus proved th a t for 3 :
p
(r
,s
;n
) —Qi~f- $ 2 + $ 3 ~ b 2 $ 4-f-2Q5-f-$6.
Secretary problem
21The results of comparison. We may now compare the two results.
Clearly, the sum Q1-bQ2 + Qs with Q±, Q2 and Q3 given by (7), ( 8 ) and (9) corresponds to th e probability of winning if none of the numbers 1 , 2 appears in the set A , hence it must be equal
2 ( P 1 + P 2),with Р г and
P 2given by (1) and (2). After some transformations this yields the identity:
For 3 < r < s < n :
n — r + 1
w
n —m
i zk
k—3 m —k + 14 -z z u~s
k= 3 mn — m \ in —
( n — s + 1 \\ * - i I +
n —s+2 n —r + 3 +
n —m\ j n —s +1 H \ r — 3 M m — 2 +LJ У У , n . l n —r + l \
*=3 m = k + i ( m 2 ) ^ m _ 2 J +
w_ , +2 / » - e + l\ n _ r + 3 l n - m
V \ ъ - i j у \ r - 3
/ j I n —r + l\ / j
z r \ ln — s —k + 2
j j \m — k — 1 — j Z
j(n—r + l \ x
j{n —r
—k + 2\ X
j(m — k — 1\
c==3 \ k — 1 f m=k+2 \ m — ic — i J i =l \ i /
X
X
m—j ( m‘ ^ ^
Г» l 3 -1
t = k
Z
+ 1t — 2
S - l
= У
nZ Z \ i c -
k—r
1 \
vh1 ( n — 1 \
’ k = s v '
Similarly, Q4 + Q5, with Q4 and Qs given by ( 10 ) and ( 11 ) represents probability of winning if 1 appears in A , hence it must be equal to P 3 given by (3). This yields the identity
Finally, comparison of Q6 and P 4 leads only to a well-known identity for binomial coefficients.
Needless to say, the practical use of these identities is rather limited
(if not nil); they are presented here for their curiosity value.
References