1. Introduction. A classical result in additive number theory is the five prime squares theorem proved by L. K. Hua: the diophantine equation (1.1) N = p

(1)

XC.3 (1999)

Numbers representable by five prime squares with primes in an arithmetic progression

by

Yonghui Wang (Beijing)

1. Introduction. A classical result in additive number theory is the five prime squares theorem proved by L. K. Hua: the diophantine equation (1.1) N = p

²₁

+ p

²₂

+ p

²₃

+ p

²₄

+ p

²₅

is solvable for large odd N satisfying N ≡ 5 (mod 24).

This theorem can be regarded as a nonlinear extension of the Goldbach ternary theorem (Goldbach–Vinogradov Theorem), it also gives a deep in- sight into the Lagrange four square theorem. In this paper we study the equation (1.1) with prime variables in an arithmetic progression, i.e. the prime variables satisfy p

i

≡ b

i

(mod d), i = 1, . . . , 5, and b = (b

1

, . . . , b

5

) ∈ B(N, d), where

(1.2) B(N, d)

= {b ∈ N

⁵

: 1 ≤ b

i

≤ d, (b

i

, d) = 1, b

²₁

+ . . . + b

²₅

≡ N (mod σ(d)d)}, with σ(d) = 1, 4, 2 for 2 - d, 2 k d and 4 | d respectively. We will use this notation in the rest of the paper.

Our main result is

Theorem. There exists an effective positive constant δ such that the diophantine equation

(1.3)

N = p

²₁

+ p

²₂

+ p

²₃

+ p

²₄

+ p

²₅

, p

_i

≡ b

_i

(mod d), i = 1, . . . , 5,

with prime variables is solvable for all positive integers d ≤ N

^δ

provided N ≡ 5 (mod 24) is a large odd integer with B(N, d) nonempty.

It should be mentioned that this result implies the famous Linnik theorem on the least prime in an arithmetic progression.

1991 Mathematics Subject Classification: Primary 11P55, 11P32, 11L07, 11L05.

[217]

(2)

In the corresponding linear case, i.e. for the Goldbach ternary theorem with prime variables in an arithmetic progression, M. C. Liu and Tao Zhan [7] proved that there exists an effective positive constant δ > 0 such that, for all positive integers d ≤ N

^δ

, the diophantine equation

N = p

1

+ p

2

+ p

3

,

p

_i

≡ b

_j

(mod d), j = 1, 2, 3, where (b

j

, d) = 1, P

₃

j=1

b

j

≡ N (mod d), is solvable for large odd N . Their result improved the work of Rademacher, Ayoub and Zulauf, the previous results holding only for a fixed positive integer d or d ≤ (log N )

^C

. Hua’s method with minor modifications actually gives that the equation (1.3) is solvable for large N with the set (1.2) nonempty and d log

^A

N , but it fails when we want to enlarge the scope of d to d ≤ N

^δ

, where δ is an absolute positive constant.

The difficulty lies in two respects. First, in the case of d ≤ N

^δ

, we cannot use the Siegel–Walfisz theorem as usual to estimate the major arcs. Second, the restriction to an arithmetic progression requires finding a way to deal with exponential sums and Gauss sums over an arithmetic progression.

The second difficulty was overcome by Jianya Liu and Tao Zhan in [4, Lemma 2]. By using multiplicativity ingeniously, they transform the exponential sum over an arithmetic progression to the usual exponential sum with Dirichlet characters and Gauss sums. The starting point of this paper is a similar result for the quadratic case, i.e. Lemma 3.2. But we use a different method to deal with the first difficulty.

In 1975, Montgomery and Vaughan [8] diminished the exceptional set of the Goldbach problem from O(x log

^−A

x) to O(x

^1−δ

). Their difficulty also was wider major arcs and the fact that the Siegel–Walfisz theorem could not be used. They solved it by using the Deuring–Heilbronn phenomenon and Gallagher’s theorem. But if we use their method for our quadratic problem, we will face too many cases and need to do lots of calculation. This is also the reason why we do not apply the method of M. C. Liu and Tao Zhan [7]. Hence we shall apply a modification of the method by Liu and Tsang [6]. The point is that we only need to estimate the singular series and singular integral separately and only once. But as we are concerned with the quadratic case and the restriction to an arithmetic progression, we have to work harder from the beginning to estimate the complicated singular series (Lemma 4.8).

2. Notations and the minor arcs. Define

(2.1) d := N

^δ

, Q := N

^21δ

, T := N

^√^δ

, L := N/50, τ := N

⁻¹

T

^1/4

,

where δ is a small computable positive constant. Then Q < T < L < N. We

(3)

write

(2.2) L

₂

= L

^1/2

and N

₂

= N

^1/2

.

In the following, ε > 0 is a comparable very small constant, and the implied constants in the symbols O and are computable, positive and depend at most on δ, ε. We write e(y) for e

^2πiy

and e

_q

(y) for e(y/q).

For any a, q such that (a, q) = 1, 1 ≤ a ≤ q ≤ Q, let m(a, q) :=

a − τ q , a + τ

q

.

We can easily see that these intervals are mutually disjoint and all lie in [τ, 1 + τ ]. We call the union of these m(a, q) the major arcs M and [τ, 1 + τ ] \ M the minor arcs M

⁰

.

Define

S

_i

(α) := S(α, d, b

_i

) := X

n≤N2

n≡b_i(mod d)

Λ(n)e(αn

²

), (2.3)

R(N ) :=

1+τ

\

τ

Y

5 i=1

S

i

(α)e(−N α) dα.

(2.4)

Then the Theorem holds if R(N ) > 0. By interval dissection we get (2.5) R(N ) =

n \

M

+ \

M⁰

o Y

⁵

i=1

S

_i

(α)e(−N α) dα =: R

₁

(N ) + R

₂

(N ), say. The integral over the minor arcs contributes the error term R

2

(N ). We now estimate it by the following lemma from [11].

Lemma 2.1. For α satisfying |α − a/q| ≤ 1/q

²

, (a, q) = 1 and h = (q, d), we have

(2.6) S

i

(α) N

₂^1+ε

h

dq

^1/2

+ N

₂^11/14

d

^1/2

+ N

₂^6/7

h

^3/4

d

^3/4

q

^1/4

+

q

^1/2

h

^1/2

+ q

^1/4

h

^1/4

N

₂^1/7

N

^ε

. For any α ∈ M

⁰

, by Dirichlet’s lemma we see that there exist q, a satisfying Q ≤ q ≤ τ

⁻¹

, (a, q) = 1, 1 ≤ a ≤ q such that |α − a/q| ≤ τ /q ≤ 1/q

²

. Notice that d = N

^δ

, Q = N

^21δ

and δ is sufficiently small. We have

Corollary 2.2. For any α ∈ M

⁰

, we have

S

_i

(α) N

^1/2

Q

^−1/2

provided N ≥ N

₀

(δ).

(4)

Lemma 2.3. We have

R

₂

(N ) N

^3/2

Q

^−1/2

d

⁻²

log

^4+c

N whenever N ≥ N

1

(δ, c).

P r o o f. By Corollary 2.2 for S

₅

(α), we have R

2

(N ) N

^1/2

Q

^−1/2

1+τ

\

τ

Y

4 i=1

S

i

(α)

dα N

^1/2

Q

^−1/2

X

4 i=1

1+τ

\

τ

|S

i

(α)|

⁴

dα.

Clearly, the inner integral is equal to X

nj≤N2

n²₁+n²₂=n²₃+n²₄ nj≡bi(mod d)

Λ(n

₁

)Λ(n

₂

)Λ(n

₃

)Λ(n

₄

) ≤ log

⁴

N X

nj≤N2

n²₁+n²₂=n²₃+n²₄ nj≡bi(mod d)

1 = log

⁴

N

1

\

0

X

n≤N₂ n≡bi(mod d)

e(αn)

⁴

dα.

Following the arguments in Hua’s lemma [9], it is seen that there is a constant c such that

1

\

0

X

n≤N2

n≡bi(mod d)

e(αn)

⁴

dα N

d

²

log

^c

N, whence our lemma follows.

From (2.5), to obtain our Theorem it remains to find a lower bound for R

₁

(N ) such that R

₁

(N ) > |R

₂

(N )|.

3. Notations and the major arcs. We shall use χ (mod q) and χ

₀

(mod q) to denote a Dirichlet character and the principal character modulo q respectively. It is known ([1], Chapter 14) that there exists a small c

₁

such that there is at most one primitive character e χ to a modulus e r ≤ T for which the corresponding L-function L(s, e χ) has a zero in the region σ >

1 − c

₁

(log T )

⁻¹

, |t| ≤ T ; and if there is such an exceptional character, it is quadratic and the corresponding zero e β, called the exceptional zero, is real, simple and unique. Furthermore we have

(3.1) c

₂

(e r

^1/2

log

²

r) e

⁻¹

≤ 1 − e β ≤ c

₁

/log T.

We write P

_q0

a=1

or P

(a,q)=1

for a sum over integers a satisfying 1 ≤ a ≤ q

and (a, q) = 1. For any character χ (mod dq/h), h = (d, q) define

(5)

(3.2)

S(χ, y) := S(χ, y, d, q) := X

L₂≤n≤N₂

χ(n)Λ(n)e(n

²

y),

I(y) :=

N

\

₂

L2

e(x

²

y) dx,

I(y) := e

N

\

₂

L2

x

^β−1^e

e(x

²

y) dx,

I(χ, y) := X

0 γ≤T

N

\

₂

L2

x

^%−1

e(x

²

y) dx,

where Λ(n) is the von Mangoldt function and P

₀

γ≤T

denotes the summation over all zeros % = β + iγ of the function L(s, χ) lying in the region 1/2 ≤ β ≤ 1 − c

1

(log T )

⁻¹

, |γ| ≤ T (hence excluding e β, if it exists). By ([1], p. 120) we can easily deduce that

Lemma 3.1. For any real y and any χ (mod dq/h) with dq/h ≤ T, we have

(3.3) S(χ, y) = δ

⁰_χ

I(y) − δ

_χ

I(y) − I(χ, y) + O((1 + |y|N )N e

₂

T

⁻¹

log

²

N ), where

δ

_χ⁰

=

1 if χ = χ

₀

(mod dq/h), 0 otherwise;

δ

_χ

=

1 if χ = e χχ

₀

(mod dq/h), 0 otherwise.

We next transform the exponential sum S

i

(α) into character sums or integrals of the above forms. To do this we need some more notations.

For positive integers d, q and h(q) := (d, q), i.e. the largest common divisor of d, q, define positive integers α

i

, β

i

, γ

i

according to

d = p

^α₁¹

. . . p

^α_s^s

d

0

, q = p

^β₁¹

. . . p

^β_s^s

q

0

, (d

0

, q

0

) = 1, (3.4)

h(q) = p

^γ₁¹

. . . p

^γ_s^s

, (3.5)

hence γ

_i

= min(α

_i

, β

_i

), i = 1, . . . , s. Define h

₁

(q) := p

^δ₁¹

. . . p

^δ_s^s

, δ

_i

=

α

i

if β

i

> α

i

, 0 otherwise, (3.6)

h

2

(q) := h(q)/h

1

(q).

(3.7)

For brevity, we write h = h(q), h

1

= h

1

(q) and h

2

= h

2

(q). It is easily seen

that (h

₁

, h

₂

) = 1 and (d/h

₁

, q/h

₂

) = 1.

(6)

Lemma 3.2. For α = a/q + λ, we have (3.8) S

i

(α) = ϕ

⁻¹

(d/h

1

)ϕ

⁻¹

(q/h

2

)

× X

η (mod d/h₁)

ζ(b

_i

) X

η (mod q/h₂)

G

_i

(a, η, q)S(ζη, λ) + O(log

²

N ), where

(3.9) G

_i

(a, η, q) := G(h, b

_i

, a, η, q) := X

(c,q)=1 c≡b_i(mod h)

e(ac

²

/q)η(c),

and η, ζ are characters modulo q/h

2

and d/h

1

respectively.

P r o o f. We have S

_i

(α) = X

n≤N2

n≡bi(mod d)

Λ(n)e(αn

²

)

= X

(n,q)=1

Λ(n)e(αn

²

) + O X

p^k≤N2

p|q

log pe(p

^2k

α)

= X

(c,q)=1

e

ac

²

q

X

n≤N2

n≡b_i(mod d) n≡c (mod q)

Λ(n)e(n

²

λ) + O

log N X

p|q

log p

= X

(c,q)=1 c≡bi(mod h)

e

ac

²

q

X

n≡c (mod q)

Λ(n)e(n

²

λ) + O(log

²

N ).

The inner sum of the main term is empty unless c ≡ b

_i

(mod h), we can therefore add the restriction c ≡ b

i

(mod h) to the sum over c. On the other hand, under the condition c ≡ b

_i

(mod h), the congruences

n ≡ b

i

(mod d), n ≡ c (mod q) are equivalent to

n ≡ b

_i

(mod d/h

₁

), n ≡ c (mod q/h

₂

).

Then

S

_i

(α) = X

(c,q)=1 c≡bi(mod h)

e

ac

²

q

X

n≤N₂ n≡bi(mod d/h1)

n≡c (mod q/h2)

Λ(n)e(n

²

λ) + O(log

²

N )

(7)

= ϕ

⁻¹

(d/h

1

)ϕ

⁻¹

(q/h

2

) X

(c,q)=1 c≡bi(mod h)

e

ac

²

q

× X

ζ (mod d/h1)

ζ(b

_i

) X

η (mod q/h2)

η(c)

× X

n≤N₂

ζη(n)Λ(n)e(n

²

λ) + O(log

²

N ).

Hence we get the assertion.

By using the above lemmas, we now simplify R

₁

(N ) as follows.

For any α = a/q + λ ∈ m(a, q), we have |λ| < τ /q and q ≤ Q. By Lemmas 3.1 and 3.2,

S

_i

(α) = ϕ

⁻¹

(d/h

₁

)ϕ

⁻¹

(q/h

₂

)

G

_i

(a, η

₀

, q)I(λ) − δ

_q

ζζ e

₀

(b

_i

)G

_i

(a, e ηη

₀

, q)e I(λ)

− X

ζ (mod d/h1) η (mod q/h2)

ζ(b

_i

)G

_i

(a, η, q)I(ζη, λ)

+ O

ϕ

⁻¹

(q/h

2

) X

η (mod q/h₂)

|G

i

(a, η, q)|(1 + |λ|N )N

^1/2

T

⁻¹

log

²

N

+ O(log

²

N ),

where e ζζ

₀

(mod d/h

₁

) e ηη

₀

(mod q/h

₂

) = e χχ

₀

(mod dq/h), e ζ, e η are primitive characters, and

(3.10) δ

_q

:=

n 1 if e χ (mod e r) exists and e r | dq/h, 0 otherwise.

Since |λ| τ /q and |λ|N < T

^1/4

q

⁻¹

, the trivial bound X

η (mod q/h2)

|G

_i

(a, η, q)| ϕ(q/h

₂

)ϕ(q) shows that the first O-term above is

ϕ(q)T

^1/4

q

⁻¹

N

2

T

⁻¹

log

²

N ≤ N

2

T

^−3/4

log

²

N. Hence, for α = a/q + λ ∈ m(a, q) we obtain

(3.11) S

i

(α) = ϕ

⁻¹

(d/h

1

)ϕ

⁻¹

(q/h

2

)H

i

(a, q, λ) + O(N

2

T

^−3/4

log

²

N ) where

H

i

(a, q, λ) := G

i

(a, η

₀

, q)I(λ) − δ

q

ζζ e

0

(b)G

i

(a, e ηη

0

, q)e I(λ) (3.12)

− F

_i

(a, q, λ),

F (a, q, λ) := X

ζ (mod d/h1) η (mod q/h2)

ζ(b

_i

)G

_i

(a, η, q)I(ζη, λ).

(3.13)

(8)

To estimate H

i

(a, q, λ), we need the following lemma which can be de- duced similarly to Lemma 3.3 of [5].

Lemma 3.3. Let I(λ), e I(λ) and I(χ, λ) be defined as in (3.2).

(a) For any real y, we have

I(y) min(N

2

, L

⁻¹₂

|y|

⁻¹

), I(y) min(N e

₂^β^e

, L

^β−1₂^e

|y|

⁻¹

) and

I(χ, y)

 



N

2

for any real y, N

₂

(L|y|)

^−1/2

for |y| > L

⁻¹

, L

2

(L|y|)

⁻¹

for |y| > T /(πL).

(b) We have

∞

\

−∞

|I(y)|

²

dy N

2

L

⁻¹₂

,

∞

\

−∞

|e I(y)|

²

dy N

₂^β^e

L

^β−2₂^e

and

∞

\

−∞

|I(χ, y)|

²

dy N L

⁻¹

log N.

By the trivial estimates for I(λ), e I(λ) and I(χ, λ) in Lemma 3.3(a), ϕ

⁻¹

(d/h

₁

)ϕ

⁻¹

(q/h

₂

)H

_i

(a, q, λ) ϕ(q)N

₂

.

Then

R

₁

(N ) = X

q≤Q

ϕ

⁻⁵

(d/h

₁

)ϕ

⁻⁵

(q/h

₂

)

× X

(a,q)=1 τ /q

\

−τ /q

e(−N (a/q + λ)) Y

5 i=1

H

_i

(a, q, λ) dλ

+ O X

q≤Q

X

(a,q)=1

τ q

X

4 k=0

(ϕ(q)N

₂

)

^k

(N

₂

T

^−3/4

log

²

N )

^5−k

= X

q≤Q

ϕ

⁻⁵

(d/h

₁

)ϕ

⁻⁵

(q/h

₂

) X

(a,q)=1

e

_q

(−N a)

×

τ /q

\

−τ /q

e(−N λ) Y

5 i=1

H

i

(a, q, λ) dλ + O(N

^3/2

T

^−1/2

Q

⁵

).

The product Q

₅

k=1

H

_k

(a, q, λ) is a sum of at most (φ(dq/h)+2)

⁵

terms, each

(9)

of the form Q

₅

i=1

E

i

where E

i

is either G

i

(q)I(λ), −δ

q

ζ(b e

i

)G(a, e ηη

0

, q)e I(λ) or −ζ(b

_i

)G(a, η, q)I(ζη, λ).

By comparing the estimates for I(λ), e I(λ) and I(χ, λ) in Lem- ma 3.3(a) with |λ| > τ /q > L

⁻¹

, it is easily seen that the weakest one among them is N

^1/2

(L|λ|)

^−1/2

, since τ = N

⁻¹

T

^1/4

, L = N/25 and T = Q

^1/^√^δ

. Then

\

R\[−τ /q,τ /q]

Y

5 i=1

E

_i

dλ φ

³

(q)|τ /q|

^−3/2

∞

\

−∞

|E

₁

E

₂

| dλ

φ

⁵

(q)|τ /q|

^−3/2

, by Cauchy’s inequality and Lemma 3.3(b). Hence

X

q≤Q

φ

⁻⁵

d h

₁

φ

⁻⁵

q h

₂

X

(a,q)=1

\

R\[−τ /q,τ /q]

e(−N λ) Y

5 i=1

H

_i

(a, b, λ) dλ

X

q≤Q

φ

⁻⁵

d h

₁

φ

⁻⁵

q h

₂

φ(q)

φ

dq h

+ 2

₅

φ

⁵

(q)q

^3/2

N

^3/2

T

^−3/8

Q

⁹

N

^3/2

T

^−3/8

N

^3/2

Q

⁻¹

. Therefore

R

1

(N ) = X

q≤Q

φ

⁻⁵

(d/h

1

)φ

⁻⁵

(q/h

2

) X

(a,q)=1

e

q

(−N a) (3.14)

×

∞

\

−∞

e(−N λ) Y

5 i=1

H

i

(a, q, λ) dλ + O(N

^3/2

Q

⁻¹

).

4. Some lemmas for singular series and singular integrals Lemma 4.1. Let χ (mod p

^β

/h

₂

) be any character with β ≥ 0, let h

₂

= h

₂

(p

^β

) be defined as in (3.7), and let α be such that p

^α

k d. We have

(a) G

_i

(a, χ, p

^β

) = 0 if χ (mod p

^β

) is primitive and p | a, β > α.

(b) G

i

(a, χη

0

, p

^t

) = 0 if η

0

is modulo p

^t

/h

2

(p

^t

), p - a and t ≥ θ + max{θ, α, β} where θ = 1 + [1/p].

(c) G

_i

(a, χ, p

^β

) ≤ 2(2, p)p

^β/2

if p - a.

P r o o f. (a) Let a

⁰

= a/p. For 1 ≤ c ≤ p

^β

, write c = u + vp

^β−1

. Since

β > α, h(p

^β

) = p

^α

and the restriction c ≡ b

i

(mod p

^α

) is equivalent to

u ≡ b

_i

(mod p

^α

). Hence

(10)

G

i

(a, χ, p

^β

) =

p^β

X

c≡b_ic=1(mod p^α)

e

a

⁰

c

²

p

^β−1

χ(c)

=

p

X

^β−1

u≡biu=1(mod p^α)

e

a

⁰

u

²

p

^β−1

X

p v=1

χ(u + vp

^β−1

) = 0.

For a primitive character χ (mod p

^β

), the inner sum over v is zero.

(b) For t > α, we see from (3.7) that h(p

^t

) = p

^α

, h

₂

(p

^t

) = 1, so η

₀

is modulo p

^t

.

For 1 ≤ c ≤ p

^t

, write c = u + vp

^t−θ

. The restriction c ≡ b

_i

(mod p

^α

) is equivalent to u ≡ b

i

(mod p

^α

), since t ≥ θ + max{θ, α, β}. Moreover, we have c

²

≡ u

²

+ 2uvp

^t−θ

(mod p

^t

), c ≡ u (mod p

^β

). Hence

G(a, χ, p

^t

) =

p

X

^t−θ

u≡b_iu=1(mod p^α)

χη

₀

(u)e

au

²

p

^t

X

p^θ v=1

e

2auv p

^θ

.

For each u coprime with p, in view of p - a, the inner sum over v is zero.

Hence (b) is proved.

(c) For β ≤ α, |G(a, χ, p

^β

)| ≤ 1. For β > α, we have G(a, χ, p

^β

) =

p^β

X

c≡bic=1(mod p^α)

e

ac

²

p

^β

χ

_pβ

(c)

= ϕ

⁻¹

(p

^α

) X

χ_pα(mod p^α)

χ

_p^α

(b

_i

)

p^β

X

c=1

e

ac

²

p

^β

χ

_p^α

χ

_pβ

(c).

By Exercise 14 in Chapter 6 of [10], the inner sum over c is less than 2(2, p)p

^β/2

for p - a. Hence we get (c).

We shall use the following sums to form the singular series:

Z(q) := Z(q; η

1

, . . . , η

5

) := X

(a,q)=1

e

q

(−N a) Y

5 i=1

G

i

(a, η

i

, q), (4.1)

Y (q) := Y (q; η

₁

, . . . , η

₅

) :=

X

q a=1

e

_q

(−N a) Y

5 i=1

G

_i

(a, η

_i

, q), (4.2)

where η

i

is modulo q/h

2

(q). We can also write (4.3) Y (q; η

₁

, . . . , η

₅

) = q X

(q)

η

₁

(c

₁

) . . . η

₅

(c

₅

),

(11)

where P

(q)

denotes the sum over c

1

, . . . , c

5

satisfying

(4.4)

1 ≤ c

₁

, . . . , c

₅

≤ q, c

_i

≡ b

_i

(mod (d, q)), (c

_i

, q) = 1,

X

5 i=1

c

²_i

≡ N (mod q).

Denote by N (q) the number of solutions of the above congruence equation.

By Hua’s work on Tarry’s problem [3, p. 162] and M. C. Liu, K. M. Tsang [6, (1.5)], we see that if N ≡ 5 (mod 24) and N satisfies (1.2) then N (q) ≥ 1 for all q. In the case where η

_i

are all principal characters, we see that (4.5) Y (q; η

₀

, . . . , η

₀

) = qN (q).

Furthermore we put

(4.6) A(q) := ϕ

⁻⁵

(q(d, q)

^∞

/h)Z(q; η

₀

, . . . , η

₀

),

where (d, q)

^∞

has the same prime factors of (d, q), and (d, q)

^∞

k d which means that if p

^α

k (d, q)

^∞

then p

^α

k d.

Lemma 4.2. Both Z(q) and Y (q) are multiplicative in the sense that if q = q

₁

. . . q

_t

with (q

_i

, q

_j

) = 1 for i 6= j, then for each i = 1, . . . , 5, the decompositions η

i

(mod q/h

2

(q)) = Q

_t

j=1

η

ij

(mod q

j

/h

2

(q

j

)) are unique.

We have

Z(q; η

₁

, . . . , η

₅

) = Y

t j=1

Z(q

_j

; η

_1j

, . . . , η

_5j

) and

Y (q; η

₁

, . . . , η

₅

) = Y

t j=1

Y (q

_j

; η

_1j

, . . . , η

_5j

).

In particular , N (q) and A(q) are multiplicative functions of q.

P r o o f. It suffices to consider the case t = 2, and then use induction.

Let q = q

1

q

2

with (q

1

, q

2

) = 1. It is easily seen from (3.7) that

h(q) = h(q

1

)h(q

2

), h

i

(q) = h

i

(q

1

)h

i

(q

2

), (h

i

(q

1

), h

i

(q

2

)) = 1, q

h

_i

(q) = q

₁

h

_i

(q

₁

) · q

₂

h

_i

(q

₂

) ,

q

₁

h

_i

(q

₁

) , q

₂

h

_i

(q

₂

)

= 1, i = 1, 2.

Then η

_i

(mod q/h

₂

(q)) = Q

₂

j=1

η

_ij

(mod q

_j

/h

₂

(q

_j

)) are uniquely deter- mined.

Let a = a

1

q

2

+ a

2

q

1

. If a

1

, a

2

run over reduced residue systems modulo q

₁

, q

₂

respectively, then a will run over a similar system modulo q

₁

q

₂

. So (4.7) Z(q

₁

q

₂

) = X

(a,q)=1

e

_q

(−N a) Y

5 i=1

G

_i

(a, η

_i

, q

₁

q

₂

)

(12)

= X

(a1,q1)=1

e

q₁

(−N a

1

) X

(a2,q2)=1

e

q₂

(−N a

2

)

× Y

5 i=1

G

_i

(a

₁

q

₂

+ a

₂

q

₁

, η

_i

, q

₁

q

₂

).

In the same way, let c = c

₁

q

₂

+ c

₂

q

₁

. We see that the restriction c ≡ b

_i

(mod h(q)) is equivalent to c

1

q

2

≡ b

i

(mod h(q

1

)) and c

2

q

1

≡ b

i

(mod h(q

2

)), so we have

G

_i

(a

₁

q

₂

+ a

₂

q

₁

, η

_i

, q

₁

q

₂

)

=

q₁q₂

X

c≡bi(mod h(q))c=1

e

(a

₁

q

₂

+ a

₂

q

₁

)c

²

q

1

q

2

η(c)

= X

(c₁q₂,q₁)=1 c₁q₂≡b_i(mod h(q₁))

X

(c₂q₁,q₂)=1 c₂q₁≡b_i(mod h(q₂))

e

a

1

(c

1

q

2

)

²

q

₁

η

i1

(c

1

q

2

)

× e

a

₂

(c

₂

q

₁

)

²

q

2

η

_i2

(c

₂

q

₁

)

= G

_i

(a

₁

, η

_i1

, q

₁

)G

_i

(a

₂

, η

_i2

, q

₂

).

Hence the right hand side of (4.7) is equal to Z(q

1

)Z(q

2

) as desired. The proof for the multiplicativity of Y (q) is similar.

Lemma 4.3. For any positive integer q, we have ϕ

⁻⁵

dq h(q)

Z(q) h

⁵

(q)

d

⁵

q

^−3/2

L. P r o o f. Let q = Q

p|q

p

^β^p

be the prime factorization of q. By Lemma 4.1(c) and multiplicativity of Z(q), we have

|Z(q)| = Y

p|q

|Z(p

^β^p

)| Y

p|q

p

^β^p

Y

5 i=1

p

^β^p^/2

q

^7/2

.

Hence by ϕ(q) q log log q, we get the assertion.

Lemma 4.4. For i = 1, . . . , 5, let χ

_i

(mod p

^βⁱ

) be primitive characters and β = max{β

1

, . . . , β

5

}. Define α := α(p) such that p

^α(p)

k d. For brevity, write Z(p

^t

) = Z(p

^t

; χ

₁

χ

₀

, . . . , χ

₅

χ

₀

), where χ

₀

is modulo p

^t

. Then

(a) Z(p

^β

) = Y (p

^β

) if β > α.

(b) Z(p

^t

) = 0 if t ≥ θ + max{θ, β, α}, where θ = 1 + [2/p].

(13)

(c) P

_t

v=β

ϕ

⁻⁵

(p

^v

)Z(p

^v

) = ϕ

⁻⁵

(p

^t

)Y (p

^t

) for β>α or P

_α

v=0

ϕ

⁻⁵

(p

^α

)Z(p

^v

) + P

_t

v=α+1

ϕ

⁻⁵

(p

^v

)Z(p

^v

) = ϕ

⁻⁵

(p

^t

)Y (p

^t

) for β = 0 and t > α.

P r o o f. (a) Lemma 4.1(a) asserts that G

i

(a, χ, p

^β

) = 0 for p - a and β > α. By comparing (4.1) and (4.2), the sum over a is empty if p | a, hence (a) is proved.

(b) This follows readily from (4.1) and Lemma 4.1(b).

(c) The sum for Z(p

^t

) can be written as P

_p^v

a=1

− P

_p^v

a=1, p|a

. The first sum is Y (p

^α

). By setting a = pa

⁰

, we see that the second sum is equal to p

⁵

Y (p

^v−1

) when v ≥ max(β + 1, α + 1, 2). Thus we have

ϕ

⁻⁵

(p

^v

)Z(p

^v

) = ϕ

⁻⁵

(p

^v

)Y (p

^v

) − ϕ

⁻⁵

(p

^v−1

)Y (p

^v−1

).

For β > α, summing both sides for v = β + 1, . . . , t and using (a), we get the first equality. In order to consider the condition β = 0, t > α, we still need to sum over those t ≤ α and prove that this yields ϕ

⁻⁵

(p

^α

)Y (p

^α

). For t ≤ α, h(p

^t

) = p

^t

, h

2

(p

^t

) = p

^t

we have

Z(p

^t

) = X

(a,p^t)=1

e

−N a p

^t

Y

₅

i=1

X

(ci,p^t)=1 ci≡bi(mod p^t)

e

ac

²_i

p

^t

= X

(a,p^t)=1

e

a( P

₅

i=1

b

²_i

− N ) p

^t

= ϕ(p

^t

).

Since, by (1.2), P

b

²_i

≡ N (mod d), we have (4.8)

X

α v=0

ϕ

⁻⁵

(p

^α

)Z(p

^v

) = ϕ

⁻⁵

(p

^α

)p

^α

, which is equal to ϕ

⁻⁵

(p

^α

)Y (p

^α

). Hence we get (c).

Taking χ

1

= . . . = χ

5

= χ

0

and β = 0 in Lemma 4.4, we obtain

Corollary 4.5. Let A(q), N (q), α = α(p) be defined as in (4.6), (4.5), and Lemma 4.4 respectively. Then

(a) A(p

^t

) = 0 for p ≥ 3, t ≥ 1 + α, and A(2

^t

) = 0 for t ≥ 2 + max{2, α}.

(b) p

^t

ϕ

⁻⁵

(p

^t

)N (p

^t

) = p

^α

ϕ

⁻⁵

(p

^α

)N (p

^α

) for p ≥ 3, t ≥ α.

(c) 2

^t

ϕ

⁻⁵

(2

^t

)N (2

^t

) = 2

^α⁰

ϕ

⁻⁵

(2

^α⁰

)N (2

^α⁰

) for t ≥ α

⁰

, α

⁰

= 1+max{2, α}.

In view of the above corollary, we now define

s(p) := X

0≤t<θ+max{θ,α(p)}

A(p

^t

) (4.9)

= ϕ

⁻⁵

(σ(p

^α(p)

)p

^α(p)

)N (σ(p

^α(p)

)p

^α(p)

)σ(p

^α(p)

)p

^α(p)

,

where σ(q) is defined in (1.2). We now simplify s(p).

(14)

Lemma 4.6. (a) s(p) = ϕ

⁻⁵

(p

^α

)p

^α

if p 6= 2, α = α(p) ≥ 1.

(b) s(2) =

2

³

if α(2) = 1,

ϕ

⁻⁵

(2

^α(2)

)2

^α(2)+1

if α(2) ≥ 2, hence we can also write s(2) = ϕ

⁻⁵

(2

^α

)2

^α

σ(d).

(c) s(p) = 1 + A(p) if p 6= 2, p - d, s(2) = 1 + A(2) + A(2

²

) + A(2

³

) if 2 - d.

P r o o f. (a) Use a similar argument to Lemma 4.4(c), (4.8).

(b) By a similar argument, we have A(2

^t

) = ϕ

⁻⁵

(2

^α

)ϕ(2

^t

) for t ≤ α.

If α = 1, it remains to consider t = 2, 3. Since P

(a,p^t)=1

= P

_p^t

a=1

− P

_p^t

a=1, p|a

, we have

A(2

^t

) = ϕ

⁻⁵

(2

^t

) X

(a,2^t)=1

e

−N a 2

^t

Y

₅

i=1

X

₂^t

c_i=1 ci≡bi(mod 2)

e

ac

²_i

2

^t

= ϕ

⁻⁵

(2

^t

)

X

₂^t

a=1

2^t

X

ci=1 ci≡bi(mod 2)

e

a( P

₅

i=1

c

²_i

− N ) 2

^t

−

2

X

^t−1

a⁰=1 2^t

X

c_i=1 ci≡bi(mod 2)

e

a

⁰

( P

₅

i=1

c

²_i

− N ) 2

^t−1

= ϕ

⁻⁵

(2

^t

)(2

^t

N (2

^t

) − 2

^t−1

2

⁵

N (2

^t−1

))

= ϕ

⁻⁵

(2

^t

)2

^t

N (2

^t

) − ϕ

⁻⁵

(2

^t−1

)2

^t−1

N (2

^t−1

)

where N (2

^t

) is the number of solutions of the following congruence equation:

1 ≤ c

i

≤ 2

^t

, (c

i

, 2) = 1,

c

²₁

+ . . . + c

²₅

≡ N (mod 2

^t

), c

_i

≡ b

_i

(mod 2).

By an easy calculation, we see that N (2

³

) = 2

¹⁰

, N (2) = 1. Then s(2) = 1 + A(2) + A(2

²

) + A(2

³

)

= 1 + 1 − ϕ

⁻⁵

(2)2N (2) + ϕ

⁻⁵

(2

³

)2

³

N (2

³

) = 2

³

. If α > 1, it remains to consider t = α + 1. We have

A(2

^α+1

) = ϕ

⁻⁵

(2

^α+1

) X

(a,2^α+1)=1

e

−N a 2

^α+1

Y

₅

i=1

₂

X

^α+1

ci=1 c_i≡b_i(mod 2^α)

e

ac

²_i

2

^α+1

= ϕ

⁻⁵

(2

^α+1

)

₂

X

^α+1

a=1

2

X

^α+1

ci=1 ci≡bi(mod 2^α)

e

a( P

₅

i=1

c

²_i

− N ) 2

^α+1

(15)

−

2^α

X

a⁰=1

e

−N a

⁰

2

^α

Y

₅

i=1

₂

X

^α+1

c_i=1 ci≡bi(mod 2^α)

e

a

⁰

c

²_i

2

^α

= ϕ

⁻⁵

(2

^α+1

)2

^α+1

N (2

^α+1

) − ϕ

⁻⁵

(2

^α

)2

^α

,

where N (2

^α+1

) is the number of solutions of the following congruence equation:

1 ≤ c

i

≤ 2

^α+1

, (c

i

, 2) = 1,

c

²₁

+ . . . + c

²₅

≡ N (mod 2

^α+1

), c

_i

≡ b

_i

(mod 2

^α

).

Obviously N (2

^α+1

) = 2

⁵

. Then

s(2) = ϕ

⁻⁵

(2

^α

) + A(2) + . . . + A(2

^α

) + A(2

^α+1

)

= ϕ

⁻⁵

(2

^α+1

)2

^α+1

N (2

^α+1

) = ϕ

⁻⁵

(2

^α

)2

^α+1

. Part (c) follows immediately from (4.9) and Corollary 4.5.

Lemma 4.7. We have:

(a) A(p) < 30p

⁻²

for all p - d.

(b) Q

p

s(p) converges absolutely and Q

p

s(p) ϕ

⁻⁵

(d)dσ(d).

X

∞ q=1 (q,r)=1

ϕ

⁻⁵

(dq/h)Z(q; η

0

, . . . , η

0

) = Y

p-r

s(p) (c)

= σ(d/(d, r))d/(d, r) ϕ

⁻⁵

(d/(d, r))

Y

p-r p-d

s(p).

(d) X

q≥y

ϕ

⁻⁵

(dq/h)Z(q; η

₀

, . . . , η

₀

) y

⁻¹

d

⁻³

log

³⁰

(y + 1).

P r o o f. (a) Since p - d, we have h(p) = 1; let g be a quadratic nonresidue modulo p. Then

A(p) = ϕ

⁻⁵

(p)

p−1

X

a=1

e

−N a p

Y

5 i=1

p−1

X

ci=1

e

ac

²_i

p

= 1

2 ϕ

⁻⁵

(p)

p−1

X

a=1

e

−N a

²

p

Y

₅

i=1

C

_p

(a

²

) + e

−N ga

²

p

Y

₅

i=1

C

_p

(ga

²

)

, where C

_p

(a) = P

_p−1

c=1

e(ac

²

/p). It is well known [1] that C

p

(1) = λ − 1, λ =

√ p if p ≡ 1 (mod 4), i √

p if p ≡ −1 (mod 4).

Furthermore C

_p

(1) + C

_p

(g) = 2 P

_p−1

c=1

e

_p

(c) = −2, hence C

_p

(g) = −λ − 1.

(16)

Plainly

C

p

(a) =

C

p

(1) if a is a quadratic residue modulo p, C

_p

(g) otherwise.

We get

A(p) =

¹₂

ϕ

⁻⁵

(p)

×

 



(p − 1)((λ − 1)

⁵

+ (−λ − 1)

⁵

) if p | N,

(λ − 1)

⁶

+ (−λ − 1)

⁶

if p - N and

^N_p

= 1, (−λ − 1)(λ − 1)

⁵

+ (λ − 1)(−λ − 1)

⁵

if p - N and

^N_p

= −1,

N p

being the Legendre symbol. Hence |A(p)| < 30p

⁻²

for p - d. The 30 comes from explicit computation, but it is unimportant for our application.

(b) By Lemmas 4.6 and 4.7(a), we have Y

p

s(p) = Y

p|d

s(p) Y

p-d

(1 + A(p))

ϕ

⁻⁵

(2

^α(2)

)σ(d)2

^α(2)

Y

p^α(p)kd p6=2

ϕ

⁻⁵

(p

^α(p)

)p

^α(p)

Y

p-d

(1 − 30p

⁻²

)

σ(d)ϕ

⁻⁵

(d)d.

The proof of the convergence is similar.

(c) Let q = q

⁰

q

⁰⁰

, (q

⁰

, q

⁰⁰

) = 1 and q

⁰

| d

^∞

, (q

⁰⁰

, d) = 1. By the multiplicativity of Z(q), we have

X

∞ q=1 (q,r)=1

ϕ

⁻⁵

dq h

Z(q)

=

X

_∞

q⁰=1 (q⁰,r)=1

q⁰|d^∞

ϕ

⁻⁵

dq

⁰

h

Z(q

⁰

)

X

^∞

q⁰⁰=1 (q⁰⁰,r)=1 (q⁰⁰,d)=1

ϕ

⁻⁵

(q

⁰⁰

)Z(q

⁰⁰

)

.

Hence by Corollary 4.5, (4.9) and Lemma 4.6, we obtain the result.

(d) Let δ = (log(y + 1))

⁻¹

. We have X

q≥y

ϕ

⁻⁵

(dq/h)Z(q) ≤ X

q≥y

|A(q)| ≤ y

^−1+δ

X

∞ q=1

q

^1−δ

|A(q)|

y

⁻¹

Y

p|d

θ+max{θ,α(p)}

X

t=0

p

^t(1−δ)

|A(p

^t

)| Y

p-d

(1 + 30p

^−1−δ

)

y

⁻¹

d ϕ

⁵

(d)

Y

p|d

p

^α(p)

Y

p

(1 − p

^−1−δ

)

⁻³⁰

y

⁻¹

d

²

ϕ

⁵

(d) δ

⁻³⁰

since 1 + nx (1 − x)

⁻ⁿ

and ζ(1 + δ) ∼ δ

⁻¹

.

1. Introduction. A classical result in additive number theory is the five prime squares theorem proved by L. K. Hua: the diophantine equation (1.1) N = p

XC.3 (1999)

Numbers representable by five prime squares with primes in an arithmetic progression

by

Yonghui Wang (Beijing)