On attitude representations for optimization-based Bayesian smoothing

(1)

Delft University of Technology

On attitude representations for optimization-based Bayesian smoothing

Lorenz, Michael ; Taetz, Bertram; Kok, Manon; Bleser, Gabriele

Publication date 2019

Document Version

Accepted author manuscript Published in

Proceedings of the 22nd International Conference on Information Fusion

Citation (APA)

Lorenz, M., Taetz, B., Kok, M., & Bleser, G. (2019). On attitude representations for optimization-based Bayesian smoothing. In Proceedings of the 22nd International Conference on Information Fusion IEEE .

Important note

To cite this publication, please use the final published version (if applicable). Please check the document version above.

Copyright

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons. Takedown policy

Please contact us and provide details if you believe this document breaches copyrights. We will remove access to the work immediately and investigate your claim.

This work is downloaded from Delft University of Technology.

(2)

On Attitude Representations for Optimization-Based

Bayesian Smoothing

Michael Lorenz

1

, Bertram Taetz

1

, Manon Kok

2

and Gabriele Bleser

1

Abstract—This paper presents a study of the behavior of seven different attitude representations in a batchwise orientation smoothing problem. The representations include the well-known unit quaternions and rotation vectors(/axis-angle), as well as modified Rodrigues parameters (MRPs). We consider error states as well as direct orientation formulations for the orientation and propose two methods to handle the singularity of MRPs in the latter case. The Bayesian smoothing problem is posed as a maximum a posteriori estimate with Gaussian noise, which results in a non-linear weighted least squares problem. With this we estimate the trajectory for a single inertial measurement unit with only sparse magnetometer samples for two challenging scenarios. In the evaluation we mainly focus on the convergence but also consider the estimation errors. Monte Carlo simulations show that orientation error states, with rotation vectors or MRPs, in general converge faster than other representations. However, with an initialization of the optimization problem up to deviations of 90 degrees MRPs with one of the proposed singularity handlings performs similar, but with a much smaller memory consumption. The source code for this study is available online3_.

Index Terms—Bayesian smoothing, optimization, orientation estimation, attitude representations, inertial sensors.

I. INTRODUCTION

In applications where orientation estimations are involved in optimization problems such as in visual/inertial SLAM [17], [20], inertial human motion capture [8], [13] or com-puter vision [22], [24] attitude representations need to be chosen wisely. Desirable properties for representations in an optimization-based context are

1) a minimal parametrization with a low runtime and low memory consumption

2) that allows for numerically stable estimates and 3) fast convergence of the numerical optimization. Often the choice of the applied representation is not the best suited one but rather the most known. The rotation vectors(/axis-angles) representation is rarely used in applica-tions in such fields. Euler-Rodrigues symmetric parameters or unit quaternions are the common parametrizations used for orientation estimation [18], [21]. However, in the field of optimization-based Bayesian smoothing the direct application is rather unsuited due to the unity constraint and the fourth dimension [9].

*This work was supported by the BMBF (16SV7115)

1_{Department of Computer Science, Technische Universit¨at Kaiserslautern,}

Kaiserslautern, Germanysurname@cs.uni-kl.de

2_{Delft Center for Systems and Control (DCSC), Delft University of}

Technology, Delft, The NetherlandsM.Kok-1@tudelft.nl 3

https://git.cs.uni-kl.de/lorenz/attitudes-for-optimization-based-bayesian-smoothing

The, in contrast to unit quaternions rather unknown, mod-ified Rodriguez Parameters (MRPs) are also an alternative representation. They have been applied in Bayesian filtering and smoothing approaches [7], [3], [16], [12]. Recently [23] addressed the performance of different attitude representations, specifically MRPs, for the task of orientation interpolation and optimization problems. However, the optimizations solved only Wahba’s problem, i.e. no time dependency is involved. The effect of the different parametrizations for batch-wise optimization-based Bayesian smoothing approaches has, to the best of our knowledge, not been given so far.

In this work we study the convergence behavior of different attitude representations in the context of a common orientation estimation problem. The investigations are conducted on the example of a single inertial measurement unit (IMU) with a magnetometer, where the orientation trajectory is obtained by numerical optimization. Our contributions are two-fold:

1) We show how the different representations perform and we give recommendations for the choice of the attitude representation for optimization-based Bayesian smoothing.

2) We suggest two strategies that enable the employment of MRPs for optimization-based Bayesian smoothing problems.

The work is structured as follows: After the problem of optimization-based Bayesian smoothing is stated for the general case in Section II, we introduce the orientation parametrizations to be studied and point out their character-istics in the context of optimization-based smoothing. Having the attitude representations we are able to specify the gen-eral formulation of the problem for the task of orientation estimation with an IMU and a magnetometer in Section V. In connection to this we introduce models and formulate the weighted least squares (WLS) problem to be solved. Given the WLS problem we conduct simulation experiments with the introduced orientation parametrizations for different scenarios and illustrate the outcomes in Section VI. The experiments are followed by a discussion in Section VII and the conclusions in Section VIII. In this work we use the words attitude, ori-entation and rotation interchangeably. We provide the source

code online3_.

II. PROBLEMSTATEMENT

When dealing with Bayesian smoothing the goal is to find the best estimate of the trajectory of the states

x1:N = {x1, . . . , xN} for N subsequent time instances given

the measurement vectors y1:N = {y1, . . . , yN} [6], [5].

© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

(3)

From a probabilistic view we want to obtain insight into the

posteriori distribution p(x1:N|y1:N). To obtain a point estimate

of the complete trajectory from the posteriori a natural choice is to compute the maximum a posteriori (MAP) estimate given as

ˆ

x1:N = argmax

x1:N

p(x1:N|y1:N) . (1)

Applying Bayes rule and the assumption that the underlying models obey the Markov property, as well as exploiting the fact that the log does not change the extremum of the estimate, we can rearrange the equation into the well-known form

ˆ x1:N = argmax x1:N p(x1) N Y t=1 p(yt|xt) N −1 Y t=1 p(xt+1|xt) N Y t=2 p(xt) = argmin x1:N − log p(x1) | {z } prior − N X t=1 log p(yt|xt) | {z } sensor models − N −1 X t=1 log p(xt+1|xt) | {z } dynamic models − N X t=2 log p(xt) | {z } regularizations . (2)

When the distributions are assumed to be Gaussian we obtain a weighted least square problem. Such batch estimation prob-lems can be iteratively solved using standard algorithms like the Gauss-Newton method, which is our choice of a solver in this work. The different attitude representations (given in Section III) introduce different non-linearities into the estimation problem. In order to enforce these non-linearities during the optimization, we induce only sparse magnetometer measurements.

III. ORIENTATION PARAMETRIZATIONS FOR

OPTIMIZATION-BASEDBAYESIAN SMOOTHING

In the following we introduce parametrizations commonly used in the context of optimization-based Bayesian smoothing and point out their advantages and disadvantages in this field. For a general understanding of these we advise the work of [12], [18]. Conversions between the individual represen-tations can also be found there.

Rotation matrices are commonly used only for computations in euclidean space to rotate a vector. A rotation parametrized

by rotation matrices RN B _{∈ R}3x3 _{or other representations}

from the sensor body coordinate frame B into the navigation frame N is indicated by the superscript N B. The inverse

rotation can be given as RN B−1

= RBN_{. To indicate a}

conversion from an attitude representation to another we use bold font. For instance R(z) signifies a conversion from an arbitrary attitude representation z to a rotation matrix. Rotation matrices in 3D have nine variables and subsequent rotations can be computed with matrix multiplications of the corresponding matrices [1]. Although they are unique and do not have singularities they are because of several constraints rather unsuited for orientation estimation tasks.

A. Unit Quaternions (Quat)

Unit quaternions qN B = [q0 qvT]T , qN B ∈ R4 :

||qN B_||2

= 1 are the common attitude representation used in data processing applications involving orientations [18], [12]. Although they need four values, which have to obey the non-linear unity constraint, to represent an orientation, they have a number of advantageous properties. Especially the algebraic treatment is rather easy and does not require nonlinear functions. Subsequent rotations can be expressed as quaternion multiplications which only include linear opera-tions. Furthermore, unit quaternions are continuous and do not have singularities [18], [23]. This renders them especially attractive for simulations, when a trajectory is integrated over time. However, in the field of orientation estimation they intro-duce some problems. A major drawback is the nonlinear unit constraint. When dealing with it in an optimization framework one solution is the application of constrained optimization methods to obey the unity constraint. However, such methods typically introduce more variables and thereby the need of additional computational resources [15].

In case of unconstrained optimization with an iterative solver, as proposed in this work, typically a post processing step after each optimization iteration has to be carried out and regular-izations can be added. Another drawback is that quaternions posses the ambiguity that one and the same attitude can be represented by the quaternion q as well as by the antipodal −q. Given a bad initial guess of the optimization this can slow convergence behavior as we will see in Section VI-B2. B. Rotation Vector (RV)

A rotation vector is typically represented as a single vector

aN B _{∈ R}3_{, which is a axis vector n ∈ R}3 _{scaled by an}

angle α ∈ R as aN B _{= α n. The algebraic treatment, such}

as the conversion to rotation matrices includes trigonometric functions. This has the advantage that for small angles the represented rotation matrices are nearly linear. Rotation vec-tors are ambiguous in the case of adding a multiple of 2π to the angle results in the same orientation representation. If the estimate is far away from the real trajectory the non-linearity as well as the ambiguity can cause a bad convergence behavior. This is especially the case when the initialization state for the optimization is far away from the optimum as illustrated later in Section VI-B2.

To proceed subsequent rotations it is best practice to transform first the rotation vector to unit quaternions and then perform the rotations using a quaternions multiplication as

q(aN B₂ ) q aN B₁ . (3)

The resulting unit quaternion has to be converted back to obtain the rotation vector representation.

C. Modified Rodrigues Parameters (MRPs)

Modified Rodrigues parameters χN B∈ R3_{can be seen as a}

trade-off between unit quaternions and a minimal parametriza-tion. They are obtained by stereographic projection of the hypersphere of unit quaternions onto a hyperplane of three

(4)

Fig. 1. The upper plot shows a trajectory of q0for 400 samples, which passes

by the projecting pole (minus identity quaternion). The lower plot shows the euclidean norm of the to the upper plot corresponding MRP. The x-axis depicts the distance in degrees to the projecting pole. Note the singularity of the MRPs at sample 200 for distance of 0°.

dimensions, lying at the quaternion qχ = [0 χ1χ2χ3]T. For

a stereographic projection a pole is needed from which the mapping originates. Usually the pole is chosen as minus the

identity quaternion qp = −[1 0 0 0 ]T_{. A useful property of}

MRPs is that the mapping is bijective except for the case of the pole [10], [18]. A further advantage is that the algebra of subsequent rotations incorporates rational functions. Hence the derivates of rotation matrices or quaternions w.r.t. to MRPs are rational expressions [23]. However, the main disadvantage of MRPs is that they are singular at the projecting pole which is illustrated in Fig. 1. In the worst case this leads to an abortion of the numerical optimization procedure.

Two subsequent rotations can be computed as with rotation vectors in equation (3) or directly as given in [18].

D. Error States (ES)

The concept of Error States in filtering known e.g. from the MEKF [4], [11] or sometimes also called linearisation [9], [2] combine the advantages unit quaternions and a minimal parametrization. The basic idea is that an orientation is repre-sented by two subsequent rotations expressed in terms of unit quaternions as

q(z) ˘qN B. (4)

The main rotation is expressed as a constant term by an unit

quaternion ˘qN B_{, called linearisation quaternion. The other}

ro-tation expressed by a minimal parametrization z ∈ R3_contains

only errors or deviations from the linearisation quaternion and is meant to have small amplitudes only. To obtain the actual attitude the state z is converted into a unit quaternion and multiplied by the linearisation quaternion. In our case the states z are rotation vectors (ES-RV) or MRPs (ES-MRP). Although one has to store seven variables (four of the quater-nion and three of the minimal representation), the advantages over other direct attitude representations are:

On one hand the easy mathematical treatment of quaternions can be used. On the other hand the unity constraint can be

avoided by performing computations solely on a minimal parametrization. The attitude can be given as a unit quaternion and computations can be proceeded on the state z.

This means when solving a WLS derived from the general optimization problem (2) with an iterative solver like the Gauss-Newton method each Gauss-Newton step contains only the three dimensional error states. After each iteration in a postprocessing step the linearisation quaternion has to be updated with the deviation of the error states, so that the error states remain small in amplitude. Looking from an implementation perspective the combined usage of different parametrizations and the postprocessing step can make error states unattractive.

IV. DEALING WITH THE SINGULARITY OFMRPS

As pointed out before MRPs posses promising charac-teristics but suffer from the singularity which might cause instabilities during the numerical optimization procedure. In literature there have been several suggestions to overcome singularity problems. One possibility is to use two MRPs representing one attitude with two subsequent rotations like with error states [19], [14]. Another suggestion is the usage

of the so called shadowed MRP set χN Bs . It is often used in

orientation filtering approaches [3], [7]. The main idea is to

switch the projecting pole to antipodal pole qp:= −qp, when

the estimated MRP passes a certain threshold on the norm. The shadowed set can than be computed as

χN B_s,t = − χ N B t ||χN B t ||22 . (5)

The singularity is thereby avoided and the bijective mapping to and from unit quaternions is preserved. However, this procedure introduces a discontinuity at the threshold. Also one has to keep track of the currently applied sets, which requires additional data structures. Since in a filtering approach only one time instance is estimated sequentially, this structure has to monitor only one time instance. In an approach where the complete trajectory is estimated in a batch, this structure has to keep track of the sets for all time instances. Especially, when computations of different MRPs sets are involved simple solutions of the data structure are hard to achieve.

In the following we present two strategies that enable the application of MRPs in a general framework of Bayesian optimization based smoothing.

A. Re-projecting (MRP-RP)

Two antipodal quaternions represent the same attitude. Since MRPs are a bijective mapping of unit quaternions, two MRPs represent the same attitude. Our suggestion for a first strategy is to reduce the set of MRPs solely to the quaternions having

q0 ≥ 0. We therefore omit the antipodal unit quaternion set

with q0< 0 for the attitude representation.

The procedure behind is similar to the one as with shadowed

sets. Every time a MRP reaches values of ||χ||22 > 1 it is

re-projected using equation (5). This can be done in a post-processing step after each optimization iteration. In contrast

(5)

to the shadowed set, we do not monitor which of the sets is currently used and therefore loose the bijective nature. Like with the shadowed set we introduce a discontinuity but avoid numerical instabilities due to the singularity.

B. Regularizing MRPs (MRP-RG)

As pointed out in [23], for MRPs with ||χ||2

2 > 21 the

corresponding q0 value of the mapped quaternion is very

close to −1. Hence MRPs with very high values of the norm represent quaternions very close to the projecting pole. In our second strategy we propose an avoidance of the singularity by regularizing the norm of the MRP. We allow MRPs to be in a certain neighbourhood of the pole but not to reach the pole. It is clear that in some cases we thereby impose an error to the attitude representation.

The advantage of optimization-based Bayesian smoothing is that it allows to add a regularization into (2). If the euclidean norm of a MRP exceeds a given threshold , they are penalized by the regularization term. Due to the quadratic nature of the resulting WLS problem we propose to use a polynomial penality term based on the function

ρ(χ) = ||χ||m2/m. (6)

Here the degree of the polynomial is m and should be chosen as a positive even number. Other regularization functions like exponentials are also possible but showed in pilot experiments a worse convergence behavior. In addition to guarantee that the optimization runs numerically stable, when a trajectory near the singularity is estimated, we suggest to include a post-processing step. In this step after each optimization iteration it is checked whether the norm of the states exceed the threshold . Affected MRPs are renormalized to the value .

V. MODELLING

To study how the estimations behave when applying the introduced orientation parametrizations in (2), we consider a standard orientation estimation problem using an inertial measurement unit and a magnetometer. In order to focus mainly on the convergence behavior we use only common models for the modelling and for the Monte Carlo simulations. Moreover, we make use of the assumption that the IMU does not perceive any specific acceleration. Thus the inclination w.r.t. the gravity remains observable. We also assume all measurements to be bias-free or at least bias-compensated. The applied models are introduced in terms of unit

quater-nions qN Bt . To obtain the equations for other parametrizations

conversions and subsequent rotations have to be replaced by the corresponding formulations.

A. Sensor Models

The sensor models allow to extract the measurements. All measurements are given in the body frame. The applied models are taken in accordance with [5].

1) Accelerometer sensor: Assuming no specific

accelera-tion is acting on the measurement unit the accelerometer can be modelled as

ya,t= −R(qN B_t )Tg + va,t. (7)

The earth gravity vector is denoted as g. The accelerometer measurement noise is assumed to be Gaussian distributed as

va,t∼ N (0, Ra).

2) Magnetometer sensor: Denoting the vector of the local

earth magnet field as m the magnetometer can be modelled as

ym,t= R(qtN B)Tm + vm,t. (8)

The magnetometer measurement noise is also assumed to be

Gaussian distributed as vm,t∼ N (0, Rm).

B. Dynamic Models

The dynamic model couples orientations over time. A subsequent orientation can be computed by rotating the current orientation about the integrated angular velocity, given by the gyroscope measurement yω,t, as

q_t+TN B = qN B_t q (T (yω,t+ wω,t)) . (9)

The sampling time is given by T . We assume the process noise

wω,t to be distributed as wω,t∼ N (0, Qω).

C. Prior

In our formulation we make use of a prior as suggested

in [9], [13], where the prior noise e1 is given in terms of a

rotation vector distributed as e1∼ N (0, P1).

D. Regularizations

As already mentioned in Section IV-B for the case of MRPs it is for certain attitude representations beneficial to have an additional regularization. Such regularizations can be included into (1) as priors on the attitude representation and can be modelled as

ec,t = ρ(qN Bt ) (10)

We denote the regularizing function as ρ and the noise as

ec,t which is assumed to be distributed as ec,t ∼ N (0, Eq).

The subscript of the covariance matrix indicates the used attitude representation. For the case of unit quaternions the unity constraint can be included as a regularization.

E. Resulting WLS Problem

Given the prior, the dynamic model, the sensor models and if needed the model of the regularization they can be solved for noises and then inserted into the general MAP formulation (2). Since all distributions of the models are assumed to be Gaussian we obtain the following WLS problem

ˆ x1:N = argmin x1:N ||e1||2_P−1 1 + N X t=1 ||va,t||2_R−1 a + NK X k=1 ||vm,k||2_R−1 m + N −1 X t=1 ||wω,t||2_Q−1 ω + N X t=1 ||ec,t||2_E−1 q ! = argmin x1:N V (x1:N) ₍₁₁₎

(6)

Algorithm 1: MAP estimate of orientation trajectory for a IMU with magnetometer

Input:

• Chosen attitude representation.

• Inertial and magnetometer data {yω,1:N −1, ya,1:N, ym,1:NK} and their covariance matrices {Qω, Ra, Rm}.

• Prior on orientation ¯pN B1 and the prior covariance P1. • If needed, parameter regularizations and covariance

regularization Ep

• Initial guess of x01:N for numerical optimization. Output:

Estimate of the orientation trajectory ˆx1:N. iter := 0

while termination criteria are not fulfilled do 1) compute Gauss-Newton step as in [15]

of costfunction (11) dxiter1:N ← −h ˜H V (x iter 1:N) i−1 ∇V (xiter 1:N), where ˜H V (xiter

1:N) is the approximated Hessian 2) perform backtracking line search as in [15]

and compute α 3) update step ˜ xiter+11:N ← x iter 1:N+ α dx iter 1:N

4) if postprocessing function pp defined then perform postprocessing xiter1:N ← pp(˜x iter 1:N) else xiter 1:N ← ˜xiter1:N end 5) iter ← iter+1 end

To maintain a comprehensibility, a notation with dependencies of the residuals on the current orientation estimates and measurements, i.e. with parentheses and arguments, is omitted. Note that we assume to have N accelerometer measurements

and only NK N magnetometer measurements. The

result-ing algorithm is summarized in Algorithm 1.

VI. SIMULATIONEXPERIMENTS

The presented attitude representations were compared in two different scenarios. The first scenario was a rather realistic scenario with randomly generated priors and trajectories. The second scenario was a particularly challenging scenario for parametrizations of MRPs and rotation vectors. For each sce-nario 100 Monte Carlo simulations were processed. Randomly generated Gaussian noise with zero mean and covariances according to Table I was added to the measurements of the IMU and the magnetometer. For each simulation a trajectory over N = 400 samples was estimated. The sampling time was T = 0.01 seconds. The IMU measurements were available for each time instance, but only one magnetometer measurement was available at the 200th sample. We compared quaternions, axis angles, error states with rotation vectors and MRPs as well as direct MRPs and their two singularity avoidance strategies. A prior on the first attitude sample was obtained by deviating the real initial orientation by five degrees about a random axis. The applied covariance matrices are summarized in Table I.

TABLE I

COVARIANCE MATRICES OF THE SIMULATED NOISE AND OF THE REGULARIZATION.

P1 Qω Ra Rm Eχ

5 I 1.6 10−5I 0.0015I 0.0015I I

For the determination of the regularization threshold of (6), we executed 1000 Monte Carlo Simulations as described in Section VI-A. The threshold was chosen equivalently to

the lowest value obtained for q0 of all trajectories with

:= 200. This equals a deviation of about 1.14 degrees from

q = [−1 0 0 0]T_{. The degree of the polynomial of (6) was}

chosen as m := 2.

We denote ei ∈ R3_{, ||e}_i||2 _{= 1 as randomly generated}

vectors as with entries in the range of [−1, 1], and uniformly

distributed numbers in the range of [0, 1] as αi∈ R.

Typically a state as initial guess of the optimization x0

1:N is chosen to be in proximity of the estimated trajectory. In practice this initial state can be obtained by e.g. dead-reckoning. Since in our case the ground-truth data was given, each initial state was obtained by deviating the real trajectory by a rotation vector d edev, where d is the angle. We studied deviations for d ∈ {15°, 30°, 60°, 90°, 120°, 150°, 180°}. A. Random Trajectories

This scenario was inspired by movements occurring when a flying drone or a robot is turning. The angular velocities consisted of a constant and time varying periodic part, as

ωt= (α1π + π/9) e1 + α2π

2 sin(α310π t) e2. (12)

1) Experimental Setup: The initial orientation of the

ran-dom trajectories was generated using a ranran-dom rotation vector

as ¯aN B₁ = 2πα0e0. To obtain the real trajectory we integrated

the angular velocity from the initial orientation over time using

(9). For this scenario we chose the initial trajectory x0_1:N to

be smooth. Therefore we deviated each time instance of the real attitude by the same rotation vector d edev.

2) Experimental Results: We ran 100 simulations for each

deviation. The results are summarized in Table II and III. If an optimization converged the root mean square error (RMSE) was on average about 0.20° and did not vary much. However, the runtimes per iteration, number of iterations needed to converge and the number of diverged optimizations differed. Since results concerning the performance of a script or program are strongly depended on the implementation, the outcomes regarding runtime should be considered with care. The average runtime per optimization iteration based on our available implementation is summarized in Table II.

TABLE II

AVERAGERUNTIME PER OPTIMIZATION ITERATION IN SECONDS FOR ALL SIMULATIONS OF THE SCENARIO OF RANDOM TRAJECTORIES

Quat RV ES-RV ES-MRP MRP MRP-RP MRP-RG

(7)

25 50 75 optimization iteration [1] 0 50 100 150 RMS Er ror [°] Quaternion 25 50 75 optimization iteration [1] 0 50 100 150 RMS Er ror [°] RotationVector 25 50 75 optimization iteration [1] 0 50 100 150 RMS Er ror [°] ErrorState-RotVec 25 50 75 optimization iteration [1] 0 50 100 150 RMS Er ror [°] ErrorState-MRP 25 50 75 optimization iteration [1] 0 50 100 150 RMS Er ror [°] MRP 25 50 75 optimization iteration [1] 0 50 100 150 RMS Er ror [°] MRP-reprojected 25 50 75 optimization iteration [1] 0 50 100 150 RMS Er ror [°] MRP-regularized Random scenario

Fig. 2. Convergence behavior of different attitude representations for the scenario of random trajectories with two different initial deviations. The plots show the RMSE in degrees per optimization iteration for each attitude representation. The light blue lines correspond to case of d = 90° deviation and the light red lines to d = 180°. The dark red and dark blue dashed lines depict the mean value of all RMSE for each iteration of the results.

The runtime per iteration of a parametrization did not vary between different deviations. In our implementation most time was needed to evaluate the residuals of the problem and particularly to compute the numerical Jacobian, rather than to compute the inversion of the approximated Hessian in the Gauss-Newton step. For this reason unit quaternions with their computational simple algebra needed least runtime per iteration. MRPs were slightly slower, since in general more operations are needed. Rotation Vectors suffered from their trigonometric functions and needed most time per iteration compared to other direct parametrizations. Since Error States combine different states and therefore needed more operations during an iteration, they had the greatest runtime per optimiza-tion iteraoptimiza-tion. Again the Error States with Rotaoptimiza-tion Vectors are much slower than with MRPs.

For deviations up to 60 degrees all attitude parametrizations behaved similarly regarding their mean value of iterations required to converge but differed in certain aspects. The attitude representations with error states converged slightly faster than the other ones. The standard MRPs and their regularized version varied more in their number of iterations than the other attitude representations. Although in two cases the regularized MRPs reached a RMSE of less than 0.5 degrees the optimization did not fulfil the convergence criteria. This can be explained by the postprocessing step. In critical cases, the procedure forces the trajectory not to reach the optimum and thereby reaches the maximum number of iterations.

From 90 degrees initial deviation, the convergence behavior differed significantly. Fig. 2 illustrates these differences. For every attitude representation except error states the variance in iterations to convergence increased. Furthermore the attitude estimation with plain MRPs showed more and more cases of numerical instabilities which resulted in an abortion of the numerical optimization.

From initial deviations of over 120 degrees one could

observe the following: Error states with both parametrizations always needed less than 20 iterations to converge. Quaternions and re-projected MRPs converged but needed significantly more iterations than error states and the variance w.r.t. the number of iterations increased. For the extreme case of 180 degrees deviation, re-projected MRPs did not manage to converge for seven cases and quaternions for three cases. Since the RMSE over iterations for these cases was decreasing, it can be assumed that with a higher number of iteration, these cases would also have converged. For deviations over 150 degrees, the unit quaternion representation needed on average two iterations less re-projected MRPs but showed a higher variance of the number of iterations to converge.

Plain and regularized MRPs performed similarly. The regular-ized version needed slightly more iterations to converge but did not suffer as much from numerical instabilities. For plain MRPs, the number of diverged optimizations was significantly higher due to numerical instabilities.

For large deviations, rotation vectors started to show their nonlinear behavior. They needed much more iterations to converge than MRPs and more optimizations diverged with an increased initial deviation.

B. Investigating the singularity

This scenario was a purely artificial scenario and was only constructed to study the behavior of MRPs near the singularity and the effect of random initial optimization guesses.

1) Experimental Setup: The unit quaternion trajectory was

the same as depicted in Fig. 1 and passed exactly by the projecting pole of the MRPs. From an initial quaternion, the trajectory was obtained by integrating a constant angular

velocity ω = π[1 1 1]T using (9).

The additional challenge for MRPs was that the magnetometer

TABLE III

CONVERGENCE BEHAVIOR IN THE SCENARIO OF RANDOM TRAJECTORIES FOR DIFFERENT ATTITUDES

d Quat RV ES-RV ES-MRP

15° 5/0.3 n.c.: 0 5/0.3 n.c.: 0 5/0 n.c.: 0 5/0 n.c.: 0 30° 6/0.5 n.c.: 0 6/0.6 n.c.: 0 5/0.5 n.c.: 0 5/0.5 n.c.: 0 60° 6/0.6 n.c.: 0 8/1.0 n.c.: 0 6/0.7 n.c.: 0 6/0.6 n.c.: 0 90° 7/0.8 n.c.: 0 9/2.0 n.c.: 0 7/0.9 n.c.: 0 7/0.8 n.c.: 0 120° 8/1 n.c.: 0 13/4.6 n.c.: 0 8/0.9 n.c.: 0 8/0.9 n.c.: 0 150° 12/10 n.c.: 0 24/14 n.c.: 3 9/0.9 n.c.: 0 9/0.9 n.c.: 0 180° 24/20 n.c.: 3 50/24 n.c.: 32 11/1.7 n.c.: 0 11/1.8 n.c.: 0 d MRP MRP-RP MRP-RG 15° 6/2.7 n.c.:0 5/0.4 n.c.:0 6/2.4 n.c.:0 30° 8/4.1 n.c.:0 6/0.6 n.c.:0 8/3.7 n.c.:1 60° 9/3.0 n.c.:2 7/1.2 n.c.:0 9/5.7 n.c.:1 90° 11/5.9 n.c.:4 9/1.7 n.c.:0 12/6.3 n.c.:0 120° 13/7.7 n.c.:14 11/3 n.c.:0 16/11 n.c.:0 150° 15/6.6 n.c.:12 14/4.8 n.c.:0 17/8.3 n.c.:1 180° 28/18 n.c.:29 26/14 n.c.:7 31/19 n.c.:11

The structure of the entries is : m/s nc: w. The mean number of iterations to fulfil the convergence criterion is given by m . The variable s

denotes the standard deviation of iterations to converge for all hundred simulations. The number of not converged optimizations is indicated by w.

(8)

measurement was induced at the singularity. In contrast to the random scenario the initial guess of the optimization of each simulation was a non-smooth trajectory. It was obtained by deviating every real orientation by the same angle d but different random axes for each time instance d edev,t.

2) Experimental Results: Like in the case of the random

trajectories, the mean RMSE over all converged optimizations did not differ much. Except for MRPs and their regularized version, the mean value was about 0.069° with standard deviations in the range of [0.0054° , 0.0061°].

We obtained for the average runtime per iterations similar results as in the previous scenario. Regarding the convergence performance this scenario was very challenging for rotation vectors and MRPs. The reasons for this are explained in the following and the results are summarized in Table IV.

Like in the scenario from Section VI-A, error states showed the fastest convergence. However, error states with MRPs needed for deviations lager than 90 degrees more iterations to converge than the ones with rotation vectors, which is due to the singularity. For deviations up to 90 degrees, unit quaternions needed on average less iterations than re-projected MRPs and performed similarly to error states. For deviations of more than 90 degrees, re-projected MRPs performed better than quaternions. Since in this scenario every time instance of the initial guess was deviated by a random axis unit quaternions suffered from their ambiguity for large initial deviations. Although unit quaternions needed more iterations on average to converge, in contrast to re-projected MRPs, all optimizations managed to converge.

Rotation vectors performed well for deviations up to 30 degrees. The ambiguities and non-linear nature significantly limited the convergence performance of rotation vectors for more than 60 degrees deviation for the initial guess.

Plain MRPs suffered from the singularity in this scenario. Only 8 of 700 optimizations converged. All remaining optimizations aborted due to numerical instabilities caused by the singularity.

TABLE IV

CONVERGENCE BEHAVIOR IN THE SCENARIO OF TRAJECTORY OVER THE SINGULARITY OF THE MRP FOR DIFFERENT ATTITUDES.

d Quat RV ES-RV ES-MRP

15° 6/0.1 n.c.: 0 6/0.2 n.c.: 0 6/0.1 n.c.: 0 6/0.1 n.c.: 0 30° 7/0.4 n.c.: 0 9/2.9 n.c.: 1 7/0.5 n.c.: 0 7/0.4 n.c.: 0 60° 10/0.3 n.c.: 0 21/12 n.c.: 95 11/1.2 n.c.: 0 11/1.1 n.c.: 0 90° 17/2.0 n.c.: 0 - / - n.c.: 100 16/1.5 n.c.: 0 16/1.8 n.c.: 0 120° 28/2.7 n.c.: 0 - / - n.c.: 100 17/1.7 n.c.: 0 25/2.5 n.c.: 0 150° 27/2.5 n.c.: 0 - / - n.c.: 100 17/1.5 n.c.: 0 24/2.1 n.c.: 0 180° 26/2.4 n.c.: 0 - / - n.c.: 100 17/1.6 n.c.: 0 24/2.5 n.c.: 0 d MRP MRP-RP MRP-RG 15° 18/2 n.c.:92 6/0.4 n.c.:0 - / - n.c.:100 30° - / - n.c.:100 9/1.5 n.c.:0 - / - n.c.:100 60° - / - n.c.:100 14/6.0 n.c.:1 100/0 n.c.:99 90° - / - n.c.:100 17/4.6 n.c.:3 95/0 n.c.:99 120° - / - n.c.:100 15/3.8 n.c.:0 - / - n.c.:100 150° - / - n.c.:100 17/6.9 n.c.:3 - / - n.c.:100 180° - / - n.c.:100 19/11 n.c.:1 - / - n.c.:100

The structure of the entries is the same as in Table III.

25 50 75 optimization iteration [1] 0 50 100 150 RMS Er ror [°] Quaternion 25 50 75 optimization iteration [1] 0 50 100 150 RMS Er ror [°] RotationVector 25 50 75 optimization iteration [1] 0 50 100 150 RMS Er ror [°] ErrorState-RotVec 25 50 75 optimization iteration [1] 0 50 100 150 RMS Er ror [°] ErrorState-MRP 25 50 75 optimization iteration [1] 0 50 100 150 RMS Er ror [°] MRP 25 50 75 optimization iteration [1] 0 50 100 150 RMS Er ror [°] MRP-reprojected 25 50 75 optimization iteration [1] 0 50 100 150 RMS Er ror [°] MRP-regularized Singularity scenario

Fig. 3. Convergence behavior of different attitude representations for the scenario of singular trajectory with a initial deviations of d = 90° and of d = 180°. The description of the plots is analogous to the one of Fig. 2.

Regularized MRPs managed to decrease the RMSE error, but among 900 simulations only two optimizations converged. The reason for this behavior is the magnetometer measurement which was induced at the singularity. We explicitly forced an avoidance of the singularity, but since the magnetometer measurement forced Gauss-Newton step to be on the trajectory to pass by the singularity, the optimization criteria had trouble to be fulfilled. The explained observations are illustrated in Fig. 3.

VII. DISCUSSION

Based on the previous experiments we have indications for the following recommendations.

When a robust representation with fast convergence behavior is desired and computational effort as well as implementation questions do not matter, error states are the representation of choice. As they benefit from their structure they converged fastest in all cases and did not suffer from numerical in-stabilities. Error states with rotation vectors only performed better than the ones with MRPs in the challenging scenario for deviations greater than 90°, but needed in general more time for each iteration. Excluding this special cases, both error states representations seem to be equally well suited.

Given an application which can guarantee that an initial guess of the optimization does not exceed a deviation of 90 degrees from the real trajectory, other choices are possible. For such applications either unit quaternions or re-projected MRPs are suited. In our implementation a unit quaternion parametrization was slightly faster regarding the average run-time per optimization than projected MRPs. However re-projected MRPs performed fastest between other minimal parametrizations. Although in some extreme cases they did not perform as robust as unit quaternions, on average they converged almost equally fast. In contrast to unit quaternions, re-projected MRPs do not have to obey the nonlinear unity constraint. This might be beneficial if in a rather general case where additional nonlinear regularizations or constraining

(9)

models are included into (2). Like error states, re-projected MRPs and unit quaternions did not suffer from numerical instabilities and always showed a clear converging behavior. Rotation vectors can be used when the initial guess of the trajectory is rather smooth and the maximum deviation of the real trajectory is lower 30° .

Plain MRPs and regularized MRPs should only be applied when it can be guaranteed that during the N subsequent time instances of the estimate the trajectory does not reach the singularity. This can be the case when N is chosen sufficiently small, the movements contain only moderate angular velocities or when the prior is sufficiently far away from the singularity. Regularized MRPs can handle singularity problems but they are not guaranteed to fulfil the convergence criteria even though the estimated trajectory might be near to the real one. We have conducted experiments with an implementation of MRPs using a conversion to unit quaternions or a direct for-mulation as mentioned in Section III-C. Both implementations yielded the same results.

VIII. CONCLUSIONS

In this work we have studied the behavior of attitude parametrisations for optimization-based Bayesian smoothing. The main focus has been on the convergence behavior when solving the resulting WLS problem with the Gauss Newton method, but also estimation errors have been considered.

Besides direct attitude representations as rotations vectors, unit quaternions and modified Rodrigues parameters, also error states with rotation vectors and MRPs have been investigated. We suggested two different strategies to handle the singularity problems of MRPs during numerical optimizations.

The Bayesian smoothing problem has been formulated as a maximum a posteriori estimate, which with the assumption of Gaussian distributions, results in a nonlinear weighted least squares problem. The problem is solved with the standard Gauss-Newton method and an additional backtracking line search. With this we have estimated the complete trajectory of a single IMU in a batch given measurements for all time instances. In order to impede the estimation problem and to force non-linearities, we induced only sparse magnetome-ter measurements. In two different, scenarios Monte Carlo simulations showed that error states with rotation vectors or MRPs converged fastest. For initial deviations from the real trajectory of up to 90 degrees, re-projected MRPs show a good compromise between robust and fast convergence behavior. They are easy to implement and allow for minimal computational effort.

We plan to conduct additional studies on covariance estimates and to improve the regularized MRPs to allow for application in standard optimization toolboxes without numerical instabil-ities.

We provide the source code online1_.

1

https://git.cs.uni-kl.de/lorenz/attitudes-for-optimization-based-bayesian-smoothing

REFERENCES

[1] S. L. Altmann, Rotations, Quaternions and Double Groups. Oxford University Press New York, 1986.

[2] M. Bloesch, H. Sommer, T. Laidlow, M. Burri, G. Nuetzi, P. Fankhauser, C. D. Bellicoso, C. Gehring, S. Leutenegger, M. Hutter, and R. Y. Siegwart, “A Primer on the Differential Calculus of 3D Orientations,” ETH Zurich, Report, 2016.

[3] J. Crassidis and M. Landis, “Attitude Estimation Using Modified Ro-drigues Parameter,” Proceedings of the Flight Mechanics/Estimation Theory Symposium, pp. pp.71 – 83, May 1996.

[4] J. L. Crassidis, F. L. Markley, and Y. Cheng, “Survey of Nonlinear Atti-tude Estimation Methods,” Journal of Guidance, Control, and Dynamics, vol. 30, no. 1, pp. 12–28, Jan. 2007.

[5] F. Gustafsson, Statistical Sensor Fustion. Studentlitteratur, 2012. [6] A. H. Jazwinski, Stochastic Processes and Filtering Theory. Academic

Press New Yor and London, 1970.

[7] C. D. Karlgaard and H. Schaub, “Nonsingular Attitude Filtering Using Modified Rodrigues Parameters,” The Journal of the Astronautical Sciences, vol. 57, no. 4, pp. 777–791, Oct. 2009.

[8] M. Kok, J. D. Hol, and T. B. Sch¨on, “An optimization-based approach to human body motion capture using inertial sensors,” Proceedings of the 19th World Congress of the International Federation of Automatic Control (IFAC), pp. 79 – 85, Aug. 2014.

[9] ——, “Using Inertial Sensors for Position and Orientation Estimation,” Foundations and Trends® in Signal Processing, vol. 11, no. 1-2, pp. 1–153, 2017.

[10] S. Marandi and V. Modi, “A preferred coordinate system and the associ-ated orientation representation in attitude dynamics,” Acta Astronautica, vol. 15, no. 11, pp. 833–843, Nov. 1987.

[11] F. L. Markley, “Attitude error representations for Kalman filtering,” Journal of Guidance, Control, and dynamics, vol. 26(1), pp. 311–317, 2003.

[12] F. L. Markley and J. L. Crassidis, Fundamentals of Spacecraft Attitude Determination and Control, ser. Space Technology Library. New York: Springer, 2014, no. 33, oCLC: ocn882605422.

[13] M. Miezal, B. Taetz, and G. Bleser, “On Inertial Body Tracking in the Presence of Model Calibration Errors,” Sensors, vol. 16, no. 12, p. 1132, Jul. 2016.

[14] D. Mortari, M. Angelucci, and F. L. Markley, “Singularity and Attitude Estimation,” in Annual AIAA/AAS Space Flight Mechanics Metting, 2000.

[15] J. Nocedal and S. J. Wright, Numerical Optimization. Springer Series in Operations Research, 2006.

[16] M. L. Psiaki, “Backward-Smoothing Extended Kalman Filter,” Journal of Guidance, Control, and Dynamics, vol. 28, no. 5, pp. 885–894, Sep. 2005.

[17] T. Qin, P. Li, and S. Shen, “Vins-mono: A robust and versatile monocular visual-inertial state estimator,” IEEE Transactions on Robotics, vol. 34, no. 4, pp. 1004–1020, Aug 2018.

[18] M. D. Shuster, “A Survey of Attitude Representations,” The Journal of the Astronautical Sciences, vol. 41, no. 4, pp. 439–517, 1993. [19] M. D. Shuster and S. D. OH, “Three-axis attitude determination from

vector observations,” Journal of Guidance, Control, and Dynamics, vol. 4, no. 1, pp. 70–77, 1981.

[20] Z. Sjanic, M. A. Skoglund, and F. Gustafsson, “EM-SLAM With Inertial/Visual Applications,” IEEE Transactions on Aerospace and Electronic Systems, vol. 53, no. 1, pp. 273–285, Feb. 2017.

[21] M. A. Skoglund, Z. Sjanic, and M. Kok, “On orientation estimation using iterative methods in Euclidean space,” in Proceedings of the 20th International Conference on Information Fusion (Fusion). Xi’an, China: IEEE, Jul. 2017, pp. 1–8.

[22] N. Snavely, S. M. Seitz, and R. Szeliski, “Modeling the World from Internet Photo Collections,” International Journal of Computer Vision, vol. 80, no. 2, pp. 189–210, Nov. 2008.

[23] G. Terzakis, M. Lourakis, and D. Ait-Boudaoud, “Modified Rodrigues Parameters: An Efficient Representation of Orientation in 3D Vision and Graphics,” Journal of Mathematical Imaging and Vision, vol. 60, no. 3, pp. 422–442, Mar. 2018.

[24] C. Wefelscheid, R. H¨ansch, and O. Hellwich, “Three-Dimensional build reconstruction using images obtained by Unmanned Arieal Vechicles ,” ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, vol. XXXVIII-1/C22, pp. 183–188, Sep. 2012.