Global Refinement of Positions and Uncertainties
------------------------------------------------
(H.L. McCallon 981012)
The following outline describes a plan to refine 2MASS positions and
position uncertainties taking advantage of global overlap information as
contained in the database. The position corrections are obtained by
"Martinizing" the position data. The name comes from the fact that
Martin Weinberg originally proposed this type of approach a number of
years ago. Prior to that step, however, the original position uncertainties
must be adjusted to compensate for the fact that frame position uncertainty
contribution is currently set to a constant in the pipeline processing.
The described approach to achieve the needed uncertainty adjustments
resulted from a number of discussions between myself and John Fowler
and has elements in common with the position "Martinizing".
Before getting into the position refinement itself, a method for
extracting the needed data from the database and setting it up for
efficient access is described. This is quite important given the huge
volume of data involved. The technique steps through the database a
tile at a time, computing what it needs and storing the results in
memory. After all the needed data is accumulated, the refinement
is done using arrays in memory.
The position uncertainties of point sources coming out of the pipeline
will be adjusted to make them reflect actual discrepancy dispersions
by multiplying their error variances by a factor denoted "variance
factor". These variance factors will be determined for a dozen Dec
segments in each scan.
Positions at the scan segment level will be adjusted by a global
minimization of scan overlap discrepancies weighted against
ACT reference star match information.
This refinement technique could be applied at any point in the survey,
up to and including full sky coverage. The greater the contiguous
coverage, the better the results. A test case, using prototype code and
data from only a single night (971116n), showed a significant improvement.
The proposed steps are as follows:
I. Step through tiles and, for the primary scan covering each, compute
difference information w.r.t. ACT's and overlapping scans.
A. Pull positions of the four corners for the primary scan from
the database.
B. Pull all sources from any scan within the database which have
high quality extractions and are enclosed by the primary-scan
four corner points.
C. Compute average position variances (RA and Dec) for primary
scan and load into the arrays VarRA and VarDec indexed by
tile. In most instances each will be the average of a group
of constants. This occurs because:
1) Only "high quality extractions" (I-B) are used.
2) PROPHOT applies a floor to the position uncertainties
which generally clips the high quality extractions
at a minimum allowed value.
3) The contribution of position reconstruction of the
frames to the original variance is constant. This,
of course, is what gives rise to the need for the
variance factors.
D. Match primary-scan sources to overlapping-scan sources and
to ACT's.
E. Divide matches from step I-D into 12 segments by Dec. These
will henceforth be referred to as "tile segments" and will
be the basic unit of position adjustment.
F. For each tile-segment overlap check existing entries in the
idtsl and idtsh arrays to see if the overlap combination has
already been processed. Note that each overlap combination
will show up twice but need only be processed once. Assuming
the combination has not already been processed, compute the
following variables, increment the overlap index "iol" and load
into the indicated arrays (otherwise skip to G):
1) Number of matches => nMCHo(iol)
2) Sums of match differences in RA => SdRAo(iol)
3) Sums of match differences in Dec => SdDECo(iol)
4) Sums of match difference squares in RA => SdRA2o(iol)
5) Sums of match difference squares in Dec => SdDC2o(iol)
In order to provide easy access to the overlap data two additional
arrays with the same indexing are maintained. These are
6) Unique ID of lower tile segment in overlap => idtsl(iol)
7) Unique ID of higher tile segment in overlap => idtsh(iol)
The terms "lower" and "higher" here simply refer to the tile-segment
ID's themselves. The sign of the difference sums (SdRAo and SdDECo)
should always reflect the position of the tile-segment with the
larger ID minus that with the smaller ID.
G. For each tile-segment identify the ACT matches, compute the
following variables and load to tile-segment indexed (its) arrays
as indicated:
1) Number of ACT matches => nMCHa(its)
2) Sums of ACT match differences in RA => SdRAa(its)
3) Sums of ACT match differences in DEC => SdDECa(its)
II. Prepare difference data for efficient access.
A. Generate two sets of keys allowing indexed access to the
overlap arrays (I-F) in:
1) ascending idtsl order => array "idxtsl"
2) ascending idtsh order => array "idxtsh"
B. Step through the idtsl overlap array in ascending order
using the idxtsl keys and build a tile-segment indexed
array "tsl1st" giving the location of the first occurrence
of that tile-segment in idts1 order. Repeat to build a
tile-segment indexed array "tsh1st" for first occurrence in
idtsh order.
Note: One can now access tile-segment unique data directly
via tile-segment index. The shared overlap data can be
obtained by first using the tsl1st key to jump into the
idxtsl array at the first occurrence of that tile-segment
then pulling in all data until idtsl changes. The
remaining overlap data can be obtained in the same manner
using the tsh1st key to jump into idxtsh array at the first
occurrence of that tile-segment then pulling in all data until
idtsh changes.
III. Compute variance factors for each tile segment. The RA and Dec
factors can be computed independently. The steps described are
applied to RA and then Dec to generate separate sets of variance
factors. Since the purpose of the variance factors is to allow
the contribution of the frame position uncertainties to vary,
only that portion of the original variance should be multiplied
by the variance factor. This can be achieved by representing
the adjusted variance as follows:
var_adj = var_orig + V0*(vfac-1)
where: var_adj = adjusted variance
var_orig = original variance
V0 = constant frame variance used in pipeline
vfac = variance factor
Note: A variance factor of one indicates no adjustment. A
variance factor of zero would reduce the contribution
of the frame uncertainty to zero.
A. The summation of chi-squares over n in each overlap can be
written as:
sum_chisq/n = (sum_dx^2)/{N*[Var1+V0*(vf1-1) +Var2+V0*(vf2-1)]}
where: sum_dx^2 = sum of overlap differences squared
N = number of matched points in overlap
vf1 = variance factor of first tile segment
vf2 = variance factor of 2nd tile segment
Var1 = average original variance from first tile (I-C)
Var2 = average original variance from 2nd tile (I-C)
V0 = constant frame variance used in pipeline
B. Form sum "S" of differences which, if driven to zero, would drive
the sum_chisq/n values to one. It will not, of course, be
possible to drive "S" to zero, but it can be minimized. Each term
in "S" is associated with an overlap and is weighted by the
number of matches in that overlap.
S =SUM{Nij*[(sum_dx^2)ij -Nij*(Vari+V0*(vfi-1) +Varj+V0*(vfj-1))]^2}
where i ranges over the tile segments and j over the overlaps
C. Form a set of simultaneous linear equations by taking the
partial derivative of S w.r.t. each of the tile-segment variance
factors in turn and setting each partial equal to zero.
D. Solve simultaneous linear equations above for the desired variance
factors which minimize the sum_chisq/n differences from one.
This step is complicated by the fact that for the entire sky
(limiting the northern survey to > -12 deg and the southern
to < +18 deg) there are 893364 tile-segments. Although the
matrix to be inverted is extremely large (893364x893364)
it is also extremely sparse with typically only about 2-3 non-zero
elements per row. The number of non-zero elements goes up near the
poles but still remains very small.
1. The matrix will have to be stored and solved using sparse
matrix techniques. IMSL has some routines which look
promising assuming they can handle a problem this
large. Need to check with Martin Weinberg on the upper
limits of the sparse matrix approach.
2. Should the matrix prove too large even for sparse matrix
techniques, an alternate approach might be to solve the
problem as described but with manageable subsets of the
entire sky. Each subset could be made up of a number of
adjacent Dec bands. This should leave things close enough
to finish off with iteration.
E. Iteration could proceed as follows:
1. Determine what variance factor for the primary tile segment
would make each of the overlap chisq/n values go to one.
Let vfvij represent the vote of overlap j on primary variance
factor i; then:
vfvij = [(sum_dx^2)ij/Nij -Vari -Varj]/V0 -vfj_lst +2
where vfj_lst is the variance factor for tile-segment j from
the last iteration
2. Compute a weighted average of the vfvij overlap votes to
obtain what a full-step value (vfi_fs) for the new iteration
of vfi would be.
vfi_fs = (sum Nij*vfvij)/(sum Nij)
where the sums are over the j overlaps.
3. Take a portion of the indicated full step to get the new value
for vfi:
vfi_new = vfi_lst +alf*(vfi_fs -vfi_lst)
where alf is the factor which specifies the fraction of the
full step to use. It usually helps to start alf out low
and let it approach one as the iteration converges.
F. Clip variance factors to prevent using unreasonably low variances
as weights in the position correction (step IV). Define VarMin to
be the minimum reasonable frame variance. If vfi*V0 adjusted positions and associated uncertainties
2 => adjusted uncertainties for non-adjusted positions
nTile = tile number
RA = uncorrected RA
DEC = uncorrected DEC
SigRA = original RA sigma
SigDEC= original Dec sigma
Outputs: (for nflag=1)
RAc = corrected RA
DECc = corrected Dec
SigRAa= adjusted sigma for corrected RA
SgDECa= adjusted sigma for corrected Dec
(for nflag=2)
RAc = uncorrected RA
DECc = uncorrected Dec
SigRAa= adjusted sigma for uncorrected RA
SgDECa= adjusted sigma for uncorrected Dec