3

3. Overview of Extended Source Processor

The last major subsystem to run in the 2MASS quasi-linear data reduction pipeline is the extended source processor, referred to as GALWORKS. The primary role of the processor is to characterize each detection source and decide which sources are "extended" or resolved with respect to the point spread function. Sources that are deemed "extended" are measured further (mostly photometry) and the information is output to a separate table. In addition to tabulated source information, a small "postage stamp" image is extracted per extended source from the corresponding J,H and Ks atlas images. The source lists and image data is stored in the 2MASS extended source database; see flow schematic below.

The extended source database contain several classes of "extended objects", including real galaxies, galactic nebulae and pieces of large angular-structure sources, galactic H II regions, multiple stars (mostly double stars), artifacts (pieces of bright stars, meteor streaks, etc) and faint (mostly pointlike) sources that have an unknown nature. For extended sources, the ultimate goal of the 2MASS project is to produce a reliable catalog of real extended sources, predominantly galaxies. It is therefore necessary for additional ‘post-processing’ steps to eliminate non-galaxies like double stars. In section 4, we discuss in detail how the star-galaxy separation process is performed. For the GALWORKS processor, the emphasis is placed primarily with completeness; that is, we want to comprehensively detect and identify extended sources (re: galaxies) brighter than the level-1 specifications limits, or K ~ 13.5, H ~14.3 and J ~15. Later in the post-processing operations phase the galaxy completeness will be relaxed (but still within the level-1 specifications) in order to derive achieve the desired reliability in our galaxy catalog. 2MASS is an allsky project that will acquire over 10 Tb of data over the lifetime of the project, which places severe runtime restrictions on the pipeline reduction software. Consequently, one important caveat is that most of the GALWORKS algorithms and flow structures were designed specifically to run/operate as fast and as efficiently as possible.

By the time GALWORKS is run in the 2MASS pipeline, point sources have been fully measured (i.e., position refinement and photometry) and band merged, coordinate positions calibrated, coadd images (re: atlas Images) constructed, and the time-dependent PSF characterized. GALWORKS, however, does generate PSF ridgelines (see section 2.4) on finer time scales depending on the number of sources available from the scan (i.e., the stellar number density), which are then used to parameterize sources and perform basic star-galaxy discrimination. The high-level steps that encompass GALWORKS include: (1) bright star (and their associated features) removal, (2) large (>5’) cataloged-galaxy removal, (3) atlas image background removal, (4) source parameterization and attribute measurements, (5) star-galaxy discrimination (discussed at length in the next section, 4), (6) refined photometric measurements, and (7) extraction; see flow schematic below.

The background removal operation is in particular a crucial step since both parameterization (re: star-galaxy discrimination) and photometry relie up predictable (e.g., zero, smooth and flat) background levels. This operation is described in detail below (section 3.3). Steps 4-6 are designed to isolate ‘normal’ galaxies other relatively high surface brightness extended sources. There are, however, other kinds of extended sources that 2MASS is capable of detecting, including bright galactic (extended) stars (H II regions, T Tauri stars, etc) and low surface brightness galaxies (and faint galactic nebulae).

3.1 Bright Extended (Fuzzy) Stars

Bright fuzzy stars are identified as such using a separate algorithm within GALWORKS. The basic method is to look for emission in and around the source at levels elevated above that expected for a bright star (characterized by the PSF). After nearby stars have been masked and the source itself has been removed (assuming the shape of the PSF), the remaining or residual emission is measured with respect to the mean background level of the coadd. A more effective indicator of enhanced emission is to calculate the root mean square of the residual emission versus the mean background AND versus a zero background (i.e., assume the true background level is zero). The rms values are then normalized by the measured noise for the coadd (atlas image) as a whole. Stars with associated emission (e.g., reflection nebulae) will clearly stand out compared to quiescent stars. This operation is referred to as the "bright extended source" processor. Extracted sources are stored in the 2MASS database, with a special catalog to be released at some later date in the future. The completeness and reliability of this supplemental catalog are unknown at this time with while there are no set requirements for these kinds of objects. Examples of sources found with the technique are shown in Figure 1, from scans crossing the Orion trapezium and the large magellanic clouds. The upper row shows J-band postage stamp images, middle row the H-band and bottom row the K-band images. The integrated flux for the example sources range from 5^th to 7^th mag.

3.2 Low Central Surface Brightness Galaxies

Low surface brightness galaxies (e.g., dwarf ellipticals) present a different challenge to GALWORKS than the average ‘normal’ galaxy 2MASS runs across. They are typically very faint (as measured in a standard aperture for of ‘normal’ galaxies) and they do not have well defined cores; see Figure 2 for examples of typical low central surface brightness galaxies found within 2MASS. The integrated flux of the example sources range from J=15 to 15.6 and K=13.8 to 15.2. The LSB galaxy nature of these sources was confirmed with deep optical images.

The galaxy core is an important component for star-galaxy separation since many of the parametric measurements for star-galaxy separation are anchored to the core or vertex of the galaxy. For these reasons, it is necessary to perform a special detection step since the primary 2MASS detector used to find point source and galaxies alike is not specifically designed to find low surface brightness galaxies. The low central surface brightness detector (referred to as the LCSB processor) of GALWORKS is executed last in the chain of operations that comprise GALWORKS (see flowchart). The input to the LCSB processor is a fully cleaned coadd image (per band), where stars (brighter than some limit, typically K = 14.5) and galaxies have been entirely masked. The image is then blocked up (using three independent kernal sizes: 2X2, 4X4 and 8X8) and ‘boxcar’ smoothed to increase the signal to noise ratio for large (but faint) galaxies normally hidden in the 1" pixel noise. A block average is not the most optimum method (as compared to a gaussian convolution, for example) but with pipeline runtime constraints it is by far best option.

The detection step consists of 3-sigma threshold isolation of local maxima or bumps in the blocked-up cleaned images. Source detections are then parameterized (using the blocked and smoothed image) with the primary measurements being: signal to noise ratio of the peak pixel, radial extent ("sh" score), integrated signal to noise, surface brightness, integrated flux, and SNR measurements using a J+H+K_s combined "super" image. The "super" image, in principle, provides the best median from which to find faint LSB galaxies given the effective increase in the signal to noise ratio. In practice, the "super" image only increases the SNR by approximately sqrt (2) due to the significant background levels (which directly mapps into noise) at H and K and assuming normal (i.e., J-K = 1) galaxy colors. Nevertheless, the peak SNR measurement coming from the blocked/smoothed "super" coadd provides the best discriminant between LSB galaxies and faint stars. Stars will have a relatively low SNR due to most of the light being confined to a few pixels which are subsequently smoothed away with the blocking and boxcar-smoothing step. Galaxies, on the other hand, will add up since their light is distributed over a large area.

The preliminary results for the LCSB processor reveal a reliability rate of about 80% using a threshold on the ‘maximum’ SNR (between 2X2, 4X4 and 8X8 blockings) of the "super" coadd. The major contaminants are faint stars and diffuse emission associated with bright stars. However, if a meteor streak (or other transient phenomenon) pass through the coadd, then numerous false sources are picked up as LSB galaxies. It still remains a future task to learn how to improve the reliability of sources coming from the LSB detector. It is important to note that these sources are nearly always fainter than the level-1 specifications (K > 13.5, J > 15) which means that there are currently no requirements on the completeness and reliability. Note that the faintness level of the LSB detector implies that (integrated) LSB galaxies are being detected and processed with the ‘normal’ or standard 2MASS/GALWORKS operation (see below). Thus, we do not anticipate any completeness failure for LSB galaxies brighter than K ~ 13.5. The fainter LSBs, however, will have to be detected and processed with the LCSB processor described here. A special catalog of faint LSB galaxies will be released at some later date in the future. The reliability of this catalog is to be determined. Further information and some early science results with 2MASS LSB galaxies can be found in Jarrett (1997) and Schneider et al. (1998).

3.3 Atlas Image Background Removal

In the near-infrared, the background "sky" emission has structure at all size scales, primarily due to upper atmospheric aerosol & hydroxyl emission (the so-called "airglow" emission; cf. Ramsey et al 1992, MNRAS, 259. 751). The OH emission is the dominant component to the J (1.3 l m) and H-band (1.7 l m) backgrounds, while thermal continuum emission comprises the bulk of the K (2.2 l m) background; consequently, the J and H images tend to have more background ‘structure’. At times of severe airglow incidence, the background can have relatively high frequency (tens of arcseconds) features that resemble extended sources (re: galaxies), thus triggering false positive extended source detection. However, for the most part, the background variation in a given image (size 8.5X16’) is smooth and can be modeled with a cubic polynomial. A third order polynomial is a good compromise between a planar fit (too stiff) and spline waves (too yielding). For extended sources, the primary objective of the 2MASS project is to find and characterize galaxies (and other extended objects) smaller than ~3’ in diameter. We therefore attempt to remove airglow features slightly larger than this limiting size scale to minimize random and systematic photometric error from non-zero background structure. For the case in which the airglow frequency is higher than we can remove, the photometry (particularly at H band) is severely compromised and the quality of the data is downgraded accordingly.

The background removal process is applied separately to the J, H, & K coadd images (512 X 1024 pixels each). Given the "cross-scan" size of one coadd image, a cubic polynomial, ax³ + bx² +cx + d, provides an effective model for smooth background variations larger than ~3’. Moreover, the "inscan" size (1024") allows cubic fit to each half of the coadd (lower 512", upper 512"). In addition to fitting a cubic polynomial to each half of the coadd, we also apply a fit to the "central" 512X512 pixels in order to smoothly ‘join’ the boundaries of the two background solution fits. The final 512X1024-solution fit is generated from a weighted average of each 512X512-block solution.

The fitting procedure is first preceded by an image "clean" operation. Stars and catalogued galaxies are masked from the image. Very bright stars (K < 6) require more complicated masking, including removal of their bright reflection halo, diffraction spikes, horizontal streaks, glints and persistence ghosts. Finally, in order to minimize contamination from faint stars and objects that escaped the masking procedure, we median filter the coadd with an 8X8 pixel filter (thus, degrading the resolution of each pixel to 8" chunks). The fitting schematic is illustrated in Figure 3. The 512X1024 coadd is represented by a thick-lined rectangle. Cubic fits are applied to the lower half, 512 X [1:512] pixels, the upper half, 512 X [513:1024] pixels, and the central half, 512 X [257:768] pixels, where we have first resampled the data with an 8X8 median filter.

Using a least-squares technique, a cubic polynomial is then iteratively fit with 3s rejection to each line (of the 512X512 block, with 8X8 median filtering). The line solutions are then used for input to the next step, where we fit a cubic polynomial to each column, thus areal coupling the line and column background solutions. The three block solution images are combined with a (1/D r) weighting scheme. Here D r refers to the relative radial ("in-scan") difference between any two given block solutions from some reference point. There are three "in-scan" reference points: 256, 512 and 768. So for example, combining the lower and central blocks at some point, Y’, gives the respective weights [1 / (256-Y’)] and [1 / (512 – Y’)]. With this technique we are able to smoothly combine the three independent solutions per coadd image. Note however, the "boundary" solutions for the upper and lower blocks are better constrained near the center of the image due to the weighed addition of the central block solution image. Conversely, the background solutions are not as well determined at the upper, >900, and lower, <128, "in-scan" image extremes.

Representative performance of the background removal operation is shown in Figures 4– 6. The image data comes from a fairly typical ‘photometric’ Northern Hemisphere night, although the "airglow" emission is to some extent rather severe during the period that this data was acquired (see H-band, Fig 5). The figures show the raw image coadd, resultant background solution and residual (background subtracted) image. The gray-scale stretch ranges from -2s to 5s of the mean background level. The J & K raw images (Fig 4, 6) reveal fairly low level (smooth, but non-linear) background variations, while the corresponding residual images show very little (if any) background structure. However, airglow emission is much more prevalent in the H-band (Fig 5), with size scales smaller than ~1-2’, as evident in the residual image. It is this residual structure in the background (with amplitude >10% of the mean background noise) that can induce systematics in the photometry, parameterization (e.g., azimuthal ellipse fitting), and reliability.

3.4 Source Parameterization and Shape-Attribute Measurements

Preliminary flux estimates come from the point source processor which uses a characteristic PSF to derive total fluxes (assuming point-like flux distribution). These measures are not optimal for extended sources since they systematically underestimate the flux. Hence, one of the first tasks for GALWORKS is to deduce the nature of a source using some simple radial profile attributes. The median radial shape, or "msh", is both easy to compute (re: fast runtime) and a robust discriminator between stars/double stars and galaxies (see section 4 for more details). Applying a threshold to the "msh" measure for each source (per band) eliminates a large fraction of the total number of sources that require more exhaustive testing for star-galaxy separation. It also provides a measure of the"extendedness" of a source: if the source is highly probable of being a galaxy (i.e., large "msh" score) then its total flux is re-estimated using a fixed R=10" circular aperture.

Before the more time-consuming image attribute measurements are performed per source (e.g., elliptical shape fitting; adaptive aperture photometry) it is necessary to perform additional star-galaxy separation tests, particularly when the stellar number density is very high (i.e., glat < 10 deg). Thresholds on the "sh", "wsh", "r1", and "r23" radial shape attributes (see section 4) are carried out to eliminate most non-extended sources (re: stars and double stars) from the scan/tile source list. For high glat fields, the remaining sources (in a typical scan) are mostly real galaxies intermixed with a few double stars, one or two isolated stars and faint objects whose natures is unknown. In quantitative terms, the reliability is from 50 to 80% at this juncture , and thus the star-galaxy separation process has reduced the fraction of stars to galaxies from 10:1 to approximately 1:1.

The orientation of disk spiral and spheroid elliptical galaxies is estimated using a 2-D ellipse fit to a single isophote surface, which is used to compute various forms of aperture photometry (e.g., Kron, isophotal, etc) and symmetry parameters used for star-galaxy separation. Although galaxies can change orientation (e.g., ellipticity and position angle) with radius, the 2MASS pixel undersampling and runtime constraints limit measurements to one or two isophotes. Moreover, most 2MASS galaxies are small in size (<15") which is not much bigger than the actual angular resolution ~2". Again, it is to our advantage that in the near-infrared galaxies appear for the most part symmetric about the major axis and so using one isophote as representative of the average orientation is a good approximation on average. In order to minimized influence from the PSF orientation (which is typically non-circular), the representative orientation isophote roughly to 3s pixel values, or 20.09 mag at J, 19.34 mag at H and 18.55 mag at K. A cautionary note: like all isophotes used in 2MASS pipeline processing, they are uncalibrated magnitudes, which are prior to the adjustment of ~0.1 to 0.2 mag derived from the later calibration processing step. Consequently, the isophote at which the 2-D elliptical parameters are derived can vary from 2.6s – 3.7s, depending on the calibration correction.

The ellipse-fitting method was designed to run fast and to minimize confusion from nearby sources (i.e., stars) and correlated noise features that form ‘extended’ limbs and other ‘disconnected’ false features. Three elliptical parameters are derived from the isophote: axis ratio (b/a), position angle, f (standard reference frame, east of north), and a goodness of fit. The goodness of fit is defined as follows:

where (r_semi)_i is the semi-major axis distribution for a given (axis ratio, f ) solution. That is to say, for each point along the isophote, the solution ellipse (axis ratio, f ) gives the corresponding major axis radius. If the ellipse (b/a, j ) is perfectly matched to the isophote, the mean variance in r_semi is identically zero. Therefore, by minimizing the ratio of mean radius to the standard deviation in the distribution, the best-fit ellipse is ascertained. In this fashion, the elliptical parameters were derived for each band. An additional fit was performed on the combined (J+H+K_s ) "super" image. The "super" coadd represents the optimum signal to noise representation of the galaxy, assuming normal galaxy colors and minimal reddening. Accordingly, the derived "super" coadd 2-D ellipse serves as the "default" shape for cases in which the individual band flux is fainter than some given limit: 14.4 at J, 13.9 at H and13.5 at K, or the SNR of the galaxy is less than 5, based on the R=10" fixed circular aperture photometry. For the case in which the derived semi-major axis is less than 5" or greater than 70", the source is assumed to be round and the parameters are set accordingly. For the case in which the derived axial ratio is less than 0.10, the ellipse fit parameters are set to the corresponding fit from the "super" coadd. Finally, the "super" coadd values are also used when the individual band fit for one reason or another is not possible (e.g., when masked pixels are present within 1" of the peak pixel).

A final note with regarding to the ellipse fitting operation relates to stellar masking. For bright galaxies (K < 12.5) in which the inclination is large (>40 deg), their is apt to be multiple point source detections strung across the disk of the galaxy, falsely induced by the sharp intensity gradient of the disk with respect to the sky. Consequently, we do not perform any stellar masking or subtraction specific to the ellipse fitting step, except when the stellar number density is high (>2000 stars deg^-2for K < 14) in which case it is more favorable to mask out nearby stars given the high probability of contamination. This ellipse-fitting detail is not to be confused with the general GALWORKS procedure of near-neighbor masking prior to photometric or symmetry measurements.

Once the general orientation of the galaxy is derived, various ‘symmetry’ measures are performed. The radial/azimuthal symmetry of an object is a good indicator of its true nature. Double stars appear asymmetric across the minor axis -- that is to say, if you center the elliptical axis on the primary component of the double star, the resultant profile is highly asymmetric comparing one side of the major axis to the other. This is also generally the case for triple stars, although you can have configurations of three+ stars in

which the alignment is symmetric across both the minor and major axis.

One way to measure the "symmetry" of an object is to perform a bi-symmetric spatial autocorrelation. Divide the object along the minor axis. The integrated flux in each half gives the gross bi-symmetric flux ratio. Rotate one side 180 degrees with respect to the other and multiply the resultant pieces. The autocorrelation is then normalized by the original galaxy (squared). To minimize the affects of noise and the shape of the PSF, very low SNR points (< 1.5) and the inner 3" core are avoided in this procedure. In addition to the autocorrelation, we also compute bi-symmetric cross-correlation reduced chi-square,

where p and p* are the points 180 deg apart that are being compared, N is the number of points being compared, and sigma is the pixel noise. This c measure has the advantages that it has a distribution that is well understood statistically with tabulated confidence ranges, there are no asymmetries in the distribution like those introduced in a ratio comparison, and it is insensitive to low SNR or data points near zero.

Finally, we perform an ellipse fit to the 5s isophote (per band and "super" coadd). Comparison between the default 3s and 5s fit parameters may indicate either real asymmetry due to stellar contamination or orientation changes as a function of radius. Likewise, the goodness of fit metric can indicate both problems with the fit (due to stellar contamination or noise in the case of faint sources) or real asymmetry in the object.

3.5 Photometry

3.6 Source Extraction

info sent to tables (list tables)

postage stamp

Figures

Figure 1— Bright stars with associated nebulosity. The first five sources come from the Orion trapezium region and the last three from the large magellanic clouds. The upper row shows J-band postage stamp images, middle row the H-band and bottom row the K-band images. Each image is 50’´ 50’ in angular size. The integrated flux for the sources range from 5^th to 7^th mag.

Figure 2— Low central surface brightness galaxies. Typical set of galaxies detected and extracted with the LCSB processor. The upper row shows J-band postage stamp images, middle row the H-band and bottom row the K-band images. Each image is 25’´ 25’ in angular size. The integrated flux for each source is (reading left to right): (J=15.0, H=14.3, K=13.8), (15.1, 14.5, 14.0), (15.2, 14.5, 13.9), (15.6, 14.7, 13.9), (15.4, 14.5, 14.1), (15.1, 14.5, 14.3), 15.7, 14.8, 14.0), & (15.3, 15.1, 15.2).

Figure 3—2MASS Atlas Image (coadd) decomposition schematic for background fitting. The J,H, K_s raw imagess have 512´ 1024 pixels (~8.5´ 16’) each. The first step is to resample the image with an 8´ 8 median filter. A cubic polynomial is then fit to the surface defined by dividing the filtered image into three chunks: upper, middle and lower, with 50% overlap between the middle and upper segments, and middle and lower segments. The final background solution results from a weighted average (overlap dependent) stitching between the three segments.

Figure 4—Example of a raw J-band image, the corresponding background solution and the residual (background subtracted) image. The grey-scale stretch ranges from –2s to 5s of the median background level.

Figure 5— Example of a raw H-band image, the corresponding background solution and the residual (background subtracted) image. The grey-scale stretch ranges from –2s to 5s of the median background level. Notice the prominent "airglow" background gradients in the raw image (left-most panel) and low-level-high-frequency ridges in the residual image (right-most panel). This image is in fact rather typical for 2MASS data.

Figure 6— Example of a raw K_s-band image, the corresponding background solution and the residual (background subtracted) image. The gray-scale stretch ranges from –2s to 5s of the median background level. The background level is approximately 4´ larger than that of J-band (due to thermal emission of the atmosphere).