VI. Analysis of the 2MASS Second Incremental Release Catalogs

1. Comparison of Achieved Performance of Second Incremental Release Catalogs with Level 1 Science Specification

i. Extended Source Reliability

i. The Nature of Galaxy Truth Table "Unknowns"

Introduction

In order to verify the results of the 2MASS galaxy processing, T. Jarrett has spent many months examining over 150,000 sources by hand and classifying them into galaxies, apparently single stars, double stars, triple stars and artifacts. Jarrett used all the tools at his disposal, the parameters calculated by GALWORKS, the 2MASS Images, and the DPOSS images. In addition, J. Huchra and J. Mader have also examined tens of thousands of sources by hand.

However, at high latitudes and in the half magnitude bin above the XSC completeness specification, 4-10% of all sources defied categorization in that they could not clearly be placed into either the galaxy or single star classes. These sources were left as a class of "unknown" sources.

For years we have wondered what are the true nature of the unknown sources. A few unknowns have been followed up with higher resolution data, and most of them have turned out to be true galaxies. However, the number of such followups is extremely small, and because all of them have occurred in known galaxy clusters, that subset of the unknown sources might not be representative of the class as a whole. In the absence of definitive information, we have variously thought that the unknowns are nearly always true galaxies or that they were nearly always falsely-detected single stars.

Finally, we now have enough statistics to be able to with some confidence state that most of the unknowns that have J magnitudes between 14.5 and 15.0 mag, found above |b| = 30° are likely true galaxies. It is possible that a significant minority (~15%) could still be false sources. This page summarizes the evidence for those statements.

In the following, the term single stars or just singles will always mean "detected extended sources" classified by T. Jarrett or J. Huchra/J. Mader as single stars.

Hypotheses

Since the unknowns are either true galaxies, falsely-detected single stars, or a mixture of the two classes, we can compare the characteristics of the unknowns to the two possible source classes. Thus, consider the following two hypotheses, and their implications:

The unknowns are true galaxies whose non-nuclear components have become too faint to be detected by eye on the 2MASS or DPOSS images, and thus appear quasi-stellar. In other words, the fuzz around the nucleus has become too faint to be seen by anything other than the power of the GALWORKS processing.
This hypothesis predicts that the characteristics of the unknowns are very similar to those of true galaxies, except that the unknowns should be fainter on average and have somewhat lower measured source extents, accounting for the less visible fuzz.
The unknowns are actually single stars which have been detected by GALWORKS for the same reasons that the sources classified as single stars were detected.
Unfortunately, we cannot predict from first principles the characteristics that this implies, since we haven't yet analyzed the single star classifications to find the main failure modes that allowed these stars to be detected as extended. The most likely suspect is a brief instance of untracked seeing, that was either too brief to be reliably detected by the automatic see_track "seeing-tracking" processing, or that was associated by a marginally-poor see_track score that fell just under the threshold for rejecting the data. Untracked seeing is a well-known cause that creates extended sources out of true point sources.
In spite of not knowing the precise cause that creates the category of single stars, this hypothesis predicts that the characteristics of the unknowns should closely match the characteristics of the single stars picked up as extended sources. There must be some modification to account for the placement of each source into the unknown category rather than the single star category, but we cannot predict what those modifications would be without further understanding of the failure mode that produces the single stars. For example, if the single stars are due to untracked seeing, then one scenario consistent with this hypothesis is that the unknowns result from instances of worse seeing than the occasions when a single star category was clear. The worse seeing would mimic a fuzzier source, preventing Jarrett or Huchra/Mader from confidently assigning the source to the single star category.

Analysis

With these two hypotheses in mind, it is now a simple matter to analyze all the main characteristics of these source populations and compare them to the predictions of each hypothesis. We will consider each characteristic in turn.

In all analysis below, we consider the set of classified sources with 14.5 < J < 15.0 mag that are unconfused as indicated by their cc_flg, found at |b| > 30°. There are 3467 galaxies, 41 singles, 25 doubles, 0 triples and 178 unknowns. (See Section VI.3c.) Due to the large difference in the number of galaxies, singles and unknowns, in some of the plots below we scale the number of galaxies and singles to the number of unknowns, to make the comparison of the distributions clearer.

Distribution on the sky

Galaxies have an approximately uniform distribution on the sky, whereas stars are highly concentrated to the Galactic plane. If the unknowns follow one distribution rather than the other, this test by itself is probably the most powerful test by far, giving the answer directly.

The distribution of stars falsely identified as extended by untracked seeing will not be precisely the same as the stellar distribution, since untracked seeing is somewhat more likely to occur in low source density areas where the seeing is tracked with a lower frequency. However, seeing flare-ups do occur on time scales too short to be tracked even with the high stellar densities of the Galactic plane. In any case, the theoretical distribution is not needed - we have the observed distribution of the single stars.

Because we do not have readily available the area covered by this subset of the truth table, we instead simply compare the histogram of number of sources vs. density in Figure 1. (Density is the log number of sources with K_s < 14.0 mag.)

The histogram of unknowns vs. density is almost an exact match of the histogram of galaxies vs. density below density = 2.9, and definitely disagrees with the histogram of singles vs. density below density = 2.9. This is the most powerful evidence that most of the unknowns are, in fact, true galaxies.

Above density = 2.9 there is an excess of unknown sources, implying that a small percentage of unknowns may in fact be false sources in those high density areas. There are 27 more unknowns above density = 2.9 than predicted by the hypothesis that 4.4% of all galaxies fall into the unknown category. This implies that ~27/178 = 15% of the J unknowns are singles, and that ~85% of the J unknowns are true galaxies.

Figure 1

Distribution of Measured Extent Parameters: J shape

In the figures below, the unknowns will be separated into sources found in areas with density < 2.9 (blue) and with density > 2.9 (pink).

Galaxies have a range of measured extent parameters, ranging from small compact galaxies to large more diffuse galaxies. The hypothesis that unknowns are galaxies whose non-nuclear component has fallen below some visibility threshold implies that the unknowns should continue to show a variation in their extent parameters, but that the group of unknowns as a whole must be fainter and therefore have smaller extent parameters.

If these predictions are not obvious, consider two specific cases. First, consider a small compact galaxy whose fuzz is bright enough to be seen on the 2MASS images, and therefore classified as a galaxy. If that galaxy simply is moved farther away, it becomes smaller and fainter, with a smaller measured extent. Correspondingly, it gets harder to classify it as a galaxy. Second, consider a large diffuse galaxy also classified as a galaxy. If that galaxy is farther away, it also becomes smaller and fainter, with a smaller measured extent. However, this galaxy still has a large measured extent compared to the first galaxy, even though it too has become too faint to classify as a galaxy. Hence the group of galaxies as a whole has become fainter and smaller, but still retains some variation in the measured extent parameters.

The J shape score (j_sh_sc) is the measured source extent divided by the 1 scatter in the shapes of point sources. Plotting that score vs. J magnitude directly shows the range in measured galaxy source size at a given total J magnitude, as well as the general decrease in measured source extent as the total J magnitude becomes fainter.

Figure 2 showing unknowns is exactly that predicted from the plot for galaxies, shown in Figure 3, and very different from the plot for singles, shown in Figure 4. Only one single has a j_sh_sc above 20, whereas a considerable number of the unknowns have scores above 20. Further, the unknowns show a clear trend of declining score with fainter magnitude, not present in the singles.

We conclude that we have strong confirming evidence that the bulk of the unknowns are true galaxies.

Figure 2 Figure 3 Figure 4

Distribution of Measured Extent Parameters: J vs. K_s Shape

The distribution of the J and K_s measured shapes are significantly different for galaxies, as shown in Figure 5, and singles, as shown in Figure 6. Galaxies show a high correlation between their shapes measured at J and K_s, whereas the singles considered here, detected at J, tend to not have very high K_s shapes. This latter fact may be caused by the J and K_s fitted seeing curves differing significantly in areas with untracked seeing.

Regardless of the source of the difference between galaxies and singles, there is no doubt that the unknowns, shown in Figure 7, strongly resemble galaxies, and not singles.

Figure 5 Figure 6 Figure 7

Other Comparisons

For completeness, we present other possible comparisons that use the 2MASS data, but which are not expected on theoretical grounds to show clear differences between galaxies and stars.

The color-color plot for galaxies, singles, and unknowns, shown in Figures 8, 9, and 10, closely resemble each other, due to the color selection used to assign the "G" score. Nonetheless, it is noticeable that the singles include relatively more quite blue sources (J-H < 0.4) than are seen in the galaxy and unknown plots.

Figure 8 Figure 9 Figure 10

The number histogram with magnitude, shown in Figure 11, for the three source classes is difficult to interpret, since there are many effects going on. For example, it is expected that the unknowns would show an increased number at fainter magnitudes if they are galaxies due to the selection effect of being classified as unknowns. It is also expected that stars falsely classified as extended would increase rapidly at fainter magnitudes, due to the larger errors in derived parameters at fainter magnitudes. Thus, we draw no conclusions from this plot.

Figure 11

Conclusion

Three powerful comparisons strongly support the hypothesis that the unknowns are largely (but not necessarily entirely) galaxies and strongly reject the hypothesis that the unknowns are single stars. The small differences in the distribution of parameters for unknowns vs. galaxies are exactly those predicted by the hypothesis that the unknowns are simply galaxies that have become faint enough, so that their fuzz is not clearly distinguishable in human analysis of images.

These comparisons leave little doubt that the unknowns are largely galaxies. Additional confirming evidence may remove any doubt once we understand the origin of the singles. For example, if the singles are due to untracked seeing in one way or another, that gives another distribution to which the unknowns and galaxies can be compared.

Of course, the ultimate truth is further followup of a number of the unknowns. The very limited work to date has found that the unknowns are indeed galaxies, but more followup is needed.

Again, it is possible that ~15% of the unknowns are false detections of single stars. Once the origin of the singles is known, it may be possible to accurately determine the percentage of the unknowns that are galaxies and the percentage that are single stars.

ii. Appendix: Postage Stamps

T. Jarrett has prepared postage-stamp images of the unknowns in Figures 12-28 below.

Figure 12 Figure 13 Figure 14 Figure 15