I need some guidance on the appropriate level of pooling to use for difference of means tests on time series data. I am concerned about temporal and sacrificial pseudo-replication, which seem to be in tension on this application. This is in reference to a mensural study rather than a manipulative experiment.
Consider a monitoring exercise: A system of sensors measures dissolved oxygen (DO) content at many locations across the width and depth of a pond. Measurements for each sensor are recorded twice daily, as DO is known to vary diurnally. The two values are averaged to record a daily value. Once a week, the daily results are aggregated spatially to arrive at a single weekly DO concentration for the whole pond.
Those weekly results are reported periodically, and further aggregated – weekly results are averaged to give a monthly DO concentration for the pond. The monthly results are averaged to give an annual value. The annual averages are themselves averaged to report decadal DO concentrations for the pond.
The goal is to answer questions such as: Was the pond's DO concentration in year X higher, lower, or the same as the concentration in year Y? Is the average DO concentration of the last ten years different than that of the prior decade? The DO concentrations in a pond respond to many inputs of large magnitude, and thus vary considerably. A significance test is needed. The method is to use a T-test comparison of means. Given that the decadal values are the mean of the annual values, and the annual values are the mean of the monthly values, this seems appropriate.
Here’s the question– you can calculate the decadal means and the T-values of those means from the monthly DO values, or from the annual DO values. The mean doesn’t change of course, but the width of the confidence interval and the T-value does. Due to the order of magnitude higher N attained by using monthly values, the CI often tightens up considerably if you go that route. This can give the opposite conclusion vs using the annual values with respect to the statistical significance of an observed difference in the means, using the same test on the same data. What is the proper interpretation of this discrepancy?
If you use the monthly results to compute the test stats for a difference in decadal means, are you running afoul of temporal pseudoreplication? If you use the annual results to calc the decadal tests, are you sacrificing information and thus pseudoreplicating?