Is the forecast specifically for [suburb]? Or are you using the [city] forecast as a proxy? If the forecast is technically for [city], then the forecast numbers may have a high bias because they reflect the urban “heat island” of [city] (if you haven’t heard of the “heat island” terminology, it refers to all the heat emissions and different materials in the urban environment maintaining the temperature at a level that is higher than in surrounding areas, a common phenomenon).

If the forecast is, in fact, for [suburb], it’s probably valid for the city center. If your neighborhood is not close to the city center, then your neighborhoods cold temperatures may be an example of “microscale” variability. For instance, if your neighborhood is in a small valley or depression, then you can get what is called “cold pooling” at night. As the air cools at night, the coldest, most dense air will collect in the valleys/depressions. Meanwhile, the neighboring higher ground can be in notably warmer air. If the city center happens to be on higher ground, then it will be systematically warmer, and hence the forecast should be warmer. Keep in mind that this can happen even if the variations in topography are very small.

Often, the forecast numbers are derived from multivariate linear regression algorithms. I do not have intimate knowledge of these, but my understanding is that these algorithms exhibit reversion to the mean. That is, as the forecast lead time increases and the skill of the regression diminishes, the forecast is designed to trend closer to the climatological value (i.e. what you call the ‘normal’ value). However, in my experience using these algorithms in various forecast contests, the mean reversion doesn’t become pronounced until long forecast lead times (the 5-7 day forecasts or thereabouts). If you’re getting your numbers from a 1-day forecast then I’d be surprised if you’re seeing mean reversion.

Also, our predictions often fail to capture extremes, but not because any smoothing is applied. It’s because of model errors. In fact, rather than smoothing our predictions, we’re doing the opposite. We are stochastically forcing our numerical prediction models so that they have greater variability. The stochastic forcing of prediction models, to account for model errors, is an active area of research at the laboratory, and one that my boss has some expertise in.

Last, regards bias, this is an ongoing problem in weather prediction. The raw, unprocessed forecast distributions that we get from our prediction systems most definitely suffer from bias (what I mean here by “bias” is that the mean of our estimated forecast distribution is different from the mean of the “true” forecast distribution, i.e. what the distribution would be given a perfect model, perfect error statistics, and unlimited sampling). The problem is, the biases are thought to be strongly flow-dependent. That is, we might be in one weather “pattern” for a week or two, and suffer one type of bias, and then the pattern will abruptly shift, as will the bias. This flow-dependence makes estimating and removing the bias a challenging problem.

]]>But the really interesting thing is that the most valuable weather forecasts are those * for extremes*. Energy companies, Departments of Transportation, etc. do their business planning based on normals, then adjust if there are extremes. So for planning, the economic value of a forecast for extremes is much higher than for days when its nearly normal. But ironically, those extreme days are the ones forecasters have the toughest time predicting.