Discovering Dynamic Dipoles in Climate Data
Discovering Dynamic Dipoles in Climate Data
net/publication/220906792
CITATIONS READS
32 179
3 authors, including:
All content following this page was uploaded by Vipin Kumar on 16 January 2014.
dipoles. For example, the Dipole Mode Index (DMI) [4], which The Southern Oscillation index (SOI) is mea-
has been investigated in relation to the Indian Monsoon. sured as the difference in the pressure anomalies at
Tahiti and Darwin, Australia and captures fluctua- As mentioned, climate indices, including dipoles,
tions in pressure around the tropical Indo-Pacific re- are of great importance in understanding climate
gion that correspond to the El Niño Southern Oscilla- variability. Table 1 lists some dipoles that are well
tion (ENSO) climate phenomenon [13]. A high value known to climate researchers. These dipoles have
of SOI indicates higher pressure anomalies in the east- been discovered by observation, e.g., SOI and NAO,
ern tropical Pacific around Tahiti and lower pressure or by EOF analysis [12], e.g., AO. However, all these
anomalies around Indonesia and northern Australia, discoveries have required considerable research and
while a low value of SOI is associated with the reverse insight on the part of the domain experts involved.
conditions. Figure 2 shows the time series of pressure Because of the amount of effort involved and the
anomalies at Tahiti (measured at 17.5 S, 150W) and possibility of missing indices, an automated approach
Darwin (measured at 12.5S, 130E). to climate index discovery could be quite useful.
One of the first attempts in this direction was
Steinbach et al [8, 9, 10]. The approach used a shared
Southern Oscillation
nearest neighbor (SNN) [2] clustering approach to
4 find climate indices. More specifically, it built a
Tahiti
Darwin, Australia graph of all locations on a latitude-longitude grid
Monthy mean subtracted anomaly (hPa)
3
based on the positive pairwise correlations between
2 the anomaly time series of temperature or pressure at
these locations and then found clusters in this graph.
1
The centroids of these clusters or the differences
0 between two centroids were then used as candidate
climate indices. Many of the resulting candidate
−1 indices showed a high correlation with known climate
−2
indices and were similar in their level of impact on
land climate variables such as temperature.
−3 Tsonis et al. [14] pioneered the use of complex
networks to study climate systems. The authors con-
−4
1993 94 95 96 97 98 99 00 01 02 03 structed networks using nodes on a 5◦ x 5◦ grid on the
Years
globe, where the edges of the network were defined
in terms of the (absolute) correlation values between
Figure 2: Pressure anomaly time series for the South- the anomaly time series of climate variables (SST,
ern Oscillation SLP) of all the pairs of nodes. From this complete
4
xy (m) = xy (m) − µm , ∀y∈{1948..2009}
3.5
In this formula, start and end represent the start
3
and end years to consider for the mean and define the
Frequency
base for computing the mean for subtraction (in our 2.5
case 1948 and 2009). µm is the mean of the month
2
m and xy (m) represents the value of pressure for the
month m and year y.Once we remove the monthly 1.5
means, the resulting values are the anomaly time 1
series for that location.
0.5
3.3 Edge Weight Estimation After we get
the anomaly values for every node, the networks 0
−1 −0.5 0 0.5 1 1.5
are constructed by looking at the similarity values Correlation
in the NAO region. The North Atlantic Oscilla- 1. Strength of the negative correlation between
tion is seen very clearly in all the 9 networks of the two regions of the dipole. Higher negative
20 year periods. correlation implies a stronger dipole.
• Arctic Oscillation: The Arctic Oscillation is 2. Correlation with known dipole indices. This
the pressure anomaly around the North pole highlights the ability to reproduce known
and is defined on the basis of the first leading dipoles.
component of an EOF analysis using the region
north of 20N latitude. It does not have a pair of 3. Impact of the dipole indices on land by comput-
physical locations associated with it. However ing an area weighted correlation of land temper-
using our method we are able to find it in all the ature anomalies with the dipole indices. This
9 networks with a very high correlation. highlights the ability of data driven dipoles to
potentially outperform known dipoles.
• Antarctic Oscillation: The Antarctic Oscilla-
6.1 Negative Correlation within regions of
tion measures the anomaly of pressure around
Dipole From the definition of the dipole, the two
the Antarctic region. This oscillation is the ana-
regions forming a dipole should be negatively corre-
log of the Arctic oscillation in the southern hemi-
lated with each other. To compute the strength of
sphere and is also defined by EOF analysis of
the negative correlation across the two regions, we
locations south of 20S. We see the Antarctic Os-
look at three values -
cillation in all the climate networks. However
the climate indices data from the Climate Pre- 1. The mean value of the correlation between all the
diction Center is defined from 1979 onwards[19]. locations pairs across two regions constituting
Hence we can only compare its correlation with the dipoles. We call this value mean of all pairs.
known climate indices for the last two networks.
2. The best correlation in the two regions of the
• Western Pacific Index: The Western Pacific in- dipole represented by the most negative edge in
dex is north south dipole around the western Pa- the two regions. We call this value the best pair.
cific with one end located over the Kamachatka
peninsula and the other end in southeastern Asia 3. Compute the mean of the anomalies of all the
and the subtropical north Pacific. locations at each region and then take the corre-
lation between them. We call this pair of means.
6 Experimental Evaluation Table 2 shows the three correlation values of the
In order to evaluate the goodness of the dipole regions dipole regions discovered by our algorithms. The
generated, we look at three things - table reports the mean values for all the 9 networks.
0° 0° 0°
180° W 135° W 90° W 45° W 0° 45° E 90° E 135° E 180° E 180° W 135° W 90° W 45° W 0° 45° E 90° E 135° E 180° E 180° W 135° W 90° W 45° W 0° 45° E 90° E 135° E 180° E
° °
90 S 90 S 90° S
From the table it can be seen that all the regions 6.2 Comparison with known Climate Indices
are strongly negatively correlated, indicating that the In order to evaluate the goodness of the dipole
regions indeed consist of strong opposing pressure clusters found, we compared them with some well
polarities. known climate indices. For each of the 9 network
We performed a further analysis of the SOI region periods, we generated a set of dipoles from the
and found that the negative correlation between corresponding network. For every dipole belonging to
Tahiti and Darwin is not as strong as several other a time period, we took the two clusters belonging to
location pairs. Fig 9 shows the correlation between the dipole and computed their centroids by taking the
Tahiti and Darwin as well as the best pair results mean of the anomaly at those locations during that
from our two dipole finding algorithms. This results time period. We computed the difference in between
indicates that the underlying phenomenon leading to the two cluster centroids to create a time series which
the negative correlation is not fixed at Tahiti and is then compared with all the climate indices over
Darwin and that SOI and other climate indices are that period using linear correlation. We kept track
perhaps better captured with dynamic clusters. of the best correlation to the climate indices during
the period and recorded the dipole cluster that best
−0.4
matched each climate index. We performed this step
Tahiti−Darwin for all the time periods. Table 3 shows the the best
A1(best pair)
−0.45 A1 +Community(best pair)
correlation to each climate index of the dipoles found
using the two variations of algorithm A1 with a bin
−0.5
size of 300. Although A1 + community shows weaker
Correlation
correlated clusters whose difference correlates very ing shifted correlations will only improve the numbers
well with the AO climate index (as high as 0.85). further. From 4 we see that our algorithm is better
To evaluate the sensitivity of our analysis to than the existing approaches to find climate indices.
the choice of the dipole bin size, K, we looked at Note that for A1, we report the mean values that we
the mean correlation with the climate indices using got from choosing K=300 as shown in Table. 3.
different values of K. These results are shown in
6.3 Area weighted correlation with land tem-
Fig 10. As expected a small value of K gives very
perature From the previous sections we see that we
focused patterns for SOI and NAO and leads to better
can generate dipoles that dynamically change over
correlation because they are actually defined by single
time and from the results we see that their corre-
point locations. However we see that at very small
lation with known climate indices is very high. In
values of K, the correlation of the dipole cluster with
order to study the changes in the dipole clusters over
AO or AAO is not as high as for larger values of
time, we take their centroids and plot them on the
K. This is because the AO and AAO patterns are
globe. Fig.11 shows the plot of moving centroids of
not defined using single point locations, but instead
the Arctic Oscillation dipole. Fig 12 shows the plot
are defined as a summary of the behavior of a large
of moving centroids of the North Atlantic Oscillation
region.
cluster.
1
SOI NAO AO WP AAO
0.95
0.9
Correlation
0.85
0.8
0.75
0.7
0.65
100 300 500 1000 2000
K
Figure 11: Moving Centroids of Arctic Oscillation
Figure 10: Effect of varying the region size K on A1
0.06 0.06
Impact on land
0.05
0.04 0.04
0.03
0.02
0.02
0.01
0
1 2 3 4 5 6 7 8 9 0
Networks 1 2 3 4 5 6 7 8 9
Networks
7 Conclusion and Future Work claim is that the area weighted correlation of the SOI
index with land temperature anomalies is improved
This paper presents a novel approach to find dipoles by up to 90% by capturing the index as a centroid
using the climate data. The problem of finding of moving clusters rather than fixed locations. Given
dipoles has been of key interest to climate scientists as the importance of the Southern Oscillation on the
it helps in a greater understanding of the teleconnec- climate of the globe, this result has significant impact
tions and several important extreme phenomenons. in terms of predictions in climate science. The
Finding dipoles has been particularly interesting to Southern Oscillation is closely tied with the El Niño
the data mining community as the underlying data phenomenon which drives the extreme weather events
is not only large but also has a spatio-temporal na- like tropical cyclones, droughts, hurricanes, etc. A
ture presenting challenges such as seasonality, high thorough evaluation of this is part of future work.
variability, autocorrleation, etc. In this setting, we In addition to further evaluation and improve-
propose a method based on greedy heuristics to iden- ment of the approaches presented in the paper, we
tify dipoles. Our methodology seems to produce con- need to go beyond comparisons to current climate in-
siderably better results than the current state-of-art dices to see if any novel dipoles can be discovered.
algorithms. Although it is unlikely that any of these would be as
The algorithm A1 proposed in the paper and significant as NAO or SOI, such dipoles could still be
it’s community version is effective and efficient to of great regional importance.
implement. Our community based approach to first
partition the large network of all locations on the Acknowledgement
globe narrows the search space for A1 algorithm, This work was supported by NSF grants III-0713227,
generates fewer candidate dipoles, removes spurious IIS-0905581, and IIS-1029711. We also thank Dr.
connections and is able to match the performance Stefan Liess and Dr. Shyam Boriah for their com-
of A1. However, further investigation is needed to ments and feedback.
determine if one of these algorithms is to be clearly
preferred to the other. References
A larger significance of this work, which might
impact how climate scientists perceive the climate [1] Donges, J. F., Zou, Y., Marwan, N. Complex net-
indices, is that it shows climate indices are better works in climate dynamics. In European Physical
explained as centroids of dynamic clusters. So far, Journal Special Topics, 174 (1), pp. 157–179, 2006.
climate scientists have mostly considered climate [2] Ertoz, L., Steinbach, M., Kumar. V. A new shared
indices to be fixed. The evidence that supports our nearest neighbor clustering algorithm and its appli-
0.4 0.4
0.3 0.3
−45 −45
0.2 0.2
0.1 0.1
Latitude
Latitude
0 0 0 0
−0.1 −0.1
−0.2 −0.2
45 45
−0.3 −0.3
−0.4 −0.4
90 −0.5 90 −0.5
−90 0 90 180 −90 0 90 180
Longitude Longitude
Figure 15: Area weighted correlation of land temper- Figure 16: Area weighted correlation of land temper-
ature anomalies using SOI index for network 2 ature anomalies using our dynamic index generated
from A1 + Community for network 2.