|
2 months ago | |
---|---|---|
data | 2 months ago | |
figures | 3 months ago | |
.gitignore | 4 months ago | |
0_prepare_data.ipynb | 3 months ago | |
1_earlier_measurements_images.ipynb | 3 months ago | |
2_Vostok_measurements_images.ipynb | 3 months ago | |
3_CAPE_T2_images.ipynb | 3 months ago | |
4_IP_simulations_temporal_images.ipynb | 3 months ago | |
5_IP_simulations_spatial_images.ipynb | 3 months ago | |
readme.md | 3 months ago |
readme.md
The seasonal variation of the DC global electric circuit
In this repository you can find the measurement and modelling data, the research code, and a comprehensive description of the code developed for our two papers on atmospheric electricity:
First paper (Part 1: A new analysis based on long-term measurements in Antarctica) focuses on the seasonal variation of the global electric circuit (GEC) based on electric field measurements at the Vostok station in Antarctica during 2006–2020. The key findings include:
- The annual behaviour of the GEC intensity is difficult to determine reliably, as seasonal variation patterns at specific UTC hours can significantly differ from the seasonal variation of the diurnal mean.
- Potential gradient data from the Vostok station indicate that the daily mean GEC intensity reaches its maximum during the Northern Hemisphere summer (with a small local minimum inside this summer maximum) and its minimum during the Northern Hemisphere winter.
- These seasonal changes are linked to the annual cycle of insolation, which influences the dynamics of global convection intensity, and to the distribution of land and ocean between the hemispheres.
Second paper (Part 2: Further analysis based on simulations) further analyses the seasonal variation of the GEC on the basis of simulations with models of atmospheric dynamics. The key findings are as follows:
- The equatorial contribution to the GEC exhibits two maxima following the equinoxes, while each of non-equatorial contributions has one maximum during the local summer.
- The overall annual variation of the GEC is rather subtle and represents a combination of complex patterns where different contributions partially offset each other, making it challenging to simulate.
- Simulations better agree with observations when surface air temperature is included alongside convection parameters in the ionospheric potential parameterisation.
Note: As soon as the articles are published, we will add links to them here.
A Short Description of the Scripts
Note: We use some data sets containing ionospheric potential values simulated via climate and weather forecasting models. Since these raw data sets are very large (around 350 Gb), we only upload preprocessed lower-dimensional data sets (around 30 Mb) to the repository. This preprocessing can be reproduced using the script
0_prepare_data.ipynb
, but this would require downloading the raw data sets from https://eee.ipfran.ru/files/seasonal-variation-2024/.
1_Earlier_measurements_images.ipynb
: the seasonal variation of the GEC based on the results of earlier studies (with the data taken from other sources)2_Vostok_measurements_images.ipynb
: the seasonal variation and a diurnal-seasonal diagram for the GEC on the basis of potential gradient data from Vostok (both the new data for 2006–2020 and the earlier data for 1998–2001)3_CAPE_T2_images.ipynb
: the seasonal variation of air temperature at the height of 2 m (T2
) averaged over different latitude ranges and the distribution of CAPE in the tropics in different models4_IP_simulations_temporal_images.ipynb
: the seasonal and diurnal variations of simulated ionospheric potential values for different models and parameterisations5_IP_simulations_spatial_images.ipynb
: the decomposition of the seasonal variation into contributions to the ionospheric potential from different latitude ranges and a new parameterisation of the ionospheric potential
Note: The scripts should be executed sequentially one after another; at the very least, scripts 4 and 5 should be run after script 2. This is necessary because script 2 saves intermediate arrays of preprocessed data from the Vostok station, which are used in scripts 4 and 5.
A Detailed Description of the Scripts
Note: Here and throughout the script descriptions, it will be specified which calculations correspond to which figures. The figure number is presented in the format 1.x or 2.x, where the first digit refers to the article (Part 1 or Part 2) and the second part indicates the figure number within the article. If the figure number starts with the letter "S" (e.g., 1.S1), it refers to a figure in the Supporting Information for the corresponding article.
1. The script 1_Earlier_measurements_images.ipynb
This script imports the digitised data of earlier measurements from other sources, required for constructing Figure 1.1.
The data analysis in this file is minimal, focusing only on calculating the amplitude of the seasonal variation as a percentage of the annual average value.
2. The script 2_Vostok_measurements_images.ipynb
Firstly, we present two data sets based on earlier publications: vostok_old_data
(adjusted potential gradient values for 1998–2001) is taken directly from Burns et al. (2012); vostok_old_data_unadjusted
(raw data for the same time period) is inferred from Burns et al. (2012) by subtracting from the adjusted values the linear corrections indicated in the same paper. These two data sets will be required to construct a series of figures comparing our results with earlier studies.
2.1. Preparing potential gradient data
Secondly, hourly averaged results of potential gradient measurements from the Vostok station are loaded into pandas dataframes for both new (dataframes df_10s
and df_5min
) and earlier (dataframe earlier_df_src
) data sets.
New measurements at the Vostok station are combined from hourly data derived from 10-second files and hourly data derived from 5-minute files; it should be noted that the data set primarily relies on the 10-second data, and the 5-minute data are only used when owing to technical reasons the 10-second data are unavailable. The combined series of new measurements is saved in the dataframe df_src
.
Note: When averaging the source data into the hourly series, we used only those hours for which at least 80% of the records were present, i.e. 288 for 10-second data or 10 for 5-minute data. Note that here we do not publish the code for preparing the hourly time series (it is very simple) and the original data with higher resolution.
2.2. Taking form factors into account
At the Vostok station potential gradient measurements should be corrected for the influence of the metal pole supporting the field mill by applying a form factor. The form factor accounting for the gradual reduction in the pole’s height due to the constant snow accumulation decreases linearly from 3 in the beginning of 2006 (for a 2.95 m high pole) to 2.4 at the end of 2024 (for a 1.75 m high pole).
To apply the correction, we add a Factor
column to the df_src
DataFrame. This column is then used to convert the sensor-measured values into absolute potential gradient values.
Note: Since the form factor for the field mill during the 1998–2001 period is unknown and the reported field values from the earlier Vostok data do not differ significantly from the corrected new data, we do not apply this procedure to the older data and leave them as they are.
2.3. Helper functions and fair weather criteria
Next, we introduce helper functions. Notably, the pass_fair_weather
function, when applied to a dataframe (like df_src
or earlier_df_src
), retains only those days when (1) there were no gaps, (2) the potential gradient did not exceed 300 V/m and was non-negative, and (3) the peak-to-peak amplitude was no more than 150% of the daily average value of the potential gradient.
The next helper functions to mention are calculate_seasonal_var_params
and std_error
.
They are structured such that the input to calculate_seasonal_var_params
is a dataframe with daily average values, and the function returns:
- An array of 12 monthly average values of the potential gradient,
- An array of 12 monthly counts of fair-weather days,
- An array of 12 monthly average values of the square of the daily mean potential gradient.
std_error
is designed to take the output from calculate_seasonal_var_params
and return 12 values of the standard error, one for each month.
Both described functions are used to compute values necessary for plotting graphs (mean value ± one standard error).
For both new and early Vostok data, we apply the pass_fair_weather
function, resulting in two data sets that contain only the hours of fair-weather days (in code named as df
and earlier_df
)
2.4. Resampling daily fair-weather data sets
Since in paper we focus on studying monthly mean values, using hourly dataframes for all calculations might be excessive. Therefore we average both dataframes (pre-filtered to include only fair-weather days) over days, so that each day corresponds to a single potential gradient value.
Such daily-resolution arrays are stored in the variables daily_df
and daily_edf
.
Note: At this stage we also save the average hourly values in the temporary file
vostok_diurnal_2006_2020.npy
to facilitate plotting later figures.
2.5. Constructing Figure 1.2, Figure 1.S1 and Figure 1.S2
Figure 1.2 illustrates the seasonal variation (based on the new data) across different multi-year periods. To create it, we used the prepared data (daily_df) and helper functions to compute the mean values, the number of fair-weather days and the standard errors for three data sets:
- The complete series of the new Vostok data (2006–2020).
- The same data for 2006–2012.
- The same data for 2013–2020.
Note: The data from this figure are saved in the temporary file
vostok_2006_2020_results.npz
for use in plotting later figures. This helps us to avoid duplicating the code or merging the code to build different entities in a single cumbersome file.
Figure 1.S1 illustrates the seasonal variation (based on the new data) for individual years. Using the same process as was employed in the case of Figure 1.2, we compute the mean values, the number of fair-weather days and standard errors, but this time for each year separately (we obtain 15 data sets, one for each year in 2006-2020).
Figure 1.S2 is based on a similar analysis performed separately for individual 3-year periods.
2.6. Constructing Figure 1.3
Figure 1.3 presents a diurnal-seasonal diagram. To construct it, we transform the Vostok data series into a 12-month × 24-hour matrix. This is achieved by grouping the original df
dataframe of fair-weather hours by month and hour, then calculating the mean value for all data points corresponding to each specific hour of a specific month (stored in the sd_df
dataframe).
For clarity, we also present cross-sections of this diurnal-seasonal diagram at 3, 9, 15 and 21 hours UTC.
Note: Renaming the axes of the multi-index resulting from grouping (
sd_df.index.set_names(['hour', 'month'], inplace=True)
) is not required for the code to function and can be commented out. However, it may be convenient for further work with the diurnal-seasonal dataframesd_df
.
2.7. Data adjustment
2.7.1. Adjustment for local meteorological influences
First, we load the meteorological data sets (temp_df
, wind_df
, pressure_df
), averaged over days (vostok_daily_temp
, vostok_daily_wind
, vostok_daily_pressure_mm_hg
). For further analysis, we use the meteo_df
dataframe, which is created by merging the meteorological dataframes with daily average potential gradient values (daily_df
).
Next, we compile arrays of potential gradient anomalies and anomalies for all meteorological parameters. The anomalies are calculated using a moving window of ±10 days. We then study the correlation between meteorological parameters and potential gradient values and estimate their statistical significance. The correlation with pressure turned out to be insignificant and was therefore excluded.
We then find multiple linear regression coefficients temp_coeff
and wind_coeff
for the anomalies of the potential gradient, temperature and wind speed. Using the obtained coefficients, we subtract a linear relationship with meteorological parameters (temperature and wind speed) from potential gradient values. The corrected potential gradient is saved in meteo_df["CorrectedField"]
.
2.7.2. Adjustment for the solar-wind-imposed potential difference
First, we load the data set containing the solar-wind-imposed potentials over Vostok (swip_df
) from a text file containing hourly values of the ionospheric potential contribution over the southern polar cap from 1998 to 2020. Note that these values were calculated using the publicly available Weimer (2005) model.
Similarly to how we combined potential gradient measurements and meteorological data into meteo_df
to facilitate their processing, here we also merge the original potential gradient data (daily_df
) with the solar-wind-imposed potentials into the solar_df
dataframe.
The procedure for identifying correlations, determining the regression coefficient, and removing the trend from the measured potential gradient values is nearly identical to the one described in the previous section.
We apply the solar-wind-imposed correction to two data sets: meteo_solar_df
(obtained from merging meteo_df
with swip_df
; here the potential gradient values are adjusted for both meteorological influences and the solar wind) and solar_df
(which takes into account only the correction for solar wind).
Note: The adjusted data are saved in the temporary file
vostok_2006_2020_results_adjusted.npz
for use in plotting later figures.
2.8. Constructing Figure 1.5 and Figure 1.S3
We construct Figures 1.5 and 1.S3 using the prepared data in the same manner as it was done for Figures 1.2 and 1.3.
Figure 1.5 presents the monthly mean fair-weather potential gradient at the Vostok station under different conditions:
- Unadjusted data from 1998–2001 (from Burns et al., 2012).
- Data from 1998–2001 adjusted for meteorological and solar wind influences (from Burns et al., 2012).
- Unadjusted data from 2006–2020.
- Data from 2006–2020 adjusted for meteorological influences.
- Data from 2006–2020 adjusted for solar wind influences.
- Data from 2006–2020 adjusted for both meteorological and solar wind influences.
Each subplot visualises the seasonal variation for the corresponding data set, with error bars indicating standard errors. The amplitude of the variation (as a percentage of the mean) is also annotated.
Figure 1.S2 shows the same data as panel 1 of Figure 1.5, but the analysis is based on our own calculations rather than on those by Burns et al. (2012).
2.9. Hourly data adjustment for future use
Finally, we perform procedures similar to those described in section 2.7 above to adjust hourly values of the potential gradient at Vostok for meteorological influences and for the solar-wind-imposed potential difference (we will need these data later). To this end, we employ a 6-hour time series of meteorological parameters (namely, temperature and wind speed at 0:00, 6:00, 12:00 and 18:00 UTC) and interpolate them to estimate values at 0:30, 1:30, 2:30 UTC and so on (full_temp_df
, full_wind_df
); we merge these data with hourly electric field values (corresponding to averages over 0:00–1:00, 1:00–2:00, 2:00–3:00 UTC and so on) and with the hourly data from swip_df
(also corresponding to 0:30, 1:30, 2:30 UTC and so on) to obtain the data set full_meteo_solar_df
. To remove the unnecessary variability from the potential gradient data, we use the same regression coefficients earlier determined for the diurnal mean values.
Note: The average adjusted hourly values are saved in the temporary file
vostok_diurnal_2006_2020_adjusted.npy
to facilitate plotting later figures.
3. The script 3_CAPE_T2_images.ipynb
This script calculates the seasonal variation of air temperature at the height of 2 m (T2
), derived from global simulations of atmospheric dynamics, and presents distributions of CAPE values in the tropics (more precisely, within 30°S–30°N) in different models.
3.1. Plotting Figures 1.4 and 2.4
In the script the average temperatures for different latitudes and months are loaded from WRF_T2_MONxLAT.npy
(see the data description below).
The temperature is then averaged across the latitude ranges of 20°S–20°N, 30°S–30°N, 40°S–40°N and 50°S–50°N, taking into account the area variation with latitude (1° wide cells at higher latitudes are weighted with a diminishing coefficient). The results of the averaging (the seasonal variation of temperature within the specified latitude bands) are presented in Figures 1.4 and 2.4, consisting of four panels each.
3.1. Plotting Figure 2.S1
The source data for CAPE distributions is downloaded from WRF_CAPE_HIST.npz
, INMCM_SCAPE_HIST.npz
, WRF_CAPE_RAIN_HIST.npz
and INMCM_SCAPE_RAIN_HIST.npz
(see the data description below).
On the basis of these data the script plots Figure 2.S1, which shows histograms of CAPE for all grid columns and hours within 30°S–30°N, as well as for those grid columns and hours within the same range for which hourly precipitation is non-zero.
4. The script 4_IP_simulations_temporal_images.ipynb
This script implements the calculation of the seasonal variation of the simulated ionospheric potential using the data from two different models of atmospheric dynamics (the WRF and the INMCM). Specifically, it utilises different data sets of the modelled ionospheric potential, various time periods and so on.
Note: Unlike previous scripts, this and subsequent scripts do not use pandas for data analysis. This is because all the input data for ionospheric potential simulations (and temperature) are stored in binary
.npy
(numpy) format.
The import of libraries and helper functions in this script is similar to the previous ones and will not be discussed in detail.
The first step in the script is loading the time series of the modelled ionospheric potential. Since there are quite a few of them (calculated for different models with various parameterisation settings), for consistency they are loaded into two dictionaries: wrf_hourly_total_ip
and inm_hourly_total_ip
. To facilitate temporal grouping, two auxiliary index arrays, <model>_dt_indices
, of the same length as the arrays in the dictionaries, are also created; these arrays store the series of dates corresponding to the respective ionospheric potential arrays.
Dictionary | Key | Ionospheric Potential Time Series |
---|---|---|
wrf_hourly_total_ip |
500 | WRF model, CAPE > 500 kJ/kg, new parameterisation with T2m > 25 °C |
wrf_hourly_total_ip |
600 | WRF model, CAPE > 600 J/kg |
wrf_hourly_total_ip |
800 | WRF model, CAPE > 800 J/kg |
wrf_hourly_total_ip |
1000 | WRF model, CAPE > 1000 J/kg |
wrf_hourly_total_ip |
1200 | WRF model, CAPE > 1200 J/kg |
inm_hourly_total_ip |
600 | INMCM model with CAPE > 600 J/kg |
inm_hourly_total_ip |
800 | INMCM model with CAPE > 800 J/kg |
inm_hourly_total_ip |
1000 | INMCM model with CAPE > 1000 J/kg |
inm_hourly_total_ip |
1200 | INMCM model with CAPE > 1200 J/kg |
4.1. Constructing Figure 2.2
Here we average the data from the loaded arrays in the dictionaries (data sets for both models with the usual parameterisation and CAPE thresholds of 600, 800, 1000 and 1200 J/kg) over months and construct eight plots.
Each of these plots represents the seasonal variation for a specific model and set of parameters. Note that monthly grouping for seasonal averaging is implemented using the auxiliary index arrays described above.
Eight plots use the modelling data, while the final two are based on the data from the Vostok station (before and after adjustments). It should be noted that here we use the output from the script 2_Vostok_measurements_images.ipynb
.
4.2. Constructing Figure 2.1
Here we plot average diurnal variaiton curves corrsponding to the same data sets which were earlier used to plot Figure 2.2. We also use some output from the script 2_Vostok_measurements_images.ipynb
.
4.3. Constructing Figures 2.S2 and 2.S3
Figures 2.S2 and 2.S3 are constructed similarly to Figure 2.1. However, here specific periods are selected for averaging prior to plotting.
5. The script 5_IP_simulations_spatial_images.ipynb
This script performs a somewhat more complex calculation. It uses the same time-averaging approaches as the previous one but additionally decomposes the ionospheric potential into contributions from different latitude ranges.
This script also introduces the results obtained using a new ionospheric potential parameterisation, which includes the surface air temperature (T2
).
Note: The description of some helper functions and helper arrays will not be provided here, as they are identical to those used in Scripts 1–4.
5.1. Helper class LatitudeBand
The LatitudeBand
class provides a convenient abstraction for extracting data corresponding to specific latitude ranges from two-dimensional arrays.
During initialisation, the model (either the WRF or the INMCM) is taken into account, recognising that these models are based on different grids. Also, the latitude range expressions such as 8N-8S
are parsed.
Useful attributes of this class include array slices, visualisation colours, as well as titles and labels for display purposes.
Usage example:
my_band = LatitudeBand("20N-20S", model="WRF")
seas_var = some_LATxMON_wrf_ip[my_band.slice].sum(axis=0)
This code returns seas_var
, a seasonal variation (12 values) of the contribution to the ionospheric potential from a specific latitude range between 20°N and 20°S.
The class implements the calculation of various fields used for generating figures. For example, my_band.title
will return the string 20° N–20° S
. For a more detailed understanding, it is recommended to review how class objects are utilised in the code, as the class definition itself may not be very intuitive.
5.2. Loading precalculated arrays and auxiliary calculations
The loading procedure is similar to that in Script 4. However, here we work not with the total one-dimensional ionospheric potential values but with contributions that are monthly averaged and summed within latitude bands of one grid step width.
The loaded arrays and dictionaries of arrays (see the description in Script 4) are named in the format <model>_LATxMON_<variable>
, highlighting their dimensions (number of the latitudes in the model grid × 12).
After loading these data, various seasonal variation characteristics are calculated for different latitude ranges. These include the contribution of each latitude range to the total ionospheric potential (as a percentage), the amplitude of the seasonal variation, the months when the minimum and maximum of the seasonal variation are attained.
5.3. Constructing Figures 2.3 and 2.5
For Figure 2.5 we decompose the seasonal variation of the ionospheric potential into contributions and show them cumulatively. The total value in the figure is composed of contributions from specific latitude ranges: ["0-9N", "0-9S", "9S-18S", "18S-90S", "9N-18N", "18N-90N"]
.
Figure 2.3 uses a similar approach to averaging the arrays; however, the contributions of individual latitude strips are plotted in separate subplots. Additionally, seasonal variation patterns for the average temperature across different latitude ranges are shown.
Note: While the ionospheric potential contributions from individual grid cells within latitude ranges can be summed (they are additive), this is not the case for temperature. For temperature we must calculate a weighted average, where the weighting function takes into account the variability in the grid cell area.
5.4. Constructing Figure 2.6
This figure, which illustrates the simulated diurnal and seasonal variations according to different parameterisations, introduces the data from the new parameterisation that takes into account surface air temperature.
The data processing procedure for the seasonal variation is similar to that used in Figure 2.5. For the diurnal variation the arrays are created based on the hourly time series of the total ionospheric potential (wrf_hourly_total_ip
), which are used exclusively for this figure.
Note that we again utilise the data from the Vostok station to construct the figure. Here we load two preprocessed data sets from previous scripts: one for the diurnal variation (vostok_diurnal_2006_2020_adjusted.npy
) and one for the seasonal variation (vostok_2006_2020_results_adjusted.npz
).
Description of the data files
./data/Vostok/
vostok_hourly_from_10_s_without_calibration_and_empty.tsv
,vostok_hourly_from_5_min_without_calibration_and_empty.tsv
: Text files containing two columns, one of which represents the date and time (columnDatetime
) and the other, hourly averaged potential gradient values on the basis of the measurements at the Russian Antarctic station Vostok (columnField
, the units are V/m).vostok_1998_2004_hourly_80percent_all.tsv
: Hourly potential gradient data for 1998–2004, based on earlier measurements with a different field mill. Contains two columns:Datetime
(date and time) andField
(potential gradient values in V/m).vostok_1998_2020_hourly_SWIP.txt
: Simulated (with the Weimer 2005 model) hourly solar wind imposed ionospheric potentials over the Vostok station for 1998–2020.vostok_daily_pressure_mm_hg.csv
,vostok_daily_temp.csv
,vostok_daily_wind.csv
: CSV files containing daily averaged atmospheric pressure, temperature and wind speed at the Vostok station.vostok_daily_pressure_mm_hg_6h.csv
,vostok_daily_temp_6h.csv
,vostok_daily_wind_6h.csv
: CSV files containing daily atmospheric pressure, temperature and wind speed at the Vostok station every 6 hours (the pressure data are not actually used here, given that it did not show a statistically significant correlation with potential gradient values).
./data/WRF/
WRF_HOURLY_TOTAL_IP_[500_T2_25/600/800/1000/1200].npy
:numpy
arrays containing hourly total ionospheric potential values simulated with the WRF model, assuming the CAPE threshold of 500, 600, 800, 1000 or 1200 J/kg. Note that for the threshold of 500 J/kg a new parameterisation is applied, which also assumes a 25 °C threshold in temperature; the same applies to the next item on the list.WRF_IP_[500_T2_25/600/800/1000/1200]_LATxMON.npy
:numpy
arrays with dimensions(latitude, 12)
, containing monthly average contributions to the ionospheric potential from different latitude ranges simulated with the WRF model, assuming the CAPE threshold of 500, 600, 800, 1000 or 1200 J/kg.WRF_NUMDAYS_MON.npy
: Anumpy
array indicating the number of fair-weather days included in the monthly averages.WRF_T2_LATxMON.npy
: Anumpy
array with the shape(180, 12)
, containing montly averaged temperatures at the height of 2 m (T2m
) for each 1° wide latitude band.T2m
are calculated with the WRF.WRF_CAPE_HIST.npz
,WRF_CAPE_RAIN_HIST.npz
: Values and 10-J/kg bin boundaries to plot histograms of CAPE within 30°S–30°N simulated with the WRF model (for all grid columns and hours and for those with non-zero precipitation)
./data/INMCM/
INMCM_HOURLY_TOTAL_IP_[600/800/1000/1200].npy
:numpy
arrays containing hourly total ionospheric potential values simulated with the INMCM model, assuming the CAPE threshold of 600, 800, 1000 or 1200 J/kg.INMCM_IP_[600/800/1000/1200]_LATxMON.npy
:numpy
arrays with dimensions(latitude, 12)
, containing monthly averaged contributions to the ionospheric potential from different latitude ranges simulated with the INMCM model, assuming the CAPE threshold of 600, 800, 1000 or 1200 J/kg.INMCM_NUMDAYS_MON.npy
: Anumpy
array indicating the number of fair-weather days included in the monthly averages.INMCM_SCAPE_HIST.npz
,INMCM_SCAPE_RAIN_HIST.npz
: Values and 10-J/kg bin boundaries to plot histograms of surface CAPE within 30°S–30°N simulated with the INMCM model (for all grid columns and hours and for those with non-zero precipitation)