8.5 KiB
Short Description of the Scripts
Note: For analysis, we use simulation data of the ionospheric potential through climate models. Since these data are very large (around 350 Gb), we only upload preprocessed lower-dimensional data (around 20 Mb) to the repository. Data preparation is possible using the script
0_prepare_data.ipynb
, but this would require downloading large files from https://eee.ipfran.ru/files/seasonal-variation-2024/.
1_Earlier_measurements_images.ipynb
plots seasonal variations from external sources2_Vostok_measurements_images.ipynb
plots seasonal variations and seasonal-dirunal diagram using new and early Vostok PG measurements3_WRF_T2_images.ipynb
plots seasonal variation ofT2m
temperature averaged across different latitude bands4_IP_simulations_temporal_images.ipynb
plots seasonal variation of simulated IP grouped by datasets and different year ranges5_IP_simulations_spatial_images.ipynb
plots seasonal variation of simulated IP grouped by latitude ranges
Note: The scripts should be executed sequentially one after another; at the very least, scripts 4 and 5 should be run after script 2. This is necessary because script 2 saves intermediate arrays of preprocessed data from the Vostok station, which are used in scripts 4 and 5.
Detailed Description of the Scripts
Script 1_Earlier_measurements_images.ipynb
This program contains digitized data from external sources, necessary for constructing Figure 1.1.
At the beginning of the script, the necessary libraries are loaded and arrays with digitized data are declared; at the end, a graph is constructed.
Data analysis in this file is minimal - it calculates the amplitude of seasonal variation (as a percentage relative to the annual average value).
Script 2_Vostok_measurements_images.ipynb
This script is quite voluminous (for further understanding, see the comments in the code).
Firstly, the introduction of digitized data is repeated in the code (in this case, only for the earlier data from the Vostok station, which are also used in the first script).
Preparing PG data
Secondly, measurement data from the Vostok station (pre-averaged by the hour) are loaded into Pandas dataframes, both new (dataframes df_10s
and df_5min
) and earlier (dataframe earlier_df_src
) datasets.
New measurements at the Vostok station are combined from hourly data derived from 10-second files and hourly data derived from 5-minute files; it should be noted that the dataset primarily relies on the 10-second data, and the 5-minute data are only used when the 10-second data were unavailable (there were 24 such hours in 2013, 312 in 2015, 1752 in 2017, and 3600 in 2020). The composite series of new measurements is saved in the dataframe df_src
.
Next, we introduce helper functions. Notably, the pass_fair_weather
function, when applied to a dataframe, retains only those days when (1) there were no gaps, (2) the potential gradient did not exceed 300 V/m and was non-negative, and (3) the peak-to-peak amplitude was no more than 150% of the average daily value of the potential gradient.
The next helper functions to mention are calculate_seasonal_var_params
and std_error
.
They are structured such that the input to the first function is a dataframe with average daily values, and the function returns (1) an array of 12 average monthly values of PG, (2) an array of 12 counts of fair weather days per month, and (3) an array of 12 sums of squares of the average daily PG values of fair weather divided by the number of fair weather days, annotated by the following formula:
sumₘ = Σ(daily mean PG for the i
-th fair weather day)² / (count of fair weather days),
where m
denotes the month number m = 1...12
, and i
iterates over all fair weather days for which the month of the date equals m
.
The std_error
function is designed to take the output from the calculate_seasonal_var_param
s function and return 12 values of the standard error, one for each month.
Both described functions are used to compute values necessary for plotting graphs (mean value ± standard error).
For both new and early Vostok data, we apply the pass_fair_weather
function, resulting in two datasets that contain only the hours of fair weather days (df
and earlier_df
)
Figure 1.2
To construct Figure 1.2, using the prepared data and helper functions, we calculate the mean values, the count of fair weather days and standard errors for three sets of data:
- The complete series of new Vostok data.
- The same series up to and including the year 2012.
- The same series after the year 2012.
Note: The data from this figure is saved in the temporary file
vostok_2006_2020_results.npz
for use in the second article. This helps avoid code duplication or merging code to build different entities in a single cumbersome file.
Figure 1.3
To construct Figure 1.3, we transform the Vostok data series into a matrix of 12 months x 24 hours. To do this, we group the original dataframe of fair weather hours by months and hours, and then find the mean value for all data points taken at a specific hour of a specific month (saved in dataframe sd_df
).
For clarity, we also present slices of this diurnal-seasonal diagram at 3, 9, 15, and 21 hours UTC.
Note: Renaming the axes of the multi-index resulting from grouping (
sd_df.index.set_names(['hour', 'month'], inplace=True)
) is not necessary for the code and can be commented out; however, it may be convenient for further work with the diurnal-seasonal dataframesd_df
.
Figure 1.5
Removal of field anomalies associated with meteorological parameters
First, we load the meteorological datasets (temp_df
, wind_df
, pressure_df
), averaged by days (vostok_daily_temp
, vostok_daily_wind
, vostok_daily_pressure_mm_hg
). For further analysis, we use the meteo_df
dataframe, which is created by merging the dataframe with daily average potential gradient values (daily_df
).
Next, we compile arrays of PG anomalies and anomalies for all meteorological parameters. The anomaly is calculated using a moving window of +-10 days.
We then find the regression coefficients temp_coeffs
, wind_coeffs
, and pres_coeffs
between the PG anomaly and the corresponding meteorological parameter anomalies, and calculate some statistical characteristics.
Using the found regression coefficients, we remove the linear relationship with meteorological parameter anomalies. The corrected PG is saved in meteo_df["CorrectedField"]
.
Finally, we construct Figure 1.5 using the prepared data in the same manner as was done for Figures 1.2 and 1.3.
Script 3_WRF_T2_images.ipynb
This script calculates the seasonal variation of the 2m-level temperature (T2m) taken from climate modeling results (see article 1).
In the script, temperature data averaged by longitude and by month are loaded (see data description below) from WRF_T2_MONxLAT.npy
.
Next, the temperature is averaged across latitude bands 20° S–20° N, 30° S–30° N, 40° S–40° N, and 50° S–50° N. The averaging takes into account the latitudinal area factor; degree cells at higher latitudes are summed with a diminishing coefficient. The results of the averaging (seasonal temperature variation in the specified latitude band) are displayed on a figure 1.4, 2.3 consisting of four panels.
Script 4_IP_simulations_temporal_images.ipynb
...
Script 5_IP_simulations_spatial_images.ipynb
...
Description of the data files
WRF_T2_MONxLAT.npy
- anumpy
array with the shape(180, 12)
, containing montly averaged 2-meter level temperature (T2m
) for each 1° latitude band across a full 360° longitude.T2m
are calculated with the Weather Research and Forecasting model (WRF) version 4.3.vostok_hourly <...>.txt
- text files containing two columns, one of which represents the date and time (columnDatetime
) and the other, hourly averaged potential gradient (PG) values on the basis of the latest measurements at the Russian Antarctic station Vostok (columnField
, the units are V/m).vostok_1998_2004_hourly_80percent_all.tsv
- the same as the previous file, but these are early data collected by a different sensor during 1998-2004vostok_daily <...>.txt
- text files containing three columns: one for the date (columnUTC
), the second for the daily averaged meteorological parameter based on measurements at the Russian Antarctic station Vostok, and the third columnCount
indicating the number of measurements per day (entries with fewer than 4 measurements must be filtered out before analysis).