HomeMachine Learning5 Useful Python Scripts for Time Series Analysis

5 Useful Python Scripts for Time Series Analysis

Introduction

Working with time series data involves a consistent set of tasks. Raw data arrives at irregular intervals and must be resampled. Abnormal peaks must be identified before they distort any downstream analysis. Trends and seasonal patterns need to be separated from the noise. And when you have multiple series, understanding their relationships to each other requires more than just visual analysis.

These five Python scripts handle these common time series tasks. They are designed to work with standard CSV or Excel inputs, produce clear outputs, and be simple to configure for different datasets.

You can get all scripts on GitHub.

1. Resampling and aggregating irregular time series

The pain point

Real-world time series data rarely arrives at uniform intervals. Sensor readings, transaction logs, and event streams have gaps, duplicates, and inconsistent timestamps. Before any meaningful analysis, data must be aligned to a consistent frequency.

What the script does

Takes a CSV or Excel file with a datetime column and one or more value columns, resamples at a frequency you specify, and applies aggregation functions per column. Fills or flags gaps and writes a clean output file with a summary of what was changed.

How it works

The script parses the datetime column with pandas, sets it as index, and uses resample() with configurable frequency strings. Column aggregation methods are defined in a configuration, so a temperature column can use the average while a sales column uses the sum. Missing intervals after resampling are handled with direct padding, interpolation, or explicit NaN marking depending on your setting. A gap report lists all intervals for which data was missing in the original.

Get the time series resampling script

2. Anomaly Detection in Time Series Data

The pain point

A single abnormal spike or drop in a time series can skew averages, break downstream patterns, and mask real trends. Manually identifying these points by analyzing plots or raw values is not practical for a significant volume of data.

What the script does

Analyzes one or more numeric columns in a time series file and flags data points that fall outside expected limits using a choice of three detection methods: z-score, interquartile range (IQR), or rolling statistics. Generates an annotated file with anomaly flags and a separate summary report.

How it works

The z-score method reports points where the standardized value exceeds a configurable threshold (default ±3). The interquartile range (IQR) method reports points outside 1.5 × the interquartile range. The rolling method calculates a moving average and standard deviation over a configurable window and flags points that deviate significantly from the local context. This is useful for series with strong trends or seasonality. All three can be performed together; the output column records which method scored each point. An optional –plot flag saves a plot for each column with anomalies highlighted.

Get the anomaly detector script

3. Break down a series into trend, seasonality and residuals

The pain point

A time series is usually a combination of several elements: a long-term trend, a repeating seasonal pattern, and irregular residual noise. Analyzing the series as a whole makes it difficult to clearly understand any single component.

What the script does

Applies classic time series decomposition to a numeric column, separating the observed series into trend, seasonal, and residual components. Supports additive and multiplicative decomposition models. Exports each component as a column in the output file and saves a multi-panel chart.

How it works

The script uses statsmodels.tsa.seasonal.seasonal_decompose() on the target column after resampling at a constant frequency if necessary. The decomposition period is configurable. Additive decomposition is suitable for series where seasonal variation is approximately constant in magnitude; series of multiplicative combinations where it evolves with the trend level. The output Excel file contains the original series as well as the three extracted components. The saved chart shows all four panels stacked.

Get the time series decomposition script

4. Forecasts with seasonal autoregressive integrated moving average

The pain point

Producing a forecast from a time series typically involves model selection, parameter tuning, and validation steps that require statistical knowledge to work properly. Setting up from scratch every time takes time, and doing it informally produces forecasts that are difficult to trust or reproduce.

What the script does

Fits a seasonal autoregressive integrated moving average (SARIMA) model to a time series column, generates a forecast for a configurable number of periods, and writes the results to an output file including forecast values, confidence intervals, and base accuracy measures over a held validation period. Optionally automatically selects model parameters using Akaike Information Criterion (AIC) minimization.

How it works

The script uses statsmodels.tsa.statespace.sarimax.SARIMAX for assembly of the model. When –auto-order is set, it performs a lightweight grid search over a configurable range of ARIMA and seasonal parameters, selecting the combination with the lowest AIC. The series is divided into a training set and a pending test set configurable in a number of periods. Accuracy is reported on the test set using mean absolute error (MAE) and root mean square error (RMSE) before the final model is refitted on the full series to produce the forward forecast. Results include the point forecast and 95% confidence intervals. A forecast chart is saved, showing historical series, actual results of the test period compared to forecasts, as well as forward forecasts with confidence bands.

Get the SARIMA forecast script

5. Comparison of multiple time series

The pain point

When working with multiple related time series (different products, regions, sensors, or metrics), understanding how they move together requires more than visualizing them on the same graph. Correlation analysis, lag relationships, and aligned summary statistics all require calculations, and doing this on many pairs of series quickly becomes tedious.

What the script does

Takes a file with multiple time series columns, aligns them to a common frequency, and produces a multi-tab comparison report covering pairwise correlations, lag analysis (cross-correlation up to a configurable lag), and a side-by-side summary statistics table. Graphs are generated for the most correlated pairs.

How it works

The script uses pandas to align all columns to a shared datetime index after resampling. Pairwise Pearson and Spearman correlations are calculated and written to a correlation matrix tab. Cross-correlation is calculated for each pair up to a configurable maximum lag, identifying the lag at which each pair peaks, which is useful for finding lead/lag relationships. A summary tab includes the mean, standard deviation, minimum, maximum, and trend direction (positive/negative slope from a linear fit) for each series. The five most correlated pairs each receive a two-axis line graph in a dedicated graph tab.

Get the multi-series comparison script

Conclusion

These five scripts cover the main tasks involved in working with time series data. They are designed to be used independently or sequentially: resample first, detect anomalies, decompose, forecast, then compare the series.

To get started, first download the script you plan to use and install all the dependencies listed in its README file. Next, update the configuration section at the top of the script to match your specific data and column names. Before running it on your full dataset, test the script on a small sample to confirm that the result is correct. Once you’re happy with the results, you can schedule it or integrate it into your existing data pipeline.

Good analysis!

Bala Priya C is an Indian developer and technical writer. She enjoys working at the intersection of mathematics, programming, data science, and content creation. Her areas of interest and expertise include DevOps, data science, and natural language processing. She loves reading, writing, coding, and coffee! Currently, she is working on learning and sharing her knowledge with the developer community by creating tutorials, how-to guides, opinion pieces, and more. Bala also creates engaging resource overviews and coding tutorials.

Read the original article here.

“`

Must Read
Related News

LEAVE A REPLY

Please enter your comment!
Please enter your name here