Models¶
SSM: State Space Model
Univariate (Persistent temporal patterns): encompassing trends and seasonal patterns
Multivariate (Cross-variate information): correlations between different variables
Auxiliary (eg: static time-varying features, future time-varying features, etc)
| Name | Backbone | Type | Venue | Year | Paper | URL |
|---|---|---|---|---|---|---|
| Sonnet | Wavelet + Koopman + Attention | Multivariate | AAAI (Oral) | 2026 | Sonnet: Spectral Operator Neural Network for Multivariable Time Series Forecasting | Arxiv - GitHub |
| SimTS | Causal CNN (contrastive pre-training) | Multivariate | ICASSP | 2024 | SimTS: Rethinking Contrastive Representation Learning for Time Series Forecasting | Arxiv - IEEE - GitHub |
| Amplifier | MLP | Multivariate | AAAI | 2025 | Amplifier: Bringing Attention to Neglected Low-Energy Components in Time Series Forecasting | Arxiv - REF |
| Linear | MLP | Univariate | AAAI | 2023 | Are Transformers Effective for Time Series Forecasting? | Arxiv - REF |
| LinearIC | MLP | Univariate | AAAI | 2023 | Are Transformers Effective for Time Series Forecasting? | Arxiv - REF |
| NLinear | MLP | Univariate | AAAI | 2023 | Are Transformers Effective for Time Series Forecasting? | Arxiv - REF |
| DLinear | MLP | Univariate | AAAI | 2023 | Are Transformers Effective for Time Series Forecasting? | Arxiv - REF |
| DLinearIC | MLP | Univariate | AAAI | 2023 | Are Transformers Effective for Time Series Forecasting? | Arxiv - REF |
| DNGLinear | MLP | Bridging Simplicity and Sophistication using GLinear: A Novel Architecture for Enhanced Time Series Prediction | Arxiv - REF | |||
| GLinear | MLP | Bridging Simplicity and Sophistication using GLinear: A Novel Architecture for Enhanced Time Series Prediction | Arxiv - REF | |||
| FLinear | MLP | NeurIPS | 2024 | Frequency Adaptive Normalization For Non-stationary Time Series Forecasting | Arxiv | |
| FreTS | MLP | Multivariate | NeurIPS | 2023 | Frequency-domain MLPs are More Effective Learners in Time Series Forecasting | Arxiv - REF |
| LightTS | MLP | Multivariate | Less Is More: Fast Multivariate Time Series Forecasting with Light Sampling-oriented MLP Structures | Arxiv - REF | ||
| MTSD | MLP | Univariate | MTS-Mixers: Multivariate Time Series Forecasting via Factorized Temporal and Channel Mixing | Arxiv - REF | ||
| MTSMatrix | MLP | Multivariate | MTS-Mixers: Multivariate Time Series Forecasting via Factorized Temporal and Channel Mixing | Arxiv - REF | ||
| MTSMixer | MLP | Multivariate | MTS-Mixers: Multivariate Time Series Forecasting via Factorized Temporal and Channel Mixing | Arxiv - REF | ||
| PaiFilter | MLP | NeurIPS | 2024 | FilterNet: Harnessing Frequency Filters for Time Series Forecasting | Arxiv - REF | |
| TexFilter | MLP | NeurIPS | 2024 | FilterNet: Harnessing Frequency Filters for Time Series Forecasting | Arxiv - REF | |
| RDLinear | MLP | RDLinear: A Novel Time Series Forecasting Model Based on Decomposition with RevIN | IEEE | |||
| RLinear | MLP | Revisiting Long-term Time Series Forecasting: An Investigation on Linear Mapping | Arxiv - REF | |||
| CrossLinear | MLP | Multivariate | KDD | 2025 | CrossLinear: Plug-and-Play Cross-Correlation Embedding for Time Series Forecasting with Exogenous Variables | Arxiv - REF |
| UMixer | MLP | Multivariate | AAAI | 2024 | U-Mixer: An Unet-Mixer Architecture with Stationarity Correction for Time Series Forecasting | Arxiv - REF |
| TSMixer | MLP | Multivariate | TSMixer: An All-MLP Architecture for Time Series Forecasting | Arxiv - REF | ||
| SWIFT | MLP | Univariate | SWIFT: Mapping Sub-series with Wavelet Decomposition Improves Time Series Forecasting | Arxiv - REF | ||
| SparseTSF | MLP | Univariate | ICML | 2024 | SparseTSF: Modeling Long-term Time Series Forecasting with 1k Parameters | Arxiv - REF |
| CMoS | MLP | Multivariate | ICML | 2025 | CMoS: Rethinking Time Series Prediction Through the Lens of Chunk-wise Spatial Correlations | Arxiv - REF |
| DishLinear | MLP | AAAI | 2023 | Dish-TS: A General Paradigm for Alleviating Distribution Shift in Time Series Forecasting | Arxiv | |
| FSMLP | MLP | FSMLP: Frequency Simplex MLP for Time Series Forecasting | Arxiv | |||
| FITS | MLP | Univariate | ICLR | 2024 | FITS: Modeling Time Series with 10k Parameters | Arxiv - REF |
| RFITS | MLP | Univariate | ICLR | 2024 | FITS: Modeling Time Series with 10k Parameters | Arxiv - REF |
| LSTM | RNN | Neural Computation | 1997 | Long Short-Term Memory | ACM - REF | |
| GRU | RNN | EMNLP | 2014 | Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation | Arxiv - REF | |
| SegRNN | RNN | Univariate | SegRNN: Segment Recurrent Neural Network for Long-Term Time Series Forecasting | Arxiv - REF | ||
| RWKV4TS | RNN | RWKV-TS: Beyond Traditional Recurrent Neural Network for Time Series Tasks | Arxiv - REF | |||
| DSSRNN | SSM | DSSRNN: Decomposition-Enhanced State-Space Recurrent Neural Network for Time-Series Analysis | Arxiv - REF | |||
| MLCNN | CNN | Multivariate | AAAI | 2020 | Towards Better Forecasting by Fusing Near and Distant Future Visions | Arxiv - REF |
| SCINet | CNN | Multivariate | NeurIPS | 2022 | SCINet: Time Series Modeling and Forecasting with Sample Convolution and Interaction | Arxiv - REF |
| ModernTCN | CNN | ICLR | 2024 | ModernTCN: A Modern Pure Convolution Structure for General Time Series Analysis | OpenReview - REF | |
| xPatch | CNN | Univariate | AAAI | 2025 | xPatch: Dual-Stream Time Series Forecasting with Exponential Seasonal-Trend Decomposition | Arxiv |
| TimePoint | CNN | Multivariate | ICML | 2025 | TimePoint: Accelerated Time Series Alignment via Self-Supervised Keypoint and Descriptor Learning | OpenReview - Arxiv - REF |
| InfoTS | Causal CNN + AutoAUG | Multivariate | ICLR | 2023 | InfoTS: Information-Aware Time Series Meta-Contrastive Learning | Arxiv - REF |
| TimeKAN | KAN | ICLR | 2025 | TimeKAN: KAN-based Frequency Decomposition Learning Architecture for Long-term Time Series Forecasting | Arxiv - REF | |
| MMK | KAN | Are KANs Effective for Multivariate Time Series Forecasting? | Arxiv - REF | |||
| iTransformer | Transformer | Multivariate | ICLR | 2024 | iTransformer: Inverted Transformers Are Effective for Time Series Forecasting | Arxiv - REF |
| TimeMixer | MLP | Multivariate | ICLR | 2024 | TimeMixer: Decomposable Multiscale Mixing for Time Series Forecasting | Arxiv - REF |
| TimesNet | CNN | Multivariate | ICLR | 2023 | TimesNet: Temporal 2D-Variation Modeling for General Time Series Analysis | Arxiv - REF |
| TimeXer | Transformer | Multivariate | NeurIPS | 2024 | TimeXer: Empowering Transformers for Time Series Forecasting with Exogenous Variables | Arxiv - REF |
| TiDE | MLP | Multivariate | TMLR | 2023 | Long-term Forecasting with TiDE: Time-series Dense Encoder | Arxiv - REF |
| NonstationaryTransformer | Transformer | Multivariate | NeurIPS | 2022 | Non-stationary Transformers: Exploring the Stationarity in Time Series Forecasting | Arxiv - REF |
| MICN | CNN | Multivariate | ICLR | 2023 | MICN: Multi-scale Isometric Convolution Network for Long-term Time Series Forecasting | OpenReview - REF |
| Koopa | MLP | Multivariate | NeurIPS | 2023 | Koopa: Learning Non-stationary Time Series with Koopman Predictors | Arxiv - REF |
| ETSformer | Transformer | Multivariate | ICLR | 2022 | ETSformer: Exponential Smoothing Transformers for Time-series Forecasting | Arxiv - REF |
| MSGNet | GNN | Multivariate | AAAI | 2024 | MSGNet: Learning Multi-Scale Inter-Series Correlations for Multivariate Time Series Forecasting | Arxiv - REF |
| WPMixer | Wavelet | Multivariate | 2025 | WPMixer: Wavelet Packet Mixer for Time Series Forecasting | Arxiv - REF | |
| TimeFilter | GNN | Multivariate | AAAI | 2025 | TimeFilter: Scalable and Adaptive Graph Neural Network for Time Series Forecasting | Arxiv - REF |
| MultiPatchFormer | Transformer | Multivariate | 2024 | MultiPatchFormer: Multi-scale Patch Transformer for Long-term Time Series Forecasting | Arxiv - REF | |
| Reformer | Transformer | Multivariate | ICLR | 2020 | Reformer: The Efficient Transformer | OpenReview - REF |
| Transformer | Transformer | Multivariate | NeurIPS | 2017 | Attention Is All You Need | Arxiv - REF |
| Informer | Transformer | Multivariate | AAAI | 2021 | Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting | Arxiv - REF |
| Autoformer | Transformer | Multivariate | NeurIPS | 2022 | Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting | Arxiv - REF |
| FEDformer | Transformer | Multivariate | ICML | 2022 | FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting | Arxiv - REF |
| Pyraformer | Transformer | Multivariate | ICLR | 2022 | Pyraformer: Low-Complexity Pyramidal Attention for Long-Range Time Series Modeling and Forecasting | Arxiv - REF |
| CrossFormer | Transformer | Multivariate | ICLR | 2023 | Crossformer: Transformer Utilizing Cross-Dimension Dependency for Multivariate Time Series Forecasting | OpenReview - REF |
| PatchTST | Transformer | Univariate | ICLR | 2023 | A Time Series is Worth 64 Words: Long-term Forecasting with Transformers | Arxiv - REF |
| CARD | Transformer | ICLR | 2024 | CARD: Channel Aligned Robust Blend Transformer for Time Series Forecasting | Arxiv - REF | |
| PAttn | Transformer | Univariate | NeurIPS | 2024 | Are Language Models Actually Useful for Time Series Forecasting? | Arxiv - REF |
| Timer | Foundation | ICML | 2024 | Timer: Generative Pre-trained Transformers Are Large Time Series Models | Arxiv - REF | |
| GPT4TS | LLM | NeurIPS | 2023 | One Fits All:Power General Time Series Analysis by Pretrained LM | Arxiv - REF | |
| T54TS | LLM | ICLR | 2024 | TEMPO: Prompt-based Generative Pre-trained Transformer for Time Series Forecasting | Arxiv - REF | |
| CALF | LLM | AAAI | 2025 | CALF: Aligning LLMs for Time Series Forecasting via Cross-modal Fine-Tuning | Arxiv - REF | |
| LLM_TPF | LLM | IJCAI | 2025 | LLM-TPF: Multiscale Temporal Periodicity-Semantic Fusion LLMs for Time Series Forecasting | REF | |
| VisionTS | Foundation | ICML | 2025 | VisionTS: Visual Masked Autoencoders Are Free-Lunch Zero-Shot Time Series Forecasters | Arxiv - REF | |
| ConvRNN | MLP | Autoregressive Convolutional Recurrent Neural Network for Univariate and Multivariate Time Series Prediction | Arxiv - REF | |||
| S3 | MLP | Segment, Shuffle, and Stitch: A Simple Layer for Improving Time-Series Representations | Arxiv - REF | |||
| FAN | MLP | FAN: Fourier Analysis Networks | Arxiv - REF | |||
| FiLM | SSM | NeurIPS | 2022 | FiLM: Frequency improved Legendre Memory Model for Long-term Time Series Forecasting | Arxiv - REF | |
| Leddam | Transformer | ICML | 2024 | Revitalizing Multivariate Time Series Forecasting: Learnable Decomposition with Inter-Series Dependencies and Intra-Series Variations Modeling | Arxiv - REF |
Transformer Parameters¶
The Transformer model (from LTSF-Linear) accepts time marks and exposes these optional CLI arguments:
| Argument | Type | Default | Description |
|---|---|---|---|
--d_model |
int | 512 | Model dimension |
--n_heads |
int | 8 | Number of attention heads |
--e_layers |
int | 2 | Number of encoder layers |
--d_layers |
int | 1 | Number of decoder layers |
--d_ff |
int | 2048 | Feed-forward dimension |
--factor |
int | 1 | Attention factor (1 = full attention) |
--dropout |
float | 0.05 | Dropout rate |
--activation |
str | gelu | Activation function (gelu or relu) |
--label_len |
int | 48 | Label length for decoder input |
--embed_type |
int | 0 | Embedding type (see below) |
--embed |
str | timeF | Temporal embedding strategy (see below) |
--freq |
str | h | Dataset granularity (see below) |
Embedding types (--embed_type):
| Value | Components | Description |
|---|---|---|
| 0 | token + positional + temporal | Full embedding (default) |
| 1 | token + positional + temporal | Full embedding (learned positional) |
| 2 | token + temporal | No positional encoding |
| 3 | token + positional | No temporal encoding |
| 4 | token only | No positional or temporal encoding |
Temporal embedding strategies (--embed):
| Value | Class | Description |
|---|---|---|
timeF |
TimeFeatureEmbedding | Linear projection of continuous time features (default) |
fixed |
TemporalEmbedding (fixed) | Fixed sinusoidal encoding on discrete time indices |
learned |
TemporalEmbedding (learned) | Learnable embedding table on discrete time indices |
Frequency (--freq):
| Value | Mark columns | Count |
|---|---|---|
s |
month, day, weekday, hour, minute, second | 6 |
t |
month, day, weekday, hour, minute | 5 |
h |
month, day, weekday, hour | 4 |
d |
month, day, weekday | 3 |
w |
month, day, week_of_year | 3 |
mo |
month | 1 |
q |
month | 1 |
Example:
python main.py --dataset ETDatasetHour --model Transformer --strategy FedAvg \
--d_model 128 --n_heads 4 --e_layers 1 --embed_type 0 --embed timeF --freq h