Skip to content

State Space: Exogenous Vars - Conflicting dimensions for time (bug?) #424

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
rOybOiii opened this issue Feb 11, 2025 · 4 comments · May be fixed by #451
Open

State Space: Exogenous Vars - Conflicting dimensions for time (bug?) #424

rOybOiii opened this issue Feb 11, 2025 · 4 comments · May be fixed by #451

Comments

@rOybOiii
Copy link

rOybOiii commented Feb 11, 2025

Hey PyMC & @jessegrabowski

I’m trying to use state space’s forecast method with exogenous variables, but I keep getting this error:

ValueError: conflicting sizes for dimension ‘time’: length 1107 on the data but length 8 on coordinate ‘time’

My scenario: I’m passing exogenous indicator variables totaling 18 columns and 8 rows long, for an 8 day forecast. The period I’m forecasting is not at the -very- end of my time series. I’m testing my exogenous variable’s ability to reduce variance on events and holidays.

Time series info:

Length: 1107 data points
Start: 2022-01-19
End: 2025-01-29

Relevant code:

// Define start date and forecast period
start_date, n_periods = pd.to_datetime("2024-1-25"), 8

// Extract exogenous indicator variables for the forecast period
scenario = {'data_exog': pd.DataFrame(df_holidays.loc[start_date:].iloc[:n_periods].to_numpy(dtype=float), 
                                      columns=df_holidays.columns)}

// Generate the forecast
forecasts = ss_mod.forecast(trace, start=start_date, periods=n_periods, scenario=scenario)

My second line of code is extracting the relevant portion of my exogenous variables and assigning it to the scenario.

Am I getting this error because I’m not forecasting at the very end of my time series? (edit: dumb question, I’ll test this later.)

Conda Environment-
PyMC Version: 5.20.1
PyMC Extras Version: 0.2.3
Python Version: Python 3.11.11

Thanks,
Roy

@rOybOiii
Copy link
Author

Can offer data and Jupyter file. Lemme know.

-Roy

@jessegrabowski
Copy link
Member

Can you provide a very simple example that reproduces the error?

@rOybOiii
Copy link
Author

Back at the other forum, I do have my model and some data available. I have not had a moment yet to pare it down to something a little simpler, but Jonathan was able to run it in 41 minutes, after putting in some column names into the data.

I'll try to provide something even simpler soon - but work calls me now.

-Roy

@rOybOiii
Copy link
Author

rOybOiii commented Feb 19, 2025

Hey @jessegrabowski :

I modified my model so it's a little simpler, with only 2 exogenous variables, and I deleted about 25-30% of the data points. Total run time is about 26 minutes for me, but with your M4 processor, I imagine it'll be closer to 8 minutes. Do you need something even simpler?

I ran this model using a fresh environment with pymc-extras 0.2.3.

Here's the Jupyter file

Here's the data

-Mike

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants