Skip to content

pandas.DataFrame.plot error for time series data #26921

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
alexgawrilow opened this issue Jun 18, 2019 · 3 comments
Open

pandas.DataFrame.plot error for time series data #26921

alexgawrilow opened this issue Jun 18, 2019 · 3 comments
Labels

Comments

@alexgawrilow
Copy link

Code Sample

import pandas as pd

idx = pd.date_range(start="2018", freq=pd.DateOffset(days=1), periods=2)
ts = pd.DataFrame({"A": [1, 2]}, idx)
ts.plot()

Problem description

I created the index of a time series using pd.date_range and provided the frequency as a DateOffset object. This index was passed to the DataFrame constructor to create a time series. When calling the plot method on this time series, pandas is not able to parse the frequency correctly:

233     if isinstance(freq, DateOffset):
--> 234         freq = freq.rule_code
    235     else:
    236         freq = get_base_alias(freq)

~/venv/lib/python3.6/site-packages/pandas/tseries/offsets.py in rule_code(self)
    372     @property
    373     def rule_code(self):
--> 374         return self._prefix
    375 
    376     @cache_readonly

~venv/lib/python3.6/site-packages/pandas/tseries/offsets.py in _prefix(self)
    368     @property
    369     def _prefix(self):
--> 370         raise NotImplementedError('Prefix not defined')
    371 
    372     @property

Expected Output

When passing the frequency to pd.date_range as a string, everything works fine, however both indices are equal.

idx2 = pd.date_range(start="2018", freq="D", periods=2)
ts = pd.DataFrame({"A": [1, 2]}, idx2)
ts.plot()
idx.equals(idx2)

Output: True

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.7.final.0
python-bits: 64
OS: Linux
OS-release: 4.18.0-10-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.24.2
pytest: 4.3.1
pip: 19.1.1
setuptools: 41.0.1
Cython: 0.29.6
numpy: 1.16.2
scipy: None
pyarrow: None
xarray: None
IPython: 7.3.0
sphinx: None
patsy: None
dateutil: 2.8.0
pytz: 2018.9
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.2.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: None
bs4: 4.7.1
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10.1
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

@TomAugspurger
Copy link
Contributor

Can you look to see what the plotting code expects about the freq? I think in the spot it raises, it's checking to see if the freq is fixed or variable. And if it's variable it bails out.

I think for arbitrary DateOffsets, which may not have a fixed frequency, we can't expect to use pandas formatters. What do you think?

@kmw8551
Copy link

kmw8551 commented Aug 15, 2019

I'll take a look

@MarcoGorelli
Copy link
Member

Full traceback:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/marco/pandas-dev/pandas/plotting/_core.py", line 907, in __call__
    return plot_backend.plot(data, kind=kind, **kwargs)
  File "/home/marco/pandas-dev/pandas/plotting/_matplotlib/__init__.py", line 61, in plot
    plot_obj.generate()
  File "/home/marco/pandas-dev/pandas/plotting/_matplotlib/core.py", line 264, in generate
    self._make_plot()
  File "/home/marco/pandas-dev/pandas/plotting/_matplotlib/core.py", line 1057, in _make_plot
    if self._is_ts_plot():
  File "/home/marco/pandas-dev/pandas/plotting/_matplotlib/core.py", line 1049, in _is_ts_plot
    return not self.x_compat and self.use_index and self._use_dynamic_x()
  File "/home/marco/pandas-dev/pandas/plotting/_matplotlib/core.py", line 1054, in _use_dynamic_x
    return _use_dynamic_x(self._get_ax(0), self.data)
  File "/home/marco/pandas-dev/pandas/plotting/_matplotlib/timeseries.py", line 205, in _use_dynamic_x
    freq = get_period_alias(freq)
  File "/home/marco/pandas-dev/pandas/plotting/_matplotlib/timeseries.py", line 167, in get_period_alias
    freq = freq.rule_code
  File "pandas/_libs/tslibs/offsets.pyx", line 532, in pandas._libs.tslibs.offsets.BaseOffset.rule_code.__get__
    return self._prefix
  File "pandas/_libs/tslibs/offsets.pyx", line 528, in pandas._libs.tslibs.offsets.BaseOffset._prefix.__get__
    raise NotImplementedError("Prefix not defined")
NotImplementedError: Prefix not defined

Is this just the case of modifying _use_dynamic_x so that it would return False here, as it would in the case

import pandas as pd

idx = pd.date_range(start="2018", freq=pd.offsets.WeekOfMonth(), periods=2)
ts = pd.DataFrame({"A": [1, 2]}, idx)
ts.plot()

?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants