-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Using df.at() for assignment changes views of certain columns, but not all of them #22372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Quite a few things being thrown out there but unfortunately it's not explicit enough as to what issue(s) you are trying to address. Also the expectation that Please refer to the contributing guide on how to open bug reports. The biggest thing here is to post a minimally reproducible example: If you can provide the minimally reproducible example and clarify what the issue is exactly feel free to reopen |
I'm sorry, did you even run the code?
While I understand that `test[['cost']]` and test.cost are different
objects, they still should contain _the same values_. They don't.
I'll work on a more minimal example.
…On Wed, Aug 15, 2018 at 7:09 PM, William Ayd ***@***.***> wrote:
Quite a few things being thrown out there but unfortunately it's not
explicit enough as to what issue(s) you are trying to address. Also the
expectation that test[['cost']] and test.cost are the same is incorrect,
as the former returns a frame whereas the latter returns a series
Please refer to the contributing guide on how to open bug reports. The
biggest thing here is to post a *minimally* reproducible example:
https://pandas.pydata.org/pandas-docs/stable/
contributing.html#bug-reports-and-enhancement-requests
If you can provide the minimally reproducible example and clarify what the
issue is exactly feel free to reopen
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#22372 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AFm_93fsRy4LoVNICqzkz1nC-pSs1d3wks5uRGPmgaJpZM4V-hG4>
.
|
@liesb pls post a minimal example - you have lots of things going on which are unrelated |
Minimal example: I'm changing the value in the
|
I'm adding the
INSTALLED VERSIONS
------------------
commit: None
python: 3.6.6.final.0
python-bits: 64
OS: Darwin
OS-release: 17.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_GB.UTF-8
LOCALE: en_GB.UTF-8
pandas: 0.23.4 |
OK thanks for the example. Just to remove unnecessary prints and make it more concise: df = pd.DataFrame(index=[0])
df['x'] = 1
df['cost'] = 2
idx = np.argmax(df.cost.values)
centroid = df.loc[[0]]
new_param = 4
df.at[idx, 'x'] = new_param
df.at[idx, 'cost'] = 789
print(df['cost']) # Print 789
print(df[['cost']]) # Prints 2 It is strange that the frame shows the original value before the .at assignment. Note that almost any change to this to make it idiomatic in construction or remove elements is fine, for example even doing: df = pd.DataFrame([[1, 2]], columns=['x', 'cost'], index=[0])
idx = np.argmax(df.cost.values)
centroid = df.loc[[0]]
new_param = 4
df.at[idx, 'x'] = new_param
df.at[idx, 'cost'] = 789
print(df['cost']) # Print 789
print(df[['cost']]) # Prints 789 My guess is there is some wonky state management going on. Investigation and PRs are always welcome |
df._item_cache needs to be cleared, not sure exactly where |
Looks like the |
_getitem_iterable calls self.obj._reindex_with_indexers which calls mgr.reindex_indexer which calls _consolidate_inplace |
Code Sample, a copy-pastable example if possible
Problem description
The
cost
column is displayed incorrectly when we use the commandtest
. It is correct when we usetest.cost
but not when we usetest[['cost']]
ortest.iloc[4]
. The other columns are displayed correctly, it seems.This doesn't happen when we use .loc() to assign new_param to the
x
andy
columns.It also doesn't happen when we comment out
centroid = df.loc[rng.choice(np.arange(5), size=1)].mean(axis=0)
.The current behaviour is problematic because choosing different ways of viewing the data yield different answers, which shouldn't be the case.
Expected Output
cost
column intest
would correspond totest.cost
.test[['cost']]
andtest.cost
give the same resultstest.iloc[4]
displays the 4th element oftest.cost
Output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.2.final.0
python-bits: 64
OS: Darwin
OS-release: 17.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_GB.UTF-8
LOCALE: en_GB.UTF-8
pandas: 0.23.4
pytest: None
pip: 18.0
setuptools: 36.4.0
Cython: None
numpy: 1.15.0
scipy: 1.0.1
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.7.2
pytz: 2018.4
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: