Skip to content

Applying flag normalisation to local packages. #5287

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jun 26, 2018

Conversation

merijn
Copy link
Collaborator

@merijn merijn commented Apr 26, 2018

Continuing efforts from #4247, apply similar recompilation avoidance to local packages as well. Need to look into how this interacts with new-haddock.

Please include the following checklist in your PR:

  • Patches conform to the coding conventions.
  • Any changes that could be relevant to users have been recorded in the changelog.
  • The documentation has been updated, if necessary.
  • If the change is docs-only, [ci skip] is used to avoid triggering the build bots.

Please also shortly describe how you tested your change. Bonus points for added tests!

merijn added 3 commits April 25, 2018 16:36
Move package flag normalisation code for the new-build hash into it's own
`normaliseConfiguredPackage` function and reuse it to detect whether the
configuration of a local package has changed and needs to be rebuild.
Strip empty arguments lists from the elabProgramArgs in
ElaboratedConfiguredPackage to eliminate spurious configuration changes being
found.
@merijn
Copy link
Collaborator Author

merijn commented Apr 30, 2018

Apparently not so straight-forward, as this seems to cause ghci to ignore arguments when running cabal new-repl

@merijn
Copy link
Collaborator Author

merijn commented May 2, 2018

Ping @dcoutts, @ezyang, @23Skidoo

You all seem to be responsible for the majority of the code in buildInplaceUnpackedPackage, so I was hoping you guys could weigh in on my problems.

Setting the Stage

There's a whole bunch of GHC flags that don't affect the output, so we don't want to uselessly rebuild. For dependencies this was fixed and merged by normalising the flags going into the dependency hash. However, local/inplace packages are handled differently, they are covered by a FileMonitor that triggers a rebuild when anything changes in them. Unfortunately, that FileMonitor just directly monitors the ElaboratedConfiguredPackage serialised to disk, which means that any trivial change to that will trigger a rebuild.

Why is that a problem? I have a new-build project, it has 5 local packages, I want to run cabal new-repl and pass in a custom .ghci script for the project, so I add --ghc-option=-ghc-script=foo. Whoops, flags changed! Rebuild everything!

Obviously this gets annoying quickly if those local packages take a non-trivial time to compile.

The Problem

"I know!", I said optimistically, "I'll just change the package FileMonitor to compare packages after normalising the flags! That's a trivial change!". And so I quickly hacked that in and things were good, no more spurious rebuilds of local packages! But what's that? cabal new-repl is acting weird... (as a side note, the normalisation currently also strips debug dump flags, which isn't always desirable for local packages either, I'll get to that at the end.

So, the FileMonitor comparison works fine, it ignores the same flags as I ignore for the dependency hash, but the cached ElaboratedConfiguredPackage doesn't get updated if the normalised version of the old and new one are the same. This is fine when local package is just used as a dependency, since the build output doesn't change, but breaks in all sorts of exciting ways with new-repl.

Why? After a lot of sleuthing I figured out that setupHsReplFlags, used to pass stuff into "old" cabal-install's repl command explicitly throws away program configuration (and thus flags) since those are "set at configure time" as the comments say. Except, if the cached ElaboratedConfiguredPackage isn't considered out-of-date, because the build outputs don't change, we never rerun configure. Later on the old repl command will just load the LocalBuildInfo from disk which hasn't been updated and ignore any command line flags. Whoops.

Possible Solutions

So here's where I need you guys to chime in. I've brainstormed a bunch of (partial) solutions, each with it's own trade-offs and I need some feedback.

  1. Disable the normalisation check in the FileMonitor if a package/component has a repltarget, so every configuration change is caught, causing configure to run, updating the LocalBuildInfo a properly picking up the flags.

  2. Keep the normalisation check in the FileMonitor, but just pass in all the elabProgramArgs and programOverrideArgs for GHC when invoking the repl, bypassing the whole configure step entirely.

  3. Decide that GHC (or compilers in general) are special, and add a second flag normalisation codepath specifically for compilers where we can classify flags as affecting build output or affecting repl invocation and do smarter filtering, so it correctly detects repl flag changes when invoking the repl. But this would involve somehow filtering out the repl flags from the on-disk configuration.

  4. Give up and cry ;)

Pros & Cons

Both 1 & 3 keep the indirection via serialised build info between cabal-install and repl by invoking configure. This is probably the most robust/least likely to break anything approach but comes at a cost. Configure is kinda slow, on my machine it takes about 1 sec to complete, which means that any repl invocation with additional flags will always incur a 1s slowdown for no good reason. (Do we really need to store the repl flags in the on disk configuration?)

2 is fast and trivial to implement. We can skip the configure step for the repl, just pass in all the GHC arguments directly. Downside is, some of the arguments we're passing in may already be in the stored build info on disk, so we end up passing them twice to ghci. I'm not sure whether that could cause problems or not, which is why I'm a bit hesitant about this approach.

Conclusion

I think we should try to keep the normalisation for local packages, because it does safe me a considerable amount of compile time. We should probably add a flag for new-build similar to GHC's -fforce-recomp that let's use manually override the FileMonitor normalisation so things are compiled anyway. This would help with displaying warning flags that have changed and with debug dump flags mentioned earlier.

Ideally, we should avoid calling configure just to store repl flags it slows things down for no good reason. But this would mean we need an entirely different approach to how the repl command gets it's configuration, since the current approach relies on those flag having been configured in. The simplest way would be to preserve/add repl specific flags in setupHsReplFlags/setupHsReplArgs, but then we have to think hard on whether that can break things.

@ezyang
Copy link
Contributor

ezyang commented May 3, 2018

Yes, (2) seems like the right approach. The mental model I have is that there are "slow" flags and there are "fast" flags; ostensibly you only want to reconfigure when a slow flag changes, but you want all invocations to be affected by changes to fast flags whenever. So those better be passed through and not scrubbed, as you noticed.

Perhaps the reason (2) is not the obvious solution is that, based on your description, it sounds like flag normalization doesn't really classify flags as fast or slow, it just munges the flags until you have a new "flag set" that is "normalized". So to "properly" fix this we may need to add a new concept. It's been a while since I've read the code so I am not sure.

@merijn merijn force-pushed the filter-flag-local branch from 55f3391 to 1fd6de2 Compare May 3, 2018 08:14
@merijn
Copy link
Collaborator Author

merijn commented May 3, 2018

Currently flag normalisation doesn't do any sort of classification, as I just did the simplest thing possible to reduce recompiles of dependencies. There's still a bunch of design space and trade-offs to talk about (for example, I currently let warning flags trigger rebuilds of dependencies if -Werror is present and no one was clear whether that should count as "affecting build output"), but there's no real good forum for that sorta design discussion. (Well, the original ticket, but no one commented on that issue).

I actually did have an epiphany after writing this up that would push things in the right direction. One of the issue here is that new-repl is fundamentally different from any of the other comments. At the moment Cabal doesn't support passing "ghci options" it just passes on GHC options, but I think that's pretty poor UI anyway. So I'd propose the following:

  1. Keep the normalisation of ElaboratedConfiguredPackage I introduced in the package FileMonitor this branch, to avoid triggering rebuilds of local packages. This would keep stripping GHCI flags from GHC-options, but this only affects new-repl
  2. Keep forwarding GHC options to the repl command (anything else would break the UI of old repl , which we should probably just leave alone at this point?)
  3. Extend new-repl with --ghci-option, these options would also get stripped out of ElaboratedConfiguredPackage in the package monitor, thus not triggering rebuilds and then directly forward the options specified via --ghci-option to setupHsReplFlags. This ensures they're always added to the repl invocation.

This gets rid of useless recompiles of local packages. Does not affect old repl behaviour at all. Code changes can be confined to the new-X part of cabal-install. Separating repl flags from compilation flags is better UX. This still leaves the issue of debug dump flags being filtered out and warnings not being produced due to not recompiling local packages, but I think those issues would be better served by having a dedicated flag/UI for forcing recompilation of local packages/components (something -force-rebuild-local and -force-rebuild-pkg=foo).

@merijn merijn force-pushed the filter-flag-local branch from 9afac27 to 59bd528 Compare June 8, 2018 13:19
@merijn
Copy link
Collaborator Author

merijn commented Jun 8, 2018

And finally we're getting somewhere! I still need to test and dogfood this a bit, but this'd be a good time for people to start looking over the code.

Copy link
Contributor

@ezyang ezyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this has dogfooded well, I don't have objections to merging it. There aren't any BC-breaking changes to the ./Setup CLI interface, are there?

@@ -171,8 +172,9 @@ repl pkg_descr lbi flags suffixes args = do
let clbi = targetCLBI target
comp = targetComponent target
lbi' = lbiForComponent comp lbi
replFlags = toNubListR $ replReplOptions flags
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this going to be sane if, e.g., the repl options are -trust base -trust containers? Or is that somehow guaranteed not to happen and the only things in replReplOptions are safe to de-dupe?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question, I'd have to test that a bit. Most of the obvious REPL flags are probably not affected and I'm not sure "-trust" even makes sense here? (Since that seems like it should affect the build outcome?). Sadly there's no real "safe" way of deduping stuff (I'm not even sure the current use of NubListR is completely sensible in the presence of space separated flags?)

A simple solution might be to simply break the NubListR invariant by manually wrapping the list with the newtype, that avoids any doubt about whether things could break (but is rather hacky)...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I can't say definitively if this is a problem either. It just looks suspicious to me in the context of #5369, which removed a load of nub operations on argument lists.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, actually, master no longer uses NubListR it seems. I only used that because I was forced by the types of the GHC config. So seems the problem is trivially solved by rebasing onto master and just eliminating NubListR entirely. I wasn't using it myself anyway.

@merijn
Copy link
Collaborator Author

merijn commented Jun 11, 2018

@ezyang About backwards compatiblity: I'm not 100% sure, because honestly I still don't get the internals well enough to say.

I added the --repl-options flag to "cabal repl" too (because all new-X commands end up getting re-parsed by the old ones) and extended the ReplFlags type. I don't think adding a flag should affect backwards compat, so the only issue might be the ReplFlags record type? But as long as people use the empty/default variable and initialise fields there, that shouldn't break compat either, right?

@merijn
Copy link
Collaborator Author

merijn commented Jun 11, 2018

Ok, so the only problematic bit (aside from the new flag for cabal repl? I'll wait for ezyang to comment on that) goes away if I rebase on master, so I'll keep testing this a bit locally for a few days and and then merge this after rebasing to master.

@merijn merijn closed this Jun 11, 2018
@merijn merijn force-pushed the filter-flag-local branch from 59bd528 to fb67b25 Compare June 11, 2018 12:27
@merijn merijn reopened this Jun 11, 2018
@merijn
Copy link
Collaborator Author

merijn commented Jun 11, 2018

Accidentally closed when fiddling with the branch...

@merijn merijn force-pushed the filter-flag-local branch from 4a16d11 to 079580d Compare June 11, 2018 12:33
@ezyang
Copy link
Contributor

ezyang commented Jun 12, 2018

Adding a new optional flag is good; signing off!

@merijn merijn force-pushed the filter-flag-local branch from f3b52f1 to 00347f8 Compare June 21, 2018 14:13
@@ -1,6 +1,10 @@
-*-change-log-*-

2.4.0.0 (current development version)
* Add '--repl-options' flag to 'cabal repl' and 'cabal new-repl'
commands. Passes it's arguments to the invoked repl, bypassing the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/it's/its/

@quasicomputational
Copy link
Contributor

Changelog notes look good. There's a new-repl section in Cabal/docs/nix-local-build.rst where the new flag should go, too. After that I think there's nothing to do but press the big green button?

[ci skip]
@merijn
Copy link
Collaborator Author

merijn commented Jun 21, 2018

Yeah, I've been testing it locally and a bunch of people told me they start using it and so far no one's complained, so if the latest CI finishes without any errors from rebasing I'm gonna merge it.

@merijn
Copy link
Collaborator Author

merijn commented Jun 21, 2018

Actually, maybe that's not useful to wait for. As far as I can tell a large amount of tests are just very broken in ways unrelated to my changes right now...

flags are now stripped from ``ghc-options``. As a result ``--ghc-options`` will
no longer (reliably) work to pass flags to ``ghci`` (or other repls). Instead,
you should use the new ``--repl-options`` flag to specify these options to the
invoked repl. (This flags also works on ``cabal repl`` and ``Setup repl`` on
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/This flags/This flag/

no longer (reliably) work to pass flags to ``ghci`` (or other repls). Instead,
you should use the new ``--repl-options`` flag to specify these options to the
invoked repl. (This flags also works on ``cabal repl`` and ``Setup repl`` on
sufficiently new versions)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing full stop. Should specify that it's versions of Cabal.

GHCi as interpreted bytecode. It takes the same flags as
``cabal new-build``.
GHCi as interpreted bytecode. In addition to the ``cabal new-build``'s flags,
it takes an additional ``--repl-options`` flags.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/flag/flags/

it takes an additional ``--repl-options`` flags.

To avoid ``ghci`` specific flags from triggering unneeded global rebuilds these
flags are now stripped from ``ghc-options``. As a result ``--ghc-options`` will
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got momentarily confused here between ghc-options the Cabal file field, and --ghc-options the command-line argument, which is what are being talked about here, right? Worth clarifying.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, it refers to both. --ghc-options the flag and ghc-options the field end up in the same data structure that gets filtered, so this applies in both cases.

@merijn
Copy link
Collaborator Author

merijn commented Jun 22, 2018

Still seeing some failures on Travis, but I don't understand at all why those are failing. I see all sorts of doctest warnings, warnings about unrecognised ghc 8.7 versions (in the tests for 7.4), so all-in-all I'm a bit lost how to get everything passing again.

@quasicomputational
Copy link
Contributor

#5389 is seeing similar weird failures, so it's probably not your code but something else somewhere that's broken. HEAD is currently green, so I guess that something is environmental. I do note that containers-0.6.0.1 was released a few days ago and is mentioned in some of the failures...

@merijn
Copy link
Collaborator Author

merijn commented Jun 22, 2018

Actually looks like wherever the tests fetch GHC HEAD from has updated. 3 out of 4 failures fail with "unsupported GHC in /opt/ghc/head/bin", looking at the reported version of GHC it was updated yesterday, so presumably all green tests ran before the remote source updated and every test since late yesterday is failing. I don't know what to do about this, though.

@quasicomputational
Copy link
Contributor

Oh, you're right. It looks like the proximal cause is that the error message has changed in GHC HEAD as well, in at least some of the tests. I think I can fix that.

@merijn
Copy link
Collaborator Author

merijn commented Jun 22, 2018

hvr was already looking at it too, after I asked on IRC.

@quasicomputational
Copy link
Contributor

I just put up #5392, which hopefully fixes the residual problems; hvr got to the 'unsupported GHC versions' error message ahead of me.

@merijn merijn force-pushed the filter-flag-local branch 2 times, most recently from 56eba60 to a9ab437 Compare June 22, 2018 17:40
@merijn
Copy link
Collaborator Author

merijn commented Jun 22, 2018

Seems to be one weird error left: https://travis-ci.org/haskell/cabal/jobs/395597958#L1131-L1133 I pushed another commit to print the actual failing value, but it's not printing anything, so it'd seems to be failing in the cabal that's trying to build cabal for testing. So I dunno if that's breakage in the cabal that ships with 8.2.2 or that I broke something...

@23Skidoo
Copy link
Member

Seems like we're using cabal-install 2.0 on Travis. Maybe updating to 2.2 will fix it.

@quasicomputational
Copy link
Contributor

On the basis that I can't see how the PR could've caused an error before any code was even being built and how it wasn't failing in this way earlier, I'm tempted to blame intermittent Travis weridness for this latest error, and I've restarted the build to see if it comes back good.

@quasicomputational
Copy link
Contributor

Nope, still happening. Quite baffling: this PR doesn't touch the package description at all, and other recent builds have succeeded so it's not some kind of global weirdness with the Hackage state. I can only assume it's something like some ordering-dependent bad case and this PR is unlucky enough that its component hashes are tickling it.

@merijn merijn force-pushed the filter-flag-local branch from 2364701 to a9ab437 Compare June 25, 2018 08:52
@merijn
Copy link
Collaborator Author

merijn commented Jun 25, 2018

Yeah, I'd restarted it a bunch of times already so far. Maybe rebasing it on master will churn it enough to get fixed again.

@merijn merijn force-pushed the filter-flag-local branch from a9ab437 to cbcfc43 Compare June 25, 2018 09:28
@merijn
Copy link
Collaborator Author

merijn commented Jun 26, 2018

Ha! After @hvr cleared the build cache this is finally in the realm of happy green checkmarks.

@merijn merijn merged commit 2ce4858 into haskell:master Jun 26, 2018
@merijn merijn deleted the filter-flag-local branch June 26, 2018 11:08
@merijn merijn restored the filter-flag-local branch September 25, 2018 10:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants