-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Issue 662 explore globset #883
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 662 explore globset #883
Conversation
Signed-off-by: Simon (Desktop) <[email protected]>
Moving forward Vec<FilePattern> or similar should be compiled into a GlobSet, still need builder function `from_user` to create patterns for both filenames and fully specified directories Signed-off-by: Simon (Desktop) <[email protected]>
everything is now considered complex, so `exclude_simple` would never trigger, ditto ditto for extend_exclude Signed-off-by: Simon (Desktop) <[email protected]>
Signed-off-by: Simon (Desktop) <[email protected]>
Signed-off-by: Simon (Desktop) <[email protected]>
Signed-off-by: Simon (Desktop) <[email protected]>
charliermarsh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! It seems like the code both got simpler and there are meaningful performance gains here (I'm seeing a 10ms difference on the benchmark on my machine, which is a > 3% speed-up).
src/settings/user.rs
Outdated
| .map(|per_file_ignore| { | ||
| ( | ||
| Exclusion::from_file_pattern(per_file_ignore.pattern), | ||
| Exclusion::from_file_pattern(&per_file_ignore.pattern), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should remove Exclusion and just inline the string, now that basename and absolute are identical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc32661 re-split basename and absolute, we should continue to use Exclusions
| .map(Exclusion::from_file_pattern) | ||
| .collect(), | ||
| exclude: configuration.exclude, | ||
| extend_exclude: configuration.extend_exclude, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good points all around, I'll go over everything you have shared and take your advice on improvements, this especially should absolutely be cleaned up to something readable ( all those numbers are character codes so it shouldn't be too hard to get strings back out).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good! Thank you for your contribution :)
If at any point you feel like you won't have time to get to the comments, just let me know and we can figure out a plan.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e4d537d and 9e59346 - Had to keep an uncompiled version of the globs to get the strings needed but settings now look much cleaner
src/fs.rs
Outdated
| for x in ex { | ||
| build.add(x); | ||
| } | ||
| build.build().expect("bad") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: can we add more expansive expectations here? Like .expect("Failed to build GlobSet")?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the changes in f4660bd, this .expect is no longer needed. (or really it's now a ? to waterfall errors up)
src/settings/configuration.rs
Outdated
|
|
||
| for x in ex { | ||
| exclude_extend.add(x?); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a little tighter as:
let extend_exclude = {
let mut set = globset::GlobSetBuilder::new();
for path in options.extend_exclude.unwrap_or_default() {
set.add(from_user(&path, project_root)?);
}
set.build()?
};There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f4660bd - great suggestion
src/settings/configuration.rs
Outdated
| FilePattern::Simple("venv"), | ||
| ] | ||
| }); | ||
| static DEFAULT_EXCLUDE: Lazy<Vec<std::result::Result<globset::Glob, globset::Error>>> = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this should just be a GlobSet, rather than a vector of Glob?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a great idea, except that we need to use the globs to make settings look nice, but I did clean this up to not include Result anymore just a Lazy<Vec<Glob>>>
src/main.rs
Outdated
| for x in cli | ||
| .exclude | ||
| .iter() | ||
| .map(|path| from_user(path, project_root.as_ref())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if we're going to for-loop here, maybe just make this part of the body, rather than doing a comprehension with the loop too. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely, the map was just an artifact of old code, fixed in 708d408 and further improved by cc32661
src/settings/types.rs
Outdated
| let absolute = Pattern::new(&absolute_path.to_string_lossy())?; | ||
| let basename = if pattern.contains(std::path::MAIN_SEPARATOR) { | ||
| None | ||
| pub fn from_user( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This name made more sense when it was part of FilePattern, since FilePattern::from_user had clear semantics. Maybe we rename to create_glob or glob_from_user_pattern?
src/settings/types.rs
Outdated
| } | ||
| std::borrow::Cow::Owned( | ||
| "{".to_owned() + &absolute_path.to_string_lossy() + "," + pattern + "}", | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this merits a comment. Is this creating a pattern that's effectively a union of the absolute and base patterns?
Would it be clearer to instead return two Globs, one for each of those two patterns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 Globs made much more sense to me, so I basically reimplemented FilePattern::Complex in cc32661 🤣
|
Thank you for this PR! Let me know if you're up for tackling the comments. |
Signed-off-by: Simon (Desktop) <[email protected]>
Signed-off-by: Simon (Desktop) <[email protected]>
add flexibility to Exclusion to break up Multi-part globs change created for settings display purposes Signed-off-by: Simon (Desktop) <[email protected]>
these can be used to turn items into strings in UserConfigurations Signed-off-by: Simon (Desktop) <[email protected]>
Basically reimplement Complex, but keep GlobSet for faster checks Signed-off-by: Simon (Desktop) <[email protected]>
|
I think that's all your reviews addressed. Benchmarks for me are still showing ~10 ms improvement even after changes. Do LMK if there are any other changes you'd like to see. I won't be on here over Thanksgiving weekend, but we can continue going back and forth on changes after that if any are needed. |
|
Great, thanks! I'll try to get this merged in the next few days. |
|
@CelebrateVC - Thank you for this! I ended up making some tweaks to restore |

Thank you for submitting this pull request! We appreciate you spending the time to work on these changes.
What is the motivation?
In #662 Charlie was asking if globset could speed up runtimes of file matching. After discussing a test strategy for the use case was decided.
What does this change do?
swaps out
globforglobsetentirely, Strips code ofFilePatternas those are now compiledGlobs which instead of being stored inVec<FilePattern>s are grouped together in aGlobSet.is_excludedis modified to reflect this change and now takes in GlobSets.What is your testing strategy?
Compiled the base image using
globand compiled altered code, per Contributing guide, benchmarked using hyperfine. Based on the discussion in #662 kneecapped the linter.rs#lint_path function to immediately return default Diagnostic, in order to exclusively test path discovery. below are current code state benchmarks.Is this related to any issues?
#662 - as above mentioned
Have you read the Contributing Guidelines?