Skip to content

Update binlog mcp name#684

Merged
JanKrivanek merged 1 commit into
mainfrom
dev/jankrivanek/update-mcp-name
May 22, 2026
Merged

Update binlog mcp name#684
JanKrivanek merged 1 commit into
mainfrom
dev/jankrivanek/update-mcp-name

Conversation

@JanKrivanek
Copy link
Copy Markdown
Member

No description provided.

@JanKrivanek JanKrivanek requested a review from ViktorHofer as a code owner May 22, 2026 13:39
Copilot AI review requested due to automatic review settings May 22, 2026 13:39
@JanKrivanek JanKrivanek requested a review from a team as a code owner May 22, 2026 13:39
@JanKrivanek
Copy link
Copy Markdown
Member Author

/evaluate

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the dotnet-msbuild plugin’s binlog MCP package name reference to align the MCP server installation command and the corresponding documentation/agent guidance.

Changes:

  • Renamed the binlog MCP package from AITools.BinlogMcp to Microsoft.AITools.BinlogMcp in the plugin MCP server command.
  • Updated all affected skill and agent documentation to reference the new MCP package name.
  • Updated the allowed external dependency comment to reflect the new package name.
Show a summary per file
File Description
plugins/dotnet-msbuild/skills/incremental-build/SKILL.md Updates referenced binlog MCP package name in incremental-build guidance.
plugins/dotnet-msbuild/skills/eval-performance/SKILL.md Updates referenced binlog MCP package name in evaluation-performance guidance.
plugins/dotnet-msbuild/skills/build-perf-diagnostics/SKILL.md Updates referenced binlog MCP package name in perf diagnostics workflow.
plugins/dotnet-msbuild/skills/build-parallelism/SKILL.md Updates referenced binlog MCP package name in parallelism analysis workflow.
plugins/dotnet-msbuild/skills/binlog-failure-analysis/SKILL.md Updates referenced binlog MCP package name in failure analysis overview.
plugins/dotnet-msbuild/plugin.json Updates MCP server dotnet dnx package argument to Microsoft.AITools.BinlogMcp.
plugins/dotnet-msbuild/agents/build-perf.agent.md Updates referenced binlog MCP package name in agent instructions.
eng/allowed-external-deps.txt Updates the dotnet-msbuild binlog MCP allowlist comment to match the new package name.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 8/8 changed files
  • Comments generated: 0

@JanKrivanek JanKrivanek enabled auto-merge (squash) May 22, 2026 13:42
@JanKrivanek JanKrivanek merged commit 0724ccf into main May 22, 2026
40 of 42 checks passed
@JanKrivanek JanKrivanek deleted the dev/jankrivanek/update-mcp-name branch May 22, 2026 13:48
github-actions Bot added a commit that referenced this pull request May 22, 2026
github-actions Bot added a commit that referenced this pull request May 22, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Skill Validation Results

Skill Scenario Quality Skills Loaded Overfit Verdict
build-parallelism Analyze build parallelism bottlenecks 4.0/5 → 4.0/5 ✅ build-parallelism; tools: skill, glob, binlog-binlog_overview, binlog-binlog_expensive_projects, binlog-binlog_expensive_targets, binlog-binlog_projects, binlog-binlog_diagnose, binlog-binlog_project_target_times / ⚠️ NOT ACTIVATED ✅ 0.13 [1]
incremental-build Analyze incremental build issues 3.0/5 → 3.7/5 🟢 ✅ incremental-build; tools: skill, bash / ⚠️ NOT ACTIVATED ✅ 0.14 [2]
build-perf-diagnostics Diagnose slow build for a small project 4.0/5 → 4.3/5 🟢 ⚠️ NOT ACTIVATED 🟡 0.30 [3]
eval-performance Analyze MSBuild evaluation performance issues 5.0/5 → 5.0/5 ✅ eval-performance; tools: skill ✅ 0.13 [4]
binlog-failure-analysis Diagnose build failures from binlog only (no source files) 4.0/5 → 4.3/5 🟢 ✅ binlog-failure-analysis; tools: binlog-binlog_overview, binlog-binlog_errors, binlog-binlog_items, binlog-binlog_search_files, binlog-binlog_files, skill / ⚠️ NOT ACTIVATED ✅ 0.11

[1] ⚠️ High run-to-run variance (CV=218%) — consider re-running with --runs 5
[2] ⚠️ High run-to-run variance (CV=371%) — consider re-running with --runs 5. (Isolated) Quality improved but weighted score is -23.2% due to: judgment, tokens (26591 → 72933), quality, time (18.6s → 31.3s), tool calls (4 → 6)
[3] ⚠️ High run-to-run variance (CV=529%) — consider re-running with --runs 5. (Plugin) Quality unchanged but weighted score is -8.8% due to: tokens (27481 → 211868), tool calls (5 → 14), time (25.2s → 70.9s)
[4] ⚠️ High run-to-run variance (CV=63%) — consider re-running with --runs 5. (Plugin) Quality unchanged but weighted score is -7.6% due to: tokens (41711 → 84999), quality

timeout — run(s) hit the (160s) scenario timeout limit; scoring may be impacted by aborting model execution before it could produce its full output (increase via timeout in eval.yaml)

Model: claude-opus-4.6 | Judge: claude-opus-4.6

🔍 Full Results - additional metrics and failure investigation steps

▶ Sessions Visualisation -- interactive replay of all evaluation sessions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants