Showing posts with label Bugs. Show all posts
Showing posts with label Bugs. Show all posts

Thursday, 30 May 2013

How to debug silent crashes in .Net

Here’s a quick note to my future self, and any other interested parties in the present on how to diagnose a particularly tricky kind of unhandled exception. I’m talking about those ninja exceptions which manage to evade any kind of unhandled exception logging you’ve rigged up around your .Net applications.

I was debugging just such a problem today. A customer would double click the application, and nothing would happen. Our application is configured to log any unhandled exceptions to a log file. But no log file was being created. There was, however, an entry written to the application event log: it told me that a TypeInitializationException had been thrown, and even gave me a stack trace. But it told me nothing about why the type had failed to initialize.

To complicate matters, this was one of those heisenbugs where, though I could replicate it under normal circumstances, it would go away if I attached a debugger to the process. What I needed was a crash dump – a snapshot of the entire state of the application at the moment the exception happened.

DebugDiag was my first port of call. That’s a nifty tool from Microsoft which can be configured to create dumps from your application under particular circumstances,  including when it crashes. Handily, it will also analyse the dump file, and help you work out why the application crashed. Inexplicably, DebugDiag failed to capture any dumps for my application.

So I turned to Windows Error Reporting. From Windows Vista SP1 onwards you can tweak some flags in the registry, and have windows automatically capture dumps for you – assuming you’re using .Net 4.0.

Capturing dump files with Windows Error Reporting

All you need to do is set create a key at HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\Windows Error Reporting\LocalDumps\[Your Application Exe Name]. In that key, create a string value called DumpFolder, and set it to the folder where you want dumps to be written. Then create a DWORD value called DumpType with a value of 2.

image

Now get those ninjas to crash your app.

You should see a *.dmp file appearing in the folder you chose. Double click it – and you’ll see some Visual Studio magic, introduced in VS 2010. You can debug a dump file almost as if it were a live application. When you see the Minidump File Summary screen, you just need to click Debug with Mixed.

image

Unmasking the ninja

So what was this ninja exception that was evading my logging routines? It was an exception being thrown by my exception handling code!

The original exception was a ConfigurationErrorsException, thrown by the configuration API when loading my application’s user settings from a corrupt user.config file. My exception handling code looked like this:

private void ReportUnhandledException(Exception exception)
{
    try
    {
        Tracing.TraceUnhandledError(exception);

        ReportExceptionToUser(exception);
    }
    catch (Exception fatalException)
    {
        Tracing.TraceUnhandledError(fatalException);
    }
}

The problem was caused by a static constructor in the Tracing class. We’re using the System.Diagnostics TraceSource api to log our exceptions, and the constructor was setting all of that up. The TraceSource API, it turns out, also tries to load the application configuration, so that was throwing an exception.

The solution? Use Environment.FailFast instead of trying to log the secondary exception. That simply writes a message to the event log then bails out, with no chance of causing mayhem by raising further exceptions.

private void ReportUnhandledException(Exception exception)
{
    try
    {
        Tracing.TraceUnhandledError(exception);

        ReportExceptionToUser(exception);
    }
    catch (Exception fatalException)
    {
        Environment.FailFast("An error occured whilst reporting an error.", exception);
    }
}

Wednesday, 14 March 2012

Solved: Http Error returned for some route urls but not others in Asp.Net MVC or Web API

Here GoogleBing: remember this one for me so I don’t have to waste another two hours re-solving it.

I’m working on a project using the new Asp.Net Web API and hit a really strange problem yesterday. Every time I called the /search endpoint, I’d get back a Http Error 404.20 – Not Found. All the other controllers were working fine. There were no exceptions being thrown and, even stranger, if I set a break point in the SearchController it was never hit.

I spent two hours trying every which-way to get to the bottom of it.

The light-bulb lit up when my eye latched on to this:

image(No, that’s not Visual Studio 11. I just faded all the rest to grey so you could see what I saw).

There in the solution was a folder with the same name as the Web API Controller. A quick test confirmed that any route with the same url as a folder in my solution would give the same error.

The Solution

There’s the obvious solution: rename either your folder or your controller, so that there’s no conflict.

But if you’re rather attached to your Controller name, and you’re loath to change your folder structure, there is another solution. When you register your Routes, set

RouteTable.Routes.RouteExistingFiles = true;

This makes Asp.Net Routing try the routes in your RouteTable first, even if there are files and directories that match them.

This by itself was enough for us, because we don’t have any files on disk that we want to serve up directly. But if you do, add

routes.IgnoreRoute("folder/{*path}");

for each folder that you want to serve up, and everything should by hunky-dory.

Friday, 10 June 2011

XmlSerializerFormat is NOT supported by WCF Rest 4.0 in Medium Trust

Here’s one for you googlers (or bingites): Using the XmlSerializerFormat attribute on methods that you expose over WCF Rest (i.e. services using the webHttpBinding) is not supported if your service is running in Medium Trust (as it most likely is if you’re using Shared Hosting, like Rackspace Cloud Sites).

This is contrary to what you might expect from reading documents like this, which say that XmlSerializer is supported in partial trust. I guess they forgot to test the webHttpBinding scenario. I should be clear: webHttpBinding itself works fine in Medium Trust: it’s using XmlSerializerFormat with webHttpBinding that will give you problems.

That’s the short story.

Here’s the amplified version.

How I learned: the hard way

I’m building a web service that my friends running MacOSX want to be able to use. Out in the Apple orchards they’re not so keen on SOAP as we are who sit behind Windows. So I turned to WCF Web Http, now part of .Net 4.0, which lets you take control of your urls and keep your messages to the bare necessities.

With WCF REST you have the choice of XML or JSON. I chose XML, and by default WCF will use the DataContractSerializer. That produces reasonably clean XML, but it is picky about ordering of elements within messages, it annotates elements with type attributes that are often unnecessary, and it has a rather clunky syntax for serializing arrays.

Then I read about the XmlSerializerFormat attribute. If you apply that to your Operation Contracts, it switches to using the XmlSerializer, and then, by decorating your data contracts with the appropriate attributes, you can take complete control of the xml which gets written to the wire.

All well and good – while the service was running on my machine. But when I pushed the bits to our web host, who run all websites under Medium Trust, it was a different story. All the calls to the service threw a ProtocolException with the helpful message: “(400) Bad Request”.

Turning on includeExceptionDetailInFaults, then spying on the interaction using Fiddler shed more light on the situation. I was getting back a stack trace with this at the top:

at System.ServiceModel.Dispatcher.UnwrappedTypesXmlSerializerManager.BuildSerializers()
   at System.ServiceModel.Dispatcher.UnwrappedTypesXmlSerializerManager.GetOperationSerializers(Object key)
   at System.ServiceModel.Dispatcher.SingleBodyParameterXmlSerializerMessageFormatter.EnsureSerializers()
   at System.ServiceModel.Dispatcher.SingleBodyParameterXmlSerializerMessageFormatter.GetInputSerializers()
   at System.ServiceModel.Dispatcher.SingleBodyParameterMessageFormatter.ReadObject(Message message)
   at System.ServiceModel.Dispatcher.SingleBodyParameterMessageFormatter.DeserializeRequest(Message message, Object[] parameters)

A bit of hunting around using Reflector revealed that the BuildSerializers method always calls XmlSerializer.FromMappings, and if you check that out on MSDN, you’ll see that it requires Full Trust.

So there’s no getting away from it. You can’t use XmlSerializer together with WebHttpBinding under Medium Trust. I did try the usual work-around for getting XmlSerializer to work in Medium Trustpre-generating the serialization assemblies at build time: but that was before Reflector showed me the clinching evidence that my efforts were futile.

So what are your options? Either stick with DataContractSerializer, or switch to JSON1. Or, if time is on your side, wait for the next version of WCF which has a new extensible Web API specially designed to let you take complete control of your message formats and such like.


  1. Decorate your methods like this to switch to JSON format:
[ServiceContract()]
public interface IService
{
    [OperationContract]
    [WebInvoke(UriTemplate = "/MyOperation", 
               Method = "POST",
               RequestFormat = WebMessageFormat.Json,
               ResponseFormat = WebMessageFormat.Json)]
    Response MyOperation(Dto dto);
}

Wednesday, 14 July 2010

C# 4 broke my code! The Pitfalls of COM Interop and Extension methods

A couple of weeks back I got the go-ahead from my boss to upgrade to Visual Studio 2010. What with Microsoft’s legendary commitment to backwards compatibility, and the extended beta period for the 2010 release, I wasn’t expecting any problems. The Upgrade Wizard worked its magic without hiccup, the code compiled, everything seemed fine.

Banana Skin - Image: Chris Sharp / FreeDigitalPhotos.netThen I ran our suite of unit tests. 6145 passed. 15 failed.

When I perused the logs of the failing tests they all had this exception in common:

COMException: Type mismatch. (Exception from HRESULT: 0x80020005 (DISP_E_TYPEMISMATCH))

How could that happen? Nothing should have changed: though I was compiling in VS 2010, I was still targeting .Net 3.5. The runtime behaviour should be identical. Yet it looked like COM Interop itself was broken. Surely not?

A footprint in the flowerbed

I dug deeper. In every case, I was calling an extension method on a COM object – an extension method provided by Microsoft’s VSTO Power Tools, no less.

These Power Tools are painkillers that Microsoft made available for courageous souls programming against the Office Object Model using C#. Up until C# 4, calling a method with optional parameters required copious use of Type.Missing – one use for every optional parameter you didn’t wish to specify. Here’s an example (you might want to shield your eyes):

var workbook = application.Workbooks.Add(Missing.Value);
workbook.SaveAs(
    "Test.xls", 
    Missing.Value, 
    Missing.Value, 
    Missing.Value, 
    Missing.Value, 
    Missing.Value, 
    XlSaveAsAccessMode.xlNoChange, 
    Missing.Value, 
    Missing.Value, 
    Missing.Value, 
    Missing.Value, 
    Missing.Value);

The Power Tools library provides extension methods that hide the optional parameters:

using Microsoft.Office.Interop.Excel.Extensions;

var workbook = application.Workbooks.Add();
workbook.SaveAs(
    new WorkbookSaveAsArgs { FileName = "Test.xls"} );

Can you see what the problem is yet? I couldn’t. So I thought I’d try a work-around.

A big part of C# 4 is playing catch-up with VB.Net improving support for COM Interop. The headline feature is that fancy new dynamic keyword that makes it possible to do late-bound COM calls, but the one that’s relevant here is the support for optional and named parameters. The nice thing is that, whereas using the dynamic keyword requires you to target .Net 4.0, optional and named parameters are a compiler feature, so you can make use of them even if you are targeting .Net 3.5.

That meant that I didn’t need Power Tools any more. In C# 4 I can just rewrite my code like this:

var workbook = application.Workbooks.Add();
workbook.SaveAs( Filename: "Test.xls" );

And that worked. COM Interop clearly wasn’t broken. But why was the code involving extension methods failing?

Looking again over the stack traces of the exceptions in my failed tests I noticed something very interesting. The extension methods didn’t appear. The call was going directly from my method into the COM Interop layer. I was running a Debug build, so I knew that the extension method wasn’t being optimised out.

The big reveal

Then light dawned.

Take a look at the signature of the SaveAs method as revealed by Reflector:

void SaveAs(
   [In, Optional, MarshalAs(UnmanagedType.Struct)] object Filename,
    /* 11 other parameters snipped */)

Notice in particular how Filename is typed as object rather than string. This always happens to optional parameters in COM Interop to allow the Type.Missing value to be passed through if you opt out of supplying a real value.

Now when C# 3.0 was compiling my code and deciding what I meant by

workbook.SaveAs(new WorkbookSaveAsArgs { ... } );

it had to choose between a call to the SaveAs instance method with 12 parameters or a call to the SaveAs extension method with a single parameter. The extension method wins hands down.

But the landscape changes in C# 4.0. Now the compiler knows about optional parameters. It’s as if a whole bunch of overloads have been added to the SaveAs method on Workbook with 0 through to 12 parameters. Which method does the compiler pick this time? Remember that it will always go for an instance method over an extension method. Since new WorkbookSaveAsArgs { … } is a perfectly good object the compiler chooses the SaveAs instance method, completely ignoring the extension method.

All seems well until we hit Run and Excel is handed the WorkbookSaveAsArgs instance by COM Interop. Not knowing how to handle it, it spits it back out in the form of a Type Mismatch exception.

Mystery solved.

The moral

So watch out if your code uses extension methods on COM objects and you’re planning on upgrading to VS 2010. Don’t assume that your code works the same as before just because it compiled OK. Unit tests rule!

Update: Microsoft have confirmed that this is indeed a breaking change that they didn’t spot before shipping.

Monday, 12 April 2010

Back from a bug hunt [or, Don’t Call Excel on a Background Thread]

I’m just back from a two-day bug hunt: an exhausting experience, but with a successful outcome that I’d like to share in case anybody else comes across this critter.

Photoxpress_8059464As he was testing our Excel Add-in, our Tester, Rich, noticed some rather odd behaviour. Whenever he used the Scan Tables feature,  then subsequently closed Excel, Excel.exe would refuse to vacate its slot in Task Manager until forcibly evicted.

The feature in question had my grubby fingerprints all over it, so I signed up to fix the bug. But where to start?

There are abundant examples of this kind of behaviour when automating Excel from the outside – usually caused by managed code not releasing its references to the COM objects. But my Add-in is inside Excel: any references it holds will be trashed when Excel is shutdown.

What then? I clearly remembered testing for just this kind of problem a couple of weeks back, and I was sure the application closed cleanly then. This gave me my starting point for a spot of Binary-Search debugging.

The Find

I began by grabbing a version of the code as it was at the time I thought the code was bug-free. Running it confirmed that was indeed the case. Then I pulled down a version that lay about half-way between then and now. That one exhibited all the symptoms of infestation. Hah! So the bug must have slipped in some time between the first version I checked and the half-way version. And thus I proceeded, wading through version after version, at each step slashing in half the list of changes in which the bug must have concealed itself.

After a couple of hours searching I narrowed it down to two versions: version 3320 had the bug, version 3319 was bug free. The check-in comment on version 3320 was rather revealing:

“Moved long-running table-scan operation to a background thread, and introduced progress reporting”

Those two words “background thread” were a smoking gun if ever I saw one, but who pulled the trigger? (I did mention that this bug was wanted for murder, right?). Now to adjust my binary-search-magnifying glass to zoom in, not on versions, but on lines of code.

I started by hacking the method where I launched the BackgroundWorker that did the table-scan so that, once again, it did its stuff in the UI thread. Sure enough, the bug went away, Excel closing cleanly on request.

Then, having restored the BackgroundWorker, I chopped its DoWork method in two, stubbing out one half at a time so that I could see which bunch of code was causing the problem. DoWork called out to a couple of other components, and I subject their methods to the same treatment. As method-call after method-call confirmed its innocence I closed in on the culprits: calls to the Excel Object model – specifically, calls to the Excel Object model made from a background thread.

Excel gave no indication that it had a problem being called in this way. When called on the background thread, it did exactly as when called on the foreground thread, with nary an exception. Seems it preferred to sulk, and protest in a more underhand way.

The Fix

Having found the problem, how was I to fix it?

I didn’t want to move all the work back on to the foreground thread: but clearly Excel wouldn’t listen if I talked to it from any other. Enter SynchronizationContext. SynchronizationContext is a handy little class that wraps up the complexities of executing code in another thread’s context. In the olden days of Windows Forms you’d use Control.BeginInvoke or Control.Invoke to do this kind of thing; these days WPF’s Dispatcher provides the same features. SynchronizationContext abstracts either Control or Dispatcher depending on what kind of message loop you’re running, and gives you two methods, Send and Post: Send executes code synchronously, Post asynchronously.

So all I needed to do was add a SynchronizationContext parameter to my two services which do all the talking to Excel (my IoC container kindly absorbed the consequences of this change – no ripple-down effect here).

Most places where I was calling Excel, I was querying some value. Like this, for example:

var result = (string)_application.Run("MyMacro")

Afterwards my change it looked like this:

var result = Query(() => (string)_application.Run("MyMacro"));

...

private TResult Query<TResult>(Func<TResult> query)
{
    return _synchronizationContext.SendQuery(query);
}

SendQuery is an extension method I created for SynchronizationContext that looks like this:

public static SynchronizationContextExtensions
{
    public static TResult SendQuery<TResult>(this SynchronizationContext context, Func<TResult> query)
    {
        var result = default(TResult);
        context.Send(parameter => result = query());
        return result;
    }
}

And that nailed it good and proper.

The Finale

So, Bing, if anybody asks you why their Excel Process doesn’t close after running an Add-in (managed or otherwise), tell them to make sure they are not calling into its object model from a background thread.