Case of the .NET Service Hang – Patching WebClient Operation Timed Out

Continuing my series on .NET patching. The first two cases:

https://chentiangemalc.wordpress.com/2014/03/20/case-of-the-unknown-error-app-crash-debugging-patching-someone-elses-net-race-condition/

https://chentiangemalc.wordpress.com/2014/05/09/injecting-debug-tracing-in-a-net-exe/

The article will assume knowledge of .NET Reflector and Reflexil, both covered in previous articles, along with knowledge of creating Visual Studio class library project.

In this case a Windows Service, based on .NET Framework, would only work “once” then after that would stop providing the regular uploading of data the service was supposed to do.

Using Task Manager run as Administrator, I right clicked process and selected “Create Dump File” I then opened up the dump file in WinDbg from Windows SDK

Using SOS debugging extension (http://msdn.microsoft.com/en-us/library/bb190764(v=vs.110).aspx)

I dumped the CLR stack:

0:000> !EEstack

ChildEBP RetAddr  Caller, Callee
000000000040f2dc 63db6110 (MethodDesc 63d93f88 +0x40 DomainBoundILStubClass.IL_STUB_PInvoke(IntPtr))
000000000040f2f8 63db6110 (MethodDesc 63d93f88 +0x40 DomainBoundILStubClass.IL_STUB_PInvoke(IntPtr))
000000000040f304 63db797a (MethodDesc 63d94620 +0x1aa System.ServiceProcess.ServiceBase.Run(System.ServiceProcess.ServiceBase[]))
000000000040f324 63db797a (MethodDesc 63d94620 +0x1aa System.ServiceProcess.ServiceBase.Run(System.ServiceProcess.ServiceBase[]))
000000000040f350 00210434 (MethodDesc 0019385c +0x3c4 SophosWebDirectorService.Program.Main(System.String[]))
000000000040f5d4 6f2f9a55 (MethodDesc 6f176b54 +0x81 System.Security.Policy.Evidence+EvidenceLockHolder.Dispose())

 

Seems like nothing special, just what you would expect for a Service Starting. So I then dumped out all exception objects on the heap:

0:000> .foreach (ex {!dumpheap -type Exception -short}){.echo “********************************”;!pe ${ex} }

This resulted in a pile of timed out exceptions (one shown here):

Exception object: 01088f2c
Exception type:   System.IO.IOException
Message:          Unable to read data from the transport connection: A blocking operation was interrupted by a call to WSACancelBlockingCall.
InnerException:   System.Net.Sockets.SocketException, Use !PrintException 247bcc0d0108897c to see more.
StackTrace (generated):
    SP       IP       Function
    04F9F300 6E5F44C9 System_ni!System.Net.Sockets.NetworkStream.Read(Byte[], Int32, Int32)+0x12d
    04F9F3D0 6E5F4387 System_ni!System.Net.PooledStream.Read(Byte[], Int32, Int32)+0x1b
    04F9F3EC 6E5F40AE System_ni!System.Net.Connection.SyncRead(System.Net.HttpWebRequest, Boolean, Boolean)+0x10a
 
StackTraceString: <none>
HResult: 80131620
********************************
Exception object: 010899bc
Exception type:   System.Net.WebException
Message:          The operation has timed out

InnerException:   <none>
StackTrace (generated):
    SP       IP       Function
    04F9F6C8 6E9C41B8 System_ni!System.Net.WebClient.UploadValues(System.Uri, System.String, System.Collections.Specialized.NameValueCollection)+0x1e8
    04F9F7AC 6E9C422B System_ni!System.Net.WebClient.UploadValues(System.Uri, System.Collections.Specialized.NameValueCollection)+0xb
    04F9F7B4 0021C92C SophosWebDirectorService!SophosWebDirectorService.WinService.UnassignedTimerEventFired()+0x10c
 
StackTraceString: <none>
HResult: 80131509
********************************
Exception object: 01089ad4
Exception type:   System.Net.WebException
Message:          The underlying connection was closed: A pipeline failure occurred.
InnerException:   System.Net.WebException, Use !PrintException 247bcc0d010899bc to see more.
StackTrace (generated):
<none>
StackTraceString: <none>
HResult: 80131509

Inspecting the code with .NET reflector we could see the service startup code initialized two timed event handlers:

this.unassignedTimer.Elapsed += new ElapsedEventHandler(this.OnUnassignedElapsedTime);
this.reassignTimer.Elapsed += new ElapsedEventHandler(this.OnReassignElapsedTime);

When inspecting these event handlers, we could see they were responsible for calling the UploadValues that was throwing The Operation Has Timed Out

As I didn’t have control of the server (i.e. to improve the server performance) I had to look at what could be changed from client side, i.e. increase the timeout.

However the WebClient class by default doesn’t let you specify a time out for methods like UploadValues.

However methods like UploadValues rely on WebRequest internally, which we could override…

So I created a new C# class library project in Visual Studio, matching the target .NET framework of the EXE with the issue:

I created an “ExtendedWebClient” class with the following code:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace System.Net
{
    public class ExtendedWebClient: WebClient
    {
        private int _timeout;

        public int Timeout
        {
            get
            {
                return _timeout;
            }

            set
            {
                _timeout = value;
            }
        }

        public ExtendedWebClient(int timeout=60000)
        {
            this.Timeout=timeout;
        }

        protected override WebRequest GetWebRequest(Uri address)
        {
            var objWebRequest = base.GetWebRequest(address);
            objWebRequest.Timeout = this.Timeout;
            return objWebRequest;
        }
    }
}

This would allow us to replace WebClient initialization code with our ExtendedWebClient code, allowing us to set the timeout in milliseconds. Then when WebClient methods called WebRequest method, our own version of WebRequest would be called, using our time out we had set.

To replace our extended class with straightforward:

1. We copied our compiled class library project (DLL) to the same path as the EXE we were patching
2. We opened the target EXE in .NET reflector
3. We opened our new DLL in .NET reflector as well

Then using Reflexil we then replaced:

newobj instance void [System.Net.WebClient::.ctor()

In both of our

with:

ldc.i4 0x1d4c0 ; load integer onto stack – 120,000 (ms)
newobj instance void [System.Net.ExtendedWebClient::.ctor(int32)  ; instantiate an instance of our extended web client class

The result looked like this:

image

We saved the patched EXE, made a backup of original, and replaced the original with our new EXE.

Started the Windows Service…and it worked perfectly, no more timeout exceptions, no more hang…kept on doing its job.

So now we have a proven solution we can suggest developer of the service to implement, or information to network teams, proxy management, or host of web server – please improve your performance!

About chentiangemalc

specializes in end-user computing technologies. disclaimer 1) use at your own risk. test any solution in your environment. if you do not understand the impact/consequences of what you're doing please stop, and ask advice from somebody who does. 2) views are my own at the time of posting and do not necessarily represent my current view or the view of my employer and family members/relatives. 3) over the years Microsoft/Citrix/VMWare have given me a few free shirts, pens, paper notebooks/etc. despite these gifts i will try to remain unbiased.
This entry was posted in .NET, Patching, WinDbg and tagged . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s