博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
一个有趣的.net程序死锁问题
阅读量:4690 次
发布时间:2019-06-09

本文共 7919 字,大约阅读时间需要 26 分钟。

最近遇到一个有趣的.net 2.0程序死锁问题,一般来说.net死锁问题都是应用程序显示的请求锁的过程出现锁访问顺序不一致导致的,但是本文中这个死锁则相对较为隐晦,隐藏的很深。

调试过程

.net的死锁我们可以通过sos.dll提供的syncblk来查看sync block来发现那些线程拥有锁,哪些线程等待锁。所以我们先通过syncblk来查看以下输出如何。

通过syncblk可以看到目前有一个syncblk已经被线程3(系统线程1814)所拥有。

0:005> .loadby sos mscorwks0:005> !syncblkIndex         SyncBlock MonitorHeld Recursion Owning Thread Info          SyncBlock Owner    3 0000000000f4e678            3         1 0000000000f58320  1814   3   0000000002ef6040 System.Object-----------------------------Total           3CCW             0RCW             0ComClassFactory 0Free            0

切换到3号线程,验证是否系统线程号为1814。输出其调用栈。原来该线程在等待进入另外一个CriticalSection。

0:005> ~3sntdll!ZwWaitForSingleObject+0xa:000007f8`04ba2c2a c3              ret0:003> ~..  3  Id: 88c.1814 Suspend: 1 Teb: 000007f5`ff396000 Unfrozen      Start: mscorwks!Thread::intermediateThreadProc (000007ff`e7d5f33c)      Priority: 0  Priority class: 32  Affinity: f0:003> kvLArgs to Child                                                           : Call Site00000000`00f535d0 00000000`000001f8 00000000`00000000 00000000`00f07c70 : ntdll!ZwWaitForSingleObject+0xa00000000`00000000 000007ff`43010042 00000000`00f535d0 00000000`00000001 : ntdll!RtlpWaitOnCriticalSection+0xea00000000`0000000a 00000000`00000000 00000000`00f1c5b0 00000000`00f535d0 : ntdll!RtlpEnterCriticalSectionContended+0x9400000000`00000a45 00000000`00000000 ffffffff`fffffffe 000007ff`88533480 : mscorwks!UnsafeEEEnterCriticalSection+0x2000000000`00000000 000007ff`e82758f0 ffffffff`fffffffe 00000000`00000000 : mscorwks!CrstBase::Enter+0x123ffffffff`00000001 00000000`00f4d920 00000000`00000000 00000000`00000001 : mscorwks!ListLockEntry::FinishDeadlockAwareEnter+0x2b00000000`1b81e100 00000000`00f48e80 00000000`00000000 00000000`00000000 : mscorwks!ListLockEntry::LockHolder::DeadlockAwareAcquire+0x32000007f8`01eb798a 000007ff`883f3ba0 00000000`00000003 000007ff`883f3958 : mscorwks!MethodTable::DoRunClassInitThrowing+0x6cb00000000`00000000 00000000`00000000 ffffffff`fffffffe 000007ff`e1261560 : mscorwks!MethodTable::CheckRunClassInitThrowing+0x6800000000`02ef8148 00000000`00f0c040 00000000`00000000 000007ff`883f3b90 : mscorwks!MethodDesc::DoPrestub+0x16200000000`02ef4e40 00000000`00000000 00000000`02ef5fe8 00000000`00f58320 : mscorwks!PreStubWorker+0x1fa00000000`1b81ec50 00000000`00000000 00000000`1b81ea50 00000000`00000000 : mscorwks!ThePreStubAMD64+0x8700000000`02ef6028 000007ff`e16334c0 00000000`1b81ed00 00000000`00000000 : StaticConstruction!StaticConstruction.Singleton.LockIt()+0x10c00000000`02ef4c20 000007ff`e16334c0 00000000`1b81ed00 00000000`00000000 : StaticConstruction!StaticConstruction.Program.Thread1Proc()+0x37 ... 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x1d

将该CriticalSection输出,查看一下拥有的线程是1994,即4号线程。

0:003> !cs 00f535d0 -----------------------------------------Critical section   = 0x0000000000f535d0 (+0xF535D0)DebugInfo          = 0x0000000000f402e0LOCKEDLockCount          = 0x1WaiterWoken        = NoOwningThread       = 0x0000000000001994RecursionCount     = 0x1LockSemaphore      = 0x1F8SpinCount          = 0x00000000020007d00:003> ~   0  Id: 88c.1f7c Suspend: 1 Teb: 000007f5`ff39e000 Unfrozen   1  Id: 88c.18a4 Suspend: 1 Teb: 000007f5`ff39c000 Unfrozen   2  Id: 88c.1438 Suspend: 1 Teb: 000007f5`ff39a000 Unfrozen.  3  Id: 88c.1814 Suspend: 1 Teb: 000007f5`ff396000 Unfrozen   4  Id: 88c.1994 Suspend: 1 Teb: 000007f5`ff394000 Unfrozen#  5  Id: 88c.174c Suspend: 1 Teb: 000007f5`ff1be000 Unfrozen

切换到4号线程查看调用栈,可以看到该调用栈正在等待StaticConstruction.Singleton.LockIt中的Monitor.Enter,即三号线程拥有的syncblk。

0:003> ~4sntdll!ZwWaitForMultipleObjects+0xa:000007f8`04ba319b c3              ret0:004> kLChild-SP          RetAddr           Call Site00000000`1b91d348 000007f8`01e812d2 ntdll!ZwWaitForMultipleObjects+0xa00000000`1b91d350 000007ff`e7c3e809 KERNELBASE!WaitForMultipleObjectsEx+0xe500000000`1b91d630 000007ff`e7c431f1 mscorwks!WaitForMultipleObjectsEx_SO_TOLERANT+0xc100000000`1b91d6d0 000007ff`e7d403e5 mscorwks!Thread::DoAppropriateAptStateWait+0x4100000000`1b91d730 000007ff`e7c5e95c mscorwks!Thread::DoAppropriateWaitWorker+0x19100000000`1b91d830 000007ff`e7c9d17a mscorwks!Thread::DoAppropriateWait+0x5c00000000`1b91d8a0 000007ff`e7c20fe1 mscorwks!CLREvent::WaitEx+0xbe00000000`1b91d950 000007ff`e7d6e012 mscorwks!AwareLock::EnterEpilog+0xc900000000`1b91da20 000007ff`e817a825 mscorwks!AwareLock::Enter+0x7200000000`1b91da50 000007ff`88540657 mscorwks!JIT_MonEnterWorker_Portable+0xf500000000`1b91dc20 000007ff`885404c3 StaticConstruction!StaticConstruction.Singleton.LockIt()+0xa700000000`1b91dcc0 000007ff`e7ddd562 StaticConstruction!StaticConstruction.Static..cctor()+0x9300000000`1b91dd30 000007ff`e7d1a293 mscorwks!CallDescrWorker+0x8200000000`1b91dd70 000007ff`e7d1a3da mscorwks!CallDescrWorkerWithHandler+0xd300000000`1b91de10 000007ff`e7cfd437 mscorwks!DispatchCallDebuggerWrapper+0x3e00000000`1b91de70 000007ff`e7cf22bd mscorwks!MethodTable::RunClassInitEx+0x20700000000`1b91dfc0 000007ff`e8165f98 mscorwks!MethodTable::DoRunClassInitThrowing+0x74d00000000`1b91ea40 000007ff`e814f162 mscorwks!MethodTable::CheckRunClassInitThrowing+0x6800000000`1b91ea80 000007ff`e7cf72aa mscorwks!MethodDesc::DoPrestub+0x16200000000`1b91eb70 000007ff`e7ddd447 mscorwks!PreStubWorker+0x1fa00000000`1b91ec30 000007ff`88540300 mscorwks!ThePreStubAMD64+0x8700000000`1b91ed00 000007ff`e14e2bdb StaticConstruction!StaticConstruction.Program.Thread2Proc()+0x20...00000000`1b91f8c0 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

因此我们可以看到是3号线程和4号线程互相得到了彼此需要请求的锁,因而造成了死锁。

其中4号线程很容易理解,他在等待monitor.enter,但是如何解释3号线程的调用栈,3号线程同样是调用StaticConstruction.Singleton.LockIt,其等待的CritcalSection来自何处又是做什么用的呢?

回头查看程序代码,

using System;using System.Collections.Generic;using System.Text;using System.Threading;namespace StaticConstruction{    class Program    {        static void Main(string[] args)        {            Thread t1 = new Thread(Thread1Proc);            Thread t2 = new Thread(Thread2Proc);            t1.Start(); // lockA -> wait for static construction            t2.Start(); // static consctruction -> lockA            Console.Read();        }        static void Thread1Proc()        {            Singleton.Instance.LockIt();        }        static void Thread2Proc()        {            Static.Foo();        }    }    class Singleton    {        private object lockA = new object();        private Singleton()        {        }        private static Singleton _instance = new Singleton();        public static Singleton Instance        {            get { return _instance; }        }        public void LockIt()        {            Console.WriteLine("Thread {0} waiting lock A", Thread.CurrentThread.ManagedThreadId);            lock (lockA)            {                Console.WriteLine("Thread {0} got lock A", Thread.CurrentThread.ManagedThreadId);                Thread.Sleep(10);                Static.Foo();            }            Console.WriteLine("Thread {0} released lock A", Thread.CurrentThread.ManagedThreadId);        }    }    class Static    {        private Static()        {        }        static Static()        {            Console.WriteLine("Static constructor begin by thread {0}", Thread.CurrentThread.ManagedThreadId);            Singleton.Instance.LockIt();            Console.WriteLine("Static constructor end by thread {0}", Thread.CurrentThread.ManagedThreadId);        }        public static void Foo()        {            Console.WriteLine("Static method Foo begin");            Console.WriteLine("Static method Foo end");        }    }}

 

原来3号线程在StaticConstruction.Singleton.LockIt得到锁之后紧接着要调用一个静态类Static的方法,而回到4号线程的调用栈可以看到该静态类Static构造函数还在执行过程中,在这个静态类被构造成功之前,3号线程当然无法调用该类上的方法,原来CLR通过在methodtable初始化过程中通过一个CriticalSection来确保线程安全。因此也使得在这种特殊情况下会有死锁情况发生。

明白了这一点,解决方法就很简单了,将锁的请求顺序统一,只要大家都按照同样的请求顺序来请求锁,死锁问题就不会发生了。

转载于:https://www.cnblogs.com/developersupport/archive/2013/04/20/dotnet-deadlock.html

你可能感兴趣的文章
poj3440--Coin Toss(几何上的概率)
查看>>
STL
查看>>
learning uboot source command
查看>>
svn diff 只显示文件名
查看>>
代码轻视频系列#001
查看>>
电脑蓝屏原因速查
查看>>
python3 输出内容格式化
查看>>
hdu 2089 不要62--数位dp入门
查看>>
谷歌地图自定义popup框
查看>>
机器学习实战——k-邻近算法:约会网站
查看>>
洛谷P3111 [USACO14DEC]牛慢跑Cow Jog_Sliver
查看>>
分银子
查看>>
fiddler抓包后Jmeter实现登录接口
查看>>
士兵杀敌(三)_RMQ(区间最值查询)
查看>>
实验十四:线程设计
查看>>
selenium自动化测试配置工具整理
查看>>
XE5 搭建DataSnap服务
查看>>
ADO BUG之'无法为更新定位行....' 解决之道
查看>>
多播技术总结
查看>>
hdu-5656 CA Loves GCD(dp+数论)
查看>>