is there a way to resume an ML.net cli training to where it was before a crash?
I have a lot of data in the folder C:\Users\wwww\AppData\Local\Temp\AutoML-NNI\Experiment-9K67B4 but I do not know how to make mlnet start from there.
Detail:
I used the cli, ie "mlnet classicfiaction...."
I trained for a few days, but I made a mistake which used a lot of memory on my computer, which stopped the mlnet process. I would like to start mlnet to where it left so it can continue from there
Note: I would love to be able to stop earlier and continue/resume a training from the cli.
Thanks
w
2022-08-10 15:03:24.3091 DEBUG System.InvalidOperationException: Event we were waiting on was subject to an exception
---> System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
at System.Array.Resize[T](T[]& array, Int32 newSize)
at Microsoft.ML.Internal.Utilities.ArrayUtils.EnsureSize[T](T[]& array, Int32 min, Int32 max, Boolean keepOld, Boolean& resized)
at Microsoft.ML.Data.CacheDataView.ColumnCache.ImplOne`1.CacheCurrent()
at Microsoft.ML.Data.CacheDataView.Filler(DataViewRowCursor cursor, ColumnCache[] caches, OrderedWaiter waiter)
Relate to
https://github.com/dotnet/machinelearning/issues/6286
https://github.com/dotnet/machinelearning/issues/6287
Related
I'm using FlaUI library to automate desktop app.
I took an error on try to run code to take a window of launched program. Error: Could not find process with id: ***'
Details of error.
System.Exception
HResult=0x80131500
Message=Could not find process with id: 13536
Source=FlaUI.Core
StackTrace:
at FlaUI.Core.Application.FindProcess(Int32 processId)
at FlaUI.Core.Application.<WaitWhileMainHandleIsMissing>b__33_0()
at FlaUI.Core.Tools.Retry.<>c__DisplayClass12_0.<WhileTrue>b__0()
at FlaUI.Core.Tools.Retry.While[T](Func`1 retryMethod, Func`2 checkMethod, Nullable`1 timeout, Nullable`1 interval, Boolean throwOnTimeout, Boolean ignoreException, String timeoutMessage, Boolean lastValueOnTimeout, T defaultOnTimeout)
at FlaUI.Core.Application.WaitWhileMainHandleIsMissing(Nullable`1 waitTimeout)
at FlaUI.Core.Application.GetMainWindow(AutomationBase automation, Nullable`1 waitTimeout)
at Program.<Main>$(String[] args) in C:\Users\jekug\source\repos\FlaUI test\FlaUI test\Program.cs:line 10
This exception was originally thrown at this call stack:
[External Code]
ArgumentException: Process with an Id of 13536 is not running.
What can be wrong and are there other ways to take
a window in FlaUI?
Thank you
calc.exe on any newer windows is actually just a proxy executable to in the end run the "WindowsStoreApp" calculator which behaves quite differently. For those apps, you need to use LaunchStoreApp instead of Launch in order to get the correct process.
Event Viewer on my workstation have the following error log:
ERROR NServiceBus.Transports.Msmq.MsmqDequeueStrategy [(null)] - Error in receiving messages.
System.Transactions.TransactionAbortedException: The transaction has aborted. ---> System.Transactions.TransactionManagerCommunicationException: Communication with the underlying transaction manager has failed. ---> System.Runtime.InteropServices.COMException: The Transaction Manager is not available. (Exception from HRESULT: 0x8004D01B)
at System.Transactions.Oletx.IDtcProxyShimFactory.ConnectToProxy(String nodeName, Guid resourceManagerIdentifier, IntPtr managedIdentifier, Boolean& nodeNameMatches, UInt32& whereaboutsSize, CoTaskMemHandle& whereaboutsBuffer, IResourceManagerShim& resourceManagerShim)
at System.Transactions.Oletx.DtcTransactionManager.Initialize()
--- End of inner exception stack trace ---
at System.Transactions.Oletx.OletxTransactionManager.ProxyException(COMException comException)
at System.Transactions.Oletx.DtcTransactionManager.Initialize()
at System.Transactions.Oletx.DtcTransactionManager.get_ProxyShimFactory()
at System.Transactions.Oletx.OletxTransactionManager.CreateTransaction(TransactionOptions properties)
at System.Transactions.TransactionStatePromoted.EnterState(InternalTransaction tx)
--- End of inner exception stack trace ---
at System.Transactions.TransactionStateAborted.CheckForFinishedTransaction(InternalTransaction tx)
at System.Transactions.Transaction.Promote()
at System.Transactions.TransactionInterop.ConvertToOletxTransaction(Transaction transaction)
at System.Transactions.TransactionInterop.GetDtcTransaction(Transaction transaction)
at System.Messaging.MessageQueue.StaleSafeReceiveMessage(UInt32 timeout, Int32 action, MQPROPS properties, NativeOverlapped* overlapped, ReceiveCallback receiveCallback, CursorHandle cursorHandle, IntPtr transaction)
at System.Messaging.MessageQueue.ReceiveCurrent(TimeSpan timeout, Int32 action, CursorHandle cursor, MessagePropertyFilter filter, MessageQueueTransaction internalTransaction, MessageQueueTransactionType transactionType)
at System.Messaging.MessageQueue.Receive(TimeSpan timeout, MessageQueueTransactionType transactionType)
at NServiceBus.Transports.Msmq.MsmqDequeueStrategy.TryReceiveMessage(Func`1 receive, Message& message) in C:\BuildAgent\work\3206e2123f54fce4\src\NServiceBus.Core\Transports\Msmq\MsmqDequeueStrategy.cs:line 332
Facts:
Distributed Transaction Coordinator (DTC) Component Service ==> DTC Enabled
Distributed Transaction Coordinator (DTC) Service ==> Running (referenced to Answered Stack Oveflow Question)
Realtek Audio Universal Service ==> Disabled (as per reference in Fix for 0x8004d01b)
Can anyone recommend a direction on how to check/troubleshoot the cause of the issue?
After long ardous hours, the issue I encountered boils down to permission issue. The services involved in DTC which uses credentials of LocalSystem do not have enough permission and blocked by company-imposed policy implemented on the server.
Unfortunately, I only managed to find an alternative or workaround by using account that have enough permissions to operate and manage the services involved (including custom services).
#gnud - Thanks a lot for your inputs giving me a direction for investigation and that help me continue move forward on my investigation. I appreciate the help. Thanks.
I got a memory dump. I can get the normal callstack (with line number)
When I use Debug Diag to analyze the dump I got this callstack on thread 62.
.NET Call Stack
[[HelperMethodFrame_1OBJ] (System.Threading.WaitHandle.WaitOneNative)] System.Threading.WaitHandle.WaitOneNative(System.Runtime.InteropServices.SafeHandle, UInt32, Boolean, Boolean)
mscorlib_ni!System.Threading.WaitHandle.InternalWaitOne(System.Runtime.InteropServices.SafeHandle, Int64, Boolean, Boolean)+21
mscorlib_ni!System.Threading.WaitHandle.WaitOne(Int32, Boolean)+31
CaptureServices.GenericInfrastructure.ExportLogic.ChannelsThread.ChannelsStateThread()+bb
mscorlib_ni!System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)+15e
mscorlib_ni!System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)+17
mscorlib_ni!System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)+52
mscorlib_ni!System.Threading.ThreadHelper.ThreadStart()+52
[[GCFrame]]
[[DebuggerU2MCatchHandlerFrame]]
As I understand .NET has some mechanism to shows human readable names instead of adresses. Now I want this line in WinDbg:
CaptureUtilities.AudioProcessing.APProcessorThread.IterateAPStreamProcessorQueue()+49
I open WinDbg and load the dump. I execute ~62 k and get
Child-SP RetAddr Call Site
00000016`4965e0c8 00007ffc`b59113ed ntdll!NtWaitForMultipleObjects+0xa
00000016`4965e0d0 00007ffc`abde77be KERNELBASE!WaitForMultipleObjectsEx+0xe1
00000016`4965e3b0 00007ffc`abde7658 clr!WaitForMultipleObjectsEx_SO_TOLERANT+0x62
00000016`4965e410 00007ffc`abde7451 clr!Thread::DoAppropriateWaitWorker+0x1e4
00000016`4965e510 00007ffc`abdebd15 clr!Thread::DoAppropriateWait+0x7d
00000016`4965e590 00007ffc`a94ecdf1 clr!WaitHandleNative::CorWaitOneNative+0x165
00000016`4965e7c0 00007ffc`a94ecdc1 mscorlib_ni+0x48cdf1
00000016`4965e7f0 00007ffc`4cf2e97b mscorlib_ni+0x48cdc1
00000016`4965e830 00007ffc`a94e674e 0x00007ffc`4cf2e97b
00000016`4965e890 00007ffc`a94e65e7 mscorlib_ni+0x48674e
00000016`4965e960 00007ffc`a94e65a2 mscorlib_ni+0x4865e7
00000016`4965e990 00007ffc`a94ed1f2 mscorlib_ni+0x4865a2
00000016`4965e9e0 00007ffc`abc36a53 mscorlib_ni+0x48d1f2
00000016`4965ea20 00007ffc`abc36913 clr!CallDescrWorkerInternal+0x83
Ok, as I understand it is the same. Now we have
0x00007ffc`4cf2e97b
instead of
CaptureServices.GenericInfrastructure.ExportLogic.ChannelsThread.ChannelsStateThread()+bb
So I have Microsoft debug symbols, now I need to load my own symbols to see the callstack.
The question is - do I need to load all debug symbols for my projects or I need only debug symbols for dll which contains CaptureServices.GenericInfrastructure.ExportLogic?
Or maybe I need to load only part of my debug symbols to handle this thread?
Try !sosex.mk. It gives a user-friendly stack trace with interleaved managed and native frames. I do not believe that this is a symbol issue. Also, when you have a managed address, you can pass it to !sosex.mln to see what's located there, but I think you're already aware of this command.
The k command as in ~62k is a command for the native call stack. It does dot show any .NET stuff (except the native methods in clr.dll).
To see the .NET stack, you need to load the .NET extension for WinDbg:
.loadby sos clr
And then use a command of that extension to see the .NET call stack. Switch to thread 62 first
~62s
!clrstack
!dumpstack
IMHO those commands will load symbols from PDBs when needed. If you get symbol warnings, see How to fix symbols in WinDbg
You need the debug symbols of whatever library that function belongs to.
I'm looking into a mini-dump file where the main thread (c++) utilized CLR to launch a managed (C#.NET) window, an exception was thrown in the managed portion, and crashed the application. I've been searching around looking at techniques to examine an exception details for clues, however they're mainly for one or the other (an entirely unmanaged stack & thread or an entirely managed stack & thread).
The portion of the managed callstack is below, where I can see an exception was raised inside the .NET portion, but I'm not really sure of a method to digging into viewing the details of what was raised. I'm still fairly new at digging through a .dmp file, so any guidance is greatly appreciated.
001ddb04 68b92a42 KERNELBASE!RaiseException+0x58
001ddba8 68c655ef clr!RaiseTheExceptionInternalOnly+0x276
001ddbd8 68c6de52 clr!UnwindAndContinueRethrowHelperAfterCatch+0x83
001ddc6c 627528df clr!CEEInfo::resolveToken+0x59b
001ddc7c 62778872 clrjit!Compiler::impResolveToken+0x3a
001de3ac 62751d53 clrjit!Compiler::impImportBlockCode+0x29b3
001de42c 62751f48 clrjit!Compiler::impImportBlock+0x5f
001de444 62753405 clrjit!Compiler::impImport+0x235
001de464 62753635 clrjit!Compiler::compCompile+0x63
001de4a0 62753823 clrjit!Compiler::compCompileHelper+0x2fa
001de518 627536f6 clrjit!Compiler::compCompile+0x213
001de608 6275385f clrjit!jitNativeCode+0x1e3
001de62c 68a74710 clrjit!CILJit::compileMethod+0x25
001de67c 68a747a9 clr!invokeCompileMethodHelper+0x41
001de6bc 68a747eb clr!invokeCompileMethod+0x31
001de720 68a73684 clr!CallCompileMethodWithSEHWrapper+0x2a
001deab8 68a73920 clr!UnsafeJitFunction+0x3ca
001deb94 68a81e5e clr!MethodDesc::MakeJitWorker+0x36b
001dec08 68a550b6 clr!MethodDesc::DoPrestub+0x59d
001dec70 68a44279 clr!PreStubWorker+0xed
001deca0 16c5185a clr!ThePreStub+0x16
001deda4 5ae8f887 0x16c5185a
001dedc0 5ae20c9c MYDLL!CLoader::InvokeCSharpControl
0x16c5185a is an address in memory where the .NET code has been compiled by the JIT compiler. Due to the just-in-time compilation, there's no symbol like in C++ and you need different tools (extensions for WinDbg).
First, check if it's a .NET exception with .exr -1. Except for a few exceptions, the code should be 0xE0434F4D (.COM in ASCII characters).
If that's the case, load the SOS extension to analyze the .NET details: .loadby sos clr. Next, run the command !PrintException (!pe in short) to get details about the exception and !ClrStack (casing is not relevant) to get details about the .NET call stack.
There may be more details available if you have a good crash dump for .NET.
I am getting this error when loading large files into memory. What I don't understand is that my memory (as monitored by Task Manager still says only 7G used on a machine that has 32G). Is this memory exception referring to a constrained part of this memory? And, if so, how do I allocate more. The code producing the error is below.
System.OutOfMemoryException occurred
HResult=-2147024882
Message=Exception of type 'System.OutOfMemoryException' was thrown.
Source=mscorlib
StackTrace:
at System.IO.File.InternalReadAllBytes(String path, Boolean checkHost)
InnerException:
Active (x86) and 64bit Windows 7
Code:
public void LoadAllBinaries(string aKey)
{
if (msDatas != null)
return;
msDatas = new SortedDictionary<string, byte[]>();
var dataFiles = File.ReadAllLines(G.ConfigDir + #"\dates_out.txt");
foreach (var df in dataFiles)
{
try
{
string fn = G.DataDir + "\\n" + aKey + df + ".dft";
byte[] ba = File.ReadAllBytes(fn);
msDatas.Add(fn,ba);
ba = null;
}
catch (Exception e)
{
Console.WriteLine("OpenSLoadAllBinariestreams ERROR: " + e.Message);
}
}
}
You've either got a memory leak somewhere, or a handle leak. Both will give you the Out Of Memory error.
You can see if you've got a handle leak by opening TaskManager, go to View, Select Columns..., and turn on Handles. Start your process and let it run through a few iterations. Jot down the numbers you get. Then come back in a few hours and compare the numbers. If the handles numbers are considerably higher, you're not Disposing an object than you should be.
Also ensure that you dispose of your objects as well. You can also try use System.GC.Collect for garbage collection. Hope this helps.
Not exactly sure but your exception stacktrace says
at System.IO.File.InternalReadAllBytes(String path, Boolean checkHost)
To me it looks like the below line of code is the culprit and it's throwing OOM exception cause most most probably the size of dates_out.txt is high enough.
File.ReadAllLines(G.ConfigDir + #"\dates_out.txt")
What's the size of the file dates_out.txt you are reading? if it's in size of GB then try reading it line by line rather reading the entire file at once (as suggested by others over comment).
EDIT:
If you want to see current memory consumption of a program’s object, then you can get it using GC.GetTotalMemory method. You can also use windbg.exe tool or Microsoft’s CLR Profiler
long memoryUsed = GC.GetTotalMemory (true);
In visual studio go to Project Properties and Build screen. Check that where Any CPU is selected that the platform target checkbox for Prefer 32 bit is not checked.
You can also explicitly select x64 there.
This resolved the same error for me.