Scenario
I made a Wrapper class for a C dll so I can call its functions from managed code and I can access them from a c# WCF Service. Everyting seems fine, but when allocating a lot of memory in the C library. IIS does not seems to like it. It will give me a stackoverflow exception.
Question
When allocating memory in the C dll. It breaks in IIS.
char stmt[163840+1]; // allocation a lot of memory
char stmt2[163840+1]; // allocation a lot of memory
Does IIS have special setting to allow more memory to be allocated from the C module?
Code which expose C dll functions
Steps:
1. use SetDllDirectory
2. LoadLibrary
3. then call my function with DLLImport
4. FreeLibrary
The NativeClassWrapper code (Simplefied)
[SuppressUnmanagedCodeSecurity]
public static class NativeClassWrapper
{
[DllImport("kernel32.dll", CharSet = CharSet.Unicode, SetLastError = true)]
public static extern IntPtr LoadLibrary(string hModule);
[DllImport("kernel32.dll", CharSet = CharSet.Auto, SetLastError = true)]
[return: MarshalAs(UnmanagedType.Bool)]
public static extern bool FreeLibrary(IntPtr hModule);
[DllImport("kernel32.dll", CharSet = CharSet.Auto, SetLastError = true)]
public static extern bool SetDllDirectory(string lpPathName);
[DllImport("myCDll.dll", EntryPoint = "MyFunction", ExactSpelling = false, SetLastError = true, CallingConvention = CallingConvention.Cdecl, CharSet = CharSet.Auto)]
public static extern int MyFunction(
);
}
C code
long MyFunction () {
char stmt[163840+1]; // allocation alot of memory
char stmt2[163840+1];
long lReturn = 0L;
*stmt = '\0';
*stmt2 = '\0';
return lReturn;
}
char stmt[163840+1]; // allocation alot of memory
char stmt2[163840+1];
These allocations are responsible for the stack overflow. You are attempting to allocate large arrays on the stack, and the stack is not large enough. The default stack for a Windows application is 1MB. The arrays on their own will not overflow such a stack. However, it's quite plausible that IIS uses smaller stacks, or that there is not code that you have not shown that makes similar large stack allocations.
If you really need to allocate such large arrays, you'll should do so on the heap.
thanks David Heffernan for youre response. I accpet his answer because it helped me to this solution. The solution I choose was to Start the process that communicates with the C Dll on a different thread and allocate the stacksize to 1MB instead of the default 256KB
public void StartNewThread()
{
const int stacksize = 1024*1024; // 1MB
var thread = new Thread(NativeDllProcess, stacksize);
thread.Start();
thread.Join(); // Wait till thread is ready
// .. rest of code here
}
private void NativeDllProcess(object info)
{
// ..... Code thats calls C dll functions
}
More information here :
maximum / default stack size IIS
By default, the maximum stack size of a thread that is created in a native IIS process is 256 KB
Email
Print
SUMMARY
By default, the maximum stack size of a thread that is created by a native Microsoft Internet Information Services (IIS) process is 256 KB prior to Windows Server 2008. For example, when Inetinfo.exe, DLLHost.exe, or W3wp.exe creates a thread in IIS 5.0 or IIS 6.0, the maximum stack size of the thread is 256 KB by default. You can also explicitly call the CreateThread function to specify the stack size of the thread. In Microsoft Windows 2000, if the Microsoft ASP.NET Worker Process (ASPNet_wp.exe) creates a thread, the maximum stack size of the thread is 1 MB. In Windows Server 2008 and higher, the maximum stack size of a thread running on 32-bit version of IIS is 256 KB, and on an x64 server is 512 KB.
NOTE: Internet Information Services is a multi-threaded web application platform that allows application code running inside of each worker process to utilize hundreds or more threads at once as necessary. Each thread is bound by the same stack size limit in order to keep the virtual memory usage of the process within manageable limits.
Related
My project contains of measuring temperature at different "loads" put onto the raspberry pi's capability to see whether making the raspberry work "harder" or not will affect the temperature sensor input. I am using windows 10 on my raspberry pi 2 model B and am having trouble finding enough sources about this, hence asking here.
Is it possible to somehow place a load upon the raspberry programmatically from Visual Studio as a universal application for the raspberry, in terms of perhaps forcing it to increase usage of the available RAM or perhaps limiting it? If so, what would be the best solution for this?
Is there any way to check, programmatically how much RAM it is using in total, by already implemented functions in a universal application project?
Is there any other way to "place loads" on the raspberry and being able to measure how much load you are forcing it to work?
Any type of help is very appreciated, thank you in advanced for the effort placed to answer these questions!
Is there any way to check, programmatically how much RAM it is using
in total, by already implemented functions in a universal application
project?
There is no direct API of getting total RAM in use but you can get available RAM will be used, in C#, do it like this:
[StructLayout(LayoutKind.Sequential)]
private class MEMORYSTATUSEX
{
public uint dwLength;
public uint dwMemoryLoad;
public ulong ullTotalPhys;
public ulong ullAvailPhys;
public ulong ullTotalPageFile;
public ulong ullAvailPageFile;
public ulong ullTotalVirtual;
public ulong ullAvailVirtual;
public ulong ullAvailExtendedVirtual;
public MEMORYSTATUSEX()
{
this.dwLength = (uint)Marshal.SizeOf(typeof(MEMORYSTATUSEX));
}
}
[return: MarshalAs(UnmanagedType.Bool)]
[DllImport("kernel32.dll", SetLastError = true)]
static extern bool GlobalMemoryStatusEx([In, Out] MEMORYSTATUSEX lpBuffer);
// Alternate Version Using "ref," And Works With Alternate Code Below.
// Also See Alternate Version Of [MEMORYSTATUSEX] Defined As A Structure.
[return: MarshalAs(UnmanagedType.Bool)]
[DllImport("kernel32.dll", EntryPoint = "GlobalMemoryStatusEx", SetLastError = true)]
static extern bool _GlobalMemoryStatusEx(ref MEMORYSTATUSEX lpBuffer);
void GetProcessUsage()
{
MEMORYSTATUSEX data = new MEMORYSTATUSEX();
GlobalMemoryStatusEx(data);
System.Diagnostics.Debug.WriteLine(data.ullTotalPageFile + "\t\t" + data.ullAvailPageFile);
}
Is there any other way to "place loads" on the raspberry and being
able to measure how much load you are forcing it to work?
You may try this:
List<byte[]> list = new List<byte[]>();
while (true)
{
var buf = new byte[1024 * 1024 * 50];
list.Add(buf);
System.Diagnostics.Debug.WriteLine("Allocating memory");
await Task.Delay(1000);
}
Memory can be used for app is 390MB of Raspberry Pi 3 with 1GB RAM. For 512 MB models seems to be 185MB. You can use Windows.System.MemoryManager.AppMemoryUsageLimit to confirm your device.
I am currently trying to write a console application in C# with two screen buffers, which should be swapped back and forth (much like VSync on a modern GPU). Since the System.Console class does not provide a way to switch buffers, I had to P/Invoke several methods from kernel32.dll.
This is my current code, grossly simplified:
static void Main(string[] args)
{
IntPtr oldBuffer = GetStdHandle(-11); //Gets the handle for the default console buffer
IntPtr newBuffer = CreateConsoleScreenBuffer(0, 0x00000001, IntPtr.Zero, 1, 0); //Creates a new console buffer
/* Write data to newBuffer */
SetConsoleActiveScreenBuffer(newBuffer);
}
The following things occured:
The screen remains empty, even though it should be displaying newBuffer
When written to oldBuffer instead of newBuffer, the data appears immediately. Thus, my way of writing into the buffer should be correct.
Upon calling SetConsoleActiveScreenBuffer(newBuffer), the error code is now 6, which means invalid handle. This is strange, as the handle is not -1, which the documentation discribes as invalid.
I should note that I very rarely worked with the Win32 API directly and have very little understanding of common Win32-related problems. I would appreciate any sort of help.
As IInspectable points out in the comments, you're setting dwDesiredAccess to zero. That gives you a handle with no access permissions. There are some edge cases where such a handle is useful, but this isn't one of them.
The only slight oddity is that you're getting "invalid handle" rather than "access denied". I'm guessing you're running Windows 7, so the handle is a user-mode object (a "pseudohandle") rather than a kernel handle.
At any rate, you need to set dwDesiredAccess to GENERIC_READ | GENERIC_WRITE as shown in the sample code.
Also, as Hans pointed out in the comments, the declaration on pinvoke.net was incorrect, specifying the last argument as a four-byte integer rather than a pointer-sized integer. I believe the correct declaration is
[DllImport("kernel32.dll", SetLastError = true)]
static extern IntPtr CreateConsoleScreenBuffer(
uint dwDesiredAccess,
uint dwShareMode,
IntPtr lpSecurityAttributes,
uint dwFlags,
IntPtr lpScreenBufferData
);
I would like to check if the processes' threads (the whole process) are suspended.
I'm obtaining each process thread by this code:
var threads = Proc.Threads;
for (int x = 0; x < threads.Count; x++) {
var thread = threads[x];
However System.Diagnostics.ThreadState doesn't contain Suspended, but System.Threading.ThreadState does. How do I convert System.Diagnostics.ThreadState to System.Threading.ThreadState, or is it some other method to check it? I'm not trying to suspend/resume them, just I want to know how Process hacker/Process explorer does that.
Microsoft made a big mistake in .NET version 1.0, they added the Thread.Suspend() and Resume() methods. Those methods were widely abused, programmers used them to implement thread synchronization. For which they are entirely inappropriate. Problem was that it usually worked. But call Suspend() at an unlucky time and you'll freeze a thread while it is buried inside a Windows call, holding a global lock. And causing the entire program to deadlock.
It was not the only design mistake they made, the Synchronized method on the collection classes was quite a disaster as well. Widely misinterpreted as "returns a thread-safe collection".
Live and learn, this all got fixed in .NET 2.0. One big overhaul was that a Thread may not necessarily be an operating system thread anymore, that never actually got implemented. But explains why there are two ThreadState enumerations, one for Thread (the .NET version) and another for ProcessThread (the operating system version). And they closed the loophole on programmers abusing Suspend/Resume, the methods were declared obsolete. And they closed the backdoor as well, you can't find out from ProcessThread that a thread is suspended.
Feature, not a bug. Don't make the same mistake, knowing that a thread is suspended is useless knowledge, it may well not be suspended anymore a microsecond later.
This will help someone.
Process proc = Process.GetProcessById(31448);
if(proc.Threads[0].WaitReason == ThreadWaitReason.Suspended)
{
//process is suspended
}
An operating system thread isn't the same as a .Net thread. Process.Threads returns OS threads, each of which may or may not correspond to a .Net thread.
You can look at ProcessThread.WaitReason, but it doesn't correspond to .Net wait states
You could improperly use SuspendThread or Wow64SuspendThread to find out if it was suspended, then use ResumeThread to restore the situation.
SuspendThread return: "If the function succeeds, the return value is the thread's previous suspend count;"
Declarations:
[Flags] public enum ThreadAccess : int {
TERMINATE = (0x0001),
SUSPEND_RESUME = (0x0002),
GET_CONTEXT = (0x0008),
SET_CONTEXT = (0x0010),
SET_INFORMATION = (0x0020),
QUERY_INFORMATION = (0x0040),
SET_THREAD_TOKEN = (0x0080),
IMPERSONATE = (0x0100),
DIRECT_IMPERSONATION = (0x0200)}
[DllImport("kernel32.dll")]
static extern IntPtr OpenThread(
ThreadAccess dwDesiredAccess,
bool bInheritHandle,
uint dwThreadId);
[DllImport("kernel32.dll")]
static extern uint SuspendThread(IntPtr hThread);
[DllImport("kernel32.dll")]
static extern int ResumeThread(IntPtr hThread);
[DllImport("kernel32", CharSet = CharSet.Auto, SetLastError = true)]
static extern bool CloseHandle(IntPtr handle);
(Wow64SuspendThread link hidden because i need 10 reputation to put over 2 links = ht.tps://msdn.microsoft.com/it-it/library/windows/desktop/ms687400(v=vs.85).aspx)
How do I set MinWorkingSet and MaxWorking set for a 64-bit .NET process?
p.s. I can set the MinWorkingSet and MaxWorking set for a 32-bit process, as follows:
[DllImport("KERNEL32.DLL", EntryPoint = "SetProcessWorkingSetSize", SetLastError = true, CallingConvention = CallingConvention.StdCall)]
internal static extern bool SetProcessWorkingSetSize(IntPtr pProcess, int dwMinimumWorkingSetSize, int dwMaximumWorkingSetSize);
[DllImport("KERNEL32.DLL", EntryPoint = "GetCurrentProcess", SetLastError = true, CallingConvention = CallingConvention.StdCall)]
internal static extern IntPtr MyGetCurrentProcess();
// In main():
SetProcessWorkingSetSize(Process.GetCurrentProcess().Handle, int.MaxValue, int.MaxValue);
Don't pinvoke this, just use the Process.CurrentProcess.MinWorkingSet property directly.
Very high odds that this won't make any difference. Soft paging faults are entirely normal and resolved very quickly if the machine has enough RAM. Takes ~0.7 microseconds on my laptop. You can't avoid them, it is the behavior of a demand_paged virtual memory operating system like Windows. Very cheap, as long as there is a free page readily available.
But if it "blips" you program performance then you need to consider the likelihood that it isn't readily available and triggered a hard page fault in another process. The paging fault does get expensive if the RAM page must be stolen from another process, its content has to be stored in the paging file and has to be reset back to zero first. That can add up quickly, hundreds of microseconds isn't unusual.
The basic law of "there is no free lunch", you need to run less processes or buy more RAM. With the latter option the sane choice, 8 gigabytes sets you back about 75 bucks today. Complete steal.
All you have to do is change your declaration like so:
[DllImport("KERNEL32.DLL", EntryPoint = "SetProcessWorkingSetSize",
SetLastError = true, CallingConvention = CallingConvention.StdCall)]
internal static extern bool SetProcessWorkingSetSize(IntPtr pProcess,
long dwMinimumWorkingSetSize, long dwMaximumWorkingSetSize);
The reason is because of the definition of the SetProcessWorkingSetSize function:
BOOL WINAPI SetProcessWorkingSetSize(
_In_ HANDLE hProcess,
_In_ SIZE_T dwMinimumWorkingSetSize,
_In_ SIZE_T dwMaximumWorkingSetSize
);
Note that it doesn't use a DWORD (as 32-bit integer) but a SIZE_T, which is defined as:
The maximum number of bytes to which a pointer can point. Use for a
count that must span the full range of a pointer. This type is
declared in BaseTsd.h as follows:
typedef ULONG_PTR SIZE_T;
This means that it's a 64-bit value, hence the ability to change to a long and have the function work on 64-bit systems. Also, from the section of MSDN titled "Common Visual C++ 64-bit Migration Issues":
size_t, time_t, and ptrdiff_t are 64-bit values on 64-bit Windows operating systems.
However, this presents a bit of a dilemma, in that you don't want to have to compile platform-specific assemblies (that would be a PITA). You can get around this by taking advantage of the EntryPoint field on the DllImportAttribute class (which you're already doing) to have two method declarations:
[DllImport("KERNEL32.DLL", EntryPoint = "SetProcessWorkingSetSize",
SetLastError = true, CallingConvention = CallingConvention.StdCall)]
internal static extern bool SetProcessWorkingSetSize32(IntPtr pProcess,
int dwMinimumWorkingSetSize, int dwMaximumWorkingSetSize);
[DllImport("KERNEL32.DLL", EntryPoint = "SetProcessWorkingSetSize",
SetLastError = true, CallingConvention = CallingConvention.StdCall)]
internal static extern bool SetProcessWorkingSetSize64(IntPtr pProcess,
long dwMinimumWorkingSetSize, long dwMaximumWorkingSetSize);
Now you have two separate signatures. However, knowing which signature to call is still an issue. You don't want to place conditional checks everywhere. To that end, I'd recommend creating a method that performs the check for you and call that.
You'll want to use the Is64BitProcess property on the Environment class to make this determination. Don't use the Is64BitOperatingSystem property. You want the former because 32-bit processes can be run on 64-bit operating systems, and you want to make sure that your code is resilient to that; just checking to see if the operating system is 64 bit doesn't give you the entire picture.
Background:
I've written a multi-threaded application in Win32, which I start from C# code using Process class from System.Diagnostics namespace.
Now, in the C# code, I want to get the name/symbol of the start address of each thread created in the Win32 application so that I could log thread related information, such as CPU usage, to database. Basically, C# code starts multiple instances of the Win32 Application, monitors them, kills if needed, and then logs info/error/exceptions/reason/etc to database.
For this purpose, I've wrapped two Win32 API viz. SymInitialize and SymFromAddr in programmer-friendly API written by myself, as listed below:
extern "C"
{
//wraps SymInitialize
DllExport bool initialize_handler(HANDLE hModue);
//wraps SymFromAddr
DllExport bool get_function_symbol(HANDLE hModule, //in
void *address, //in
char *name); //out
}
And then call these API from C# code, using pinvoke. But it does not work and GetLastError gives 126 error code which means:
The specified module could not be found
I'm passing Process.Handle as hModule to both functions; initialize_handler seems to work, but get_function_symbol does not; it gives the above error. I'm not sure if I'm passing the correct handle. I tried passing the following handles:
Process.MainWindowHandle
Process.MainModule.BaseAddress
Both fail at the first step itself (i.e when calling initialize_handler). I'm passing Process.Threads[i].StartAddress as second argument, and that seems to be cause of the failure as ProcessThread.StartAddress seems to be the address of RtlUserThreadStart function, not the address of the start function specific to the application. The MSDN says about it:
Every Windows thread actually begins execution in a system-supplied function, not the application-supplied function. The starting address for the primary thread is, therefore, the same (as it represents the address of the system-supplied function) for every Windows process in the system. However, the StartAddress property allows you to get the starting function address that is specific to your application.
But it doesn't say how to get the startinbg function address specific to the application, using ProcessThread.StartAddress.
Question:
My problem boils to getting the start address of win32 thread from another application (written in C#), as once I get it, I will get the name as well, using the above mentioned APIs. So how to get the start address?
I tested my symbol lookup API from C++ code. It works fine to resolve the address to a symbol, if given the correct address to start with.
Here is my p/invoke declarations:
[DllImport("UnmanagedSymbols.dll", SetLastError = true, CallingConvention= CallingConvention.Cdecl)]
static extern bool initialize_handler(IntPtr hModule);
[DllImport("UnmanagedSymbols.dll", SetLastError = true, CallingConvention = CallingConvention.Cdecl)]
static extern bool get_function_symbol(IntPtr hModule, IntPtr address, StringBuilder name);
The key is to call the NtQueryInformationThread function. This is not a completely "official" function (possibly undocumented in the past?), but the documentation suggests no alternative for getting the start address of a thread.
I've wrapped it up into a .NET-friendly call that takes a thread ID and returns the start address as IntPtr. This code has been tested in x86 and x64 mode, and in the latter it was tested on both a 32-bit and a 64-bit target process.
One thing I did not test was running this with low privileges; I would expect that this code requires the caller to have the SeDebugPrivilege.
using System;
using System.ComponentModel;
using System.Diagnostics;
using System.Linq;
using System.Runtime.InteropServices;
class Program
{
static void Main(string[] args)
{
PrintProcessThreads(Process.GetCurrentProcess().Id);
PrintProcessThreads(4156); // some other random process on my system
Console.WriteLine("Press Enter to exit.");
Console.ReadLine();
}
static void PrintProcessThreads(int processId)
{
Console.WriteLine(string.Format("Process Id: {0:X4}", processId));
var threads = Process.GetProcessById(processId).Threads.OfType<ProcessThread>();
foreach (var pt in threads)
Console.WriteLine(" Thread Id: {0:X4}, Start Address: {1:X16}",
pt.Id, (ulong) GetThreadStartAddress(pt.Id));
}
static IntPtr GetThreadStartAddress(int threadId)
{
var hThread = OpenThread(ThreadAccess.QueryInformation, false, threadId);
if (hThread == IntPtr.Zero)
throw new Win32Exception();
var buf = Marshal.AllocHGlobal(IntPtr.Size);
try
{
var result = NtQueryInformationThread(hThread,
ThreadInfoClass.ThreadQuerySetWin32StartAddress,
buf, IntPtr.Size, IntPtr.Zero);
if (result != 0)
throw new Win32Exception(string.Format("NtQueryInformationThread failed; NTSTATUS = {0:X8}", result));
return Marshal.ReadIntPtr(buf);
}
finally
{
CloseHandle(hThread);
Marshal.FreeHGlobal(buf);
}
}
[DllImport("ntdll.dll", SetLastError = true)]
static extern int NtQueryInformationThread(
IntPtr threadHandle,
ThreadInfoClass threadInformationClass,
IntPtr threadInformation,
int threadInformationLength,
IntPtr returnLengthPtr);
[DllImport("kernel32.dll", SetLastError = true)]
static extern IntPtr OpenThread(ThreadAccess dwDesiredAccess, bool bInheritHandle, int dwThreadId);
[DllImport("kernel32.dll", SetLastError = true)]
static extern bool CloseHandle(IntPtr hObject);
[Flags]
public enum ThreadAccess : int
{
Terminate = 0x0001,
SuspendResume = 0x0002,
GetContext = 0x0008,
SetContext = 0x0010,
SetInformation = 0x0020,
QueryInformation = 0x0040,
SetThreadToken = 0x0080,
Impersonate = 0x0100,
DirectImpersonation = 0x0200
}
public enum ThreadInfoClass : int
{
ThreadQuerySetWin32StartAddress = 9
}
}
Output on my system:
Process Id: 2168 (this is a 64-bit process)
Thread Id: 1C80, Start Address: 0000000001090000
Thread Id: 210C, Start Address: 000007FEEE8806D4
Thread Id: 24BC, Start Address: 000007FEEE80A74C
Thread Id: 12F4, Start Address: 0000000076D2AEC0
Process Id: 103C (this is a 32-bit process)
Thread Id: 2510, Start Address: 0000000000FEA253
Thread Id: 0A0C, Start Address: 0000000076F341F3
Thread Id: 2438, Start Address: 0000000076F36679
Thread Id: 2514, Start Address: 0000000000F96CFD
Thread Id: 2694, Start Address: 00000000025CCCE6
apart from the stuff in parentheses since that requires extra P/Invoke's.
Regarding SymFromAddress "module not found" error, I just wanted to mention that one needs to call SymInitialize with fInvadeProcess = true OR load the module manually, as documented on MSDN.
I know you say this isn't the case in your situation, but I'll leave this in for the benefit of anyone who finds this question via those keywords.
Here's what my understanding of the problem is.
You have a C# app, APP1 that creates a bunch of threads.
Those threads, in turn, each create a process. I am assuming those threads stay alive and are in charge of monitoring the process it spawned.
So for each thread in APP1, you want it to enumerate information on the threads spawned in the child process of that thread.
They way I would have done this back in the good-old-days would be:
Code all my Win32 thread monitoring of a given Win32 process into a DLL
Inject that DLL into the process I wanted to monitor
Use a named pipe or other RPC mechanism to communicate from the injected Win32 process to the host APP1
So in your main threadproc in C#, you would create and monitor a named pipe for your process to communicate once it has been injected.
In C++ land, the pseudo code would be to then create a suspended process, allocate some memory in that process, inject your DLL into the process, then create a remote thread that would execute your injected dll:
char * dllName = "your cool dll with thread monitoring stuff.dll"
// Create a suspended process
CreateProces("your Win32 process.exe", ...CREATE_SUSPENDED..., pi)
// Allocate memory in the process to hold your DLL name to load
lpAlloc = VirtualAlloc(ph.hProcess, ... MEM_COMMIT, PAGE_READWRITE)
// Write the name of your dll to load in the process memory
WriteProcessMemeory(pi.hProcess, lpAlloc, dllName, ...)
// Get the address of LoadLibrary
fnLoadLibrary = GetProcAddress(GetModuleHandle("kernel32.dll"), "LoadLibraryA")
// Create a remote thread in the process, giving it the threadproc for LoadLibrary
// and the argument of your DLL name
hTrhead = CreateRemoteThread(pi.hProcess, ..., fnLoadLibrary, lpAlloc, ...)
// Wait for your dll to load
WaitForSingleObject(hThread)
// Go ahead and start the Win32 process
ResumeThread(ph.hThread)
In your DLL, you could put code into DLL_PROCESS_ATTACH that would connect to the named pipe you set up, and initialize all your stuff. Then fire a function to begin monitoring and reporting on the named pipe.
Your C# threadproc would monitor the named pipe for its process, and report it on up to APP1.
UPDATE:
I missed the fact that you control the code for the Win32 proccess. In that case, I would just pass an argument to the proccess that would control the RPC mechanism of your choice for communication (Shared memory, named pipes, queue service, clipboard (ha), etc).
That way, your C# threadproc sets up the RPC communication channel and monitoring, and then provides the "address" information to your Win32 process so it can "dial you back".
I'll leave the other stuff up there in case it is useful to anyone else wanting to monitor a Win32 process where they are not in charge of the code.
Well, this is definitely not the straightforward approach, but maybe it will help you somehow. You should be able to get the stack trace of another thread in a way used by this project (StackWalk64) and eventually see the name of desired function. It has its own problems, particularly performance of this approach probably won't be too high, but as I understood this is one-shot per thread operation. Question is, will it generally be able to properly walk the stack of your (probably optimized) applications.
First, you can't really do this reliably: if you happen to access Thread.StartAddress before the thread executes the function pointer or after the function returns, you will have no way to know what the starting function actually is.
Secondly, the more likely answer is that there isn't a direct mapping to the starting function when the thread starting function is managed.