A question about making a C# class persistent during a file load

A question about making a C# class persistent during a file load - c#

Apologies for the indescriptive title, however it's the best I could think of for the moment.
Basically, I've written a singleton class that loads files into a database. These files are typically large, and take hours to process. What I am looking for is to make a method where I can have this class running, and be able to call methods from within it, even if it's calling class is shut down.
The singleton class is simple. It starts a thread that loads the file into the database, while having methods to report on the current status. In a nutshell it's al little like this:
public sealed class BulkFileLoader {
static BulkFileLoader instance = null;
int currentCount = 0;
BulkFileLoader()
public static BulkFileLoader Instance
{
// Instanciate the instance class if necessary, and return it
}
public void Go() {
// kick of 'ProcessFile' thread
}
public void GetCurrentCount() {
return currentCount;
}
private void ProcessFile() {
while (more rows in the import file) {
// insert the row into the database
currentCount++;
}
}
}
The idea is that you can get an instance of BulkFileLoader to execute, which will process a file to load, while at any time you can get realtime updates on the number of rows its done so far using the GetCurrentCount() method.
This works fine, except the calling class needs to stay open the whole time for the processing to continue. As soon as I stop the calling class, the BulkFileLoader instance is removed, and it stops processing the file. What I am after is a solution where it will continue to run independently, regardless of what happens to the calling class.
I then tried another approach. I created a simple console application that kicks off the BulkFileLoader, and then wrapped it around as a process. This fixes one problem, since now when I kick off the process, the file will continue to load even if I close the class that called the process. However, now the problem I have is that cannot get updates on the current count, since if I try and get the instance of BulkFileLoader (which, as mentioned before is a singleton), it creates a new instance, rather than returning the instance that is currently in the executing process. It would appear that singletons don't extend into the scope of other processes running on the machine.
In the end, I want to be able to kick off the BulkFileLoader, and at any time be able to find out how many rows it's processed. However, that is even if I close the application I used to start it.
Can anyone see a solution to my problem?

You could create a Windows Service which will expose, say, a WCF endpoint which will be its API. Through this API you'll be able to query services' status and add more files for processing.

You should make your "Bulk Uploader" a service, and have your other processes speak to it via IPC.
You need a service because your upload takes hours. And it sounds like you'd like it to run unattended if necessary,, and you'd like it to be detached from the calling thread. That's what services do well.
You need some form of Inter-Process Communication because you'd like to send information between processes.
For communicating with your service see NetNamedPipeBinding
You can then send "Job Start" and "Job Status" commands and queries whenever you feel like to your background service.

Related

How to distribute work to a pool of computers

I have some data that needs to be processed. The data is a tree. The processing goes like this: Take a node N. Check if all of its children have already been processed. If not, process them first. If yes, process N. So we go from top to bottom (recursively) to the leaves, then process leaves, then the leaves' parent nodes and so on, upwards until we arrive at the root again.
I know how to write a program that runs on ONE computer that takes the data (i.e. the root node) and processes it as described above. Here is a sketch in C#:
// We assume data is already there, so I do not provide constructor/setters.
public class Data
{
public object OwnData { get; }
public IList<Data> Children { get; }
}
// The main class. We just need to call Process once and wait for it to finish.
public class DataManager
{
internal ISet<Data> ProcessedData { get; init; }
public DataManager()
{
ProcessedData = new HashSet<Data>();
}
public void Process(Data rootData)
{
new DataHandler(this).Process(rootData);
}
}
// The handler class that processes data recursively by spawning new instances.
// It informs the manager about data processed.
internal class DataHandler
{
private readonly DataManager Manager;
internal DataHandler(ProcessManager manager)
{
Manager = manager;
}
internal void Process(Data data)
{
if (Manager.ProcessedData.Contains(data))
return;
foreach (var subData in data.Children)
new DataHandler(Manager).Process(subData);
... // do some processing of OwnData
Manager.ProcessedData.Add(data);
}
}
But how can I write the program so that I can distribute the work to a pool of computers (that are all in the same network, either some local one or the internet)? What do I need to do for that?
Some thoughts/ideas:
The DataManager should run on one computer (the main one / the sever?); the DataHandlers should run on all the others (the clients?).
The DataManager needs to know the computers by some id (what id would that be?) which are set during construction of DataManager.
The DataManager must be able to create new instances of DataHandler (or kill them if something goes wrong) on these computers. How?
The DataManager must know which computers currently have a running instance of DataHandler and which not, so that it can decide on which computer it can spawn the next DataHandler (or, if none is free, wait).
These are not requirements! I do not know if these ideas are viable.
In the above thoughts I assumed that each computer can just have one instance of DataHandler. I know this is not necessarily so (because CPU cores and threads...), but in my use case it might actually be that way: The real DataManager and DataHandler are not standalone but run in a SolidWorks context. So in order to run any of that code, I need to have a running SolidWorks instance. From my experience, more than one SolidWorks instance on the same Windows does not work (reliably).
From my half-knowledge it looks like what I need is a kind of multi-computer-OS: In a single-computer-setting, the points 2, 3 and 4 are usually taken care of by the OS. And point 1 kind of is the OS (the OS=DataManager spawns processes=DataHandlers; the OS keeps track of data=ProcessedData and the processes report back).
What exactly do I want to know?
Hints to words, phrases or introductory articles that allow me to dive into the topic (in order to become able to implement this). Possibly language-agnostic.
Hints to C# libraries/frameworks that are fit for this situation.
Tips on what I should or shouldn't do (typical beginners issues). Possibly language-agnostic.
Links to example/demonstration C# projects, e.g. on GitHub. (If not C#, VB is also alright.)

You should read up on microservices and queues. Like rabbitmq.
The producer/ consumer approach.
https://www.rabbitmq.com/getstarted.html
If you integrate your microservices with Docker, you can do some pretty nifty stuff.

How to force stop a different Android service from my app?

First of all, let me say that I know that is is bad practice, not good, probably not allowed (technically) etc, etc, etc... to force stop another service from my app.
However, there ARE some use cases to warrant this need. In my case, for example, there is a 3rd party service that is installed by my app because of my reference to it ("it" is a barcode scanning SDK). The SDK states that I must call method to something called
GetScannerService();
I have observed that this call will either start the service or grab the instance to it if it is already running.
Furthermore, there are some calls that have to be done during onStop and onDestroy of my app which effectively will stop this third party service.
All that said, I have seen cases where this service gets stuck in a weird state. I have no control over the code (and bugs) in this package. Yes, I have reached out to them but have been unsuccessful so far to get them to fix the root cause. When it is stuck in this state, I can see it in the list of running services (and sometimes it is listed in the cached ones) but when my app calls to GetScannerService, it throws an exception that basically states the service cannot be started...but it already is.
So, when this happens, if I manually to the running services list and find it (again, sometimes it is in cached) and click force stop, this will fix things and my app works as expected again...until it happens again that is.
So, I want and need to have my app control this service. The thought is on startup, when I do the first call to GetScannerService, if it returns the exception, I will basically force stop it so that I can then call again and have it started. In other words, I want to automate the force stop function.
I know technically this is not allowed but I have also read that there are ways to do it, even if you don't have root.
So far, I can get the list of all running services and I can see the service in question in my list. Which means I also have access to a lot of information about the running service. But, what I have tried does not work. I have tried to KillBackgroundProcesses but it did not work, the service is still in the list.
Here is what I have tried so far:
private void Button_Click(object sender, EventArgs e)
{
var am = (ActivityManager)this.Application.ApplicationContext.GetSystemService(Context.ActivityService);
var taskList = am.GetRunningServices(serviceListLimit);
List<string> serviceNames = new List<string>();
foreach(var t in taskList)
{
serviceNames.Add(t.Service.PackageName);
}
var adapter = new ArrayAdapter<string>(this, Android.Resource.Layout.SimpleListItem1, serviceNames);
services.Adapter = adapter;
if (serviceNames.Contains(emdkServiceName))
testKillService(am, emdkClass);
}
private void testKillService(ActivityManager am, Class emdkClass)
{
am.KillBackgroundProcesses(emdkServiceName);
}
So, I can list them and see it in the list as well as grab details about the item in the list. Anybody know how I can force stop it?

You can use the Process class:
void killProcess (int pid)
You can get the "pid" from the ActivityManager.RunningAppProcessInfo.

Can the same application be both publisher/subscriber with NServiceBus?

I am new at messaging architectures, so I might be going at this the wrong way. But I wanted to introduce NServiceBus slowly in my team by solving a tiny problem.
Appointments in Agenda's have states. Two users could be looking at the same appointment in the same agenda, in the same application. They start this application via a Remote session on a central server. So if user 1 updates the state of the appointment, I'd like user 2 to see the new state 'real time'.
To simulate this or make a proof of concept if you will, I made a new console application. Via NuGet I got both NServiceBus and NServiceBus.Host, because as I understood from the documentation I need both. And I know in production code it is not recommended to put everything in the same assembly, but the publisher and subscriber will most likely end up in the same assembly though...
In class Program method Main I wrote the following code:
BusConfiguration configuration = new BusConfiguration();
configuration.UsePersistence<InMemoryPersistence>();
configuration.UseSerialization<XmlSerializer>();
configuration.UseTransport<MsmqTransport>();
configuration.TimeToWaitBeforeTriggeringCriticalErrorOnTimeoutOutages(new TimeSpan(1, 0, 0));
ConventionsBuilder conventions = configuration.Conventions();
conventions.DefiningEventsAs(t => t.Namespace != null
&& t.Namespace.Contains("Events"));
using (IStartableBus bus = Bus.Create(configuration))
{
bus.Start();
Console.WriteLine("Press key");
Console.ReadKey();
bus.Publish<Events.AppointmentStateChanged>(a =>
{
a.AppointmentID = 1;
a.NewState = "New state";
});
Console.WriteLine("Event published.");
Console.ReadKey();
}
In class EndPointConfig method Customize I added:
configuration.UsePersistence<InMemoryPersistence>();
configuration.UseSerialization<XmlSerializer>();
configuration.UseTransport<MsmqTransport>();
ConventionsBuilder conventions = configuration.Conventions();
conventions.DefiningEventsAs(t => t.Namespace != null
&& t.Namespace.Contains("Events"));
AppointmentStateChanged is a simple class in the Events folder like so:
public class AppointmentStateChanged: IEvent {
public int AppointmentID { get; set; }
public string NewState { get; set; }
}
AppointmentStateChangedHandler is the event handler:
public class AppointmentStateChangedHandler : IHandleMessages<Events.AppointmentStateChanged> {
public void Handle(Events.AppointmentStateChanged message) {
Console.WriteLine("AppointmentID: {0}, changed to state: {1}",
message.AppointmentID,
message.NewState);
}
}
If I start up one console app everything works fine. I see the handler handle the event. But if I try to start up a second console app it crashes with: System.Messaging.MessageQueueException (Timeout for the requested operation has expired). So i must be doing something wrong and makes me second guess that I don't understand something on a higher level. Could anyone point me in the right direction please?
Update
Everthing is in the namespace AgendaUpdates, except for the event class which is in the AgendaUpdates.Events namespace.
Update 2
Steps taken:
Copied AgendaUpdates solution (to AgendaUpdates2 folder)
In the copy I changed MessageEndpointMappings in App.Config the EndPoint attribute to "AgendaUpdates2"
I got MSMQ exception: "the queue does not exist or you do not have sufficient permissions to perform the operation"
In the copy I added this line of code to EndPointConfig: configuration.EndpointName("AgendaUpdates2");
I got MSMQ exception: "the queue does not exist or you do not have sufficient permissions to perform the operation"
In the copy I added this line of code to the Main methodin the Program class:
configuration.EndpointName("AgendaUpdates2");
Got original exception again after pressing key.
--> I tested it by starting 2 visual studio's with the original and the copied solution. And then start both console apps in the IDE.

I'm not exactly sure why you are getting that specific exception, but I can explain why what you are trying to do fails. The problem is not having publisher and subscriber in the same application (this is possible and can be useful); the problem is that you are running two instances of the same application on the same machine.
NServiceBus relies on queuing technology (MSMQ in your case), and for everything to work properly each application needs to have its own unique queue. When you fire up two identical instances, both are trying to share the same queue.
There are a few things you can tinker with to get your scenario to work and to better understand how the queuing works:
Change the EndPointName of your second instance
Run the second instance on a separate machine
Separate the publisher and subscriber into separate processes
Regardless of which way you go, you will need to adjust your MessageEndpointMappings (on the consumer/subscriber) to reflect where the host/publisher queue lives (the "owner" of the message type):
http://docs.particular.net/nservicebus/messaging/message-owner#configuring-endpoint-mapping
Edit based on your updates
I know this is a test setup/proof of concept, but it's still useful to think of these two deployments (of the same code) as publisher/host and subscriber/client. So let's call the original the host and the copy the client. I assume you don't want to have each subscribe to the other (at least for this basic test).
Also, make sure you are running both IDEs as Administrator on your machine. I'm not sure if this is interfering or not.
In the copy I changed MessageEndpointMappings in App.Config the EndPoint attribute to "AgendaUpdates2" I got MSMQ exception: "the queue does not exist or you do not have sufficient permissions to perform the operation"
Since the copy is the client, you want to point its mapping to the host. So this should be "AgendaUpdates" (omit the "2").
In the copy I added this line of code to EndPointConfig: configuration.EndpointName("AgendaUpdates2"); I got MSMQ exception: "the queue does not exist or you do not have sufficient permissions to perform the operation"
In the copy I added this line of code to the Main methodin the Program class: configuration.EndpointName("AgendaUpdates2"); Got original exception again after pressing key
I did not originally notice this, but you don't need to configure the endpoint twice. I believe your EndPointConfig is not getting called, as it is only used when hosting via the NSB host executable. You can likely just delete this class.
This otherwise sounds reasonable, but remember that your copy should not be publishing if its the subscriber, so don't press any keys after it starts (only press keys in the original).

If you want to publisher to also be the receiver of the message, you want to specify this in configuration.
This is clearly explained in this article, where the solution to your problem is completely at the end of the article.

Topshelf - handling loops

Generally with services, the task you want to complete is repeated, maybe in a loop or maybe a trigger or maybe something else.
I'm using Topshelf to complete a repeated task for me, specifically I'm using the Shelf'ing functionality.
The problem I'm having is how to handle the looping of the task.
When boot strapping the service in Topshelf, you pass it a class (in this case ScheduleQueueService) and indicate which is its Start method and it's Stop method:
Example:
public class QueueBootstrapper : Bootstrapper<ScheduledQueueService>
{
public void InitializeHostedService(IServiceConfigurator<ScheduledQueueService> cfg)
{
cfg.HowToBuildService(n => new ScheduledQueueService());
cfg.SetServiceName("ScheduledQueueHandler");
cfg.WhenStarted(s => s.StartService());
cfg.WhenStopped(s => s.StopService());
}
}
But in my StartService() method I am using a while loop to repeat the task I'm running, but when I attempt to stop the service through Windows services it fails to stop and I suspect its because the StartService() method never ended when it was originally called.
Example:
public class ScheduledQueueService
{
bool QueueRunning;
public ScheduledQueueService()
{
QueueRunning = false;
}
public void StartService()
{
QueueRunning = true;
while(QueueRunning){
//do some work
}
}
public void StopService()
{
QueueRunning = false;
}
}
what is a better way of doing this?
I've considered using the .NET System.Threading.Tasks to run the work in and then maybe closing the thread on StopService()
Maybe using Quartz to repeat the task and then remove it.
Thoughts?

Generally, how I would handle this is have a Timer event, that fires off a few moments after StartService() is called. At the end of the event, I would check for a stop flag (set in StopService()), if the flag (e.g. your QueueRunning) isn't there, then I would register a single event on the Timer to happen again in a few moments.
We do something pretty similar in Topshelf itself, when polling the file system: https://github.com/Topshelf/Topshelf/blob/v2_master/src/Topshelf/FileSystem/PollingFileSystemEventProducer.cs#L80
Now that uses the internal scheduler type instead of a Timer object, but generally it's the same thing. The fiber is basically which thread to process the event on.
If you have future questions, you are also welcomed to join the Topshelf mailing list. We try to be pretty responsive on there. http://groups.google.com/group/topshelf-discuss

I was working on some similar code today I stumbled on https://stackoverflow.com/a/2033431/981 by accident and its been working like a charm for me.

I don't know about Topshelf specifically but when writing a standard windows service you want the start and stop events to complete as quickly as possible. If the start thread takes too long windows assumes that it has failed to start up, for example.
To get around this I generally use a System.Timers.Timer. This is set to call a startup method just once with a very short interval (so it runs almost immediately). This then does the bulk of the work.
In your case this could be your method that is looping. Then at the start of each loop check a global shutdown variable - if its true you quit the loop and then the program can stop.
You may need a bit more (or maybe even less) complexity than this depending on where exactly the error is but the general principle should be fine I hope.
Once again though I will disclaim that this knowledge is not based on topshelf, jsut general service development.

Lock used in Cache Item callback and other method doesn't seem to lock

Simplest explanation I can produce:
In my .NET1.1 web app I create a file on disc, in the Render method, and add an item to the Cache to expire within, say, a minute. I also have a callback method, to be called when the cache item expires, which deletes the file created by Render. In the Page_Init method I try to access the file which the Render method wrote to disc. Both these methods have a lock statement, locking a private static Object.
Intention:
To create a page which essentially writes a copy of itself to disc, which gets deleted before it gets too old (or out of date, content-wise), while serving the file if it exists on disc.
Problem observed:
This is really two issues, I think. Requesting the page does what I expect, it renders the page to disc and serves it immediately, while adding the expiry item to the cache. For testing the expiry time is 1 minute.
I then expect that the callback method will get called after 60 seconds and delete the file. It doesn't.
After another minute (for the sake of argument) I refresh the page in the browser. Then I can see the callback method get called and place a lock on the lock object. The Page_Init also gets called and places a lock on the same object. However, both methods appear to enter their lock code block and proceed with execution.
This results in: Render checks file is there, callback method deletes file, render method tries to serve now-deleted-file.
Horribly simplified code extract:
public class MyPage : Page
{
private static Object lockObject = new Obect();
protected void Page_Init(...)
{
if (File.Exists(...))
{
lock (lockObject)
{
if (File.Exists(...))
{
Server.Transfer(...);
}
}
}
}
protected override void Render(...)
{
If (!File.Exists(...))
{
// write file out and serve initial copy from memory
Cache.Add(..., new CacheItemRemovedCallback(DoCacheItemRemovedCallback));
}
}
private static void DoCacheItemRemovedCallback(...)
{
lock (lockObject)
{
If (File.Exists(...))
File.Delete(...);
}
}
}
Can anyone explain this, please? I understand that the callback method is, essentially, lazy and therefore only calls back once I make a request, but surely the threading in .NET1.1 is good enough not to let two lock() blocks enter simultaneously?
Thanks,
Matt.

Not sure why your solution doesn't work, but that might be a good thing, considering the consequences...
I would suggest a completely different route. Separate the process of managing the file from the process of requesting the file.
Requests should just go to the cache, get the full path of the file, and send it to the client.
Another process (not bound to requests) is responsible for creating and updating the file. It simply creates the file on first use/access and stores the full path in the cache (set to never expire). At regular/appropriate intervals, it re-creates the file with a different, random name, sets this new path in the cache, and then deletes the old file (being careful that it isn't locked by another request).
You can spawn this file managing process on application startup using a thread or the ThreadPool. Linking your file management and requests will always cause you problems as your process will be run concurrently, requiring you to do some thread synchronization which is always best to avoid.

First thing I would do is open the Threads window and observe which thread is the Page_Init is running on and which thread the Call Back is running on. The only way I know that two methods can place a lock on the same object is if they are running in the same thread.
Edit
The real issue here is how Server.Transfer actually works. Server.Transfer simply configures some ASP.NET internal details indicating that the request is about to be transfer to a different URL on the server. It then calls Response.End which in turn throws a ThreadAbortException. No actual data has been read or sent to the client at that time.
Now when the exception occurs code execution leaves the block of code protect by the lock. At this time the Call back function can acquire the lock and delete the file.
Now somewhere deep inside ASP.NET the ThreadAbortException is handled in some way and the request for the new URL is processed. At this time it finds the file has gone missing.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.