RDotNet vs R scripting - c#

When is it an advantage/disadvantage to be using RDotNet for making statistical calculations as opposed to generating an R script text file and running it from the application using e.g. Process.Start? Or is there some other better way?
I need to execute a large number of commands and have a feeling that sending them one by one to R takes a lot of time.

I'd say the following two scenario's are stereotypical:
.NET code and R code are quite separate, not a lot of interaction is needed between the R code and the .NET code. For example, the .NET code gathers some information, and launches a processing script on that, after which the .NET code picks up the results. In this case spawning an R process (Process.Start) is a simple way to get this working.
A lot of interaction is needed between the .NET code and the R code, the workflow consists of going back and forth between .NET and R often. In this case, a more heavy weight, flexible solution such as RDotNet makes a lot of sense. RDotNet allows more easy integration of the .NET code and the R code, with the price being that it is often harder to learn, harder to debug, and often needs to be updated for new versions of R etc.

R.NET currently can initilize once. Parallel execution will be problematic.
Would suggest use RScript.
Our solution based on this answer on stackoverflow Call R (programming language) from .net
With monor change, we send R code from string and save it to temp file, since user run custom R code when needed.
public static void RunFromCmd(string batch, params string[] args)
{
// Not required. But our R scripts use allmost all CPU resources if run multiple instances
lock (typeof(REngineRunner))
{
string file = string.Empty;
string result = string.Empty;
try
{
// Save R code to temp file
file = TempFileHelper.CreateTmpFile();
using (var streamWriter = new StreamWriter(new FileStream(file, FileMode.Open, FileAccess.Write)))
{
streamWriter.Write(batch);
}
// Get path to R
var rCore = Registry.LocalMachine.OpenSubKey(#"SOFTWARE\R-core") ??
Registry.CurrentUser.OpenSubKey(#"SOFTWARE\R-core");
var is64Bit = Environment.Is64BitProcess;
if (rCore != null)
{
var r = rCore.OpenSubKey(is64Bit ? "R64" : "R");
var installPath = (string)r.GetValue("InstallPath");
var binPath = Path.Combine(installPath, "bin");
binPath = Path.Combine(binPath, is64Bit ? "x64" : "i386");
binPath = Path.Combine(binPath, "Rscript");
string strCmdLine = #"/c """ + binPath + #""" " + file;
if (args.Any())
{
strCmdLine += " " + string.Join(" ", args);
}
var info = new ProcessStartInfo("cmd", strCmdLine);
info.RedirectStandardInput = false;
info.RedirectStandardOutput = true;
info.UseShellExecute = false;
info.CreateNoWindow = true;
using (var proc = new Process())
{
proc.StartInfo = info;
proc.Start();
result = proc.StandardOutput.ReadToEnd();
}
}
else
{
result += "R-Core not found in registry";
}
Console.WriteLine(result);
}
catch (Exception ex)
{
throw new Exception("R failed to compute. Output: " + result, ex);
}
finally
{
if (!string.IsNullOrWhiteSpace(file))
{
TempFileHelper.DeleteTmpFile(file, false);
}
}
}
}
Full blog post: http://kostylizm.blogspot.ru/2014/05/run-r-code-from-c-sharp.html

With Process.Start you will start up a new R session. This can take some time, especially if you are using different packages in your script which you need to load.
If you use R.NET, you can create an R instance, and keep on talking with it. So if you have created a webservice to connect R with ASP you don't want to start up R all the time as this will be very time costly. You need it just once and you can work with it interactively.

Related

Getting "File already in Use by other process" when using File.Move, why/how can I fix this?

I want to move several files that names are saved in an ObservableCollection<String> _collection with this method:
string firstFolderThatContainsEveryFile = "...\Folder\Files";
string secondFolderArchiv = "...\Folder\Files\Archiv";
foreach (var item in _collection)
{
string firstFolder = System.IO.Path.Combine(firstFolderThatContainsEveryFile, item);
string secondFolder = System.IO.Path.Combine(secondFolderArchiv, item);
File.Move(firstFolder, secondFolder);
}
This works at the first time, but if i load new files into firstFolderThatContainsEveryFile and try to use my move method i get an exception:
File already in Use by other process
This are the steps:
I open the programm -> use the move method -> success -> close the programm -> fill the folder with new files -> open the programm -> use the move method -> exception!
How can i get the processname or processID to close the process before i use my move method, or is there even a better way to get around this?
To figure out which process are using your file, with the solution proposed by this, you can use a tool Handle of Microsoft and this code C# to invoke the tool.
public void ViewProcess(string filePath)
{
Process tool = new Process();
tool.StartInfo.FileName = "handle.exe";
tool.StartInfo.Arguments = filePath + " /accepteula";
tool.StartInfo.UseShellExecute = false;
tool.StartInfo.RedirectStandardOutput = true;
tool.Start();
tool.WaitForExit();
string outputTool = tool.StandardOutput.ReadToEnd();
string matchPattern = #"(?<=\s+pid:\s+)\b(\d+)\b(?=\s+)";
foreach (Match match in Regex.Matches(outputTool, matchPattern))
{
try{
Console.WriteLine(match.Value); // this is the process ID using the file
}
catch(Exception ex)
{
}
}
}
If the file is using by the others program, you should figure why they use it, if by your program, so recheck your code to understand why.

How to execute the threadpool after Queueing user work items in C#?

I am new to C# programming and I have a task to make a process from single thread to multithreaded. I am using C#3.5 version and implementing threadpool in the code. I have searched about threadpool and did some changes but it is not working. When I again searched in the internet I think I wrote partial code upto only queueing user workitems, I am not understanding how to execute the threads.
Shown here is the code I wrote, please don't hesitate to correct me if the code is wrong, I am very new to C# coding.
ThreadPool.SetMaxThreads(6, 6);
try
{
// Assign the values to the report parameters
for (int i = 0; i < aq.Count; i++)
{
object j = aq[i];
ThreadPool.QueueUserWorkItem(new WaitCallback(process), j);
}
}
private void process(object i)
{
List<Report> aq = new List<Report>();
ReportEnv env = null;
ParameterValue[] paramval;
List<Report> list = new List<Report>();
Report al = null;
using (OleDbDataAdapter oleDA = new OleDbDataAdapter())
using (DataTable dt = new DataTable())
{
oleDA.Fill(dt, i);
foreach (DataRow _row in dt.Rows)
{
al = new Report();
al.EmailAttachmentMsg = _row["AttachmentMsg"].ToString();
al.reportName = _row["Repo"].ToString();
al.AccountNumber = _row["Number"].ToString();
al.AccountGroupCode = _row["GroupCode"].ToString();
al.EmailTo = _row["To"].ToString().Split(';');
al.ReportScheduleId = _row["ReportScheduleId"].ToString();
al.Frequency = _row["Frequency"].ToString();
al.ColcoContactTelephone = _row["ColcoContactTelephone"].ToString();
list.Add(al);
}
}
// aq = Populatereport(Dts.Variables["vnSource_SQL_Result"].Value);
env = PopulateEnvironment(Dts.Variables["vnEnvironment"].Value);
aq = list;
paramval = new ParameterValue[2];
paramval[0] = new ParameterValue();
paramval[0].Name = "PRM_CustomerDetails";
paramval[0].Value = aq[0].AccountNumber;
paramval[1] = new ParameterValue();
paramval[1].Name = "PRM_Startdate";
paramval[1].Value = aq[0].StartDate;
//Rendering the report begins
ReportExecutionService rs = new ReportExecutionService();
rs.Credentials = System.Net.CredentialCache.DefaultCredentials;
rs.Url = env.SSRSServerUrl.ToString();
//Load the report options
rs.LoadReport(aq[0].ReportPath, null);
rs.SetExecutionParameters(paramval, aq[0].CultureCode);
// Set the filename
String filename = aq[0]. Number + "_" + env.Code + "_" + "_" + aq[0].Name +
DateTime.UtcNow.ToString("_dd-MM-yyyy_hh-mm-ss.fff");
//Render the report and generate pdf
Byte[] results;
string encoding = String.Empty;
string mimeType = String.Empty;
string extension = String.Empty;
Warning[] warnings = null;
string[] streamIDs = null;
string deviceInfo = null;
results = rs.Render(aq[0].ReportFormat, deviceInfo, out extension, out encoding, out mimeType, out warnings, out streamIDs);
//Write the file into the directory
using (FileStream stream = File.OpenWrite(#env.wipPath + filename))
{
stream.Write(results, 0, results.Length);
}
if (SendEmail(env.From, aq[0].To, env.Subject, aq[0].Attachment, env.Server, false, filename, env.TimeOut) == true)
{
// Move report file from WIP to Processed
File.Move(#env.oldPath + filename, #env.newPath + filename);
}
}
One reason I think that your code may not execute is that you have a race condition of some sort between the thread executing and your program ending. How long is your code? If you're just beginning to learn C#, I have a feeling you are coding a console app and your code is mostly on the Main() method and consists of a few lines. If you do ThreadPool.QueueUserWorkItem() on a short application and the end of the Main() method is reached immediately, your code may never execute!
To avoid this, you can add a sleep for a second before the Main() method ends, e.g.:
Thread.Sleep(1000);
You don't have to do anything more. By calling QueueUserWorkItem you are saying that you want to execute a give method in a thread that is managed by a thread pool. However, you don't have any guarantee that it will be executed immediately and it is how a thread pool works. Your method will be executed when a thread pool thread is available.
In the first line of your code you call ThreadPool.SetMaxThreads(6, 6); Because of that no more than 6 thread pool threads will be active at the same time. All requests, to execute something in a thread pool, above this limit will be queued. So, maybe you made so many requests that some of them are simply waiting for their turn.
Besides you have to keep in mind that there might be another code that also uses a thread pool. In this case your requests need to compete for thread pool threads.
UPDATE (after discussion):
Try to put a breakpoint inside a process method. A debugger will stop there and it will prove that process method is really executed. However, there is probably some bug in your code and it is why you don't see e-mails being set

Blue screen when using Ping

I'm running into the bug where it BSODon ending debugging in the middle of a ping.
I have a few ways to disable it in my (wpf) application (where I ping continuously), but sometimes I forget to do so and BSOD.
I'd like to get around that say by changing a global AllowRealPinging variable and sleeping for 2 seconds in a callback before exiting the debugger so I don't BSOD.
This is a known bug in Windows 7, you'll get a BSOD with bug-check code 0x76, PROCESS_HAS_LOCKED_PAGES in tcpip.sys when you terminate the process. The most relevant feedback article is here. Also covered in this SO question. No great answers there, the only known workaround is to fallback to a .NET version earlier than 4.0, it uses another winapi function that doesn't trigger the driver bug.
Avoiding pinging while you debug is certainly the best way to avoid this problem. Your desired approach is not going to work, your program is entirely frozen when it hits a breakpoint, kaboom when you stop debugging.
The simplest way is to just not starting pinging in the first place in the specific case of having a debugger attached. Use the System.Diagnostic.Debugger.IsAttached property to detect this in your code.
This is a good way around:
private void GetPing(){
Dictionary<string, string> tempDictionary = this.tempDictionary; //Some adresses you want to test
StringBuilder proxy = new StringBuilder();
string roundTripTest = "";
string location;
int count = 0; //Count is mainly there in case you don't get anything
Process process = new Process{
StartInfo = new ProcessStartInfo{
FileName = "ping.exe",
UseShellExecute = false,
RedirectStandardOutput = true,
CreateNoWindow = true,
}
};
for (int i = 0; i < tempDictionary.Count; i++){
proxy.Append(tempDictionary.Keys.ElementAt(i));
process.StartInfo.Arguments = proxy.ToString();
do{
try{
roundTripTest = RoundTripCheck(process);
}
catch (Exception ex){
count++;
}
if (roundTripTest == null){
count++;
}
if (count == 10 || roundTripTest.Trim().Equals("")){
roundTripTest = "Server Unavailable";
}
} while (roundTripTest == null || roundTripTest.Equals(" ") || roundTripTest.Equals(""));
}
process.Dispose();
}
RoundTripCheck method, where the magic happens:
private string RoundTripCheck(Process p){
StringBuilder result = new StringBuilder();
string returned = "";
p.Start();
while (!p.StandardOutput.EndOfStream){
result.Append(p.StandardOutput.ReadLine());
if (result.ToString().Contains("Average")){
returned = result.ToString().Substring(result.ToString().IndexOf("Average ="))
.Replace("Average =", "").Trim().Replace("ms", "").ToString();
break;
}
result.Clear();
}
return returned;
}
I had the same problem, this solves it!

Out of memory exception. May be from webservice

I got out of memory exception problem for 4 months. My client use webservice, they wanna me test their webservice. In their webservice, there is a function call upload. I test that function on 1500 users who uploaded at the same time. I tried garbage collection function of visual studio (GC). With 2mb of file, there is not exception, but with 8mb of file there is still out of memory exception. I have tried many times and a lot of solutions but still happened. I gonna crazy now. When upload is on going, I watched memory of all test computers but memory is not out of. So I think that problem is from webservice and server. But my client said that i have to improve those reasons which is from webservice and server to them. I'm gonna crazy now. Do you guys have any solotions for this? In additional, Our client does not public code, I just use webservice's function to test. Additional, I have to use vps to connect their webservice and network rather slow when connect to vps.
I have to make sure that my test script doesn't have any problem. Here is my test script to test upload function.
public void UploadNewJob(string HalID, string fileUID, string jobUID, string fileName, out List errorMessages)
{
errorMessages = null;
try
{
int versionNumber;
int newVersionNumber;
string newRevisionTag;
datasyncservice.ErrorObject errorObj = new datasyncservice.ErrorObject();
PfgDbJob job = new PfgDbJob();
job.CompanyName = Constant.SEARCH_CN;
job.HalliburtonSalesOffice = Constant.SEARCH_SO;
job.HalliburtonOperationsLocation = Constant.SEARCH_OL;
job.UploadPersonHalId = HalID;
job.CheckOutState = Constant.CHECKOUT_STATE;
job.RevisionTag = Constant.NEW_REVISION_TAG;
var manifestItems = new List();
var newManifestItems = new List();
var manifestItem = new ManifestItem();
if (fileUID == "")
{
if (job.JobUid == Guid.Empty)
job.JobUid = Guid.NewGuid();
if (job.FileUid == Guid.Empty)
job.FileUid = Guid.NewGuid();
}
else
{
Guid JobUid = new Guid(jobUID);
job.JobUid = JobUid;
Guid fileUid = new Guid(fileUID);
job.FileUid = fileUid;
}
// Change the next line when we transfer .ssp files by parts
manifestItem.PartUid = job.FileUid;
job.JobFileName = fileName;
manifestItem.BinaryFileName = job.JobFileName;
manifestItem.FileUid = job.FileUid;
manifestItem.JobUid = job.JobUid;
manifestItem.PartName = string.Empty;
manifestItem.SequenceNumber = 0;
manifestItems.Add(manifestItem);
errorMessages = DataSyncService.Instance.ValidateForUploadPfgDbJobToDatabase(out newVersionNumber, out newRevisionTag, out errorObj, out newManifestItems, HalID, job, false);
if (manifestItems.Count == 0)
manifestItems = newManifestItems;
if (errorMessages.Count > 0)
{
if (errorMessages.Count > 1 || errorMessages[0].IndexOf("NOT AN ERROR") == -1)
{
return;
}
}
//upload new Job
Guid transferUid;
long a= GC.GetTotalMemory(false);
byte[] fileContents = File.ReadAllBytes(fileName);
fileContents = null;
GC.Collect();
long b = GC.GetTotalMemory(false);
//Assert.Fail((b - a).ToString());
//errorMessages = DataSyncService.Instance.UploadFileInAJob(out transferUid, out errorObj, job.UploadPersonHalId, job, manifestItem, fileContents);
DataSyncService.Instance.UploadPfgDbJobToDatabase(out errorObj, out versionNumber, job.UploadPersonHalId, job, false, manifestItems);
}
catch (Exception ex)
{
Assert.Fail("Error from Test Scripts: " + ex.Message);
}
}
Please review my test code. And if there is not any problem from my test code, I have to improve reason is not from my test code T_T
My guess would be that you hit the 2 GB object size limit of .NET (1500 * 8MB > 4GB).
You should consider to change to .NET 4.5 and use the large object mode - see here - the setting is called gcAllowVeryLargeObjects.

How to redirect output from the NASM command line assembler in C#

Brief Summary
I am creating a lightweight IDE for NASM development in C# (I know kind of an irony). Kinda of like Notepad++ but simpler but with features that make it more than source editor. Since Notepad++ is really just a fancy source editor. I have already implemented features like Project creation (using a project format similar to how Visual Studio organizes projects). Project extension .nasmproj. I am also in the works of hosting it in an open-source place (Codeplex). Although the program is far from finish, and definitely cannot be used in a production environment without proper protection and equipment. In addition, I am working alone with it at this moment, more like a spare time project since I just finished my last Summer final taking Calculus I.
Problem
Right now I am facing a problem, I can build the project but no output from NASM is being fed into the IDE. I have succesfully built a project, and I was able to produce object files. I even tried producing a syntax error to see if I finally see something come up but none and I check the bin folder of the test project I created and I see no object file creating. So definitely NASM is doing its magic. Is it because NASM doesn't want me to see its output. Is there a solution? Any advice would be great. Here is the code which I think is giving Trouble.
Things to Note
I have already checked if events have been invoked. An yes they have but they return empty strings
I have also checked error data and same effect.
Code
public static bool Build(string arguments, out Process nasmP)
{
try
{
ProcessStartInfo nasm = new ProcessStartInfo("nasm", arguments);
nasm.CreateNoWindow = true;
nasm.RedirectStandardError = true;
nasm.RedirectStandardInput = true;
nasm.RedirectStandardOutput = true;
nasm.UseShellExecute = false;
nasmP = new Process();
nasmP.EnableRaisingEvents = true;
nasmP.StartInfo = nasm;
bool predicate = nasmP.Start();
nasmP.BeginOutputReadLine();
return true;
}
catch
{
nasmP = null;
return false;
}
}
//Hasn't been tested nor used
public static bool Clean(string binPath)
{
if (binPath == null || !Directory.Exists(binPath))
{
throw new ArgumentException("Either path is null or it does not exist!");
}
else
{
try
{
DirectoryInfo binInfo = new DirectoryInfo(binPath);
FileInfo[] filesInfo = binInfo.GetFiles();
for (int index = 0; index < filesInfo.Length; index++)
{
try
{
filesInfo[index].Delete();
filesInfo[index] = null;
}
catch
{
break;
}
}
GC.Collect();
return true;
}
catch
{
return false;
}
}
}
}
using (BuildDialog dlg = new BuildDialog(currentSolution))
{
DialogResult result = dlg.ShowDialog();
dlg.onOutputRecieved += new BuildDialog.OnOutputRecievedHandler(delegate(Process _sender, string output)
{
if (result == System.Windows.Forms.DialogResult.OK)
{
outputWindow.Invoke(new InvokeDelegate(delegate(string o)
{
Console.WriteLine("Data:" + o);
outputWindow.Text = o;
}), output);
}
});
}
Edits
I have tried doing synchronously instead of asynchronously but still the same result (and empty string "" is returned) actually by debugging the stream is already at the end. So looks like nothing has been written into the stream.
This is what I tried:
string readToEnd = nasmP.StandardOutput.ReadToEnd();
nasmP.WaitForExit();
Console.WriteLine(readToEnd);
And another interesting thing I have tried was I copied the arguments from the debugger and pasted it in the command line shell and I can see NASM compiling and giving the error that I wanted to see all along. So definitely not a NASM problem. Could it be a problem with my code or the .Net framework.
Here is a nice snapshot of the shell window (although not technically proof; this is what the output should look like in my IDE):
Alan made a very good point, check the sub processes or threads. Is sub process and thread synonymous? But here is the problem. Almost all the properties except a select few and output/error streams are throwing an invalid operation. Here is the debugger information as an image (I wish Visual Studio would allow you to copy the entire information in click):
Okay I finally was able to do it. I just found this control that redirect output from a process and I just looked at the source code of it and got what I needed to do. Here is the the modified code:
public static bool Build(string command, out StringBuilder buildOutput)
{
try
{
buildOutput = new StringBuilder();
ProcessStartInfo startInfo = new ProcessStartInfo("cmd.exe");
startInfo.Arguments = "/C " + " nasm " + command;
startInfo.RedirectStandardError = true;
startInfo.RedirectStandardOutput = true;
startInfo.UseShellExecute = false;
startInfo.CreateNoWindow = true;
Process p = Process.Start(startInfo);
string output = p.StandardOutput.ReadToEnd();
string error = p.StandardError.ReadToEnd();
p.WaitForExit();
if (output.Length != 0)
buildOutput.Append(output);
else if (error.Length != 0)
buildOutput.Append(error);
else
buildOutput.Append("\n");
return true;
}
catch
{
buildOutput = null;
return false;
}
}
Here is how the output is formatted like:
I also wanted to thank Alan for helping me debug my code, although he didn't physically had my code. But he really was helpful and I thank him for it.

Categories

Resources