Ultra Fast Text to Speech (WAV -> MP3) in ASP.NET MVC - c#

This question is essentially about the suitability of Microsoft's Speech API (SAPI) for server workloads and whether it can be used reliably inside of w3wp for speech synthesis. We have an asynchronous controller that uses uses the native System.Speech assembly in .NET 4 (not the Microsoft.Speech one that ships as part of Microsoft Speech Platform - Runtime Version 11) and lame.exe to generate mp3s as follows:
[CacheFilter]
public void ListenAsync(string url)
{
string fileName = string.Format(#"C:\test\{0}.wav", Guid.NewGuid());
try
{
var t = new System.Threading.Thread(() =>
{
using (SpeechSynthesizer ss = new SpeechSynthesizer())
{
ss.SetOutputToWaveFile(fileName, new SpeechAudioFormatInfo(22050, AudioBitsPerSample.Eight, AudioChannel.Mono));
ss.Speak("Here is a test sentence...");
ss.SetOutputToNull();
ss.Dispose();
}
var process = new Process() { EnableRaisingEvents = true };
process.StartInfo.FileName = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, #"bin\lame.exe");
process.StartInfo.Arguments = string.Format("-V2 {0} {1}", fileName, fileName.Replace(".wav", ".mp3"));
process.StartInfo.UseShellExecute = false;
process.StartInfo.RedirectStandardOutput = false;
process.StartInfo.RedirectStandardError = false;
process.Exited += (sender, e) =>
{
System.IO.File.Delete(fileName);
AsyncManager.OutstandingOperations.Decrement();
};
AsyncManager.OutstandingOperations.Increment();
process.Start();
});
t.Start();
t.Join();
}
catch { }
AsyncManager.Parameters["fileName"] = fileName;
}
public FileResult ListenCompleted(string fileName)
{
return base.File(fileName.Replace(".wav", ".mp3"), "audio/mp3");
}
The question is why does SpeechSynthesizer need to run on a separate thread like that in order to return (this is reported elsewhere on SO here and here) and whether implementing a STAThreadRouteHandler for this request is more-efficient/scalable than the approach above?
Second, what are the options for running SpeakAsync in an ASP.NET (MVC or WebForms) context? None of the options I've tried seem to work (see update below).
Any other suggestions for how to improve this pattern (i.e. two dependencies that must execute serially to each other but each has async support) are welcome. I don't feel this scheme is sustainable under load, especially considering the known memory leaks in SpeechSynthesizer. Considering running this service on a different stack all together.
Update:
Neither of the Speak or SpeakAsnc options appear to work under the STAThreadRouteHandler. The former produces:
System.InvalidOperationException: Asynchronous operations are not
allowed in this context. Page starting an asynchronous operation has
to have the Async attribute set to true and an asynchronous operation
can only be started on a page prior to PreRenderComplete event. at
System.Web.LegacyAspNetSynchronizationContext.OperationStarted() at
System.ComponentModel.AsyncOperationManager.CreateOperation(Object
userSuppliedState) at
System.Speech.Internal.Synthesis.VoiceSynthesis..ctor(WeakReference
speechSynthesizer) at
System.Speech.Synthesis.SpeechSynthesizer.get_VoiceSynthesizer() at
System.Speech.Synthesis.SpeechSynthesizer.SetOutputToWaveFile(String
path, SpeechAudioFormatInfo formatInfo)
The latter results in:
System.InvalidOperationException: The asynchronous action method
'Listen' cannot be executed synchronously. at
System.Web.Mvc.Async.AsyncActionDescriptor.Execute(ControllerContext
controllerContext, IDictionary`2 parameters)
It seems like a custom STA thread pool (with ThreadStatic instances of the COM object) is a better approach: http://marcinbudny.blogspot.ca/2012/04/dealing-with-sta-coms-in-web.html
Update #2: It doesn't seem like System.Speech.SpeechSynthesizer needs STA treatment, seems to run fine on MTA threads so long as you follow that Start/Join pattern. Here's a new version that is able to correctly use SpeakAsync (issue there was disposing it prematurely!) and breaks up the WAV generation and the MP3 generation into two separate requests:
[CacheFilter]
[ActionName("listen-to-text")]
public void ListenToTextAsync(string text)
{
AsyncManager.OutstandingOperations.Increment();
var t = new Thread(() =>
{
SpeechSynthesizer ss = new SpeechSynthesizer();
string fileName = string.Format(#"C:\test\{0}.wav", Guid.NewGuid());
ss.SetOutputToWaveFile(fileName, new SpeechAudioFormatInfo(22050,
AudioBitsPerSample.Eight,
AudioChannel.Mono));
ss.SpeakCompleted += (sender, e) =>
{
ss.SetOutputToNull();
ss.Dispose();
AsyncManager.Parameters["fileName"] = fileName;
AsyncManager.OutstandingOperations.Decrement();
};
CustomPromptBuilder pb = new CustomPromptBuilder(settings.DefaultVoiceName);
pb.AppendParagraphText(text);
ss.SpeakAsync(pb);
});
t.Start();
t.Join();
}
[CacheFilter]
public ActionResult ListenToTextCompleted(string fileName)
{
return RedirectToAction("mp3", new { fileName = fileName });
}
[CacheFilter]
[ActionName("mp3")]
public void Mp3Async(string fileName)
{
var process = new Process()
{
EnableRaisingEvents = true,
StartInfo = new ProcessStartInfo()
{
FileName = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, #"bin\lame.exe"),
Arguments = string.Format("-V2 {0} {1}", fileName, fileName.Replace(".wav", ".mp3")),
UseShellExecute = false,
RedirectStandardOutput = false,
RedirectStandardError = false
}
};
process.Exited += (sender, e) =>
{
System.IO.File.Delete(fileName);
AsyncManager.Parameters["fileName"] = fileName;
AsyncManager.OutstandingOperations.Decrement();
};
AsyncManager.OutstandingOperations.Increment();
process.Start();
}
[CacheFilter]
public ActionResult Mp3Completed(string fileName)
{
return base.File(fileName.Replace(".wav", ".mp3"), "audio/mp3");
}

I/O is very expensive on a server. how many multiple streams of wav writting do you think you can get on a server hard drive? Why not do it all in memory and only write the mp3 when it's fully processed? mp3's are much smaller and the I/O will be engaged for a small amount of time. You can even change the code to return the stream directly to the user instead of saving to an mp3 if you want.
How do can I use LAME to encode an wav to an mp3 c#

This question is a bit old now, but this is what I'm doing and it's been working great so far:
public Task<FileStreamResult> Speak(string text)
{
return Task.Factory.StartNew(() =>
{
using (var synthesizer = new SpeechSynthesizer())
{
var ms = new MemoryStream();
synthesizer.SetOutputToWaveStream(ms);
synthesizer.Speak(text);
ms.Position = 0;
return new FileStreamResult(ms, "audio/wav");
}
});
}
might help someone...

Related

C# Process.StandardInput.Write deadlocks/hangs when not using StreamWriter.Close

I have a C# program that wants to interact with an external process written in C++. I believe this C++ process is using correct standard input. I just can't seem to get my C# code to not hang when trying to write to Process.StandardInput.
I've seen countless examples using Process.StandardInput.Close() when done writing. Every StackOverflow answer I found says to use this, and it does work. The problem is I can't close the StreamWriter because I'm not done interacting with the process. The process is a state machine that holds variables created using stdin, parses expressions, and returns an evaluation. I am expected to keep giving the process input after each output.
Does anyone have an example where Process.StandardInput.WriteLine is used more than once without closing or restarting the process?
This is how the C++ process is reading input. This example simply echos back the input and waits for another.
int main () {
std::string input;
while (getline(std::cin, input)) {
std::cout << input << std::endl;
}
}
My C# program tries to interact with this process using this wrapper class.
public class Expression {
System.Diagnostics.Process p;
public Expression () {
p = new System.Diagnostics.Process();
p.StartInfo.UseShellExecute = false;
p.StartInfo.RedirectStandardOutput = true;
p.StartInfo.RedirectStandardInput = true;
p.StartInfo.FileName = "InputEcho.exe";
p.Start();
p.StandardInput.AutoFlush = true;
}
public void Run (in string input, out string output) {
p.StandardInput.WriteLine(input);
// p.StandardInput.Close();
output = p.StandardOutput.ReadToEnd();
}
}
This works when I uncomment p.StandardInput.Close() but then subsequent calls to Expression.Run() won't work because the writer is closed.
Main program
Expression expn = new();
string output;
Console.WriteLine("Expression start");
expn.Run("Hello", output);
Console.WriteLine(output);
expn.Run("Hi", output);
Console.WriteLine(output);
Expected output
Expression start
Hello
Hi
Actual output
Expression start
EDIT:
#Matthew Andrews provided a really good answer that works, but it's not quite what I'm after. I didn't think about using event delegates to receive output data, and I see why: It's hard to implement this into the wrapper that I want to use to build a process-relevant API. What I mean by this is that I want to write some method that communicates with the process, give it input, receive the output, and return this data to the caller before doing anything else. My Expression.Run method exemplifies this perfectly.
Here's an example of what the root caller would look like in a greater C# program.
bool GetConditionEval (string condition, SomeDataType data) {
// Makes another call to 'Run' that commands the C++ process to store a variable
// Input looks like this: "variableName = true" (aka key/value pairs)
Expression.SetVar(data.name, "true");
// Don't ask why I'm using an external process to set variables using string expressions.
// It's a company proprietary thing.
string output;
Expression.Run(in condition, out output);
if (output.ToLower() == "true") return true;
else if (output.ToLower() == "false") return false;
else throw new Exception("Output is something other than true or false.");
}
This is why I'd like for Run to immediately return the output it receives from the process.
If not, I guess I could find a way for a delegate method to store the output in a global container and the GetConditionEval can just reach into that. I worry about race conditions though.
Side note:
Since I do expect the API that is contained in this C++ process to eventaully take other forms, spinning this up as a standalone process and invoking the API via stdin is really a stopgap for now so I don't have to convert thousands of lines of C++ code into C#.
SOLUTION:
I figured out a solution using the asynchronous method Matthew suggested while having a linear process of sending input and working immediately off the output in the same sequence. I reconfigured my wrapper class to queue each output received from the event listener. This sets up a pattern where I can call one method to send input, and then call another method right after to pop output data off the queue if any. I compensated for the fact that output data might not be avaliable immediately by simply waiting if the queue is empty and then moving forward once something is there. This unfortuately makes it a blocking call if it does have to wait, but it's the best I have so far. I also implemented a failsafe so it doesn't wait indefinately.
public class Expression {
System.Diagnostics.Process p = new();
System.Collections.Generic.Queue<string> outputQ = new();
public Expression () {
p.StartInfo.UseShellExecute = false;
p.StartInfo.RedirectStandardOutput = true;
p.StartInfo.RedirectStandardInput = true;
p.StartInfo.FileName = "C2E2.exe";
p.OutputDataReceived += (s, e) => {
outputQ.Enqueue(e.Data);
};
p.Start();
p.BeginOutputReadLine();
}
/// Returns custom exception object if error is encountered.
public GRLib.Exception Run (in string input) {
if (p == null) return GRLib.Exception.New("Expression Evaluator not operational.");
try {
p.StandardInput.WriteLine(input);
}
catch (Exception e) {
return GRLib.Exception.New(e.Message);
}
return null;
}
/// Returns error code 1 if timeout occured.
/// Timeout is represented in milliseconds.
/// Blocking call.
public GRLib.Exception GetOutput (out string output, int timeout = 2000) {
/// Wait for something to show in the queue.
/// Waits indefinitely if timeout is 0.
/// If anyone knows a better way to implement this waiting loop,
/// please let me know!
int timeWaited = 0;
while (outputQ.Count == 0) {
System.Threading.Thread.Sleep(100);
if (timeout != 0 && (timeWaited += 100) > timeout) {
output = "ERR";
return GRLib.Exception.New(1, "Get timed out.");
}
}
output = outputQ.Dequeue();
return null;
}
...
}
Example usage
Expression expression = new();
var e = expression.Run("3 > 2");
if (e != null) // Handle error
string output;
e = expression.GetOutput(out output);
if (e != null) // Handle error
// 'output' should now be 'true' which can then be used in other parts of this program.
While the event listener in a standalone fashion works great, I need the output from the process to be returned in the same stack where the input is given because this is going to be part of a more complex call graph.
The problem you're observing is due to the synchronous nature of Process.StandardOutput.ReadToEnd(). Instead, you should listen for your output asynchronously by setting Process.BeginOutputReadLine() and utilizing the Process.OutputDataReceived event.
Here is a quick example to get you started:
var p = new Process();
p.StartInfo.UseShellExecute = false;
p.StartInfo.RedirectStandardInput = true;
p.StartInfo.RedirectStandardOutput = true;
p.StartInfo.FileName = #"ConsoleApplication1.exe";
p.OutputDataReceived += (s, e) =>
{
Console.WriteLine(e.Data);
};
p.Start();
p.BeginOutputReadLine();
while (true)
{
var readLine = Console.ReadLine();
p.StandardInput.WriteLine(readLine);
}
And here is the c++ I used for ConsoleApplication1.exe:
int main()
{
std::cout << "Hello World!\n";
std::string input;
while (std::getline(std::cin, input)) {
std::cout << input << std::endl;
}
}
Running my example will print Hello World! and then proceed to parrot whatever else you enter into the console.

How can I get output from powershell in C# while a command is running?

I'm using powershell in C# with system.management.automation and I can access both the output and the error stream sucessfully. For most applications this is great but i'm right now in a situation where I need to get the output of a powershell command while it is running in c# and i'm lost.
I've tried subscribing to outputcollection.DataAdded, i've tried subscribing to the powershell instance verbose stream, but neither of them are getting called when powershell gives an output.
Here's the code I have so far
public async Task<string> CMD(string script)
{
ps = PowerShell.Create();
string errorMsg = "";
string output;
ps.AddScript(script);
ps.AddCommand("Out-String");
PSDataCollection<PSObject> outputCollection = new();
ps.Streams.Error.DataAdded += (object sender, DataAddedEventArgs e) =>
{ errorMsg = ((PSDataCollection<ErrorRecord>)sender)[e.Index].ToString(); };
IAsyncResult result = ps.BeginInvoke<PSObject, PSObject>(null, outputCollection);
while (!result.IsCompleted)
{
await Task.Delay(100);
}
StringBuilder stringBuilder = new();
foreach (PSObject outputItem in outputCollection)
{
stringBuilder.AppendLine(outputItem.BaseObject.ToString());
}
output = stringBuilder.ToString();
//Clears commands added to runspace
ps.Commands.Clear();
Debug.WriteLine(output);
if (!string.IsNullOrEmpty(errorMsg))
MessageBox.Show(errorMsg, "Error");
return output.Trim();
}
I've also tried checking the outputcollection in the while loop but it doesn't give me the output until the command is done.
The command i'm trying to use is Connect-ExchangeOnline -Device
To simulate it in C# it would work the same as doing sleep 5;echo test;sleep 5
where I then want the program to display test after 5 seconds not after the full 10 seconds.
EDIT:
When using "Connect-ExchangeOnline -Device" powershell will deliver this output and wait for the user to complete said task. The issue being that I can't display this in C# because my C# code waits for the powershell command to be finished. And outputcollection.DataAdded never seems to be called.
To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code CDWS27A56 to authenticate.
Unfortunately, Connect-ExchangeOnline is meant only for interactive console use, and specifically gets around allowing the output to be captured (Possibly by writing directly to the $host window it was called from?).
Normally, you could try using Tee-Object or Start-Transcript/Stop-Transcript with redirection to dump all output:
Connect-ExchangeOnline -Device *>&1 | Tee-Object -FilePath C:\temp\tee.txt -Append
# Then from another process:
Get-Content C:\temp\tee.txt
Or try starting a powershell Job, which keeps all of its output in the job object's properties:
$job = Start-ThreadJob -Name Connecting -ScriptBlock { Connect-ExchangeOnline -Device }
# Wait for prompt...
$job.Output
$job.Information
However, neither of these actually grab the device authentication code.
Currently to use -Device, you need to have a visible powershell window and have the user complete their device authentication there.
You can always use one of the other authentication types:
Connect-ExchangeOnline -UserPrincipalName username#domain.tld
This version will automatically launch a Modern Authentication prompt or a browser page. Depending on your use case, it is effectively the same.
I would pipe the script outputs to a file, then have your c# code read that file, and filter out the code.
Alternatively you could use a class that exposes 2 StringBuilders as properties through which you can use to get the script output and filter out the code:
using System.Diagnostics;
using System.Text;
public sealed class ProcessOptions
{
public bool WaitForProcessExit { get; set; }
public bool Executing { get; internal set; } = true;
}
public class InvokePowershell
{
public static StringBuilder stdout { get; private set; } = null;
public static StringBuilder stderr { get; private set; } = null;
public void string Start(string script)
{
var process = new Process();
var options = new ProcessOptions()
{
WaitForProcessExit = true,
};
process.StartInfo.FileName = "powershell"; // (or pwsh for powershell core).
process.StartInfo.Arguments = script;
process.StartInfo.RedirectStandardOutput = true;
process.StartInfo.RedirectStandardError = true;
process.StartInfo.UseShellExecute = true; // true does not always work so use caution with this.
process.StartInfo.CreateNoWindow = true;
process.StartInfo.WindowStyle = ProcessWindowStyle.Hidden;
process.StartInfo.WorkingDirectory = Directory.GetCurrentDirectory();
process.Shell(options)
}
private static void Shell(this Process process, ProcessOptions options)
{
if (stdout is not null && stderr is not null)
{
stdout = null;
stderr = null;
}
process.OutputDataReceived += (_, e) =>
{
if (e.Data is null)
{
return;
}
if (stdout is null)
{
stdout = new StringBuilder();
}
else
{
stdout.AppendLine();
}
stdout.Append(e.Data);
};
process.ErrorDataReceived += (_, e) =>
{
if (e.Data is null)
{
return;
}
if (stderr is null)
{
stderr = new StringBuilder();
}
else
{
stderr.AppendLine();
}
stderr.Append(e.Data);
};
process.Start();
options.Executing = false;
if (process.StartInfo.RedirectStandardError)
process.BeginErrorReadLine();
if (process.StartInfo.RedirectStandardOutput)
process.BeginOutputReadLine();
if (options.WaitForProcessExit)
process.WaitForExit();
}
}
And then you could then make a way to Fire and forget that code (so that way it does not get blocked until powershell exits, then simply in the code you can do the following in your normal code:
while (InvokePowershell.stdout is null || InvokePowershell.stdout.Length < /* length you expect it to be*/)
{
// do nothing but by wait by sleeping for a few milliseconds to avoid wasting cpu cycles with this.
}
// strip the code from the stdout property then use it.
I find doing something like this is much more cleaner, plus then it could easily be ported to powershell core by changing powershell on the Start function to use pwsh which is the program name for powershell core and plus then the code would work on all platforms that powershell core supports which are Windows, Mac, and various other linux distributions out there.
Additionally for this to work powershell or even pwsh if powershell core is wanted to be used instead of powershell that the program must be in the PATH environment variable so it can be invoked inside of the terminal directly.
Also with the code above, you could theoretically not Wait for process exit, however I do not know if those events would trigger and populate the StringBuilders then, likewise the process instance would leave scope and be GC'd resulting in the events also getting GC'd and then the StringBuilders never getting assigned to.
As such that is why I recommend calling InvokePowershell.Start(script); as a delegate as a fire-and-forget call. Then doing a loop that checks if null or is smaller than the expected length of the string outputs then sleep for a few cpu clockcycles (each clockcycle is less than a second), and then filtering out the results from there after that while loop ensures that it is populated for the preprocessing that comes after the loop.
Edit: Instead of having to call InvokePowershell.Start(script); in a fire-and-forget, you can replace the call to process.WaitForExit() and the if check for it entirely with the while loop shown above, pass in the length you expect to the method, and to the (Shell method by adding it as a parameter to it and remove the options argument, instantiation, and type entirely), and then after the while loop breaks (to allow time for the event handlers to add what you need to the property's stringbuilders, you can call process.Kill(); to kill powershell or powershell core.
You can use the code below to get the output of a PowerShell command in real time from a C# application.
This uses a PowerShell Pipeline, which allows you to call a notification handler whenever the PowerShell command/script writes output into the Pipeline. I've implemented the solution below as an async enumerable but if you wanted something non-async you can also just use the Pipeline.Output.DataReady handler to trigger some code to read from the pipeline.
https://gist.github.com/OnKey/83cf98e6adafe5a2b4aaf561b138087b
static async Task Main(string[] args)
{
var script = #"
For ($i=0; $i -le 5; $i++) {
$i
Start-Sleep -s 1
}
";
var p = new Program();
await foreach (var item in p.PowerShellAsyncEnumerable(script))
{
Console.WriteLine(item);
}
}
private IAsyncEnumerable<PSObject> PowerShellAsyncEnumerable(string script)
{
var rs = RunspaceFactory.CreateRunspace();
rs.Open();
var pipeline = rs.CreatePipeline();
pipeline.Commands.AddScript(script);
return new PsAsyncEnumerable(pipeline);
}
internal class PsAsyncEnumerable : IAsyncEnumerable<PSObject>
{
private readonly Pipeline pipe;
public PsAsyncEnumerable(Pipeline pipe) => this.pipe = pipe;
public IAsyncEnumerator<PSObject> GetAsyncEnumerator(CancellationToken cancellationToken = new())
=> new PsAsyncEnumerator(this.pipe);
}
internal class PsAsyncEnumerator : IAsyncEnumerator<PSObject>
{
private readonly Pipeline pipe;
private TaskCompletionSource dataReady = new();
public PsAsyncEnumerator(Pipeline pipe)
{
this.pipe = pipe;
this.pipe.Output.DataReady += NotificationHandler;
this.pipe.Error.DataReady += NotificationHandler;
this.pipe.InvokeAsync();
}
private void NotificationHandler(object sender, EventArgs e)
{
this.dataReady.SetResult();
}
public ValueTask DisposeAsync()
{
this.pipe.Dispose();
return ValueTask.CompletedTask;
}
public async ValueTask<bool> MoveNextAsync()
{
while (!this.pipe.Output.EndOfPipeline)
{
var item = this.pipe.Output.NonBlockingRead(1).FirstOrDefault();
if (item != null)
{
this.Current = item;
return true;
}
await this.dataReady.Task;
this.dataReady = new TaskCompletionSource();
}
return false;
}
public PSObject Current { get; private set; }
}
1. In C# Start BackgroundWorker bw;
2. In bw.DoWork(...) Start PowerShell.
3. In PowerShell write to a File and close it.
4. In the Main thread of C# read the File.
===
using System.ComponentModel;
BackgroundWorker bw = new();
bw.DoWork += Bw_DoWork;
private void Bw_DoWork(object sender, DoWorkEventArgs e)
{
<Start ps>
}
=== in ps
Set obj=CreateObject("Scripting.FileSystemObject")
outFile="C:\File.txt"
Set objFile = obj.CreateTextFile(outFile,True)
objFile.Write "test string"
objFile.Close # it makes file accessible outside ps
=== In the Main thread
<read the C:\File.txt>
===

Command prompt from C# get stuck

I have asked this question the other day, but neither I had an answer nor could I made it work. So I tried to slim it down as it was a lot of noise in the question.
Thing is, if I expose in a web api a all to a method that runs cmd.exe it works fine if I don't call it two times per request.
I mean, this code works fine:
public class FilesController : ApiController
{
private readonly IRunner _runner;
public FilesController(IRunner runner)
{
_runner = runner;
}
public string Get()
{
return _runner.GetFiles();
}
}
public class Runner : IRunner
{
public Runner()
{
//var cd = #"cd C:\DummyFolder";
//RunCmdPromptCommand(cd);
}
public string GetFiles()
{
var dir = #"cd C:\DummyFolder & dir";
//var dir = "dir";
return RunCmdPromptCommand(dir);
}
private string RunCmdPromptCommand(string command)
{
var process = new Process
{
StartInfo =
{
UseShellExecute = false,
CreateNoWindow = true,
WindowStyle = ProcessWindowStyle.Hidden,
RedirectStandardError = true,
RedirectStandardOutput = true,
FileName = #"cmd.exe",
Arguments = string.Format("/C {0}", command)
}
};
process.Start();
var error = process.StandardError.ReadToEnd();
if (!string.IsNullOrEmpty(error))
{
throw new Exception(error);
}
var output = process.StandardOutput.ReadToEnd();
process.WaitForExit();
return output;
}
}
But if I uncomment the lines commented (and obviously comment out the first line of GetFiles, when the code reaches for the second time (i.e. with "dir") the RunCmdPromptCommand it gets stuck in the line where it tries to read the standard error.
I don't know why, and I don't know how to force the exit whenever it could happen (might be other scenarios that can happen)
Thanks,
This is because the:
process.StandardOutput.ReadToEnd();
Is a synchronous operation.
Excerpt from MSDN:
The redirected StandardError stream can be read synchronously or
asynchronously. Methods such as Read, ReadLine, and ReadToEnd perform
synchronous read operations on the error output stream of the process.
These synchronous read operations do not complete until the associated
Process writes to its StandardError stream, or closes the stream.
In other words, as long as the process doesn't write any standard error or closes the stream, it will get stuck there forever.
To fix this, I recommend to use Async BeginErrorReadLine. Excerpt from MSDN:
In contrast, BeginErrorReadLine starts asynchronous read operations on
the StandardError stream. This method enables a designated event
handler for the stream output and immediately returns to the caller,
which can perform other work while the stream output is directed to the event handler.
Which I think will be suitable for your need.
To use that. the example given in the MSDN is pretty straightforward. Check out especially these lines:
netProcess.ErrorDataReceived += new DataReceivedEventHandler(NetErrorDataHandler); //note this event handler add
if (errorRedirect) //in your case, it is not needed
{
// Start the asynchronous read of the standard
// error stream.
netProcess.BeginErrorReadLine(); //note this
}
And how to define the event handler:
private static void NetErrorDataHandler(object sendingProcess,
DataReceivedEventArgs errLine)
{
// Write the error text to the file if there is something
// to write and an error file has been specified.
if (!String.IsNullOrEmpty(errLine.Data))
{
if (!errorsWritten)
{
if (streamError == null)
{
// Open the file.
try
{
streamError = new StreamWriter(netErrorFile, true);
}
catch (Exception e)
{
Console.WriteLine("Could not open error file!");
Console.WriteLine(e.Message.ToString());
}
}
if (streamError != null)
{
// Write a header to the file if this is the first
// call to the error output handler.
streamError.WriteLine();
streamError.WriteLine(DateTime.Now.ToString());
streamError.WriteLine("Net View error output:");
}
errorsWritten = true;
}
if (streamError != null)
{
// Write redirected errors to the file.
streamError.WriteLine(errLine.Data);
streamError.Flush();
}
}
}

How to refresh ErrorDataReceived to process faster?

I'm using sox.exe to play some audio files.
This is how I'm calling it:
SoxPlayer = new Process
{
StartInfo = new ProcessStartInfo
{
CreateNoWindow = true,
RedirectStandardError = true,
UseShellExecute = false,
FileName = Play,
Arguments = arg,
WorkingDirectory = Application.StartupPath + "\\bin\\"
}
};
and this is the code that should be interpreting the StandardError output:
private void UpdatePlaybackTime(string output)
{
if (string.IsNullOrEmpty(output)) return;
if (!output.Contains("%") || !output.Contains("[")) return;
var index1 = output.IndexOf("%", StringComparison.Ordinal) + 1;
var index2 = output.IndexOf("[", StringComparison.Ordinal);
var time = output.Substring(index1, index2 - index1).Trim();
var times = time.Split(new[] { ":" }, StringSplitOptions.None);
var seconds = Convert.ToDouble(times[0]) * 3600;
seconds = seconds + (Convert.ToDouble(times[1]) * 60);
seconds = seconds + (Convert.ToDouble(times[2]));
if (seconds == 0 || seconds < PlaybackSeconds) return;
PlaybackSeconds = seconds;
}
My goal is to get the playback time from the sox output as accurately as possible, rather than work (as I was doing before) with an internal timer that may lose sync with sox's own.
My first attempt was using this recommendation I found online:
SoxPlayer.ErrorDataReceived += (sender, args) => UpdatePlaybackTime(args.Data);
SoxPlayer.Start();
SoxPlayer.BeginErrorReadLine();
This current code "works" in that I get the information I want, but it seems like UpdatePlaybackTime() is being called every 5 seconds or so. When it's called, the info obtained is accurate, but obviously I want to update the playback info several times per second, not every 5 seconds.
My understanding is that what is happening is that UpdatePlaybackTime is being called when the StandardError buffer gets full. I've tried calling SoxPlayer.BeginErrorReadLine() with my player timer but it says it's already running asynchronously. I've tried SoxPlayer.StandardError.DiscardBufferedData() but it throws an exception because of the asynchronous process that is ongoing.
So, how can I manage to capture the playback information how I need? Thank you in advance!
EDIT:
After discussing this code and how it's not working because of buffering, I've also tried the following inside a separate BackgroundWorker thread, with the same result (i.e. updates only about every 5 seconds):
SoxPlayer.Start();
SoxTimer.RunWorkerAsync();
private void SoxTimer_DoWork(object sender, System.ComponentModel.DoWorkEventArgs e)
{
var sr = new StreamReader(SoxPlayer.StandardError.BaseStream);
while (sr.Peek() > 0)
{
var line = sr.ReadLine();
UpdatePlaybackTime(line);
}
}
private void SoxTimer_RunWorkerCompleted(object sender, System.ComponentModel.RunWorkerCompletedEventArgs e)
{
if (!SoxPlayer.HasExited)
{
SoxTimer.RunWorkerAsync();
}
}
When this BackgroundWorker completes, it checks if SoxPlayer.HasExited, and if it hasn't, it runs again. This has the same effect as my first attempt. PlaybackSeconds is only getting updated about every 5 seconds, at which point it updates to the right time, and then the rest of the code that acts based on the PlaybackSeconds value works as well.
I also tried achieving the same by creating a Thread to work the reading of the StandardError output. Every instance results in the same, a 5 second or so delay between when UpdatePlaybackTime() gets called. When it does, it iterates through all the output that was sent to StandardError since the last time we iterated through it, so it then updates the PlaybackSeconds value very quickly in small increments and leaves it at the current value at that time. But again, one update as far as the user is concerned every 5 seconds.
Sox creators are adamant that the problem is not on their end. When played in a console window, output is constant. According to sox creators, every 0.1 seconds. If I tell sox to output is standarderror to a text file, the same happens. There's a constant updating of the information on the text file. Yet reading the StandardError stream itself, I have now spent the better part of two days with no acceptable results.
Thank you for your help.
EDIT 2:
Following Peter's advice below, here's a brand new project. Didn't even change the default names for anything. Same behavior as described so far. So I'm going back to blame (ahem, discuss with) the SoX peeps.
using System;
using System.Diagnostics;
using System.Threading;
using System.Windows.Forms;
namespace WindowsFormsApplication1
{
public partial class Form1 : Form
{
private Process SoxPlayer;
public Form1()
{
InitializeComponent();
}
private void button1_Click(object sender, EventArgs e)
{
var bin = Application.StartupPath + "\\bin\\";
SoxPlayer = new Process
{
StartInfo = new ProcessStartInfo
{
CreateNoWindow = true,
RedirectStandardError = true,
UseShellExecute = false,
FileName = bin + "sox.exe",
Arguments = "song.ogg -c 2 -d",
WorkingDirectory = bin
}
};
SoxPlayer.Start();
var thread = new Thread(() =>
{
int cch;
var rgch = new char[1];
while ((cch = SoxPlayer.StandardError.Read(rgch, 0, rgch.Length)) > 0)
{
var cch1 = cch;
label1.Invoke(new MethodInvoker(() => label1.Text = new string(rgch, 0, cch1)));
}
});
thread.Start();
}
private void button2_Click(object sender, EventArgs e)
{
SoxPlayer.Kill();
}
}
}
Here is a simple code example that does not reproduce the behavior you describe:
class Program
{
static void Main(string[] args)
{
Process process = new Process();
process.StartInfo.FileName = "SimpleStderrWriter.exe";
process.StartInfo.UseShellExecute = false;
process.StartInfo.RedirectStandardError = true;
process.StartInfo.RedirectStandardInput = true;
process.Start();
Thread thread = new Thread(() =>
{
int cch;
char[] rgch = new char[1];
Console.WriteLine("Reading stderr from process");
while ((cch = process.StandardError.Read(rgch, 0, rgch.Length)) > 0)
{
Console.Write(new string(rgch, 0, cch));
}
Console.WriteLine();
Console.WriteLine("Process exited");
});
Console.WriteLine("Press Enter to terminate process");
thread.Start();
Console.ReadLine();
process.StandardInput.WriteLine();
thread.Join();
}
}
Here is the code for the SimpleStderrWriter.exe program:
class Program
{
static void Main(string[] args)
{
bool exit = false;
Thread thread = new Thread(() =>
{
while (!Volatile.Read(ref exit))
{
Console.Error.Write('.');
Thread.Sleep(250);
}
});
thread.Start();
Console.ReadLine();
Volatile.Write(ref exit, true);
thread.Join();
}
}
This code example demonstrates clearly, by receiving and re-emitting the child process's stderr output as quickly as it's generated, that there is nothing in .NET that by default would cause the delay you are experiencing. The obvious conclusion is that either your advisors with respect to SoX are wrong and it does some buffering for some reason, or that you yourself have added something to your code that introduces the delay you are experiencing.
If you are positive the latter is not the case, then you need to go back to your SoX advisor and explain to them that they are mistaken. If you are positive that the SoX advisor is correct, then you need to post an example similar to the above, but which does reproduce the delay you are experiencing.

Stop thread until enough memory is available

Environment : .net 4.0
I have a task that transforms XML files with a XSLT stylesheet, here is my code
public string TransformFileIntoTempFile(string xsltPath,
string xmlPath)
{
var transform = new MvpXslTransform();
transform.Load(xsltPath, new XsltSettings(true, false),
new XmlUrlResolver());
string tempPath = Path.GetTempFileName();
using (var writer = new StreamWriter(tempPath))
{
using (XmlReader reader = XmlReader.Create(xmlPath))
{
transform.Transform(new XmlInput(reader), null,
new XmlOutput(writer));
}
}
return tempPath;
}
I have X threads that can launch this task in parallel.
Sometimes my input file are about 300 MB, sometimes it's only a few MB.
My problem : I get OutOfMemoryException when my program try to transform some big XML files in the same time.
How can I avoid these OutOfMemoryEception ? My idea is to stop a thread before executing the task until there is enough available memory, but I don't know how to do that. Or there is some other solution (like putting my task in a distinct application).
Thanks
I don't recommend blocking a thread. In worst case, you'll just end up starving the task that could potentially free the memory you needed, leading to deadlock or very bad performance in general.
Instead, I suggest you keep a work queue with priorities. Get the tasks from the Queue scheduled fairly across a thread pool. Make sure no thread ever blocks on a wait operation, instead repost the task to the queue (with a lower priority).
So what you'd do (e.g. on receiving an OutOfMemory exception), is post the same job/task onto the queue and terminate the current task, freeing up the thread for another task.
A simplistic approach is to use LIFO which ensures that a task posted to the queue will have 'lower priority' than any other jobs already on that queue.
Since .NET Framework 4 we have API to work with good old Memory-Mapped Files feature which is available many years within from Win32API, so now you can use it from the .NET Managed Code.
For your task better fit "Persisted memory-mapped files" option,
MSDN:
Persisted files are memory-mapped files that are associated with a
source file on a disk. When the last process has finished working with
the file, the data is saved to the source file on the disk. These
memory-mapped files are suitable for working with extremely large
source files.
On the page of MemoryMappedFile.CreateFromFile() method description you can find a nice example describing how to create a memory mapped Views for the extremely large file.
EDIT: Update regarding considerable notes in comments
Just found method MemoryMappedFile.CreateViewStream() which creates a stream of type MemoryMappedViewStream which is inherited from a System.IO.Stream.
I believe you can create an instance of XmlReader from this stream and then instantiate your custom implementation of the XslTransform using this reader/stream.
EDIT2: remi bourgarel (OP) already tested this approach and looks like this particular XslTransform implementation (I wonder whether ANY would) wont work with MM-View stream in way which was supposed
The main problem is that you are loading the entire Xml file. If you were to just transform-as-you-read the out of memory problem should not normally appear.
That being said I found a MS support article which suggests how it can be done:
http://support.microsoft.com/kb/300934
Disclaimer: I did not test this so if you use it and it works please let us know.
You could consider using a queue to throttle how many concurrent transforms are being done based on some sort of artificial memory boundary e.g. file size. Something like the following could be used.
This sort of throttling strategy can be combined with maximum number of concurrent files being processed to ensure your disk is not being thrashed too much.
NB I have not included necessary try\catch\finally around execution to ensure that exceptions are propogated to calling thread and Waithandles are always released. I could go into further detail here.
public static class QueuedXmlTransform
{
private const int MaxBatchSizeMB = 300;
private const double MB = (1024 * 1024);
private static readonly object SyncObj = new object();
private static readonly TaskQueue Tasks = new TaskQueue();
private static readonly Action Join = () => { };
private static double _CurrentBatchSizeMb;
public static string Transform(string xsltPath, string xmlPath)
{
string tempPath = Path.GetTempFileName();
using (AutoResetEvent transformedEvent = new AutoResetEvent(false))
{
Action transformTask = () =>
{
MvpXslTransform transform = new MvpXslTransform();
transform.Load(xsltPath, new XsltSettings(true, false),
new XmlUrlResolver());
using (StreamWriter writer = new StreamWriter(tempPath))
using (XmlReader reader = XmlReader.Create(xmlPath))
{
transform.Transform(new XmlInput(reader), null,
new XmlOutput(writer));
}
transformedEvent.Set();
};
double fileSizeMb = new FileInfo(xmlPath).Length / MB;
lock (SyncObj)
{
if ((_CurrentBatchSizeMb += fileSizeMb) > MaxBatchSizeMB)
{
_CurrentBatchSizeMb = fileSizeMb;
Tasks.Queue(isParallel: false, task: Join);
}
Tasks.Queue(isParallel: true, task: transformTask);
}
transformedEvent.WaitOne();
}
return tempPath;
}
private class TaskQueue
{
private readonly object _syncObj = new object();
private readonly Queue<QTask> _tasks = new Queue<QTask>();
private int _runningTaskCount;
public void Queue(bool isParallel, Action task)
{
lock (_syncObj)
{
_tasks.Enqueue(new QTask { IsParallel = isParallel, Task = task });
}
ProcessTaskQueue();
}
private void ProcessTaskQueue()
{
lock (_syncObj)
{
if (_runningTaskCount != 0) return;
while (_tasks.Count > 0 && _tasks.Peek().IsParallel)
{
QTask parallelTask = _tasks.Dequeue();
QueueUserWorkItem(parallelTask);
}
if (_tasks.Count > 0 && _runningTaskCount == 0)
{
QTask serialTask = _tasks.Dequeue();
QueueUserWorkItem(serialTask);
}
}
}
private void QueueUserWorkItem(QTask qTask)
{
Action completionTask = () =>
{
qTask.Task();
OnTaskCompleted();
};
_runningTaskCount++;
ThreadPool.QueueUserWorkItem(_ => completionTask());
}
private void OnTaskCompleted()
{
lock (_syncObj)
{
if (--_runningTaskCount == 0)
{
ProcessTaskQueue();
}
}
}
private class QTask
{
public Action Task { get; set; }
public bool IsParallel { get; set; }
}
}
}
Update
Fixed bug in maintaining batch size when rolling over to next batch window:
_CurrentBatchSizeMb = fileSizeMb;

Categories

Resources