C# Error reading two dates from a binary file - c#

When reading two dates from a binary file I'm seeing the error below:
"The output char buffer is too small to contain the decoded characters, encoding 'Unicode (UTF-8)' fallback 'System.Text.DecoderReplacementFallback'. Parameter name: chars"
My code is below:
static DateTime[] ReadDates()
{
System.IO.FileStream appData = new System.IO.FileStream(
appDataFile, System.IO.FileMode.Open, System.IO.FileAccess.Read);
List<DateTime> result = new List<DateTime>();
using (System.IO.BinaryReader br = new System.IO.BinaryReader(appData))
{
while (br.PeekChar() > 0)
{
result.Add(new DateTime(br.ReadInt64()));
}
br.Close();
}
return result.ToArray();
}
static void WriteDates(IEnumerable<DateTime> dates)
{
System.IO.FileStream appData = new System.IO.FileStream(
appDataFile, System.IO.FileMode.Create, System.IO.FileAccess.Write);
List<DateTime> result = new List<DateTime>();
using (System.IO.BinaryWriter bw = new System.IO.BinaryWriter(appData))
{
foreach (DateTime date in dates)
bw.Write(date.Ticks);
bw.Close();
}
}
What could be the cause? Thanks

The problem is that you're using PeekChar - that's trying to decode binary data as if it were a UTF-8 character. Unfortunately, I can't see anything else in BinaryReader which allows you to detect the end of the stream.
You could just keep calling ReadInt64 until it throws an EndOfStreamException, but that's pretty horrible. Hmm. You could call ReadBytes(8) and then BitConverter.ToInt64 - that would allow you to stop when ReadBytes returns a byte array with anything less than 8 bytes... it's not great though.
By the way, you don't need to call Close explicitly as you're already using a using statement. (That goes for both the reader and the writer.)

I think Jon is correct that it's PeekChar that chokes on the binary data.
Instead of streaming the data, you could get it all as an array and get the values from that:
static DateTime[] ReadDates() {
List<DateTime> result = new List<DateTime>();
byte[] data = File.ReadAllBytes(appDataFile);
for (int i = 0; i < data.Length; i += 8) {
result.Add(new DateTime(BitConverter.ToInt64(data, i)));
}
return result;
}

A simple solution to your problem would be to explicitly specify ASCII encoding for the BinaryReader, that way PeekChar() uses only a single byte and this kind of exception (a .NET bug, actually) doesn't happen:
using (System.IO.BinaryReader br = new System.IO.BinaryReader(appData, Encoding.ASCII))

Related

Reading a DAT file with BinaryReader in C#

Im trying to read a DAT file with BinaryReady but I get an exception and don't see why. " Unable to read beyond the end of the stream" is the message I get. My code looks like this:
private void button1_Click(object sender, EventArgs e)
{
OpenFileDialog OpenFileDialog = new OpenFileDialog();
OpenFileDialog.Title = "Open File...";
OpenFileDialog.Filter = "Binary File (*.dat)|*.dat";
OpenFileDialog.InitialDirectory = #"C:\";
if (OpenFileDialog.ShowDialog() == DialogResult.OK)
{
FileStream fs = new FileStream(OpenFileDialog.FileName, FileMode.Open);
BinaryReader br = new BinaryReader(fs);
label1.Text = br.ReadString();
label2.Text = br.ReadInt32().ToString();
fs.Close();
br.Close();
}
I hat a certain DAT file with a lot of information and was hoping to be able to read it out and maybe even place it in a table and plot the data. But its been a while I worked with C#. So if anyone could help me I would highly appreciate
BinaryReader is very rarely a good choice for reading an external file; it only really makes sense when used in parallel with code that writes the file using BinaryWriter, since they use the same conventions.
I imagine that what is happening here is that your call to ReadString is trying to use conventions that aren't valid for your file - specifically, it will be reading some bytes (I want to say 4, big endian?) as a length prefix for the string, and then trying to read that many bytes as the string. But if that isn't what the file contents are: that could easily try to read gibberish, interpret it as a huge number, and then fail to read that-many bytes.
If you're processing an arbitrary file (nothing to do with BinaryWriter), then you really need to know a lot about the protocol/format. Given that file extensions are ambiguous, I'm not going to infer anything from ".dat" - what matters is: what is the data and where did it come from?. Only with that information can a sensible comment on reading it be made.
From the comments, here's some (untested) code that should get you started in terms of parsing the contents as a span:
public static YourResultType Process(string path)
{
byte[] oversized = null;
try
{
int len, offset = 0, read;
// read the file into a leased buffer, for simplicity
using (var stream = File.OpenRead(path))
{
len = checked((int)stream.Length);
oversized = ArrayPool<byte>.Shared.Rent(len);
while (offset < len &&
(read = stream.Read(oversized, offset, len - offset)) > 0)
{
offset += read;
}
}
// now process the payload from the buffered data
return Process(new ReadOnlySpan<byte>(oversized, 0, len));
}
finally
{
if (oversized is object)
ArrayPool<byte>.Shared.Return(oversized);
}
}
private static YourResultType Process(ReadOnlySpan<byte> payload)
=> throw new NotImplementedException(); // your code here

Read n first Characters of a big Text File - C#

I have a very big text file, for example about 1 GB. I need to just read 100 first characters and nothing more.
I searched StackOverflow and other forums but all of them have some solutions which first read the whole file and then will return some n characters of the file.
I do not want to read and load the whole file into memory etc. just need the first characters.
You can use StreamReader.ReadBlock() to read a specified number of characters from a file:
public static char[] ReadChars(string filename, int count)
{
using (var stream = File.OpenRead(filename))
using (var reader = new StreamReader(stream, Encoding.UTF8))
{
char[] buffer = new char[count];
int n = reader.ReadBlock(buffer, 0, count);
char[] result = new char[n];
Array.Copy(buffer, result, n);
return result;
}
}
Note that this assumes that your file has UTF8 encoding. If it doesn't, you'll need to specify the correct encoding (in which case you could add an encoding parameter to ReadChars() rather than hard-coding it).
The advantage of using ReadBlock() rather than Read() is that it blocks until either all the characters have been read, or the end of the file has been reached. However, for a FileStream this is of no consequence; just be aware that Read() can return less bytes than asked for in the general case, even if the end of the stream has not been reached.
If you want an async version you can just call ReadBlockAsync() like so:
public static async Task<char[]> ReadCharsAsync(string filename, int count)
{
using (var stream = File.OpenRead(filename))
using (var reader = new StreamReader(stream, Encoding.UTF8))
{
char[] buffer = new char[count];
int n = await reader.ReadBlockAsync(buffer, 0, count);
char[] result = new char[n];
Array.Copy(buffer, result, n);
return result;
}
}
Which you might call like so:
using System;
using System.IO;
using System.Text;
using System.Threading.Tasks;
namespace Demo
{
static class Program
{
static async Task Main()
{
string filename = "Your filename here";
Console.WriteLine(await ReadCharsAsync(filename, 100));
}
}
}
Let's read with StreamReader:
char[] buffer = new char[100];
using (StreamReader reader = new StreamReader(#"c:\MyFile.txt")) {
// Technically, StreamReader can read less than buffer.Length characters
// if the file is too short;
// in this case reader.Read returns the number of actually read chars
reader.Read(buffer, 0, buffer.Length);
}
fs.Read(); does not read the whole bytes all at once, it reads some number of bytes and returns the number of bytes read. MSDN has a good example of how to use it.
http://msdn.microsoft.com/en-us/library/system.io.filestream.read.aspx
Reading the entire 1 GB of data into memory is really going to put a drain on your client's system -- the preferred option would be to optimize it so that you don't need the whole file all at once.

ToArray() function limitation

I am using the .ToArray() method to convert my string to char array whose size i have kept char[] buffer = new char[1000000]; but when I am using the following code:
using (StreamReader streamReader = new StreamReader(path1))
{
buffer = streamReader.ReadToEnd().ToCharArray();
}
// buffer = result.ToArray();
threadfunc(data_path1);
The size of the buffer getting fixed up to 8190, even it is not reading the whole file after using .ToCharArray() or .ToArray().
What is the reason for this does .ToCharArray() or .ToArray() have size limitations? As if I do not use this function I'm able to read whole file in string format, but when trying to convert it into char array by using this function I am getting size limitations.
My guess is the problem is that read to end should finish before you call the ToCharArray(). This might help you. You don't need to define buffer since ToCharArray() creates a new instance of char[] itself.
string content;
using (StreamReader streamReader = new StreamReader(path1))
{
content = streamReader.ReadToEnd();
}
var buffer = content.ToCharArray();
ToCharArray() returns new instance of of array. So your buffer will refer to the new instance which is the size of data returned by ReadToEnd.
If you want keep buffer same size just add new array to the existed one
char[] buffer = new char[1000000];
using (StreamReader streamReader = new StreamReader(path1))
{
var tempArray = streamReader.ReadToEnd().ToCharArray();
tempArray.CopyTo(buffer, 0);
}
If you want just use the result array - you don't need to "predict" the size of array - just use returned one
public char[] GetArrayFromFile(string pathToFile)
{
using (StreamReader streamReader = new StreamReader(path1))
{
var data = streamReader.ReadToEnd();
}
return data.ToCharArray();
}
var arrayFromFile = GetArrayFromFile(#"..\path.file");
You are probably using incorrect encoding. By default StreamReader(String) uses UTF8 encoding:
The complete file path is specified by the path parameter. This
constructor initializes the encoding to UTF8Encoding and the buffer
size to 1024 bytes.
Don't pre-allocate the buffer size, unless you have a specific need.
If your file is in ASCII format, you need to update your StreamReader constructor:
char[] buffer = null;
using (StreamReader streamReader = new StreamReader(path1, Encoding.ASCII))
{
buffer = streamReader.ReadToEnd().ToCharArray();
}
// buffer = result.ToArray();
threadfunc(data_path1);
Does your file contain binary data? If it contains EOF character and the stream is opened in text mode (which StreamReader does), that character will signal end of file, even if it is not actually the end of the file.
I can reproduce this by reading random .exe files in text mode.

GZipStream - write not writing all compressed data even with flush?

I've got a pesky problem with gzipstream targeting .Net 3.5. This is my first time working with gzipstream, however I have modeled after a number of tutorials including here and I'm still stuck.
My app serializes a datatable to xml and inserts into a database, storing the compressed data into a varbinary(max) field as well as the original length of the uncompressed buffer. Then, when I need it, I retrieve this data and decompress it and recreates the datatable. The decompress is what seems to fail.
EDIT: Sadly after changing the GetBuffer to ToArray as suggested, my issue remains. Code Updated below
Compress code:
DataTable dt = new DataTable("MyUnit");
//do stuff with dt
//okay... now compress the table
using (MemoryStream xmlstream = new MemoryStream())
{
//instead of stream, use xmlwriter?
System.Xml.XmlWriterSettings settings = new System.Xml.XmlWriterSettings();
settings.Encoding = Encoding.GetEncoding(1252);
settings.Indent = false;
System.Xml.XmlWriter writer = System.Xml.XmlWriter.Create(xmlstream, settings);
try
{
dt.WriteXml(writer);
writer.Flush();
}
catch (ArgumentException)
{
//likely an encoding issue... okay, base64 encode it
var base64 = Convert.ToBase64String(xmlstream.ToArray());
xmlstream.Write(Encoding.GetEncoding(1252).GetBytes(base64), 0, Encoding.GetEncoding(1252).GetBytes(base64).Length);
}
using (MemoryStream zipstream = new MemoryStream())
{
GZipStream zip = new GZipStream(zipstream, CompressionMode.Compress);
log.DebugFormat("Compressing commands...");
zip.Write(xmlstream.GetBuffer(), 0, xmlstream.ToArray().Length);
zip.Flush();
float ratio = (float)zipstream.ToArray().Length / (float)xmlstream.ToArray().Length;
log.InfoFormat("Resulting compressed size is {0:P2} of original", ratio);
using (SqlCommand cmd = new SqlCommand())
{
cmd.CommandText = "INSERT INTO tinydup (lastid, command, compressedlength) VALUES (#lastid,#compressed,#length)";
cmd.Connection = db;
cmd.Parameters.Add("#lastid", SqlDbType.Int).Value = lastid;
cmd.Parameters.Add("#compressed", SqlDbType.VarBinary).Value = zipstream.ToArray();
cmd.Parameters.Add("#length", SqlDbType.Int).Value = xmlstream.ToArray().Length;
cmd.ExecuteNonQuery();
}
}
Decompress Code:
/* This is an encapsulation of what I get from the database
public class DupUnit{
public uint lastid;
public uint complength;
public byte[] compressed;
}*/
//I have already retrieved my list of work to do from the database in a List<Dupunit> dupunits
foreach (DupUnit unit in dupunits)
{
DataSet ds = new DataSet();
//DataTable dt = new DataTable();
//uncompress and extract to original datatable
try
{
using (MemoryStream zipstream = new MemoryStream(unit.compressed))
{
GZipStream zip = new GZipStream(zipstream, CompressionMode.Decompress);
byte[] xmlbits = new byte[unit.complength];
//WHY ARE YOU ALWAYS 0!!!!!!!!
int bytesdecompressed = zip.Read(xmlbits, 0, unit.compressed.Length);
MemoryStream xmlstream = new MemoryStream(xmlbits);
log.DebugFormat("Uncompressed XML against {0} is: {1}", m_source.DSN, Encoding.GetEncoding(1252).GetString(xmlstream.ToArray()));
try{
ds.ReadXml(xmlstream);
}catch(Exception)
{
//it may have been base64 encoded... decode first.
ds.ReadXml(Encoding.GetEncoding(1254).GetString(
Convert.FromBase64String(
Encoding.GetEncoding(1254).GetString(xmlstream.ToArray())))
);
}
xmlstream.Dispose();
}
}
catch (Exception e)
{
log.Error(e);
Thread.Sleep(1000);//sleep a sec!
continue;
}
Note the comment above... bytesdecompressed is always 0. Any ideas? Am I doing it wrong?
EDIT 2:
So this is weird. I added the following debug code to the decompression routine:
GZipStream zip = new GZipStream(zipstream, CompressionMode.Decompress);
byte[] xmlbits = new byte[unit.complength];
int offset = 0;
while (zip.CanRead && offset < xmlbits.Length)
{
while (zip.Read(xmlbits, offset, 1) == 0) ;
offset++;
}
When debugging, sometimes that loop would complete, but other times it would hang. When I'd stop the debugging, it would be at byte 1600 out of 1616. I'd continue, but it wouldn't move at all.
EDIT 3: The bug appears to be in the compress code. For whatever reason, it is not saving all of the data. When I try to decompress the data using a third party gzip mechanism, I only get part of the original data.
I'd start a bounty, but I really don't have much reputation to give as of now :-(
Finally found the answer. The compressed data wasn't complete because GZipStream.Flush() does absolutely nothing to ensure that all of the data is out of the buffer - you need to use GZipStream.Close() as pointed out here. Of course, if you get a bad compress, it all goes downhill - if you try to decompress it, you will always get 0 returned from the Read().
I'd say this line, at least, is the most wrong:
cmd.Parameters.Add("#compressed", SqlDbType.VarBinary).Value = zipstream.GetBuffer();
MemoryStream.GetBuffer:
Note that the buffer contains allocated bytes which might be unused. For example, if the string "test" is written into the MemoryStream object, the length of the buffer returned from GetBuffer is 256, not 4, with 252 bytes unused. To obtain only the data in the buffer, use the ToArray method.
It should be noted that in the zip format, it first works by locating data stored at the end of the file - so if you've stored more data than was required, the required entries at the "end" of the file don't exist.
As an aside, I'd also recommend a different name for your compressedlength column - I'd initially taken it (despite your narrative) as being intended to store, well, the length of the compressed data (and written part of my answer to address that). Maybe originalLength would be a better name?

change wav file ( to 16KHz and 8bit ) with using NAudio

I want to change a WAV file to 8KHz and 8bit using NAudio.
WaveFormat format1 = new WaveFormat(8000, 8, 1);
byte[] waveByte = HelperClass.ReadFully(File.OpenRead(wavFile));
Wave
using (WaveFileWriter writer = new WaveFileWriter(outputFile, format1))
{
writer.WriteData(waveByte, 0, waveByte.Length);
}
but when I play the output file, the sound is only sizzle. Is my code is correct or what is wrong?
If I set WaveFormat to WaveFormat(44100, 16, 1), it works fine.
Thanks.
A few pointers:
You need to use a WaveFormatConversionStream to actually convert from one sample rate / bit depth to another - you are just putting the original audio into the new file with the wrong wave format.
You may also need to convert in two steps - first changing the sample rate, then changing the bit depth / channel count. This is because the underlying ACM codecs can't always do the conversion you want in a single step.
You should use WaveFileReader to read your input file - you only want the actual audio data part of the file to get converted, but you are currently copying everything including the RIFF chunks as though they were audio data into the new file.
8 bit PCM audio usually sounds horrible. Use 16 bit, or if you must have 8 bit, use G.711 u-law or a-law
Downsampling audio can result in aliasing. To do it well you need to implement a low-pass filter first. This unfortunately isn't easy, but there are sites that help you generate the coefficients for a Chebyshev low pass filter for the specific downsampling you are doing.
Here's some example code showing how to convert from one format to another. Remember that you might need to do the conversion in multiple steps depending on the format of your input file:
using (var reader = new WaveFileReader("input.wav"))
{
var newFormat = new WaveFormat(8000, 16, 1);
using (var conversionStream = new WaveFormatConversionStream(newFormat, reader))
{
WaveFileWriter.CreateWaveFile("output.wav", conversionStream);
}
}
The following code solved my problem dealing with G.711 Mu-Law with a vox file extension to wav file. I kept getting a "No RIFF Header" error with WaveFileReader otherwise.
FileStream fileStream = new FileStream(fileName, FileMode.Open);
var waveFormat = WaveFormat.CreateMuLawFormat(8000, 1);
var reader = new RawSourceWaveStream(fileStream, waveFormat);
using (WaveStream convertedStream = WaveFormatConversionStream.CreatePcmStream(reader))
{
WaveFileWriter.CreateWaveFile(fileName.Replace("vox", "wav"), convertedStream);
}
fileStream.Close();
openFileDialog openFileDialog = new openFileDialog();
openFileDialog.Filter = "Wave Files (*.wav)|*.wav|All Files (*.*)|*.*";
openFileDialog.FilterIndex = 1;
WaveFileReader reader = new NAudio.Wave.WaveFileReader(dpmFileDestPath);
WaveFormat newFormat = new WaveFormat(8000, 16, 1);
WaveFormatConversionStream str = new WaveFormatConversionStream(newFormat, reader);
try
{
WaveFileWriter.CreateWaveFile("C:\\Konvertierten_Dateien.wav", str);
}
catch (Exception ex)
{
MessageBox.Show(String.Format("{0}", ex.Message));
}
finally
{
str.Close();
}
MessageBox.Show("Konvertieren ist Fertig!");
}

Categories

Resources