I need to be able to insert audio data into existing ac3 files. AC3 files are pretty simple and can be appended to each other without stripping headers or anything. The problem I have is that if you want to add/overwrite/erase a chunk of an ac3 file, you have to do it in 32ms increments, and each 32ms is equal to 1536 bytes of data. So when I insert a data chunk (which must be 1536 bytes, as I just said), I need to find the nearest offset that is divisible by 1536 (like 0, 1536 (0x600), 3072 (0xC00), etc). Let's say I can figure that out. I've read about changing a particular character at a specific offset, but I need to INSERT (not overwrite) that entire 1536-byte data chunk. How would I do that in C#, given the starting offset and the 1536-byte data chunk?
Edit: The data chunk I want to insert is basically just 32ms of silence, and I have the hex, ASCII and ANSI text translations of it. Of course, I may want to insert this chunk multiple times to get 128ms of silence instead of just 32, for example.
byte[] filbyte=File.ReadAllBytes(#"C:\abc.ac3");
byte[] tobeinserted=;//allocate in your way using encoding whatever
byte[] total=new byte[filebyte.Length+tobeinserted.Length];
for(int i=0;int j=0;i<total.Length;)
{
if(i==1536*pos)//make pos your choice
{
while(j<tobeinserted.Length)
total[i++]=tobeinserted[j++];
}
else{total[i++]=filbyte[i-j];}
}
File.WriteAllBytes(#"C:\abc.ac3",total);
Here is the helper method that will do what you need:
public static void Insert(string filepath, int insertOffset, Stream dataToInsert)
{
var newFilePath = filepath + ".tmp";
using (var source = File.OpenRead(filepath))
using (var destination = File.OpenWrite(newFilePath))
{
CopyTo(source, destination, insertOffset);// first copy the data before insert
dataToInsert.CopyTo(destination);// write data that needs to be inserted:
CopyTo(source, destination, (int)(source.Length - insertOffset)); // copy remaining data
}
// delete old file and rename new one:
File.Delete(filepath);
File.Move(newFilePath, filepath);
}
private static void CopyTo(Stream source, Stream destination, int count)
{
const int bufferSize = 32 * 1024;
var buffer = new byte[bufferSize];
var remaining = count;
while (remaining > 0)
{
var toCopy = remaining > bufferSize ? bufferSize : remaining;
var actualRead = source.Read(buffer, 0, toCopy);
destination.Write(buffer, 0, actualRead);
remaining -= actualRead;
}
}
And here is an NUnit test with example usage:
[Test]
public void TestInsert()
{
var originalString = "some original text";
var insertString = "_ INSERTED TEXT _";
var insertOffset = 8;
var file = #"c:\someTextFile.txt";
if (File.Exists(file))
File.Delete(file);
using (var originalData = new MemoryStream(Encoding.ASCII.GetBytes(originalString)))
using (var f = File.OpenWrite(file))
originalData.CopyTo(f);
using (var dataToInsert = new MemoryStream(Encoding.ASCII.GetBytes(insertString)))
Insert(file, insertOffset, dataToInsert);
var expectedText = originalString.Insert(insertOffset, insertString);
var actualText = File.ReadAllText(file);
Assert.That(actualText, Is.EqualTo(expectedText));
}
Be aware that I have removed some checks for code clarity - do not forget to check for null, file access permissions and file size. For example insertOffset can be bigger than file length - this condition is not checked here.
Related
I am using the code below to break stream into smaller chunks. however, my chunk size is constant and i want it to be a variable. I want program to read till it hit symbol '$'and make '$ position to be chunk size.
For example: lets say txt file contains 01234583145329$34212349$2134567009$, so my 1st chunk size should be 14, second should be 8 and third should be 10. I did some research and find out that can be achieved by indexof method, but I am not able to implement that with the code below. Please advise.
If there is another efficient way other than Indexof, please let me know.
public static IEnumerable<IEnumerable<byte>> ReadByChunk(int chunkSize)
{
IEnumerable<byte> result;
int startingByte = 0;
do
{
result = ReadBytes(startingByte, chunkSize);
startingByte += chunkSize;
yield return result;
}
while (result.Any());
}
public static IEnumerable<byte> ReadBytes(int startingByte, int byteToRead)
{
byte[] result;
using (FileStream stream = File.Open(#"C:\Users\file.txt", FileMode.Open, FileAccess.Read, FileShare.Read))
using (BinaryReader reader = new BinaryReader(stream))
{
int bytesToRead = Math.Max(Math.Min(byteToRead, (int)reader.BaseStream.Length - startingByte), 0);
reader.BaseStream.Seek(startingByte, SeekOrigin.Begin);
result = reader.ReadBytes(bytesToRead);
int chunkSize = Index of
}
return result;
}
static void Main()
{
int chunkSize = 8;
foreach (IEnumerable<byte> bytes in ReadByChunk(chunkSize))
{
//more code
}
}
You seem to care about characters, not bytes here, as you are trying to find $ characters. Just reading bytes will only work in the specific case of "each character is encoded with one byte". Therefore, you should use ReadChar and return IEnumerable<IEnumerable<char>> instead.
You seem to be creating a new reader and stream for each chunk, which I feel is quite unnecessary. You could just create one stream and one reader in ReadByChunk, and pass it to the ReadBytes method.
The IndexOf you found is probably for strings. I assume you want to lazily read from a stream, so reading everything into a string first and then using IndexOf seems to go against your intention.
For a text file, I would also recommend you to use StreamReader. BinaryReader is for reading binary files.
Here's my attempt:
public static IEnumerable<IEnumerable<char>> ReadByChunk()
{
using (StreamReader reader = new StreamReader(File.Open(...))) {
while (reader.Peek() != -1) { // while not at the end of the stream...
yield return ReadUntilNextDollarSign(reader);
}
}
}
public static IEnumerable<char> ReadUntilNextDollarSign(StreamReader reader)
{
char c;
// while not at the end of the stream, and the next char is not a dollar sign...
while (reader.Peek() != -1 && (c = (char)reader.Read()) != '$') {
yield return c;
}
}
When I searched the method about decompress the file by using SharpZipLib, I found lot of methods like this:
public static void TarWriteCharacters(string tarfile, string targetDir)
{
using (TarInputStream s = new TarInputStream(File.OpenRead(tarfile)))
{
//some codes here
using (FileStream fileWrite = File.Create(targetDir + directoryName + fileName))
{
int size = 2048;
byte[] data = new byte[2048];
while (true)
{
size = s.Read(data, 0, data.Length);
if (size > 0)
{
fileWrite.Write(data, 0, size);
}
else
{
break;
}
}
fileWrite.Close();
}
}
}
The format FileStream.Write is:
FileStream.Write(byte[] array, int offset, int count)
Now I try to separate part of read and write because I want to use thread to speed up the decompress rate in write function, and I use dynamic array byte[] and int[] to deposit the file's data and size like below
Read:
public static void TarWriteCharacters(string tarfile, string targetDir)
{
using (TarInputStream s = new TarInputStream(File.OpenRead(tarfile)))
{
//some codes here
using (FileStream fileWrite= File.Create(targetDir + directoryName + fileName))
{
int size = 2048;
List<int> SizeList = new List<int>();
List<byte[]> mydatalist = new List<byte[]>();
while (true)
{
byte[] data = new byte[2048];
size = s.Read(data, 0, data.Length);
if (size > 0)
{
mydatalist.Add(data);
SizeList.Add(size);
}
else
{
break;
}
}
test = new Thread(() =>
FileWriteFun(pathToTar, args, SizeList, mydatalist)
);
test.Start();
streamWriter.Close();
}
}
}
Write:
public static void FileWriteFun(string pathToTar , string[] args, List<int> SizeList, List<byte[]> mydataList)
{
//some codes here
using (FileStream fileWrite= File.Create(targetDir + directoryName + fileName))
{
for (int i = 0; i < mydataList.Count; i++)
{
fileWrite.Write(mydataList[i], 0, SizeList[i]);
}
fileWrite.Close();
}
}
Edit
(1)byte[] data = new byte[2048] into while loop to assign data to new array.
(2)change int[] SizeList = new int[2048] to List<int> SizeList = new List<int>() because of int range
As read on a stream is only guarantied to return one byte (typically it will be more, but you can't rely on the full requested length each time), your solution can theoretically fail after 2048 bytes as your SizeList can only hold 2048 entries.
You could use a List to hold the sizes.
Or use a MemoryStream instead of inventing your own.
But the two main problems are:
1) You keep reading into the same byte array, overwriting previously read data. When you add your data byte array to mydatalist, you must assign data to a new byte array.
2) you close your stream before the second thread is done writing.
In general threading is difficult and should only be used where you know it will improve performance. Simply reading and writing data is typically IO bound in performance, not cpu bound, so introducing a second thread will just give a small performance penalty and no gain in speed. You could use multithreading to ensure concurrent read/write operations, but most likely the disk cache will do this for you if you stick to the first solution - amd if not, using async is easier than multithreaded to achieve this.
I have a block of code that loads a custom storage file (data.00x) and dumps it's file contents (several files...) [for this example we'll say the referenced index only contains data.001 files for export]
Example:
public void ExportFileEntries(ref List<IndexEntry> filteredIndex, string dataDirectory, string buildDirectory, int chunkSize)
{
OnTotalMaxDetermined(new TotalMaxArgs(8));
// For each set of dataId files in the filteredIndex
for (int dataId = 1; dataId < 8; dataId++)
{
OnTotalProgressChanged(new TotalChangedArgs(dataId, string.Format("Exporting selected files from data.00{0}", dataId)));
// Filter only entries with current dataId into temp index
List<IndexEntry> tempIndex = GetEntriesByDataId(ref filteredIndex, dataId, SortType.Offset);
// Determine the path of the data.xxx file being exported from
string dataPath = string.Format(#"{0}\data.00{1}", dataDirectory, dataId);
if (File.Exists(dataPath))
{
// Load the data.xxx into filestream
using (FileStream dataFs = new FileStream(dataPath, FileMode.Open, FileAccess.Read))
{
// Loop through filex to export
foreach (IndexEntry indexEntry in tempIndex)
{
int fileLength = indexEntry.Length;
OnCurrentMaxDetermined(new CurrentMaxArgs(fileLength));
// Set the filestreams position to the file entries offset
dataFs.Position = indexEntry.Offset;
// Read the file into a byte array (buffer)
byte[] fileBytes = new byte[indexEntry.Length];
dataFs.Read(fileBytes, 0, fileBytes.Length);
// Define some information about the file being exported
string fileExt = Path.GetExtension(indexEntry.Name).Remove(0, 1);
string buildPath = string.Format(#"{0}\{1}\{2}", buildDirectory, fileExt.ToUpper(), indexEntry.Name);
// If needed unencrypt the data (fileBytes buffer)
if (XOR.Encrypted(fileExt)) { byte b = 0; XOR.Cipher(ref fileBytes, ref b); }
// If no chunkSize is provided, generate default
if (chunkSize == 0) { chunkSize = Math.Max(64000, (int)(fileBytes.Length * .02)); }
// If the build directory doesn't exist yet, create it.
if (!Directory.Exists(Path.GetDirectoryName(buildPath))) { Directory.CreateDirectory(Path.GetDirectoryName(buildPath)); }
using (FileStream buildFs = new FileStream(buildPath, FileMode.Create, FileAccess.Write))
{
using (BinaryWriter bw = new BinaryWriter(buildFs, encoding))
{
for (int byteCount = 0; byteCount < fileLength; byteCount += Math.Min(fileLength - byteCount, chunkSize))
{
bw.Write(fileBytes, byteCount, Math.Min(fileLength - byteCount, chunkSize));
OnCurrentProgressChanged(new CurrentChangedArgs(byteCount, ""));
}
}
}
OnCurrentProgressReset(EventArgs.Empty);
fileBytes = null;
}
}
}
else { OnError(new ErrorArgs(string.Format("[ExportFileEntries] Cannot locate: {0}", dataPath))); }
}
OnTotalProgressReset(EventArgs.Empty);
GC.Collect();
}
The data.001 stores about 12k files, most are very small .jpg pictures etc...etc.. for about the first half of the export process the gc collects just fine, but out of nowhere toward the last half of the export process the gc just stops giving a crap.
If I don't issue GC.Collect() at the end of the method the tool sits at around 255mb ram, but if I do call it goes down to about 14mb. What I'm asking, is there any obvious improvements over the way I coded the method (to increase gc performance)?
I am working with filestream read: https://msdn.microsoft.com/en-us/library/system.io.filestream.read%28v=vs.110%29.aspx
What I'm trying to do is read a large file in a loop a certain number of bytes at a time; not the whole file at once. The code example shows this for reading:
int n = fsSource.Read(bytes, numBytesRead, numBytesToRead);
The definition of "bytes" is: "When this method returns, contains the specified byte array with the values between offset and (offset + count - 1) replaced by the bytes read from the current source."
I want to only read in 1 mb at a time so I do this:
using (FileStream fsInputFile = new FileStream(strInputFileName, FileMode.Open, FileAccess.Read)) {
int intBytesToRead = 1024;
int intTotalBytesRead = 0;
int intInputFileByteLength = 0;
byte[] btInputBlock = new byte[intBytesToRead];
byte[] btOutputBlock = new byte[intBytesToRead];
intInputFileByteLength = (int)fsInputFile.Length;
while (intInputFileByteLength - 1 >= intTotalBytesRead)
{
if (intInputFileByteLength - intTotalBytesRead < intBytesToRead)
{
intBytesToRead = intInputFileByteLength - intTotalBytesRead;
}
// *** Problem is here ***
int n = fsInputFile.Read(btInputBlock, intTotalBytesRead, intBytesToRead);
intTotalBytesRead += n;
fsOutputFile.Write(btInputBlock, intTotalBytesRead - n, n);
}
fsOutputFile.Close(); }
Where the problem area is stated, btInputBlock works on the first cycle because it reads in 1024 bytes. But then on the second loop, it doesn't recycle this byte array. It instead tries to append the new 1024 bytes into btInputBlock. As far as I can tell, you can only specify the offset and length of the file you want to read and not the offset and length of btInputBlock. Is there a way to "re-use" the array that is being dumped into by Filestream.Read or should I find another solution?
Thanks.
P.S. The exception on the read is: "Offset and length were out of bounds for the array or count is greater than the number of elements from index to the end of the source collection."
Your code can be simplified somewhat
int num;
byte[] buffer = new byte[1024];
while ((num = fsInputFile.Read(buffer, 0, buffer.Length)) != 0)
{
//Do your work here
fsOutputFile.Write(buffer, 0, num);
}
Note that Read takes in the Array to fill, the offset (which is the offset of the array where the bytes should be placed, and the (max) number of bytes to read.
That's because you're incrementing intTotalBytesRead, which is an offset for the array, not for the filestream. In your case it should always be zero, which will overwrite previous byte data in the array, rather than append it at the end, using intTotalBytesRead.
int n = fsInputFile.Read(btInputBlock, intTotalBytesRead, intBytesToRead); //currently
int n = fsInputFile.Read(btInputBlock, 0, intBytesToRead); //should be
Filestream doesn't need an offset, every Read picks up where the last one left off.
See https://msdn.microsoft.com/en-us/library/system.io.filestream.read(v=vs.110).aspx
for details
Your Read call should be Read(btInputBlock, 0, intBytesToRead). The 2nd parameter is the offset into the array you want to start writing the bytes to. Similarly for Write you want Write(btInputBlock, 0, n) as the 2nd parameter is the offset in the array to start writing bytes from. Also you don't need to call Close as the using will clean up the FileStream for you.
using (FileStream fsInputFile = new FileStream(strInputFileName, FileMode.Open, FileAccess.Read))
{
int intBytesToRead = 1024;
byte[] btInputBlock = new byte[intBytesToRead];
while (fsInputFile.Postion < fsInputFile.Length)
{
int n = fsInputFile.Read(btInputBlock, 0, intBytesToRead);
intTotalBytesRead += n;
fsOutputFile.Write(btInputBlock, 0, n);
}
}
I Calculate the binary size of a file with this function:
public static int BinarySize(string path)
{
FileStream fs = new FileStream(path, FileMode.Open);
int hexIn;
string ret = "";
for (int i = 0; (hexIn = fs.ReadByte()) != -1; i++)
{
ret += Convert.ToString(hexIn, 2);
}
fs.Close();
return ret.Length;
}
An example of my problem is when I calculate the dimension of this simple black PNG image (10x10 pixels)
With that function I find 640 bits => 80 bytes, but windows say that this file dimension is 136 byte.
Why this difference of 56 bytes? Is the security, permissions or some private information that windows attach to every file?
Convert.ToString(hexIn,2) does not always return 8 characters, it trims leading zeroes, so if hexIn is 4, it returns 100, but not 00000100.
You might want to change it to Convert.ToString(hexIn,2).PadLeft(8, '0');.
Also you'd want to use StringBuilder instead of string for ret variable.
By the way, reading file to determine its size is a bit wasteful. Better use FileInfo class to get file information.