I am using ZXing.Net library to encode and decode my video file using RS Encoder. It works well by adding and and removing parity after encoding and decoding respectively. But When writing decoded file back it is adding "?" characters in file on different locations which was not part of original file. I am not getting why this problem is occurring when writing file back.
Here is my code
using ZXing.Common.ReedSolomon;
namespace zxingtest
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
string inputFileName = #"D:\JM\bin\baseline_30.264";
string outputFileName = #"D:\JM\bin\baseline_encoded.264";
string Content = File.ReadAllText(inputFileName, ASCIIEncoding.Default);
//File.WriteAllText(outputFileName, Content, ASCIIEncoding.Default);
ReedSolomonEncoder enc = new ReedSolomonEncoder(GenericGF.AZTEC_DATA_12);
ReedSolomonDecoder dec = new ReedSolomonDecoder(GenericGF.AZTEC_DATA_12);
//string s = "1,2,4,6,1,7,4,0,0";
//int[] array = s.Split(',').Select(str => int.Parse(str)).ToArray();
int parity = 10;
List<byte> toBytes = ASCIIEncoding.Default.GetBytes(Content.Substring(0, 500)).ToList();
for (int index = 0; index < parity; index++)
{
toBytes.Add(0);
}
int[] bytesAsInts = Array.ConvertAll(toBytes.ToArray(), c => (int)c);
enc.encode(bytesAsInts, parity);
bytesAsInts[1] = 3;
dec.decode(bytesAsInts, parity);
string st = new string(Array.ConvertAll(bytesAsInts.ToArray(), z => (char)z));
File.WriteAllText(outputFileName, st, ASCIIEncoding.Default);
}
}
}
And here is the Hex file view of H.264 bit stream
The problem is that you're handling a binary format as if it is a Text file with an encoding. But based on what you are doing you only seem to be interested in reading some bytes, process them (encode, decode) and then write the bytes back to a file.
If that is what you need then use the proper reader and writer for your files, in this case the BinaryReader and BinaryWriter. Using your code as a starting point this is my version using the earlier mentioned readers/writers. My inputfile and outputfile are similar for the bytes read and written.
string inputFileName = #"input.264";
string outputFileName = #"output.264";
ReedSolomonEncoder enc = new ReedSolomonEncoder(GenericGF.AZTEC_DATA_12);
ReedSolomonDecoder dec = new ReedSolomonDecoder(GenericGF.AZTEC_DATA_12);
const int parity = 10;
// open a file as stream for reading
using (var input = File.OpenRead(inputFileName))
{
const int max_ints = 256;
int[] bytesAsInts = new int[max_ints];
// use a binary reader
using (var binary = new BinaryReader(input))
{
for (int i = 0; i < max_ints - parity; i++)
{
//read a single byte, store them in the array of ints
bytesAsInts[i] = binary.ReadByte();
}
// parity
for (int i = max_ints - parity; i < max_ints; i++)
{
bytesAsInts[i] = 0;
}
enc.encode(bytesAsInts, parity);
bytesAsInts[1] = 3;
dec.decode(bytesAsInts, parity);
// create a stream for writing
using(var output = File.Create(outputFileName))
{
// write bytes back
using(var writer = new BinaryWriter(output))
{
foreach(var value in bytesAsInts)
{
// we need to write back a byte
// not an int so cast it
writer.Write((byte)value);
}
}
}
}
}
I am trying to decompress a GZipped string which is part of response from a webservice. The string that I have is:
"[31,-117,8,0,0,0,0,0,0,0,109,-114,65,11,-62,48,12,-123,-1,75,-50,-61,-42,-127,30,122,21,111,-126,94,60,-119,-108,-72,102,44,-48,-75,-93,-21,100,56,-6,-33,-19,20,20,101,57,37,95,-14,94,-34,4,-63,-5,-72,-73,-44,-110,-117,-96,38,-88,26,-74,38,-112,3,117,-7,25,-82,5,24,-116,56,-97,-44,108,-23,28,24,-44,-85,83,34,-41,97,-88,24,-99,23,36,124,-120,94,99,-120,15,-42,-91,-108,91,45,-11,70,119,60,-110,21,-20,12,-115,-94,111,-80,-93,89,-41,-65,-127,-82,76,41,51,-19,52,90,-5,69,-85,76,-96,-128,64,22,35,-33,-23,-124,-79,-55,-1,-2,-10,-87,0,55,-76,55,10,-57,122,-9,73,42,-45,98,-44,5,-77,101,-3,58,-91,39,38,51,-15,121,21,1,0,0]"
I'm trying to decompress that string using the following method:
public static string UnZip(string value)
{
// Removing brackets from string
value = value.TrimStart('[');
value = value.TrimEnd(']');
//Transform string into byte[]
string[] strArray = value.Split(',');
byte[] byteArray = new byte[strArray.Length];
for (int i = 0; i < strArray.Length; i++)
{
if (strArray[i][0] != '-')
byteArray[i] = Convert.ToByte(strArray[i]);
else
{
int val = Convert.ToInt16(strArray[i]);
byteArray[i] = (byte)(val + 256);
}
}
//Prepare for decompress
System.IO.MemoryStream ms = new System.IO.MemoryStream(byteArray);
System.IO.Compression.GZipStream sr = new System.IO.Compression.GZipStream(ms,
System.IO.Compression.CompressionMode.Decompress);
//Reset variable to collect uncompressed result
byteArray = new byte[byteArray.Length];
//Decompress
int rByte = sr.Read(byteArray, 0, byteArray.Length);
//Transform byte[] unzip data to string
System.Text.StringBuilder sB = new System.Text.StringBuilder(rByte);
//Read the number of bytes GZipStream red and do not a for each bytes in
//resultByteArray;
for (int i = 0; i < rByte; i++)
{
sB.Append((char)byteArray[i]);
}
sr.Close();
ms.Close();
sr.Dispose();
ms.Dispose();
return sB.ToString();
}
The method is a modified version of the one in the following link:
http://www.codeproject.com/Articles/27203/GZipStream-Compress-Decompress-a-string
Sadly, the result of that method is a corrupted string. More specifically, I know that the input string contains a compressed JSON object and the output string has only some of the expected string:
"{\"rootElement\":{\"children\":[{\"children\":[],\"data\":{\"fileUri\":\"file:////Luciano/e/orto_artzi_2006_0_5_pixel/index/shapefiles/index_cd20/shp_all/index_cd2.shp\",\"relativePath\":\"/i"
Any idea what could be the problem and how to solve it?
Try
public static string UnZip(string value)
{
// Removing brackets from string
value = value.TrimStart('[');
value = value.TrimEnd(']');
//Transform string into byte[]
string[] strArray = value.Split(',');
byte[] byteArray = new byte[strArray.Length];
for (int i = 0; i < strArray.Length; i++)
{
byteArray[i] = unchecked((byte)Convert.ToSByte(strArray[i]));
}
//Prepare for decompress
using (System.IO.MemoryStream output = new System.IO.MemoryStream())
{
using (System.IO.MemoryStream ms = new System.IO.MemoryStream(byteArray))
using (System.IO.Compression.GZipStream sr = new System.IO.Compression.GZipStream(ms, System.IO.Compression.CompressionMode.Decompress))
{
sr.CopyTo(output);
}
string str = Encoding.UTF8.GetString(output.GetBuffer(), 0, (int)output.Length);
return str;
}
}
The MemoryBuffer() doesn't "duplicate" the byteArray but is directly backed by it, so you can't reuse the byteArray.
I'll add that I find funny that they "compressed" a json of 277 characters to a stringized byte array of 620 characters.
As a sidenote, the memory occupation of this method is out-of-the-roof... The 620 character string (that in truth is a 277 byte array) to be decompressed causes the creation of strings/arrays for a total size of 4887 bytes (including the 620 initial character string) (disclaimer: the GC can reclaim part of this memory during the execution of the method). This is ok for byte arrays of 277 bytes... But for bigger ones the memory occupation will become quite big.
Following on from Xanatos's answer in C# slightly modified to return a simple byte array. This takes a gzip compressed byte array and returns the inflated gunzipped array.
public static byte[] Decompress(byte[] compressed_data)
{
var outputStream = new MemoryStream();
using (var compressedStream = new MemoryStream(compressed_data))
using (System.IO.Compression.GZipStream sr = new System.IO.Compression.GZipStream(
compressedStream, System.IO.Compression.CompressionMode.Decompress))
{
sr.CopyTo(outputStream);
outputStream.Position = 0;
return outputStream.ToArray();
}
}
I have a Validate(Stream inputStream) method. This method calls several other validation methods by passing the inputStream to each one. Each of these creates a new TextFieldParser and reads/validates the file.
When the first ValidateA(inputStream) is called, it works. But, when the 2nd ValidateB(inputStream) is called, the parser.EndOfData is true so, it does not read the fields.
I've tried to clean up the code to its simplest form.
public int Validate(Stream inputStream, ref List<string> errors)
{
inputStream.Seek(0, SeekOrigin.Begin);
errors.AddRange(ValidateA(inputStream));
// The 2nd time, the EndOfData is true, so it doesn't read the fields
inputStream.Seek(0, SeekOrigin.Begin);
errors.AddRange(ValidateB(inputStream));
...
}
private List<string> ValidateA(Stream inputStream)
{
List<string> errors = new List<string>();
// Works fine the first time
using (var parser = new TextFieldParser(inputStream))
{
parser.TextFieldType = FieldType.Delimited;
parser.SetDelimiters(",");
parser.TrimWhiteSpace = true;
int lineNumber = 0;
while (!parser.EndOfData)
{
string[] fields = parser.ReadFields();
// Processing....
}
if (lineNumber < 2)
errors.Add(string.Format("There is no data in the file"));
}
return errors;
}
Here is where the problem occurs. The ValidateB method cannot process the file because the EndOfData field does not get reset.
private List<string> ValidateB(Stream inputStream)
{
List<string> errors = new List<string>();
using (var parser = new TextFieldParser(inputStream))
{
parser.TextFieldType = FieldType.Delimited;
parser.SetDelimiters(",");
parser.TrimWhiteSpace = true;
int LineNumber = 0;
while (!parser.EndOfData)
{
// Processing....
}
}
return errors;
}
The comment by #HansPassant is correct and lead me to change the way I was passing data around. Instead of passing a Stream around, I converted the MemoryStream to a byte[].
Then, in the ValidateX(byte[] fileByteArray) method, I would create a new MemoryStream from the byte array and use it.
Example:
Stream stream = model.PostedFile.InputStream;
MemoryStream memStream = new MemoryStream();
stream.CopyTo(memStream);
byte[] data = memStream.ToArray();
var result = ValidateB(data);
And then,
private List<string> ValidateB(byte[] fileByteArray)
{
List<string> errors = new List<string>();
MemoryStream ms = new MemoryStream(fileByteArray);
ms.Position = 0;
ms.Seek(0, SeekOrigin.Begin);
using (var parser = new TextFieldParser(ms))
{
// Processing...
}
}
This prevented problems with the EndOfData and trying to access a Stream that was closed.
I'm trying to copy the contents of one Excel file to another Excel file while replacing a string inside of the file on the copy. It's working for the most part, but the file is losing 27 kb of data. Any suggestions?
public void ReplaceString(string what, string with, string path) {
List < string > doneContents = new List < string > ();
List < string > doneNames = new List < string > ();
using(ZipArchive archive = ZipFile.Open(_path, ZipArchiveMode.Read)) {
int count = archive.Entries.Count;
for (int i = 0; i < count; i++) {
ZipArchiveEntry entry = archive.Entries[i];
using(var entryStream = entry.Open())
using(StreamReader reader = new StreamReader(entryStream)) {
string txt = reader.ReadToEnd();
if (txt.Contains(what)) {
txt = txt.Replace(what, with);
}
doneContents.Add(txt);
string name = entry.FullName;
doneNames.Add(name);
}
}
}
using(MemoryStream zipStream = new MemoryStream()) {
using(ZipArchive newArchive = new ZipArchive(zipStream, ZipArchiveMode.Create, true, Encoding.UTF8)) {
for (int i = 0; i < doneContents.Count; i++) {
int spot = i;
ZipArchiveEntry entry = newArchive.CreateEntry(doneNames[spot]);
using(var entryStream = entry.Open())
using(var sw = new StreamWriter(entryStream)) {
sw.Write(doneContents[spot]);
}
}
}
using(var fileStream = new FileStream(path, FileMode.Create)) {
zipStream.Seek(0, SeekOrigin.Begin);
zipStream.CopyTo(fileStream);
}
}
}
I've used Microsoft's DocumentFormat.OpenXML and Excel Interop, however, they are both lacking in a few main components that I need.
Update:
using(var fileStream = new FileStream(path, FileMode.Create)) {
var wrapper = new StreamWriter(fileStream);
wrapper.AutoFlush = true;
zipStream.Seek(0, SeekOrigin.Begin);
zipStream.CopyTo(wrapper.BaseStream);
wrapper.Flush();
wrapper.Close();
}
Try the process without changing the string and see if the file size is the same. If so then it would seem that your copy is working correctly, however as Marc B suggested, with compression, even a small change can result in a larger change in the overall size.
I have, in my code, a ConcurrentBag<Point3DCollection>.
I'm trying to figure out how to serialize them. Of course I could iterate through or package it with a provider model class, but I wonder if it's already been done.
The Point3DCollections themselves are potentially quite large and could stand to be compressed to speed up reading and writing to and from the disk, but the response times I need for this are largely in the user interface scale. In other words, I prefer a binary formatting over a XAML-text formatting, for performance reasons. (There is a nice XAML-text serializer which is part of the Helix 3D CodeProject, but it's slower than I'd like.)
Is this a use case where I'm left rolling out my own serializer, or is there something out there that's already packaged for this kind of data?
Here are some extensions methods that handle string and binary serialization of Point3DCollection bags. As I said in my comment, I don't think there is a best way of doing this in all cases, so you might want to try both. Also note they're using Stream parameter as input so you can chain these with calls to GZipStream of DeflateStream.
public static class Point3DExtensions
{
public static void StringSerialize(this ConcurrentBag<Point3DCollection> bag, Stream stream)
{
if (bag == null)
throw new ArgumentNullException("bag");
if (stream == null)
throw new ArgumentNullException("stream");
StreamWriter writer = new StreamWriter(stream);
Point3DCollectionConverter converter = new Point3DCollectionConverter();
foreach (Point3DCollection coll in bag)
{
// we need to use the english locale as the converter needs that for parsing...
string line = (string)converter.ConvertTo(null, CultureInfo.GetCultureInfo("en-US"), coll, typeof(string));
writer.WriteLine(line);
}
writer.Flush();
}
public static void StringDeserialize(this ConcurrentBag<Point3DCollection> bag, Stream stream)
{
if (bag == null)
throw new ArgumentNullException("bag");
if (stream == null)
throw new ArgumentNullException("stream");
StreamReader reader = new StreamReader(stream);
Point3DCollectionConverter converter = new Point3DCollectionConverter();
do
{
string line = reader.ReadLine();
if (line == null)
break;
bag.Add((Point3DCollection)converter.ConvertFrom(line));
// NOTE: could also use this:
//bag.Add(Point3DCollection.Parse(line));
}
while (true);
}
public static void BinarySerialize(this ConcurrentBag<Point3DCollection> bag, Stream stream)
{
if (bag == null)
throw new ArgumentNullException("bag");
if (stream == null)
throw new ArgumentNullException("stream");
BinaryWriter writer = new BinaryWriter(stream);
writer.Write(bag.Count);
foreach (Point3DCollection coll in bag)
{
writer.Write(coll.Count);
foreach (Point3D point in coll)
{
writer.Write(point.X);
writer.Write(point.Y);
writer.Write(point.Z);
}
}
writer.Flush();
}
public static void BinaryDeserialize(this ConcurrentBag<Point3DCollection> bag, Stream stream)
{
if (bag == null)
throw new ArgumentNullException("bag");
if (stream == null)
throw new ArgumentNullException("stream");
BinaryReader reader = new BinaryReader(stream);
int count = reader.ReadInt32();
for (int i = 0; i < count; i++)
{
int pointCount = reader.ReadInt32();
Point3DCollection coll = new Point3DCollection(pointCount);
for (int j = 0; j < pointCount; j++)
{
coll.Add(new Point3D(reader.ReadDouble(), reader.ReadDouble(), reader.ReadDouble()));
}
bag.Add(coll);
}
}
}
And a little console app test program to play with:
static void Main(string[] args)
{
Random rand = new Random(Environment.TickCount);
ConcurrentBag<Point3DCollection> bag = new ConcurrentBag<Point3DCollection>();
for (int i = 0; i < 100; i++)
{
Point3DCollection coll = new Point3DCollection();
bag.Add(coll);
for (int j = rand.Next(10); j < rand.Next(100); j++)
{
Point3D point = new Point3D(rand.NextDouble(), rand.NextDouble(), rand.NextDouble());
coll.Add(point);
}
}
using (FileStream stream = new FileStream("test.bin", FileMode.Create))
{
bag.StringSerialize(stream); // or Binary
}
ConcurrentBag<Point3DCollection> newbag = new ConcurrentBag<Point3DCollection>();
using (FileStream stream = new FileStream("test.bin", FileMode.Open))
{
newbag.StringDeserialize(stream); // or Binary
foreach (Point3DCollection coll in newbag)
{
foreach (Point3D point in coll)
{
Console.WriteLine(point);
}
Console.WriteLine();
}
}
}
}
Compression could potentially take advantage of repeated coordinates. Serializers will often use references for repeat objects as well, although I'm not sure there are many set up to work with structs (like Point3D). Anyhow, here are some examples of how to serialize this. To use the standard formatters, you need to convert the data type to something most of them support: list/array. The code below uses Nuget packages NUnit and Json.NET.
using Newtonsoft.Json;
using Newtonsoft.Json.Bson;
using NUnit.Framework;
using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.IO.Compression;
using System.Linq;
using System.Runtime.Serialization.Formatters.Binary;
using System.Text;
using System.Windows.Media.Media3D;
namespace DemoPoint3DSerialize
{
[TestFixture]
class Tests
{
[Test]
public void DemoBinary()
{
// this shows how to convert them all to strings
var collection = CreateCollection();
var data = collection.Select(c => c.ToArray()).ToList(); // switch to serializable types
var formatter = new BinaryFormatter();
using (var ms = new MemoryStream())
{
formatter.Serialize(ms, data);
Trace.WriteLine("Binary of Array Size: " + ms.Position);
ms.Position = 0;
var dupe = (List<Point3D[]>)formatter.Deserialize(ms);
var result = new ConcurrentBag<Point3DCollection>(dupe.Select(r => new Point3DCollection(r)));
VerifyEquality(collection, result);
}
}
[Test]
public void DemoString()
{
// this shows how to convert them all to strings
var collection = CreateCollection();
IEnumerable<IList<Point3D>> tmp = collection;
var strings = collection.Select(c => c.ToString()).ToList();
Trace.WriteLine("String Size: " + strings.Sum(s => s.Length)); // eh, 2x for Unicode
var result = new ConcurrentBag<Point3DCollection>(strings.Select(r => Point3DCollection.Parse(r)));
VerifyEquality(collection, result);
}
[Test]
public void DemoDeflateString()
{
// this shows how to convert them all to strings
var collection = CreateCollection();
var formatter = new BinaryFormatter(); // not really helping much: could
var strings = collection.Select(c => c.ToString()).ToList();
using (var ms = new MemoryStream())
{
using (var def = new DeflateStream(ms, CompressionLevel.Optimal, true))
{
formatter.Serialize(def, strings);
}
Trace.WriteLine("Deflate Size: " + ms.Position);
ms.Position = 0;
using (var def = new DeflateStream(ms, CompressionMode.Decompress))
{
var stringsDupe = (IList<string>)formatter.Deserialize(def);
var result = new ConcurrentBag<Point3DCollection>(stringsDupe.Select(r => Point3DCollection.Parse(r)));
VerifyEquality(collection, result);
}
}
}
[Test]
public void DemoStraightJson()
{
// this uses Json.NET
var collection = CreateCollection();
var formatter = new JsonSerializer();
using (var ms = new MemoryStream())
{
using (var stream = new StreamWriter(ms, new UTF8Encoding(true), 2048, true))
using (var writer = new JsonTextWriter(stream))
{
formatter.Serialize(writer, collection);
}
Trace.WriteLine("JSON Size: " + ms.Position);
ms.Position = 0;
using (var stream = new StreamReader(ms))
using (var reader = new JsonTextReader(stream))
{
var result = formatter.Deserialize<List<Point3DCollection>>(reader);
VerifyEquality(collection, new ConcurrentBag<Point3DCollection>(result));
}
}
}
[Test]
public void DemoBsonOfArray()
{
// this uses Json.NET
var collection = CreateCollection();
var formatter = new JsonSerializer();
using (var ms = new MemoryStream())
{
using (var stream = new BinaryWriter(ms, new UTF8Encoding(true), true))
using (var writer = new BsonWriter(stream))
{
formatter.Serialize(writer, collection);
}
Trace.WriteLine("BSON Size: " + ms.Position);
ms.Position = 0;
using (var stream = new BinaryReader(ms))
using (var reader = new BsonReader(stream, true, DateTimeKind.Unspecified))
{
var result = formatter.Deserialize<List<Point3DCollection>>(reader); // doesn't seem to read out that concurrentBag
VerifyEquality(collection, new ConcurrentBag<Point3DCollection>(result));
}
}
}
private ConcurrentBag<Point3DCollection> CreateCollection()
{
var rand = new Random(42);
var bag = new ConcurrentBag<Point3DCollection>();
for (int i = 0; i < 10; i++)
{
var collection = new Point3DCollection();
for (int j = 0; j < i + 10; j++)
{
var point = new Point3D(rand.NextDouble(), rand.NextDouble(), rand.NextDouble());
collection.Add(point);
}
bag.Add(collection);
}
return bag;
}
private class CollectionComparer : IEqualityComparer<Point3DCollection>
{
public bool Equals(Point3DCollection x, Point3DCollection y)
{
return x.SequenceEqual(y);
}
public int GetHashCode(Point3DCollection obj)
{
return obj.GetHashCode();
}
}
private void VerifyEquality(ConcurrentBag<Point3DCollection> collection, ConcurrentBag<Point3DCollection> result)
{
var first = collection.OrderBy(c => c.Count);
var second = collection.OrderBy(c => c.Count);
first.SequenceEqual(second, new CollectionComparer());
}
}
}
Use Google's protobuf-net. protobuf-net is an open source .net implementation of Google's protocol buffer binary serialization format which can be used as a replacement for the BinaryFormatter serializer. It is probably going to be the fastest solution and easiest to implement.
Here is a link to the the main google wiki for protobuf-net. On the left, you'll find the downloads for all of the most updated binaries.
https://code.google.com/p/protobuf-net/
Here is a great article that you might want to look at first to get a feel for how it works.
http://wallaceturner.com/serialization-with-protobuf-net
Here is a link to a discussion on google's wiki about your specific problem. The answer is at the bottom of the page. That's where I got the code below and substituted with details from your post.
https://code.google.com/p/protobuf-net/issues/detail?id=354
I haven't used it myself but it looks like a very good solution to your stated needs. From what I gather, your code would end up some variation of this.
[ProtoContract]
public class MyClass {
public ConcurrentQueue<Point3DCollection> Points {get;set;}
[ProtoMember(1)]
private Point3DCollection[] Items
{
get { return Points.ToArray(); }
set { Items = new ConcurrentBag<Point3DCollection>(value); }
}
}
I wish you the best of luck. Take care.
For a large amount of data, why don't you consider Sqlite or any other small database system etc, which can store structured data in the file.
I have seen many 3d programs using database to store structure along with relations, which allow them to partially insert/update/delete data.
Benefit of Sqlite/database will be multithreaded serialization to improve speed, however you need to do little bit of work on sqlite to enable multi threaded sqlite connection, or else you can use LocalDB of SQL Express or even Sql Compact.
Also some of workload of loading data can be done through queries, which will be indexed by database nicely. And most of things can be done on background worker without interfering with User Interface.
Sqlite has limited multi-thread support, which can be explored here http://www.sqlite.org/threadsafe.html
Sql Compact is thread safe and requires installation that can be installed without admin priviledges. And you can use Entity framework as well.