Read exel files dynamically, not depending on rows, and write json c# - c#

I'm trying to generate a json file from exel files. I have different Excel files and I would like to read them and generate a json file. I imagine it must be quite easy, but I'm having some trouble.
Ok, so I read this link using Excel reader tool, as this is what my leader says we should use. I tried following this link https://www.hanselman.com/blog/ConvertingAnExcelWorksheetIntoAJSONDocumentWithCAndNETCoreAndExcelDataReader.aspx
I always get the readTimeout and writeTimeout error. Also it never reads my Excel. It always writes null on my json document.
public static IActionResult GetData(
[HttpTrigger(AuthorizationLevel.Function, "get", "post", Route = null)] HttpRequest req,
ILogger log)
{
Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
var inFilePath = "C:\\Users\\a\\Desktop\\exelreader\\Wave.xlsx";
var outFilePath = "C:\\Users\\a\\Desktop\\exelreader\\text.json";
using (var inFile = File.Open(inFilePath, FileMode.Open, FileAccess.Read))
using (var outFile = File.CreateText(outFilePath))
{
using (var reader = ExcelReaderFactory.CreateReader(inFile, new ExcelReaderConfiguration()
{ FallbackEncoding = Encoding.GetEncoding(1252) }))
using (var writer = new JsonTextWriter(outFile))
{
writer.Formatting = Formatting.Indented; //I likes it tidy
writer.WriteStartArray();
reader.Read(); //SKIP FIRST ROW, it's TITLES.
do
{
while (reader.Read())
{
//peek ahead? Bail before we start anything so we don't get an empty object
var status = reader.GetString(1);
if (string.IsNullOrEmpty(status)) break;
writer.WriteStartObject();
writer.WritePropertyName("Source");
writer.WriteValue(reader.GetString(1));
writer.WritePropertyName("Event");
writer.WriteValue(reader.GetString(2));
writer.WritePropertyName("Campaign");
writer.WriteValue(reader.GetString(3));
writer.WritePropertyName("EventDate");
writer.WriteValue(reader.GetString(4));
//writer.WritePropertyName("FirstName");
//writer.WriteValue(reader.GetString(5).ToString());
//writer.WritePropertyName("LastName");
//writer.WriteValue(reader.GetString(6).ToString());
writer.WriteEndObject();
}
} while (reader.NextResult());
writer.WriteEndArray();
}
}
//never mind this return
return null;
}
Can anybody give some help on this matter. The idea is to read the first row of my Excel files as headers and then the other rows as values, so I can write the json.

For converting excel data to json, you could try read excel data as dataset and then serialize the dataset to json.
Try code below:
public async Task<IActionResult> ConvertExcelToJson()
{
var inFilePath = #"xx\Wave.xlsx";
var outFilePath = #"xx\text.json";
using (var inFile = System.IO.File.Open(inFilePath, FileMode.Open, FileAccess.Read))
using (var outFile = System.IO.File.CreateText(outFilePath))
{
using (var reader = ExcelReaderFactory.CreateReader(inFile, new ExcelReaderConfiguration()
{ FallbackEncoding = Encoding.GetEncoding(1252) }))
{
var ds = reader.AsDataSet(new ExcelDataSetConfiguration()
{
ConfigureDataTable = (_) => new ExcelDataTableConfiguration()
{
UseHeaderRow = true
}
});
var table = ds.Tables[0];
var json = JsonConvert.SerializeObject(table, Formatting.Indented);
outFile.Write(json);
}
}
return Ok();
}
For AsDataSet, install package ExcelDataReader.DataSet, if you got any error related with Encoding.GetEncoding(1252), configure the code below in Startup.cs
System.Text.Encoding.RegisterProvider(System.Text.CodePagesEncodingProvider.Instance);
Reference: ExcelDataReader

Related

System.Text.Json.* & deserialization of streams

I have this original code:
public async Task<ActionResult> Chunk_Upload_Save(IEnumerable<IFormFile> files, string metaData)
{
if (metaData == null)
{
return await Save(files);
}
MemoryStream ms = new MemoryStream(Encoding.UTF8.GetBytes(metaData));
JsonSerializer serializer = new JsonSerializer();
ChunkMetaData chunkData;
using (StreamReader streamReader = new StreamReader(ms))
{
chunkData = (ChunkMetaData)serializer.Deserialize(streamReader, typeof(ChunkMetaData));
}
string path = String.Empty;
// The Name of the Upload component is "files"
if (files != null)
{
foreach (var file in files)
{
path = Path.Combine(WebHostEnvironment.WebRootPath, "App_Data", chunkData.FileName);
//AppendToFile(path, file);
}
}
FileResult fileBlob = new FileResult();
fileBlob.uploaded = chunkData.TotalChunks - 1<= chunkData.ChunkIndex;
fileBlob.fileUid = chunkData.UploadUid;
return Json(fileBlob);
}
I converted it using only System.Text.Json.* to this:
public async Task<ActionResult> Chunk_Upload_Save(IEnumerable<IFormFile> files, string metaData)
{
if (metaData == null)
{
return await Save(files);
}
var ms = new MemoryStream(Encoding.UTF8.GetBytes(metaData));
ChunkMetaDataModel chunkData;
using (var streamReader = new StreamReader(ms))
{
// Here is the issues
chunkData = (ChunkMetaDataModel) await JsonSerializer.DeserializeAsync(streamReader, typeof(ChunkMetaDataModel));
}
// The Name of the Upload component is "files"
if (files != null)
{
foreach (var file in files)
{
Path.Combine(hostEnvironment.WebRootPath, "App_Data", chunkData!.FileName);
//AppendToFile(path, file);
}
}
var fileBlob = new FileResultModel
{
uploaded = chunkData!.TotalChunks - 1 <= chunkData.ChunkIndex,
fileUid = chunkData.UploadUid
};
return Json(fileBlob);
}
I get the error:
Argument 1: cannot convert from 'System.IO.StreamReader' to 'System.IO.Stream'.
By Argument 1, VS is pointing to the streamReader parameter and it's this line:
chunkData = (ChunkMetaData)serializer.Deserialize(streamReader, typeof(ChunkMetaData));
How do I convert this to the System.Text.Json API?
System.Text.Json is designed to deserialize most efficiently from UTF8 byte sequences rather than UTF16 strings, so there is no overload to deserialize from a StreamReader. Instead deserialize directly from the MemoryStream ms using the following:
chunkData = await JsonSerializer.DeserializeAsync<ChunkMetaDataModel>(ms);
Notes:
There is no reason to use async deserialization when deserializing from a MemoryStream. Instead use synchronous deserialization like so:
chunkData = JsonSerializer.Deserialize<ChunkMetaDataModel>(ms);
And since you already have a string metaData containing the JSON to be deserialized, you can deserialize directly from it using the Deserialize<TValue>(ReadOnlySpan<Char>, JsonSerializerOptions) overload:
chunkData = JsonSerializer.Deserialize<ChunkMetaDataModel>(metaData);
System.Text.Json will do the UTF16 to UTF8 conversion for you internally using memory pooling.
If you really must deserialize from a StreamReader for some reason (e.g. incremental integration of System.Text.Json with legacy code), see Reading string as a stream without copying for suggestions on how to do this.

CSV appears to be corrupt on Double quotes in Headers - C#

I was trying to read CSV file in C#.
I have tried File.ReadAllLines(path).Select(a => a.Split(';')) way but the issue is when there is \n multiple line in a cell it is not working.
So I have tried below
using LumenWorks.Framework.IO.Csv;
var csvTable = new DataTable();
using (TextReader fileReader = File.OpenText(path))
using (var csvReader = new CsvReader(fileReader, false))
{
csvTable.Load(csvReader);
}
for (int i = 0; i < csvTable.Rows.Count; i++)
{
if (!(csvTable.Rows[i][0] is DBNull))
{
var row1= csvTable.Rows[i][0];
}
if (!(csvTable.Rows[i][1] is DBNull))
{
var row2= csvTable.Rows[i][1];
}
}
The issue is the above code throwing exception as
The CSV appears to be corrupt near record '0' field '5 at position '63'
This is because the header of CSV's having two double quote as below
"Header1",""Header2""
Is there a way that I can ignore double quotes and process the CSV's.
update
I have tried with TextFieldParser as below
public static void GetCSVData()
{
using (var parser = new TextFieldParser(path))
{
parser.HasFieldsEnclosedInQuotes = false;
parser.Delimiters = new[] { "," };
while (parser.PeekChars(1) != null)
{
string[] fields = parser.ReadFields();
foreach (var field in fields)
{
Console.Write(field + " ");
}
Console.WriteLine(Environment.NewLine);
}
}
}
The output:
Sample CSV data I have used:
Any help is appreciated.
Hope this works!
Please replace two double quotes as below from csv:
using (FileStream fs = new FileStream(Path, FileMode.Open, FileAccess.ReadWrite, FileShare.None))
{
StreamReader sr = new StreamReader(fs);
string contents = sr.ReadToEnd();
// replace "" with "
contents = contents.Replace("\"\"", "\"");
// go back to the beginning of the stream
fs.Seek(0, SeekOrigin.Begin);
// adjust the length to make sure all original
// contents is overritten
fs.SetLength(contents.Length);
StreamWriter sw = new StreamWriter(fs);
sw.Write(contents);
sw.Close();
}
Then use the same CSV helper
using LumenWorks.Framework.IO.Csv;
var csvTable = new DataTable();
using (TextReader fileReader = File.OpenText(path))
using (var csvReader = new CsvReader(fileReader, false))
{
csvTable.Load(csvReader);
}
Thanks.

Writing to a JSON File and reading from it

I need to save the information from an input page into a JSON File and output the information onto another page reading from the JSON File. I've tried many things and what seemed to work for me is using the specialfolder localapplication data.
Now, I don't quite understand how I can output the information and also check if the data is even put in correctly.
I previously used StreamReader to output the information on the JSON file and then put it on a ListView but this doesn't work if I have the file in the specialfolder. It says "stream cant be null". The commented out code is the code I tried in previous attempts.
Code:
ListPageVM (Read Page)
private ObservableCollection<MainModel> data;
public ObservableCollection<MainModel> Data
{
get { return data; }
set { data = value; OnPropertyChanged(); }
}
public ListPageVM()
{
var assembly = typeof(ListPageVM).GetTypeInfo().Assembly;
Stream stream = assembly.GetManifestResourceStream(Path.Combine(System.Environment.GetFolderPath(Environment.SpecialFolder.LocalApplicationData), "eintraege.json"/"SaveUp.Resources.eintraege.json"/));
//var file = Path.Combine(System.Environment.GetFolderPath(Environment.SpecialFolder.LocalApplicationData), "eintraege.json");
using (var reader = new StreamReader(stream))
{
var json = reader.ReadToEnd();
List<MainModel> dataList = JsonConvert.DeserializeObject<List<MainModel>>(json);
data = new ObservableCollection<MainModel>(dataList);
}
}
MainPageVM (Write Page)
public Command Einfügen
{
get
{
return new Command(() =>
{
// Data ins Json
_mainModels.Add(DModel);
Datum = DateTime.Now.ToString("dd.mm.yyyy");
//var assembly = typeof(ListPageVM).GetTypeInfo().Assembly;
//FileStream stream = new FileStream("SaveUp.Resources.eintraege.json", FileMode.OpenOrCreate, FileAccess.Write);
var file = Path.Combine(System.Environment.GetFolderPath(Environment.SpecialFolder.LocalApplicationData), "eintraege.json");
//Stream stream = assembly.GetManifestResourceStream("SaveUp.Resources.eintraege.json");
if (!File.Exists(file))
{
File.Create(file);
}
using (var writer = File.AppendText(file))
{
string data = JsonConvert.SerializeObject(_mainModels);
writer.WriteLine(data);
}
});
}
}
you are trying to read and write resources, not files. That won't work. Instead do this
var path = Path.Combine(System.Environment.GetFolderPath(Environment.SpecialFolder.LocalApplicationData), "eintraege.json");
File.WriteAllText(path, myjson);
to read the data back
var json = File.ReadAllText(path);

c# FileStream how to use utf-8 enocde

I am using papa parse as csv parser for my data but I can't convert the data to UTF-8
string name = dataitem.Headers.ContentDisposition.FileName.Replace("\"", "");
string newFileName = Guid.NewGuid() + Path.GetExtension(name);
File.Move(dataitem.LocalFileName, Path.Combine(rootPath, newFileName));
List<JObject> rows = new List<JObject>();
using (FileStream stream = File.OpenRead(Path.Combine(rootPath, newFileName)))
{
Papa.parse(stream, new Config()
{
header = true,
skipEmptyLines = true,
encoding = Encoding.UTF8,
complete = parsed =>
{
foreach (JObject jo in JArray.Parse(parsed.dataWithHeader.DumpAsJson()))
rows.Add(jo);
var dt = new DataTable();
dt.Columns.Add("data");
foreach (object jo in rows)
dt.Rows.Add(jo.ToString());
if (result.Rows[0]["Result"].ToString() == "False")
{
throw new Exception(result.Rows[0]["Message"].ToString());
}
}
});
}
File.Delete(Path.Combine(rootPath, newFileName));
Upon checking when the Papaparse parse the data. It produces the wrong data from Evelyn N Baliño to Evelyn N Bali�o but I already change the encoding to UTF8. What am I doing wrong? should I specify the encoding in the FileStream?

Editing custom XML part in word document sometimes corrupts document

We have a system that stores some custom templating data in a Word document. Sometimes, updating this data causes Word to complain that the document is corrupted. When that happens, if I unzip the docx file and compare the contents to the previous version, the only difference appears to be the expected change in the customXML\item.xml file. If I re-zip the contents using 7zip, it seems to work OK (Word no longer complains that the document is corrupt).
The (simplified) code:
void CreateOrReplaceCustomXml(string filename, MyCustomData data)
{
using (var doc = WordProcessingDocument.Open(filename, true))
{
var part = GetCustomXmlParts(doc).SingleOrDefault();
if (part == null)
{
part = doc.MainDocumentPart.AddCustomXmlPart(CustomXmlPartType.CustomXml);
}
var serializer = new DataContractSerializer(typeof(MyCustomData));
using (var stream = new MemoryStream())
{
serializer.WriteObject(stream, data);
stream.Seek(0, SeekOrigin.Begin);
part.FeedData(stream);
}
}
}
IEnumerable<CustomXmlPart> GetCustomXmlParts(WordProcessingDocument doc)
{
return doc.MainDocumentPart.CustomXmlParts
.Where(part =>
{
using (var stream = doc.Package.GePart(c.Uri).GetStream())
using (var streamReader = new StreamReader(stream))
{
return streamReader.ReadToEnd().Contains("Some.Namespace");
}
});
}
Any suggestions?
Since re-zipping works, it seems the content is well-formed.
So it sounds like the zip process is at fault. So open the corrupted docx in 7-Zip, and take note of the values in the "method" column (especially for customXML\item.xml).
Compare that value to a working docx - is it the same or different? Method "Deflate" works.
I faced the same issue and it turned out it was due to encoding.
Do you already specify the same encoding when serializing/deserializing?
Couple of suggestion
a. Try doc.Package.Flush(); after you write the data back into the custom xml.
b. You may have to delete all custom part and add a new custom part. We are using the following code and it seems working fine.
public static void ReplaceCustomXML(WordprocessingDocument myDoc, string customXML)
{
MainDocumentPart mainPart = myDoc.MainDocumentPart;
mainPart.DeleteParts<CustomXmlPart>(mainPart.CustomXmlParts);
CustomXmlPart customXmlPart = mainPart.AddCustomXmlPart(CustomXmlPartType.CustomXml);
using (StreamWriter ts = new StreamWriter(customXmlPart.GetStream()))
{
ts.Write(customXML);
ts.Flush();
ts.Close();
}
}
public static MemoryStream GetCustomXmlPart(MainDocumentPart mainPart)
{
foreach (CustomXmlPart part in mainPart.CustomXmlParts)
{
using (XmlTextReader reader =
new XmlTextReader(part.GetStream(FileMode.Open, FileAccess.Read)))
{
reader.MoveToContent();
if (reader.Name.Equals("aaaa", StringComparison.OrdinalIgnoreCase))
{
string str = reader.ReadOuterXml();
byte[] byteArray = Encoding.ASCII.GetBytes(str);
MemoryStream stream = new MemoryStream(byteArray);
return stream;
}
}
}
return null; //result;
}
using (WordprocessingDocument myDoc = WordprocessingDocument.Open(ms, true))
{
StreamReader reader = new StreamReader(memStream);
string FullXML = reader.ReadToEnd();
ReplaceCustomXML(myDoc, FullXML);
myDoc.Package.Flush();
//Code to save file
}

Categories

Resources