Docx to byte array not saving in database

Docx to byte array not saving in database - c#

I'm trying to save .docx files in a database and the code shown here is where I'm converting the .docx file into byte array and then trying to save it into the database.
I'm getting an error
String or binary data would be truncated
I used a column of type varbinary(max) in the database, and the same code is working for pdf and text file but its not working for .docx.
Please guide me.
Controller:
try
{
byte[] byteDocument = new byte[0];
if (file.Length > 0)
{
long length = file.Length;
using var fileStream = file.OpenReadStream();
byteDocument = new byte[length];
fileStream.Read(byteDocument, 0, (int)file.Length);
_attachmentDto = new ReviewAttachmentDto
{
ReviewId = reviewId,
DocumentType = file.ContentType,
Document = byteDocument
};
}
string requestBody = JsonConvert.SerializeObject(_attachmentDto);
// Call API
var _responseObj = await WebAPIHelper.PostDataToAPI(appSettings.SaveUrl, requestBody;
}
Database save:
public void SaveAction(ReviewAttachment reviewAttachment)
{
Entities.Surveillance.ReviewAttachment reviewAttachmentDB = new Entities.Surveillance.ReviewAttachment();
reviewAttachmentDB.ReviewId = Int32.Parse(reviewAttachment.ReviewId);
reviewAttachmentDB.DocumentType = reviewAttachment.DocumentType;
reviewAttachmentDB.Document = reviewAttachment.Document;
context.Add(reviewAttachmentDB);
context.SaveChanges();
}

Since I doubt that your Word doc is over 2 GB in size, then I would suggest that you check to see if you have not reached the size limit for your actual database file. You may need to enable autogrowth on your database. Check the answer here. This person was getting the same error message with a varbinary(max) column.
https://stackoverflow.com/a/11006473/1461269
Also, to find out the implications of enabling this option you can read about it on Microsoft's support site: https://support.microsoft.com/en-ca/help/315512/considerations-for-the-autogrow-and-autoshrink-settings-in-sql-server

Related

Unable to figure out how to add data to existing data lake file

I'm using the Azure SDK for .NET to manipulate files on the data lake (Gen2).
Within an Azure Function, I would like to add some data to a csv file stored on the data lake.
I came up with this method, should work according to the documentation (or I did not fully understand it).
Problem is that the data is not 'flushed' to the file. It remains the original content.
Can't figure out what's going on here i'm afraid :-(
Any tips ?
Regards,
Sven Peeters
PS : I must add data incrementally, otherwise the memory consumption can become an issue here.
public void AddFileContents(string fullPath, string content, string leaseId = null)
{
DataLakeFileClient dataLakeFileClient = GetFileSystemClient().GetFileClient(fullPath);
dataLakeFileClient.CreateIfNotExists();
long currentLength = dataLakeFileClient.GetProperties().Value.ContentLength;
byte[] byteArray = Encoding.UTF8.GetBytes(content);
MemoryStream mStream = new MemoryStream(byteArray);
long fileSize = mStream.Length;
dataLakeFileClient.Append(mStream, currentLength, leaseId: leaseId);
dataLakeFileClient.Flush(position: currentLength, close: true, conditions: new DataLakeRequestConditions() { LeaseId = leaseId });
}

According to the API documentation, you should change position: currentLength to position: currentLength + fileSize in Flush method. The position parameter should equal to the length of file after you appended.
To flush, the previously uploaded data must be contiguous, the
position parameter must be specified and equal to the length of the
file after all data has been written, and there must not be a request
entity body included with the request.
Code:
public static void AddFileContents(string fullPath, string content, string leaseId = null)
{
DataLakeFileClient dataLakeFileClient = GetFileSystemClient().GetFileClient(fullPath);
dataLakeFileClient.CreateIfNotExists();
long currentLength = dataLakeFileClient.GetProperties().Value.ContentLength;
byte[] byteArray = Encoding.UTF8.GetBytes(content);
MemoryStream mStream = new MemoryStream(byteArray);
long fileSize = mStream.Length;
dataLakeFileClient.Append(mStream, currentLength, leaseId: leaseId);
dataLakeFileClient.Flush(position: currentLength + fileSize, close: true, conditions: new DataLakeRequestConditions() { LeaseId = leaseId });
}

Checking if the file is rar through its bytes

I am trying to verify that the file is a .rar file through its bytes for security purposes. Th following code is my code the only problem is that the sub-header is not matching with the one generated from the file. I noticed that is different for different file. Could you please explain to me why?
static bool IsRARFile(string filePath)
{
bool isDocFile = false;
//
// File sigs from: http://www.garykessler.net/library/file_sigs.html
//
string msOfficeHeader = "52-61-72-21-1A-07-00-CF";
string docSubHeader = "64-2E-63-73";
using (Stream stream = File.OpenRead(filePath))
{
//get file header
byte[] headerBuffer = new byte[8];
stream.Read(headerBuffer, 0, headerBuffer.Length);
string headerString = BitConverter.ToString(headerBuffer);
if (headerString.Equals(msOfficeHeader, StringComparison.InvariantCultureIgnoreCase))
{
//get subheader
byte[] subHeaderBuffer = new byte[4];
stream.Seek(512, SeekOrigin.Begin);
stream.Read(subHeaderBuffer, 0, subHeaderBuffer.Length);
string subHeaderString = BitConverter.ToString(subHeaderBuffer);
if (subHeaderString.Equals(docSubHeader, StringComparison.InvariantCultureIgnoreCase))
{
isDocFile = true;
}
}
}
return isDocFile;
}

This is because you have just copied a function from somewhere for a different filetype and not every filetype has any notion of a "subheader". You only need to check the main header in the case of RAR.
I also suggest modifying the naming of the variables, it is quite a mismash if a function says it's checking for RAR type and internally all variables refer to DOCs.

what is the encode standard of .DOCX? (encode and save string to database) C#

i want to save in a database (mssql) the info of my HTMLEDITOR but when i encode the data and save it's fine, but when i restore it or open it data, they not open nothing.
i think the problem it's the enconde standard.
using (var output = new MemoryStream())
{
string docx = string.Empty;
byte[] bytesDocx = null;
this.ASPxHtmlEditorTemplate.Export(HtmlEditorExportFormat.Docx, output);//pass the data of the editor and assign to stream
output.Flush();
output.Position = 0;
using (var reader = new StreamReader(output))
{
docx = reader.ReadToEnd();// assign the stream to a STRING
bytesDocx = Encoding.UTF-8.GetBytes(docx);//encode the STRING to UTF-8
using (var uow = new UnitOfWork())
{
//here some instructions to save the "bytes" to TEXT in a MSSQL DB
uow.CommitChanges();
}
}

I suggest you either to store the *.docx as XML document(because it is really XML):
XmlDocument xmldoc = new XmlDocument();
xmldoc.Load(docxStream);
string text = xmldoc.DocumentElement.InnerText;
or to store stream' bytes directly to database without any "to-string" conversion using the technique described in Save byte[] into a SQL Server database from C# article.

Use iTextSharp to save a PDF to a SQL Server 2008 Blob, and read that Blob to save to disk

I'm currently trying to use iTextSharp to do some PDF field mapping, but the challenging part right now is just saving the modified file in a varbinary[max] column. Then I later need to read that blob and convert it into a pdf which I save to a file.
I've been all over looking at example code but I can't find exactly what I'm looking for, and can't seem to piece together the [read from file to iTextSharp object] -> [do my stuff] -> [convert to varbinary(max)] pipeline, nor the conversion of that blob back into a savable file.
If anyone has code snippet examples that would be extremely helpful. Thanks!

The need to deal with a pdf in multiple passes was not immediately clear when I first started working them, so maybe this is some help to you.
In the method below, we create a pdf, render it to a byte[], load it for post processing, render the pdf again and return the result.
The rest of your question deals with getting a byte[] into and out of a varbinary[max], saving a byte[] to file and reading it back out, which you can google easily enough.
public byte[] PdfGeneratorAndPostProcessor()
{
byte[] newPdf;
using (var pdf = new MemoryStream())
using (var doc = new Document(iTextSharp.text.PageSize.A4))
using (PdfWriter.GetInstance(doc, pdf))
{
doc.Open();
// do stuff to the newly created doc...
doc.Close();
newPdf = pdf.GetBuffer();
}
byte[] postProcessedPdf;
var reader = new PdfReader(newPdf);
using (var pdf = new MemoryStream())
using (var stamper = new PdfStamper(reader, pdf))
{
var pageCount = reader.NumberOfPages;
for (var i = 1; i <= pageCount; i++)
{
// do something on each page of the existing pdf
}
stamper.Close();
postProcessedPdf = pdf.GetBuffer();
}
reader.Close();
return postProcessedPdf;
}

Get Binary data from a SQL Database

I have an ASP .Net (3.5) website. I have the following code that uploads a file as a binary to a SQL Database:
Print("
protected void UploadButton_Click(object sender, EventArgs e)
{
//Get the posted file
Stream fileDataStream = FileUpload.PostedFile.InputStream;
//Get length of file
int fileLength = FileUpload.PostedFile.ContentLength;
//Create a byte array with file length
byte[] fileData = new byte[fileLength];
//Read the stream into the byte array
fileDataStream.Read(fileData, 0, fileLength);
//get the file type
string fileType = FileUpload.PostedFile.ContentType;
//Open Connection
WebSysDataContext db = new WebSysDataContext(Contexts.WEBSYS_CONN());
//Create New Record
BinaryStore NewFile = new BinaryStore();
NewFile.BinaryID = "1";
NewFile.Type = fileType;
NewFile.BinaryFile = fileData;
//Save Record
db.BinaryStores.InsertOnSubmit(NewFile);
try
{
db.SubmitChanges();
}
catch (Exception)
{
throw;
}
}");
The files that will be uploaded are PDFs, Can you please help me in writing the code to get the PDF out of the SQL database and display it in the browser. (I am able to get the binary file using a linq query but not sure how to process the bytes)

So are you really just after how to serve a byte array in ASP.NET? It sounds like the database part is irrelevant, given that you've said you are able to get the binary file with a LINQ query.
If so, look at HttpResponse.BinaryWrite. You should also set the content type of the response appropriately, e.g. application/pdf.

How big are the files? Huge buffers (i.e. byte[fileLength]) are usually a bad idea.
Personally, I'd look at things like this and this, which show reading/writing data as streams (the second shows pushing the stream as an http response). But updated to use varchar(max) ;-p

protected void Test_Click(object sender, EventArgs e)
{
WebSysDataContext db = new WebSysDataContext(Contexts.WEBSYS_CONN());
var GetFile = from x in db.BinaryStores
where x.BinaryID == "1"
select x.BinaryFile;
FileStream MyFileStream;
long FileSize;
MyFileStream = new FileStream(GetFile, FileMode.Open);
FileSize = MyFileStream.Length;
byte[] Buffer = new byte[(int)FileSize];
MyFileStream.Read(Buffer, 0, (int)FileSize);
MyFileStream.Close();
Response.Write("<b>File Contents: </b>");
Response.BinaryWrite(Buffer);
}
I tryed this and this did not work. I get a compile error on this line "MyFileStream = new FileStream(GetFile, FileMode.Open);"
I not sure where i am going wrong, is it due to the way i have stored it?

When you store binary files in SQL Server it adds an OLE Header to the binary-data. So you must strip that header before actually reading the byte[] into file. Here's how you do this.
// First Strip-Out the OLE header
const int OleHeaderLength = 78;
int strippedDataLength = datarow["Field"].Length - OleHeaderLength;
byte[] strippedData = new byte[strippedDataLength];
Array.Copy(datarow["Field"], OleHeaderLength,
strippedData , 0, strippedDataLength );
Once you run this code, strippedData will contain the actual file data. You can then use MemoryStream or FileStream to perform I/O on the byte[].

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Docx to byte array not saving in database - c#

Related

Unable to figure out how to add data to existing data lake file

Checking if the file is rar through its bytes

what is the encode standard of .DOCX? (encode and save string to database) C#

Use iTextSharp to save a PDF to a SQL Server 2008 Blob, and read that Blob to save to disk

Get Binary data from a SQL Database

Categories

Resources