using (FileStream readStream = File.OpenRead(filePath)){
bool test = readStream.CanRead;
document = new XWPFDocument(readStream);
I want to read many docx by NPOI , but when i read some docx ,show "SettingsDocument parse failed"
and get Exception
System.ObjectDisposedException
and Message show can`t access the file has cloesd;
i want to read many docx by NPOI , but when i read some docx ,show "SettingsDocument parse failed"
new XWPFDocument(readStream) throw exception ,detail like this
Related
I want to find whether a text is present in the uploaded PDF file in ASP.NET c#.
using (MemoryStream str = new MemoryStream(this.docUploadField.FileBytes))
{
using (StreamReader sr = new StreamReader(str, Encoding.UTF8))
{
string line = sr.ReadToEnd();
}
}
I am getting the below as the file content when I read the contents of file.
Please help me with this
You surely need some PDF reading library.
Most famous being
IText (ITextSharp for who remembers it): https://github.com/itext/itext7-dotnet
PdfSharp: https://github.com/empira/PDFsharp
and many other free options.
With those you open pdf file and read it and take the text you need.
Usually they give you a collection of the PDF elements (paragraphs, images, etc etc, and you loop through them or use a search function to look for what you need)
So I am able to open and dig through an xls (Excel 97-2003) file. However, the problem is when I try to save it. After I save it, and it succeeds, I then go and manually open the Excel file and get an error saying that it cannot open the Excel file and that it is corrupt. This happens regardless of whether I make any changes or not.
I am still able to, within the program, open the excel file and read through the data.
I am using NPOI 2.2.1 and 2.3.0 (which I installed via Nuget). Both versions have the same results.
string excelLocation = settings.GetExcelDirectory() + week.ExcelLocation;
HSSFWorkbook wbXLS;
// Try to open and read existing workbook
using (FileStream stream = new FileStream(excelLocation, FileMode.Open, FileAccess.Read))
{
wbXLS = new HSSFWorkbook(stream);
}
ISheet sheet = wbXLS.GetSheet("Schedule");
using (FileStream stream = new FileStream(excelLocation, FileMode.Create, FileAccess.Write))
{
wbXLS.Write(stream);
}
Do you have multi-line cell contents (with a line break)?
I just came across such a problem with my column headers in row 0.
Excel encodes a line break by replacing it with _x000a_ (i.e. line1_x000a_line2), NPOI doesn't do that.
Tried with nuget 2.3.0 and xlsx file, but might also be helpful in your case.
I worked around (and verified the cause) by replacing line feeds before saving:
// Encode LineFeeds in column headers (row 0)
IRow rowColHeaders = sheet.GetRow(0);
foreach (ICell cell in rowColHeaders.Cells)
{
string content = cell.StringCellValue;
if (content.Contains("\n"))
cell.SetCellValue(content.Replace("\n", "_x000a_"));
}
try something like
FileStream sw = File.Create(excelLocation);
wbXLS.Write(sw);
sw.Close();
I have been using NPOI to read Excel files, and I now need to write out files. I am trying to use the WorkbookFactory, which doesn't show up in a lot of examples online (doesn't appear in the NPOI examples on CodePlex either). Here is the code:
this.FileStream = new FileStream(
this.FilePath,
FileMode.OpenOrCreate,
FileAccess.ReadWrite);
this.Workbook = WorkbookFactory.Create(
this.FileStream);
When it gets to the second statement, I get an ArgumentOutOfRangeException with the following message: "Non-negative number required.\r\nParameter name: value".
Next few lines in the call stack:
at System.IO.FileStream.set_Position(Int64 value)
at NPOI.Util.PushbackStream.set_Position(Int64 value)
at NPOI.POIXMLDocument.HasOOXMLHeader(Stream inp)
at NPOI.SS.UserModel.WorkbookFactory.Create(Stream inputStream)
The WorkbookFactory (link to POI documentation) reads existing file data from an input stream, and determines on the fly whether to create an HSSFWorkbook or an XSSFWorkbook (i.e. whether you are working with XLS or XLSX-like files).
From your code, it seems you are trying to create a new file using this class. That is not something the WorkbookFactory can help you with. To write files, use the following pattern:
var workbook = new XSSFWorkbook();
...
using (var fileData = new FileStream(#"path\filename.xlsx", FileMode.Create))
{
workbook.Write(fileData);
}
(In other words, WorkbookFactory is a class factory, not a file factory :-))
When I am trying to read .doc file using DocumentFormat.OpenXml dll its giving error as "File contains corrupted data."
This dll is reading .docx file properly.
Can DocumentFormat.OpenXml dll help in reading .doc file?
string path = #"D:\Data\Test.doc";
string searchKeyWord = #"java";
private bool SearchWordIsMatched(string path, string searchKeyWord)
{
try
{
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(path, true))
{
var text = wordDoc.MainDocumentPart.Document.InnerText;
if (text.Contains(searchKeyWord))
return true;
else
return false;
}
}
catch (Exception ex)
{
throw ex;
}
}
The old .doc files have a completely different format from the new .docx files. So, no, you can't use the OpenXml library to read .doc files.
To do that, you would either need to manually convert the files first, or you would need to use Office interop, instead of the Open XML SDK you're using now.
I'm afraid there won't be any better answer than the ones already given. The Microsoft Word DOC format is binary whereas OpenXML formats such as DOCX are zipped XML files. The OpenXml framework is for working with the latter only.
As suggested, the only other option you have is to use Word interop or third party library to convert DOC -> DOCX which you can then work with the OpenXml library.
.doc (If created with an older version of Microsoft Word) does not have the same structure as a .docx (Which is basically a zip file with some XML documents).
If your .doc is 'unzippable' (Just rename the .doc extension to .zip) to probe, you'll have to manually convert the .doc to a .docx.
You can use IFilterTextReader.
TextReader reader = new FilterReader(path);
using (reader)
{
txt = reader.ReadToEnd();
}
You can take a look at http://www.codeproject.com/Articles/13391/Using-IFilter-in-C
i used a read stream to read an rtf file however it failed when this rtf file is opened by Microsoft word.
is there anyone know how to solve this problem?
The proper way to read a RTF file for a rich text box (has to be of type System.Windows.Forms.RichTextBox) is like this:
myRichTextBox.LoadFile(myFilename);
But, because you have a lock on the file, you have to do it this way (credit to #slaks):
myRichTextBox.LoadFile(new FileStream(myFilename, FileAccess.Read, FileSharing.ReadWrite));
And to save it, simply call this function:
myRichTextBox.SaveFile(myFilename);
Like this:
new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.ReadWrite)