How to find word hyperlink with OpenXML - c#

Am trying to fetch Hyperlinks address from word document from every paragraph using OpenXML.
public static string GetAddressFromPara(Paragraph Paras)
{
IEnumerable<Hyperlink> hplk = Paras.Descendants<Hyperlink>();
if (hplk != null)
{
foreach (Hyperlink hp in hplk)
{
//string address = ???????;
}
}
}

I believe it should be
foreach (Hyperlink hp in hplk)
{
hyperlinkText = new StringBuilder();
foreach (Text text in hp.Descendants<Text>())
hyperlinkText.Append(text.InnerText);
hyperlinkRelationshipId = hp.Id.Value;
ExternalRelationship hyperlinkRelationship = doc
.MainDocumentPart
.ExternalRelationships
.Single(c => c.Id == hyperlinkRelationshipId);
hyperlinkUri = new StringBuilder(hyperlinkRelationship.Uri.AbsoluteUri);
}

Related

My C# WPF webscraper returns error when more than one result is found

I am working on a WPF XAML application that scrapes certain websites for products. I have the search part working and it finds what I'm looking for. But as soon as there is more then 1 result I get a System.InvalidoperationException. I use a ObservableCollection to put the results into a <ListBox>.
Here is the search method:
private static ObservableCollection<EntryModel> _entries = new ObservableCollection<EntryModel>();
public static ObservableCollection<EntryModel> LoadCollectionData
{
get { return _entries; }
set { _entries = value; }
}
public static void PrehmSearchResults(string SearchQuery)
{
HtmlWeb web = new HtmlWeb();
try
{
string ZoekOpdracht = SearchQuery.Replace(" ", "+");
HtmlDocument doc = web.Load("https://www.prehmshop.de/advanced_search_result.php?keywords=" + ZoekOpdracht);
var title = doc.DocumentNode.CssSelect("div.header_cell > a").Single().InnerText;
var links = doc.DocumentNode.CssSelect("a.product_link");
var productLink = new List<string>();
var productTitle = new List<string>();
foreach (var item in links)
{
if (item.Attributes["href"].Value.Contains(".html"))
{
productLink.Add(item.Attributes["href"].Value);
productTitle.Add(title);
}
}
var TitleAndLink = productLink.Zip(productTitle, (l, t) => new { productLink = l, productTitle = t });
foreach (var nw in TitleAndLink)
{
var product = new List<EntryModel>();
var adDetails = new EntryModel
{
Title = nw.productTitle,
Link = nw.productLink
};
Debug.Print(adDetails.ToString());
var ZoekOpdrachtInTitle = adDetails.Title.ToLower().Contains(ZoekOpdracht.ToLower());
if (ZoekOpdrachtInTitle)
{
_entries.Add(adDetails);
}
}
}
So I found the solution without changing too much code. thanks for the help from #PaulSinnema.
The link is part of the title so I only had to change
var title = doc.DocumentNode.CssSelect("div.header_cell > a").ToList();
And I had to change the foreach loop:
foreach (var item in title)
{
if (item.Attributes["href"].Value.Contains(".html"))
{
productLink.Add(item.Attributes["href"].Value);
productTitle.Add(item.InnerText);
}
}

What is the most efficient way to substring specific portions of a text to a list of objects

I have the following vCard text, my purpose is to parse the text to a list of vCard objects
BEGIN:VCARD
VERSION:2.1
N:Kleit;Ali;;;
FN:Ali Kleit
TEL;CELL:70101010
END:VCARD
BEGIN:VCARD
VERSION:2.1
N:Kleit;Saeed;;;
FN:Saeed Kleit
TEL;CELL:03494949
END:VCARD
the following is my code to do that
List<string> cards = new List<string>();
if (text != null)
{
while (text.Length != 0)
{
int idx_begin = text.IndexOf("BEGIN:VCARD");
if (idx_begin == -1)
break;
string endToken = "END:VCARD";
int idx_end = text.IndexOf(endToken);
if (idx_end == -1)
break;
string card = text.Substring(idx_begin, idx_end + endToken.Length);
text = text.Substring(idx_end + endToken.Length);
cards.Add(card);
}
}
next, using Thought.vCards.vCard .NET Library parser to parse each found vCard text
List<Thought.vCards.vCard> vCards = new List<Thought.vCards.vCard>();
List<string> failedStrings = new List<string>();
foreach (string card in cards)
{
using (TextReader sr = new StringReader(card))
{
var vCard = new Thought.vCards.vCard(sr);
if (vCard == null)
{
failedStrings.Add(card);
continue;
}
vCards.Add(vCard);
}
}
Is there any more efficient way to accomplish that knowing that the text might be in an incorrect format?
Something like this?
var vcards = File.ReadAllText(Path.Combine(Path.GetDirectoryName(Util.CurrentQueryPath), "Contacts.vcf"));
var vcardRe = new Regex(#"BEGIN:VCARD\s+(.+?)\s+END:VCARD", RegexOptions.Compiled | RegexOptions.Singleline);
var res = vcardRe.Matches(vcards)
.Cast<Match>()
.Select(x => x.Groups[0].Captures.Cast<Capture>().Select(c => c.Value).Last())
;
List<Thought.vCards.vCard> vCards = new List<Thought.vCards.vCard>();
List<string> failedStrings = new List<string>();
foreach(string card in res)
{
using (TextReader sr = new StringReader(card))
{
var vCard = new Thought.vCards.vCard(sr);
if (vCard == null)
{
failedStrings.Add(card);
continue;
}
vCards.Add(vCard);
}
}
vCards.Dump();

Obtain value from mergefield C# Word VSTO

The title says all, but I'll explain more in detail. Here is the thing.
I'm currently developing extra code in a Word VSTO add-in which uses mergefields in a template. The mergefields in the template are filled using data from an external file. What I need is to read the value from a Merge Field but I have totally no clue how to accomplish this. I've been searching for a few days now but none of the articles I read worked for me...
So the question:
How can I get the value from a specific merge field in Word using VSTO?
Mailmerge is quite simple in VSTO, Here is the two magic lines that will do
//Pass in the path of external file
document.MailMerge.OpenDataSource(Name: vm.FilePath.FullName);
document.MailMerge.Destination = WdMailMergeDestination.wdSendToNewDocument;
I found another full example here
This codeblock retrieves all fields in the document
public static List<string> GetFieldsUsedInDocument(Document document)
{
var fields = new List<string>();
foreach (MailMergeField fld in document.MailMerge.Fields)
{
if (fld.Code != null)
{
fields.Add(fld.Code.Text.ToUpper());
}
}
return fields;
}
To get the MergeField names from the list of fields returned above GetFieldsUsedInDocument
public static List<string> GetMergeFields(List<string> allFields)
{
var merges = new List<string>();
foreach (var field in allFields)
{
var isNestedField = false;
foreach (var fieldChar in field)
{
int charCode = fieldChar;
if (charCode == 19 || charCode == 21)
{
isNestedField = true;
break;
}
}
if (!isNestedField)
{
var fieldCode = field;
if (fieldCode.Contains("MERGEFIELD"))
{
var fieldName = fieldCode.Replace("MERGEFIELD", string.Empty).Replace('"', ' ').Trim();
var charsToGet = fieldName.IndexOf(" ");
if (charsToGet < 0)
charsToGet = fieldName.IndexOf(#"\");
charsToGet = charsToGet > 0 ? charsToGet : fieldName.Length;
fieldName = fieldName.Substring(0, charsToGet);
if (!merges.Contains(fieldName))
{
merges.Add(fieldName);
}
}
}
}
return merges;
}

Windows Phone 7 Gif to Png

I am building an Rss reader app and having problems with gifs.
Does anyone know how to build the ImageConverter class so it would convert gif to png?
Converting gifs in code will also work for me.
My app works in a way that it takes everything from the feed, puts it in a list() first and then it populates the listbox, (in a way the user chooses to). gifs leave blank images :(
so basically converter would have to work with a link, not a direct data stream.
I will add some code on how my data is updated:
ObservableCollection<FeedItem> slika0; //where the feeditems from the selected category go
public ObservableCollection<FeedItem> Slika0
{
get { return slika0; }
set
{
slika0 = value;
OnPropertyChanged("Slika0");
}
}
int broj_elemenata = 0;
bool nema_elemenata = false;
private void UpdateFeedList(string feedXML)
{
StringReader stringReader = new StringReader(feedXML);
XmlReader xmlReader = XmlReader.Create(stringReader);
SyndicationFeed feed = SyndicationFeed.Load(xmlReader);
List<string> kategorije = new List<string>();
foreach (SyndicationItem sitem in feed.Items)
{
foreach (SyndicationCategory sc in feed.Categories)
{
if (!kategorije.Contains(sc.Name))
{
kategorije.Add(sc.Name);
}
}
}
FeedItem FeedItem = new FeedItem();
Slika0 = new ObservableCollection<FeedItem>();
//SyndicationCategory tražena_kat = new SyndicationCategory("Trendy");
foreach (SyndicationItem item in feed.Items)
{
foreach (SyndicationCategory sc in item.Categories)
{
FeedItem.Content_List.Add(sc.Name);
foreach (SyndicationElementExtension ext in item.ElementExtensions)
{
XElement ele = ext.GetObject<XElement>();
if (ele.Name.LocalName == "encoded" && ele.Name.Namespace.ToString().Contains("content"))
{
FeedItem.Content_List.Add(item.Title.Text);
FeedItem.Content_List.Add(ele.Value);//takes the content of the feed
}
}
foreach (SyndicationLink link in item.Links)
{
FeedItem.Content_List.Add(link.Uri.AbsoluteUri); //takes the links for browsing and the image
}
}
}
IsolatedStorageSettings.ApplicationSettings["ContentList"] = FeedItem.Content_List;
Deployment.Current.Dispatcher.BeginInvoke(() =>
{
int i;
int x = 0;
Slika0.Clear();
foreach (var item in FeedItem.Content_List)
{
i = FeedItem.Content_List.IndexOf(item, x); //x = index of category in the list
if (item == "Trendy")
{
FeedItem FF = new FeedItem();
FF.Post_text = FeedItem.Content_List[i + 1];//title of article
FF.Content_link = FeedItem.Content_List[i + 2];//content
FF.Post_link = FeedItem.Content_List[i + 3];//the location of link for browsing
FF.Post_slika = FeedItem.Content_List[i + 4]; //location of image link
if (FF.Post_slika == "") //if there is no link for picture code is executed
{
FF.Post_slika = "Slike/zimo_veliki_tile.png";
}
Slika0.Add(FF);
this.lsSlika_P.ItemsSource = Slika0; //take
x = i + 5;
broj_elemenata++;
}
}
if (lsSlika_P.Items.Count <= 3)
{
scroll_panorama.VerticalScrollBarVisibility = ScrollBarVisibility.Disabled;
}
if (lsSlika_P.Items.Count == 0)
{
nema_elemenata = true;
}
else nema_elemenata = false;
});
pan_progres.Visibility = Visibility.Collapsed;//progress bar
}

Display values in Excel Spreadhseet rather than to console, Linq to xml, c#

I have this working code that parses values from XML files. Instead of writing the data to the console, how can I write data to an Excel spreadsheet? Any help please.
namespace TestCFG
{
class Program
{
public class XAxisCalib
{
public int Max1 { get; set; }
public int Min2 { get; set; }
public int Max3 { get; set; }
public int Min4 { get; set; }
public int Max5 { get; set; }
public int Min6 { get; set; }
}
static void Main(string[] args)
{
string[] fileEntries = Directory.GetFiles(#"c:\Sciclone UAC", "*.cfg*");
foreach (string fileName in fileEntries)
{
XDocument doc = XDocument.Load(fileName);
var query = from x in doc.Descendants("XAxisCalib")
select new
{
//Max1 = x.Attribute("Max").Value,
//Min2 = x.Attribute("Min").Value
MaxChild = x.Descendants("Max"),
MinChild = x.Descendants("Min")
};
foreach (var x in query)
{
foreach (var nextLevel in x.MaxChild)
{
Console.WriteLine("XMax: " + nextLevel.Value);
}
foreach (var nextLevel in x.MinChild)
{
Console.WriteLine("XMin: " + nextLevel.Value);
}
//Console.WriteLine("XAxisCalib");
}
var query2 = from y in doc.Descendants("YAxisCalib")
select new
{
//Max3 = x.Attribute("Max").Value,
//Min4 = x.Attribute("Min").Value
MaxChild = y.Descendants("Max"),
MinChild = y.Descendants("Min")
};
foreach (var y in query2)
{
foreach (var nextLevel in y.MaxChild)
{
Console.WriteLine("YMax: " + nextLevel.Value);
}
foreach (var nextLevel in y.MinChild)
{
Console.WriteLine("YMin: " + nextLevel.Value);
}
//Console.WriteLine("YAxisCalib");
var query3 = from z in doc.Descendants("ZAxisCalib")
select new
{
//Max5 = x.Attribute("Max").Value,
//Min6 = x.Attribute("Min").Value
MaxChild = z.Descendants("Max"),
MinChild = z.Descendants("Min")
};
foreach (var z in query3)
{
foreach (var nextLevel in z.MaxChild)
{
Console.WriteLine("ZMax: " + nextLevel.Value);
}
foreach (var nextLevel in z.MinChild)
{
Console.WriteLine("ZMin: " + nextLevel.Value);
}
//Console.WriteLine("ZAxisCalib");
}
}
}
}
}
}
Use the office API for that
something like
using System;
using System.Reflection;
using Microsoft.Office.Interop.Excel;
public class CreateExcelWorksheet
{
static void Main()
{
Microsoft.Office.Interop.Excel.Application xlApp = new Microsoft.Office.Interop.Excel.Application();
if (xlApp == null)
{
Console.WriteLine("EXCEL could not be started. Check that your office installation and project references are correct.");
return;
}
xlApp.Visible = true;
Workbook wb = xlApp.Workbooks.Add(XlWBATemplate.xlWBATWorksheet);
Worksheet ws = (Worksheet)wb.Worksheets[1];
if (ws == null)
{
Console.WriteLine("Worksheet could not be created. Check that your office installation and project references are correct.");
}
// Select the Excel cells, in the range c1 to c7 in the worksheet.
Range aRange = ws.get_Range("C1", "C7");
if (aRange == null)
{
Console.WriteLine("Could not get a range. Check to be sure you have the correct versions of the office DLLs.");
}
// Fill the cells in the C1 to C7 range of the worksheet with the number 6.
Object[] args = new Object[1];
args[0] = 6;
aRange.GetType().InvokeMember("Value", BindingFlags.SetProperty, null, aRange, args);
// Change the cells in the C1 to C7 range of the worksheet to the number 8.
aRange.Value2 = 8;
}
}
From
There are several ways to create a spreadsheet from XML.
- Use the Office API. This is a good, if heavy, approach. The API is very complex, overkill for simple operations, but necessary if you need formulas.
- Write out Excel XML, even more complex.
- Write out a CSV file, good for simple, non-formatted output. Watch-out for values with commas, etc.
- Write out an HTML table, Excel will open everything in cells.
Using Json.NET, I believe you can do something like this:
string json = JsonConvert.SerializeObject(table, Formatting.Indented);
Here's a link to Json.NET:
http://json.codeplex.com/

Categories

Resources