Dynamically assign ContentId and send email with embedded images in c# - c#

How to assign src with content id in dynamically in c#
How to embed multiple images in email body using .NET
But not getting good idea.
When I submit then I would able to get html source and how to assign cid dynmically
> >
> > <html>><body><img src="~/Upload/1.jpg"><br><img src="~/Upload/1.jpg" /><br><img
> src="~/Upload/3.jpg"/><br><br>
> > thanks!!!</body></html>
Need to convert
I want to send email with multiple images
<html>><body><img src=cid:c1 /><br><img src=cid:c2 /><br><img
src=cid:c3/><br><br>
thanks!!!</body></html>
then
if (htmlString == null) return null;
var doc = new HtmlDocument();
doc.LoadHtml(htmlString);
HtmlNodeCollection nodes = doc.DocumentNode.SelectNodes("//img");
if (nodes == null) return null;
foreach (HtmlNode node in nodes)
{
if (node.Attributes.Contains("src"))
{
string data = node.Attributes["src"].Value;
string base64Data = Regex.Match(data, #"data:image/(?<type>.+?),(?<data>.+)").Groups["data"].Value;
if (base64Data != "")
{
string cid = Guid.NewGuid().ToString();
byte[] binData = Convert.FromBase64String(base64Data);
var stream = new MemoryStream(binData);
string contenttype = "image/" +
Regex.Match(data, #"data:image/(?<type>.+?);(?<data>.+)").Groups["type"]
.Value;
var inline = new Attachment(stream, new ContentType(contenttype));
inline.ContentDisposition.Inline = true;
inline.ContentDisposition.DispositionType = DispositionTypeNames.Inline;
inline.ContentId = cid;
inline.ContentType.MediaType = contenttype;
mailMessage.Attachments.Add(inline);
node.Attributes["src"].Value = "cid:" + cid;
}
}
Any good idea to make dynmically assign and add Linked resources and alternative view dynamically,
the given code is workable for static case
string path = System.Web.HttpContext.Current.Server.MapPath("~/images/Logo.jpg"); // my logo is placed in images folder
//var path=""
var logo = new LinkedResource(path);
logo.ContentId = "companylogo";
logo.ContentType = new ContentType("image/jpeg");
//now do the HTML formatting
AlternateView av1 = AlternateView.CreateAlternateViewFromString(
"<html><body><img src=cid:companylogo/>" +
"<br></body></html>" + strMailContent,
null, MediaTypeNames.Text.Html);
//now add the AlternateView
av1.LinkedResources.Add(logo);
//now append it to the body of the mail
msg.AlternateViews.Add(av1);

Here is the complete solution for you.
string inputHtmlContent = "<Your Html Content containing images goes here>";
string outputHtmlContent = string.Empty;
var myResources = new List<LinkedResource>();
if ((!string.IsNullOrEmpty(inputHtmlContent)))
{
var doc = new HtmlDocument();
doc.LoadHtml(inputHtmlContent);
HtmlNodeCollection nodes = doc.DocumentNode.SelectNodes("//img");
if (nodes !=null)
{
foreach (HtmlNode node in nodes)
{
if (node.Attributes.Contains("src"))
{
string data = node.Attributes["src"].Value;
string imgPath = System.Web.HttpContext.Current.Server.MapPath(data);
var imgLogo = new LinkedResource(imgPath);
imgLogo.ContentId = Guid.NewGuid().ToString();
imgLogo.ContentType = new ContentType("image/jpeg");
myResources.Add(imgLogo);
node.Attributes["src"].Value = string.Format("cid:{0}", imgLogo.ContentId);
outputHtmlContent = doc.DocumentNode.OuterHtml;
}
}
}
else
{
outputHtmlContent = inputHtmlContent;
}
AlternateView av2 = AlternateView.CreateAlternateViewFromString(outputHtmlContent,
null, MediaTypeNames.Text.Html);
foreach (LinkedResource linkedResource in myResources)
{
av2.LinkedResources.Add(linkedResource);
}
var msg = new MailMessage();
msg.AlternateViews.Add(av2);
msg.IsBodyHtml = true;
<-- Enter Other required Informations and send mail -->
...
}

Related

Syntax error in HAP SetAttributeValue or mistake in code

imgs = doc.DocumentNode.SelectNodes("//img");
HtmlNode img in imgs
string imageIdString = image.Id.ToString();
img.SetAttributeValue("src", "/ImageBrowser/ImageById/" + imageIdString);
I get a proper value for the ID, but the img source stays unchanged and I can't find why
tried to manage it like here:
Need to replace an img src attrib with new value
Edit1: The requested code
string input = sectionEditModel.Content;
string htmlstring = sectionEditModel.Content;
string htmlstringdecoded = HttpUtility.HtmlDecode(htmlstring);
HtmlDocument doc = new HtmlDocument();
List<string> urls = new List<string>();
DbImgBrowser.Models.Image image = null;
doc.LoadHtml(htmlstringdecoded);
var files = new FilesRepository();
HtmlNodeCollection imgs = new HtmlNodeCollection(doc.DocumentNode);
imgs = doc.DocumentNode.SelectNodes("//img");
if (imgs != null && imgs.Count > 0)
{
foreach (HtmlNode img in imgs)
{
HtmlAttribute srcs = img.Attributes[#"src"];
urls.Add(srcs.Value);
{
foreach (string Value in urls){
string AttrVal = img.GetAttributeValue("src", null);
if(AttrVal.Contains("base64"))
{
byte[] data = Convert.FromBase64String(Value.Substring(Value.IndexOf(",") + 1));
var pFolder = files.GetFolderByPath(string.Empty);
if (pFolder != null)
{
image = new DbImgBrowser.Models.Image()
{
Name = Guid.NewGuid().ToString(),
Folder = pFolder,
Image1 = data
};
files.Db.Images.Add(image);
files.Db.SaveChanges();
string imageIdString = image.Id.ToString();
img.SetAttributeValue("src", "/ImageBrowser/ImageById/" + imageIdString);
files.Db.SaveChanges();
}
}
Edit2: Example paths: before base64 example image
Path by Url example /ImageBrowser/Image?path=Test2.PNG
Wanted Result src="ImageBrowser/ImageById/"ID" (1-1000)
Edit3: Still all src is not changed
The answer is very simple.
I was on a local doc but I had to return it to the content and save the section
SectionsRepository.SaveSection(Section sec)

C# - How to get full text content from rss feed?

I want the full content from a rss feed, not just the description.
This is what I have:
string RssFeedUrl = "http://g1.globo.com/dynamo/rss2.xml";
List<feed> feeds = new List<feed>();
try
{
XDocument xDoc = new XDocument();
xDoc = XDocument.Load(RssFeedUrl);
var items = (from x in xDoc.Descendants("item")
select new
{
title = x.Element("title").Value,
link = x.Element("link").Value,
pubDate = x.Element("pubDate").Value,
description = x.Element("description").Value
});
if (items != null)
{
foreach (var i in items)
{
feed f = new feed
{
Titulo = i.title,
Link = i.link,
DataPublicada = i.pubDate,
Descricao = i.description
};
feeds.Add(f);
}
}
gvRss.DataSource = feeds;
gvRss.DataBind();
}
catch (Exception ex)
{
throw;
}
It is just retrieving me a short excerpt, but I want the full content text.

Web-scrape project writing too much information

I'm trying to modify the code below to scrape jobs from www.itoworld.com/careers. The jobs are in a table format and return all the <'td> values.
I believe it comes from the line:
var parentnode = node.ParentNode.ParentNode.ParentNode.FirstChild.NextSibling
However, I want it to write:
<a class="std-btn" href="http://www.itoworld.com/office-manager/">Office Manager</a>
Currently it is writing
<a href='http://www.itoworld.com/office-manager/' target='_blank'>Office ManagerOffice & AdminCambridgeFind out more</a>
I plan on 'brute force' modifying the output to remove unnecessary extras but was hoping there is a smarter way to do this. Is there a way for example to remove the second and third ParentNode after they have been called? (So they do not get written?)
public string ExtractIto()
{
string sUrl = "http://www.itoworld.com/careers/";
GlobusHttpHelper ghh = new GlobusHttpHelper();
List<Links> link = new List<Links>();
bool Next = true;
int count = 1;
string html = ghh.getHtmlfromUrl(new Uri(string.Format(sUrl)));
HtmlAgilityPack.HtmlDocument hd = new HtmlAgilityPack.HtmlDocument();
hd.LoadHtml(html);
var hn = hd.DocumentNode.SelectSingleNode("//*[#class='btn-wrapper']");
var hnc = hn.SelectNodes(".//a");
foreach (var node in hnc)
{
try
{
var parentnode = node.ParentNode.ParentNode.ParentNode.FirstChild.NextSibling;
Links l = new Links();
l.Name = ParseHtmlContainingText(parentnode.InnerText);
l.Link = node.GetAttributeValue("href", "");
link.Add(l);
}
}
string Xml = getXml(link);
return WriteXml(Xml);
For completeness below is the definition of ParseHtmlContainingText
public string ParseHtmlContainingText(string htmlString)
{
return Regex.Replace(Regex.Replace(WebUtility.HtmlDecode(htmlString), #"<[^>]+>| ", ""), #"\s{2,}", " ").Trim();
}
You just need to create a "name node" and use that for your parse method.
I tested with this code and it worked for me.
var parentnode = node.ParentNode.ParentNode.ParentNode.FirstChild.NextSibling;
var nameNode = parentnode.FirstChild;
Links l = new Links();
l.Name = ParseHtmlContainingText(nameNode.InnerText);
l.Link = node.GetAttributeValue("href", "");

Reading Specific text from a website

I am trying to make a database, but i need to get info from a website. Mainly the Title, Date, Length and Genre from the IMDB website. I have tried like 50 different things and it is just not working.
Here is my code.
public string GetName(string URL)
{
HtmlWeb web = new HtmlWeb();
HtmlDocument doc = web.Load(URL);
var Attr = doc.DocumentNode.SelectNodes("//*[#id=\"overview - top\"]/h1/span[1]#itemprop")[0];
return Name;
}
When I run this it just gives me a XPathException. I just want it to return the Title of a movie. I am now just using this movie for a example and testing but, I want it to work with all movies http://www.imdb.com/title/tt0405422
I am using the HtmlAgilityPack.
The last bit of your XPath is not valid. Also to get only single element from HtmlDocument() you can use SelectSingleNode() instead of SelectNodes() :
HtmlWeb web = new HtmlWeb();
HtmlDocument doc = web.Load("http://www.imdb.com/title/tt0405422/");
var xpath = "//*[#id='overview-top']/h1/span[#class='itemprop']";
var span = doc.DocumentNode.SelectSingleNode(xpath);
var title = span.InnerText;
Console.WriteLine(title);
output :
The 40-Year-Old Virgin
demo link : *
https://dotnetfiddle.net/P7U5A7
*) the demo shows that the correct title is printed, along with an error specific to .NET Fiddle (you can safely ignore the error).
I making something familiar and this is my code which gets info from imdb.com website.:
string html = getUrlData(imdbUrl + "combined");
Id = match(#"<link rel=""canonical"" href=""http://www.imdb.com/title/(tt\d{7})/combined"" />", html);
if (!string.IsNullOrEmpty(Id))
{
status = true;
Title = match(#"<title>(IMDb \- )*(.*?) \(.*?</title>", html, 2);
OriginalTitle = match(#"title-extra"">(.*?)<", html);
Year = match(#"<title>.*?\(.*?(\d{4}).*?\).*?</title>", html);
Rating = match(#"<b>(\d.\d)/10</b>", html);
Genres = matchAll(#"<a.*?>(.*?)</a>", match(#"Genre.?:(.*?)(</div>|See more)", html));
Directors = matchAll(#"<td valign=""top""><a.*?href=""/name/.*?/"">(.*?)</a>", match(#"Directed by</a></h5>(.*?)</table>", html));
Cast = matchAll(#"<td class=""nm""><a.*?href=""/name/.*?/"".*?>(.*?)</a>", match(#"<h3>Cast</h3>(.*?)</table>", html));
Plot = match(#"Plot:</h5>.*?<div class=""info-content"">(.*?)(<a|</div)", html);
Runtime = match(#"Runtime:</h5><div class=""info-content"">(\d{1,4}) min[\s]*.*?</div>", html);
Languages = matchAll(#"<a.*?>(.*?)</a>", match(#"Language.?:(.*?)(</div>|>.?and )", html));
Countries = matchAll(#"<a.*?>(.*?)</a>", match(#"Country:(.*?)(</div>|>.?and )", html));
Poster = match(#"<div class=""photo"">.*?<a name=""poster"".*?><img.*?src=""(.*?)"".*?</div>", html);
if (!string.IsNullOrEmpty(Poster) && Poster.IndexOf("media-imdb.com") > 0)
{
Poster = Regex.Replace(Poster, #"_V1.*?.jpg", "_V1._SY200.jpg");
PosterLarge = Regex.Replace(Poster, #"_V1.*?.jpg", "_V1._SY500.jpg");
PosterFull = Regex.Replace(Poster, #"_V1.*?.jpg", "_V1._SY0.jpg");
}
else
{
Poster = string.Empty;
PosterLarge = string.Empty;
PosterFull = string.Empty;
}
ImdbURL = "http://www.imdb.com/title/" + Id + "/";
if (GetExtraInfo)
{
string plotHtml = getUrlData(imdbUrl + "plotsummary");
}
//Match single instance
private string match(string regex, string html, int i = 1)
{
return new Regex(regex, RegexOptions.Multiline).Match(html).Groups[i].Value.Trim();
}
//Match all instances and return as ArrayList
private ArrayList matchAll(string regex, string html, int i = 1)
{
ArrayList list = new ArrayList();
foreach (Match m in new Regex(regex, RegexOptions.Multiline).Matches(html))
list.Add(m.Groups[i].Value.Trim());
return list;
}
Maybe you will find something useful

How to click buttons/links which are in iframe? (GeckoFx)

This code access iframe and gets me source code.
string content = null;
var iframe = browser.Document.GetElementsByTagName("iframe").FirstOrDefault() as Gecko.DOM.GeckoIFrameElement;
if (iframe != null)
{
var html = iframe.ContentDocument.DocumentElement as GeckoHtmlElement;
if (html != null)
content = html.OuterHtml;
textBox1.Text = content;
}
I tried puting some code
string content = null;
var iframe = browser.Document.GetElementsByTagName("iframe").FirstOrDefault() as Gecko.DOM.GeckoIFrameElement;
if (iframe != null)
{
var html = iframe.ContentDocument.DocumentElement as GeckoHtmlElement;
if (html != null)
content = html.OuterHtml;
textBox1.Text = content;
GeckoElementCollection elements = browser.Document.GetElementsByName("username");
foreach (var element in elements)
{
GeckoInputElement input = (GeckoInputElement)element;
input.Value = "Auto filled!";
}
}
But it wont work as code dont find elements. Any ideas?
Tried searching google for any iframe examples but seems that there isnt any good documentation for it.
Why are you looking for in the main document? You should look for in a frame.
string content = null;
var iframe = browser.Document.GetElementsByTagName("iframe").FirstOrDefault() as Gecko.DOM.GeckoIFrameElement;
if (iframe != null)
{
var html = iframe.ContentDocument.DocumentElement as GeckoHtmlElement;
if (html != null)
content = html.OuterHtml;
textBox1.Text = content;
GeckoElementCollection elements = iframe.ContentDocument.GetElementsByName("username");
foreach (var element in elements)
{
GeckoInputElement input = (GeckoInputElement)element;
input.Value = "Auto filled!";
}
}

Categories

Resources