HTMLEncodes Image URL and breaks the image

HTMLEncodes Image URL and breaks the image - c#

I am have simple script where i save news details like News Title, News URL and News Image URL. i noticed that image doesn't show when it has Unicode characters take for example http://www.bbj.hu/images2/201412/párizsi_ud_20141218113410452.jpg
It is stored in database as it is but when i display it on web page breaks & shows as
http://www.bbj.hu/images2/201412/pa%c2%b4rizsi_ud_20141218113410452.jpg
When i debug my asp.net webform page it show correctly in the code behind
protected String getImage(object imgSource)
{
string img = null;
img = imgSource.ToString();
return img;
// Debug show image url properly but it breaks on actual page
}
.aspx code
<asp:Image ID="NewsImage" ImageUrl='<%# getImage(Eval("NewsImageURL")) %>' runat="server" />
I tried different things but it keeps showing up as http://www.bbj.hu/images2/201412/pa%c2%b4rizsi_ud_20141218113410452.jpg
How can i fix this

Your problem's solution must be among one or more of the following:
C# URL Encode/Decode:
string encodedUrl = HttpUtility.UrlEncode(myUrl);
string sameMyUrl = HttpUtility.UrlDecode(encodedUrl);
Javascript URL Encode/Decode:
function myFunction() {
var uri = myUrl;
var uri_enc = encodeURIComponent(uri);
var uri_dec = decodeURIComponent(uri_enc);
}
C# HTML Encode/Decode:
string encodedHtml = HttpUtility.HtmlEncode(myHtml);
string sameMyHtml = HttpUtility.HtmlDecode(encodedHtml);
Javascript HTML Encode/Decode:
function htmlEncode(value) {
//create a in-memory div, set its inner text (which jQuery automatically encodes)
//then grab the encoded contents back out. The div never exists on the page.
return $('<div/>').text(value).html();
}
function htmlDecode(html) {
return $('<div>').html(html).text();
}

Related

Action name being displayed in PDF toolbar in browser. Setting ContentDisposition does not affect it

Controller code:
[HttpGet]
public FileStreamResult GETPDF(string guid)
{
var stream = XeroHelper.GetXeroPdf(guid).Result;
stream.Position = 0;
var cd = new ContentDisposition
{
FileName = $"{guid}.pdf",
Inline = true
};
Response.AppendHeader("Content-Disposition", cd.ToString());
return File(stream, "application/pdf");
}
As you can see the method's name is GETPDF. You can also see that I am configuring the name of the file name in the ContentDisposition header. If you see below, you will see that the method name is used as the title in the toolbar, rather than the file name.
The file name does get perpetuated. When I click "Download" the filename is the default value that is used in the file picker (note i changed the name to hide the sensitive guid):
If anyone has any ideas how to rename the title of that toolbar, it would be greatly appreciated.
As an aside, this is NOT a duplicate of: C# MVC: Chrome using the action name to set inline PDF title as no answer was accepted and the only one with upvotes has been implemented in my method above and still does not work.
Edit- For clarification, I do not want to open the PDF in a new tab. I want to display it in a viewer in my page. This behavior is already happening with the code I provided, it is just the Title that is wrong and coming from my controller method name. Using the controller code, I am then showing it in the view like so:
<h1>Quote</h1>
<object data="#Url.Action("GETPDF", new { guid = #Model.QuoteGuid })" type="application/pdf" width="800" height="650"></object>

try something like this:
[HttpGet]
public FileResult GETPDF(string guid)
{
var stream = XeroHelper.GetXeroPdf(guid).Result;
using (MemoryStream ms = new MemoryStream())
{
stream.CopyTo(ms);
// Download
//return File(ms.ToArray(), "application/pdf", $"{guid}.pdf");
// Open **(use window.open in JS)**
return File(ms.ToArray(), "application/pdf")
}
}
UPDATE: based on mention of viewer.
To embed in a page you can try the <embed> tag or <object> tag
here is an example
Recommended way to embed PDF in HTML?
ie:
<embed src="https://drive.google.com/viewerng/
viewer?embedded=true&url=[YOUR ACTION]" width="500" height="375">
Might need to try the File method with the 3rd parameter to see which works.
If the title is set in the filename, maybe this will display as the title.
(not sure what a download will do though, maybe set a download link with athe pdf name)
UPDATE 2:
Another idea:
How are you calling the url?
Are you specifying: GETPDF?guid=XXXX
Maybe try: GETPDF/XXXX (you may need to adjust the routing for this or call the parameter "id" if this is the default)

You could do this simply by adding your filename as part of URL:
<object data="#Url.Action("GETPDF/MyFileName", new { guid = #Model.QuoteGuid })" type="application/pdf" width="800" height="650"></object>`
You should ignore MyFileName in rout config. Chrome and Firefox are using PDFjs internally. PDFjs try to extract display name from URL.
According to the PDFjs code, it uses the following function to extract display name from URL:
function pdfViewSetTitleUsingUrl(url) {
this.url = url;
var title = pdfjsLib.getFilenameFromUrl(url) || url;
try {
title = decodeURIComponent(title);
} catch (e) {
// decodeURIComponent may throw URIError,
// fall back to using the unprocessed url in that case
}
this.setTitle(title);
}
function getFilenameFromUrl(url) {
const anchor = url.indexOf("#");
const query = url.indexOf("?");
const end = Math.min(
anchor > 0 ? anchor : url.length,
query > 0 ? query : url.length
);
return url.substring(url.lastIndexOf("/", end) + 1, end);
}
As you can see this code uses the last position of "/" to find the file name.
The following code is from PDFjs, I don't know why PDFjs doesn't use this instead of getFilenameFromUrl. This code use query string to detect file name and it uses as a fallback to find the file name.
function getPDFFileNameFromURL(url, defaultFilename = "document.pdf") {
if (typeof url !== "string") {
return defaultFilename;
}
if (isDataSchema(url)) {
console.warn(
"getPDFFileNameFromURL: " +
'ignoring "data:" URL for performance reasons.'
);
return defaultFilename;
}
const reURI = /^(?:(?:[^:]+:)?\/\/[^\/]+)?([^?#]*)(\?[^#]*)?(#.*)?$/;
// SCHEME HOST 1.PATH 2.QUERY 3.REF
// Pattern to get last matching NAME.pdf
const reFilename = /[^\/?#=]+\.pdf\b(?!.*\.pdf\b)/i;
const splitURI = reURI.exec(url);
let suggestedFilename =
reFilename.exec(splitURI[1]) ||
reFilename.exec(splitURI[2]) ||
reFilename.exec(splitURI[3]);
if (suggestedFilename) {
suggestedFilename = suggestedFilename[0];
if (suggestedFilename.includes("%")) {
// URL-encoded %2Fpath%2Fto%2Ffile.pdf should be file.pdf
try {
suggestedFilename = reFilename.exec(
decodeURIComponent(suggestedFilename)
)[0];
} catch (ex) {
// Possible (extremely rare) errors:
// URIError "Malformed URI", e.g. for "%AA.pdf"
// TypeError "null has no properties", e.g. for "%2F.pdf"
}
}
}
return suggestedFilename || defaultFilename;
}

Get HTML string from web in C#, but does not contain data part

I'm trying to get a data from webpage(https://finance.naver.com/sise/sise_trans_style.nhn) In my UWP App.
I write following source code in my project.
public class MainPageViewModel : Observable
{
public string urlAddress = "https://finance.naver.com/sise/sise_trans_style.nhn";
public string data { get; set; }
public MainPageViewModel()
{
ButtonClick = new RelayCommand(Click);
}
public async void Click()
{
HttpClient httpClient = new HttpClient();
Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
var result = await httpClient.GetStringAsync(new Uri(urlAddress));
data = result;
OnPropertyChanged("data");
}
public RelayCommand ButtonClick { get; set; }
}
But, the problem is, I'm not getting document's data part. following picture depicts the part of docs that I want to get.
In "data" variable, I get docs without data the most important part to me. I can get just other html parts.
I was trying to figure out where data are coming from, or which source helps me to get data. but I failed.
Are the data made from javascript? .. ajax... and,, how can I get data from that web page? .. And if I encounter this kind of problems next time, how can I figure out the reason?
(EDIT)Added HTML Source. and more specified.
when I see HTML DOCS with my source, I can get following contents.
<div class="box_type_m">
<iframe name="time" src="/sise/investorDealTrendTime.nhn?bizdate=20181005&sosok=" width="100%" height="380" marginheight="0" bottommargin="0" topmargin="0" SCROLLING="no" frameborder="0" title="시간별 순매수"></iframe>
</div>
but iframe has another html source ( contatining head & body).

The data you want is in iframes. These are loaded as pages within the page. You can see this in the source.
The actual URLs you should check out are:
https://ssl.pstatic.net/imgfinance/chart/sise/trendUitradeDayKOSPI.png?sid=1538753584555
https://finance.naver.com/sise/investorDealTrendTime.nhn?bizdate=20181005&sosok=
https://finance.naver.com/sise/investorDealTrendDay.nhn?bizdate=20181005&sosok=

C# Downloading Instagram Profile As HTML

I have been trying to download an public Instagram profile to the fetch stats such as followers and bio. I have been doing this in a c# console application and downloading the HTML using HTML Agility Pack.
Code:
string url = #"https://www.instagram.com/" + Console.ReadLine() + #"/?hl=en";
Console.WriteLine();
HtmlWeb web = new HtmlWeb();
HtmlDocument document = web.Load(url);
document.Save(path1);
When I save it though all I get is a bunch of scripts and a blank screen:
I was wondering how to save the html once all the scripts had run and formed the content

When you retrieve content using a web request, it returns a HTML document which is then rendered by the browser to display the content.
Right now, you're saving the HTML document given to you by the server. Instead of doing this, you need to render it before getting the details. One way to do this is using a web browser control. If you set the URL to the instragram URL, let the rendering engine handle it and once the load event is fired by the control, you can get the rendered HTML output.
From there, you can deserialize as an XmlDocument and identify exactly what details you need to retrieve from the rendered output.

public MainWindow()
{
InitializeComponent();
WB_1.Navigate(#"https://www.instagram.com/" + Console.ReadLine() + #"/?hl=en");
WB_1.LoadCompleted += wb_LoadCompleted;
}
void wb_LoadCompleted(object sender, NavigationEventArgs e)
{
dynamic doc = WB_1.Document;
string htmlText = doc.documentElement.InnerHtml;
}

ANSWER
Thanks for the suggestions on how to download the HTML! I managed to return some instagram information in the end. Here is the code:
//(This was done using HTML Agility Pack)
string url = #"https://www.instagram.com/" + Console.ReadLine() + #"/?hl=en";
HtmlWeb web = new HtmlWeb();
HtmlDocument document = web.Load(url);
var metas = document.DocumentNode.Descendants("meta");
var followers = metas.FirstOrDefault(_ => _.HasProperty("name", "description"));
if (followers == null) { Console.WriteLine("Sorry, Can't Find Profile :("); return; }
var content = followers.Attributes["content"].Value.StopAt('-');
Console.WriteLine(content);
And HasProperty() & StopAt()
public static bool HasProperty(this HtmlNode node, string property, params string[] valueArray)
{
var propertyValue = node.GetAttributeValue(property, "");
var propertyValues = propertyValue.Split(' ');
return valueArray.All(c => propertyValues.Contains(c));
}
public static string StopAt(this string input, char stopAt)
{
int x = input.IndexOf(stopAt);
return input.Substring(0, x);
}
NOTE:
However this is still not the answer I am looking for. I still have a wreck of HTML which is not structred the same as the HTML I recieve when I look at it in Google Chrome. Doing some searching in the HTML I managed to scalvage the content-less html for a meta tag which contains the content. This is okay for this but if I going to continue this method of finding HTML content then it may not be the same :(

How to store the url of a web page in an sql database

I have to store the link of a page in a database. The page is in my website.
As example:
I have to store the link of Result.aspx page in the database. How could I do this? I know that google.com can easily be stored and it's working with google.com, but I want to know how to do this with Result.aspx.
I will provide another example: there is an asp panel in my website and I have to store the urls of each row of menu and sub menu. These urls are also in my website like Default.aspx, Result.aspx etc.
If any question please ask.

Your question is not clear on me, but if you want to save your current page URL,
Feel free to use this.
string URL = Path.GetFileName(Request.Path);
string sqlIns = "INSERT INTO table (url) VALUES (#url)";
db.Open();
try
{
SqlCommand cmdIns = new SqlCommand(sqlIns, db.Connection);
cmdIns.Parameters.Add("#url", URL);
cmdIns.ExecuteNonQuery();
cmdIns.Dispose();
cmdIns = null;
}
catch(Exception ex)
{
throw new Exception(ex.ToString(), ex);
}
finally
{
db.Close();
}

If I understood you correctly, the problem is not with DB itself, it is with the relative url of the page. So, if your path is http://myWebSite.com/Result.aspx and http://myWebSite.com/Default.aspx then you should save string "~/Default.aspx". If your path is like http://myWebSite.com/someRoute/Result.aspx and - then you should save string "~/someRoute/Result.aspx".
To get this route you can use the following code:
string path = HttpContext.Current.Request.Url.AbsolutePath; // /someRoute/Result.aspx

it is simple...
private HtmlGenericControl LIList(string innerHtml, string rel, string url)
{
HtmlGenericControl li = new HtmlGenericControl("li");
li.Attributes.Add("rel", rel);
**li.InnerHtml = "" + innerHtml + "";**
return li;
}
this url is link which save in database.....

redirection to other page with encoded url

I am using one Image on my html page. where i am adding one anchor to that image to give the link.
that anchor contains href as ~/LB/lct.aspx?pid=177&cat=Happily In Love
Main Thing This URL IS COMING FROM DATABASE. Manually i am not entering it..
So its a Invalid URL Because of spaces between Happily In Love
I used Httputility.urlencoding and decoding also........but the problem i am facing is that..
Url is endoded properly but while i am clicking on the image its not redirecting to proper page because encoded url is not decoded..
How to resolve this...pls help me on this....

Here is the code
string url = "~/LB/lct.aspx?pid=177&cat=Happily In Love"; //your input
string[] arr = url.Split('?');
var nameValues = HttpUtility.ParseQueryString(arr[1]);
foreach (var n in nameValues.AllKeys)
{
nameValues.Set(n, HttpUtility.UrlEncode(nameValues[n]));
}
url = arr[0] + "?" + nameValues.ToString(); //your output
Use the following code to decode Querystring values
string cat = HttpUtility.UrlDecode(Request.QueryString["cat"].ToString());

Simple answer, but if it's just the spaces that are causing problems, consider replacing them with a + using a string replace:
string newurl = url.Replace(" ","+");
Note that this is only really safe if the spaces are restricted to the contents of a querystring.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

HTMLEncodes Image URL and breaks the image - c#

Related

Action name being displayed in PDF toolbar in browser. Setting ContentDisposition does not affect it

Get HTML string from web in C#, but does not contain data part

C# Downloading Instagram Profile As HTML

How to store the url of a web page in an sql database

redirection to other page with encoded url

Categories

Resources