I am trying to Encode a product SKU on our Product Filter Module.
The problem I am experiencing is that the Detailed Product View uses the following code to retrieve the appropriate product information. The problem arises when an SKU has a forward slash. For Example, BD1115/35 the code below only detects the first part.
var prodCode = Request.QueryString["sku"];
var decodeprodCode = HttpUtility.UrlDecode(prodCode);
It was suggested that I encode the URL. Now I am trying to do this with Mustache which is a templating engine. Look at {{StockCode}} after SKU. This does not work.
<img class='responsive productimage' src='{{ProductImage}}' alt='{{StockDescription}}' />
I had a look at this question: Using Request.QueryString, slash (/) is added to the last querystring when it exists in the first querystring
Update
I have created a new Object in the Backend which is called QueryStringSKU and I am encoding it before it is replaced with Mustache. So the SKU BDF5555/45 will render in the href as BDF5555%2F45.
The problem now comes in when I try to Decode the URL. The querystring is now showing BDF5555&45.
Somehow DotNetNuke is changing this or rewriting this and now it is still ignoring the 45 value which is part of the Stock Keeping Unit (SKU)
I ended up using this code:
string RawurlFromRequest = Request.RawUrl;
var cleanSKU = RawurlFromRequest.Split(new[] { "sku/" }, StringSplitOptions.None)[1];
var decodeprodCode = cleanSKU.Split(new[] { "&" }, StringSplitOptions.None)[0];
Related
I'am having an issue with encoding as I retrieve informations such as customers names or orders info through webservice API. I'am using C# to manipulate the API, here is an example of the encoding problem :
1) Here is the value as seen in the MYSQL database (input source) : TestAcctééa --> We can see that the accent characters "éé" are well interpreted.
2) Here is the value as seen when I retrieve the information through the api : TestAcct????a --> We can see there is a problem, it does this with all the special characters (é, ç, ê,...). I cannot display the string correctly in the console and as I insert it in the target database (MSSQL), it keeps the questions marks in place of the special characters.
Here is my example code to get this particular information with the api :
filters filters = new filters();
WebserviceApi service = new WebserviceApi();
string login = service.login("******", "********");
string test = null;
List<customerCustomerEntity> customers = service.customerCustomerList(login, filters).ToList();
foreach (var customer in customers)
{
if(customer.email == "test#gmail.com")
{
test = customer.firstname;
}
}
MessageBox.Show(test);
I already tried different solutions from forums such as changing the encoding in C# or convert it but none has worked...
Btw, the encoding of the source database is UTF-8 Unicode (utf8).
Thank for your help.
When running Veracode, it generated a bunch of errors pointing to the lines with InnerHtml.
For example, one of those lines is:
objUL.InnerHtml += "<tr><td></td><td class=\"libraryEdit\">" + HttpUtility.HtmlEncode(dtitems.Rows[currentitem]["content"].ToString()) + "</td>";
What do alternatives exist to fix it without using html server controls?
What exactly are you trying to do, and what exactly does Veracode say?
Most likely, it is complaining that you could end up with an arbitrary code injection vulnerability if the data passed into your InnerHtml is untrusted and could contain malicious JavaScript.
The tool may not complain if you manually construct the DOM elements using the JavaScript createElement function to build each DOM element manually.
I have faced this issue in my ASP.NET Webforms application. The fix to this is relatively simple.
Install HtmlSanitizationLibrary from NuGet Package Manager and refer this in your application.
At the code behind, please use the sanitizer class in the following way.
For example if the current code looks something like this,
YourHtmlElement.InnerHtml = "Your HTML content" ;
Then, replace this with the following:
string unsafeHtml = "Your HTML content";
YourHtmlElement.InnerHtml = Sanitizer.GetSafeHtml(unsafeHtml);
This fix will remove the Veracode vulnerability and make sure that the string gets rendered as HTML. Encoding the string at code behind will render it as 'un-encoded string' rather than RAW HTML as it is encoded before the render begins.
I am getting data from a web service end point and place it into a list in a for each loop. The service gets it's data from a Wordpress website.
var list = new ItemList
(
(string)data.id.ToString(),
(string)data.name,
(string)subcategory
);
I then print this on the XAML page. The code works fine in that it successfully gets the data from the service and prints it on the page of my windows 8 app.
However in (string)data.name,, which is the name of the items, if the name contains a "&" it shows up in the app as $#038;. Also if a item name contains a "'", apostrophe s, it shows up as ’.
EG. D & G, shows up as D $#038; G
The "&" and "'" show up as these weird symbols.
How do I get rid of these and fix it so that they render correctly in the app.
I'm going to take the risk of giving you a wrong hint, because I guess you're talking about a Windows 8 Store App (XAML), thus you don't have access to every class on .NET, but...
What about decoding HTML entities?
Check this HttpUtility method: HtmlUtility.HtmlDecode.
Check WebUtility.HtmlDecode, which is on System.dll, thus available for Windows 8 Store Apps.
You'll need to add a reference to System.Web on your Visual Studio project.
It looks like the service is returning XML escaped entities. & means a character with a code of (decimal) 38 (which is &). ’ is similar and means a code of 8217 (which is ’).
You can decode these using System.Web.HttpUtility.HtmlDecode(inputString), but that requires a reference to System.Web. If you don't want to or cannot reference that, you can try something like this:
var xml = new XmlDocument();
xml.LoadXml("<x>" + inputString + "</x>");
var output = xml.InnerText;
Given Testing ’stuff" & things, it will return Testing ’stuff" & things.
I'd go with HtmlDecode() if you can, but absolutely try and avoid rolling your own decoder unless you have no other choice.
You can use WebUtility.HtmlDecode Method (String)
Or you can use if you don't want to add additional libraries.
public string Decode(string text)
{
var replacements = new Dictionary<string, char> {
{ "’", ''' },
// ...etc
}
var sb = new StringBuilder( text );
foreach( var c in replacements.Keys ) {
sb.Replace( c.ToString(), replacements[c] );
}
return sb.ToString();
}
I want to get book information such as author name / pages / publish year / etc ...
from amazon using HtmlAgilityPack but seems amazon webpages have some problems and I can't access the appropriate fields.
here is what I've done :
I use Firefox and Firebug + FirePath to retrieve desired XPath and then inside my code I summon HtmlAgilityPack and instruct it to get information using acquired XPath that I've got it from Firebug
but no luck and till now I couldn't access the "Product Details" part of the amazon.com
and this is my XPath (which is working only with HtmlAgilityPack)
HtmlAgilityPack.HtmlNodeCollection cnt = doc.DocumentNode.SelectNodes("//*[#class='content']");
int i=1;
foreach (HtmlAgilityPack.HtmlNode content in cnt)
{
if (i != 3)
{
i++;
continue;
}
if (i == 3) // i==3 means I've reached the product details but I can't go any further :(
{
s = content.SelectSingleNode("").OuterHtml;
// break;
}
}
How can I access Product Details using appropriate understandable XPath for HtmlAgilityPack?
And why does the syntax of Firebug + FirePath XPath is different from HtmlAgilityPack?
As #Mystere said, I suggest using the API. But if you are doing this for test purpose, or just because you want to use web scraping to obtain the info (I'm not sure if Amazon allows it or not. You should check it before doing this), here is the thing:
Why are you doing this?
s = content.SelectSingleNode("").OuterHtml;
The following is what you are looking for in case you want to get the HTML source of that part of the page.
s = content.OuterHtml;
When you are scraping, I suggest you trying to identify the part you need to scrape, and see the particularities of that block of content.
If you use:
var node = doc.DocumentNode.SelectNodes("//td[#class='bucket']/div[#class='content']");
that will give you the Product Details block you are looking for.
If you want to get some fields like Paperback, Publisher, ... you can do:
string paperback = node.SelectSingleNode("./ul/li[1]/text()").InnerText;
string publisher = node.SelectSingleNode("./ul/li[2]/text()").InnerText;
string language = node.SelectSingleNode("./ul/li[3]/text()").InnerText;
...
If you want to be sure that the XPath you are using will be correct for HtmlAgilityPack, open the page on Internet Explorer 8 (or 9) and use the Developer Tools (F12) to get the XPath. The thing is that each browser renders the HTML in a particular way. For example, you will always see <tbody> tags in Firefox right after a <table>, so maybe HtmlAgilityPack doesn't, and that simple detail of adding /tbody/ to your XPath can make your program fail.
Why don't you just use amazon's web service api that is designed to do this?
In JavaScript:
encodeURIComponent("©√") == "%C2%A9%E2%88%9A"
Is there an equivalent for C# applications? For escaping HTML characters I used:
txtOut.Text = Regex.Replace(txtIn.Text, #"[\u0080-\uFFFF]",
m => #"&#" + ((int)m.Value[0]).ToString() + ";");
But I'm not sure how to convert the match to the correct hexadecimal format that JS uses. For example this code:
txtOut.Text = Regex.Replace(txtIn.Text, #"[\u0080-\uFFFF]",
m => #"%" + String.Format("{0:x}", ((int)m.Value[0])));
Returns "%a9%221a" for "©√" instead of "%C2%A9%E2%88%9A". It looks like I need to split the string up into bytes or something.
Edit: This is for a windows app, the only items available in System.Web are: AspNetHostingPermission, AspNetHostingPermissionAttribute, and AspNetHostingPermissionLevel.
Uri.EscapeDataString or HttpUtility.UrlEncode is the correct way to escape a string meant to be part of a URL.
Take for example the string "Stack Overflow":
HttpUtility.UrlEncode("Stack Overflow") --> "Stack+Overflow"
Uri.EscapeUriString("Stack Overflow") --> "Stack%20Overflow"
Uri.EscapeDataString("Stack + Overflow") --> Also encodes "+" to "%2b" ---->Stack%20%2B%20%20Overflow
Only the last is correct when used as an actual part of the URL (as opposed to the value of one of the query string parameters)
HttpUtility.HtmlEncode / Decode
HttpUtility.UrlEncode / Decode
You can add a reference to the System.Web assembly if it's not available in your project
I tried to do full compatible analog of javascript's encodeURIComponent for c# and after my 4 hour experiments I found this
c# CODE:
string a = "!##$%^&*()_+ some text here али мамедов баку";
a = System.Web.HttpUtility.UrlEncode(a);
a = a.Replace("+", "%20");
the result is:
!%40%23%24%25%5e%26*()_%2b%20some%20text%20here%20%d0%b0%d0%bb%d0%b8%20%d0%bc%d0%b0%d0%bc%d0%b5%d0%b4%d0%be%d0%b2%20%d0%b1%d0%b0%d0%ba%d1%83
After you decode It with Javascript's decodeURLComponent();
you will get this:
!##$%^&*()_+ some text here али мамедов баку
Thank You for attention
System.Uri.EscapeUriString() didn't seem to do anything, but System.Uri.EscapeDataString() worked for me.
Try Server.UrlEncode(), or System.Web.HttpUtility.UrlEncode() for instances when you don't have access to the Server object. You can also use System.Uri.EscapeUriString() to avoid adding a reference to the System.Web assembly.
For a Windows Store App, you won't have HttpUtility. Instead, you have:
For an URI, before the '?':
System.Uri.EscapeUriString("example.com/Stack Overflow++?")
-> "example.com/Stack%20Overflow++?"
For an URI query name or value, after the '?':
System.Uri.EscapeDataString("Stack Overflow++")
-> "Stack%20Overflow%2B%2B"
For a x-www-form-urlencoded query name or value, in a POST content:
System.Net.WebUtility.UrlEncode("Stack Overflow++")
-> "Stack+Overflow%2B%2B"
You can use the Server object in the System.Web namespace
Server.UrlEncode, Server.UrlDecode, Server.HtmlEncode, and Server.HtmlDecode.
Edit: poster added that this was a windows application and not a web one as one would believe. The items listed above would be available from the HttpUtility class inside System.Web which must be added as a reference to the project.