I was just thinking about the URLs of my current web project. The user can access different resources, like images using a web site. The URLs look something like this http://localhost:2143/p/AyuducjPnfnjZGfnNdpAIumehLiWaYQKbZLMeACUqgsYJfsqarTnDMRbwkIxWuDd
Now, I really need high performance, and one way could be to omit the extra round trip to the database for authentication and just rely on the URL to be unguessable.
Google does this with Picasa Web Albums, you can make an album private or unlisted. This secures the album but not the photo itself. Take this photo of Skagen (Denmark); http://lh4.ggpht.com/_Um1gIFfF614/TQpVMvN3hPI/AAAAAAAANRs/GY5DxrDPHUE/s800/IMG_4074.JPG, it's actually in a private album, but you can all see it.
So what is your take on this? Is a 64 character long random string "secure" enough? Are there other approaches?
Let's say I choose to do authentication for each request to the resources. The users have logged in to the site on somedomain.com, where they access their, let's say photo albums. A cookie is dropped to maintain their authentication.
Now the actual photos are served through some form of CDN or storage service on a completely different URL.
How would you maintain authentication across multiple domains? Let's say the content of two albums could be delivered from to different servers.
Do the math. 64 characters chosen cryptographically randomly (NOT rand()!) from the alphabet of 62 possible values (26+26+10: caps/lowercase/numbers) will yield 5.16e+114 possible values (62^64). Trying a million combinations a second, it would take 1.63e+101 years (moar than a googol) to guess the code. It's probably good enough. A shorter one is probably pretty good too.
64 characters * 6 bits of entropy each (Base-64 encoding, right?) is a 384-bit key. That would be considered quite weak by today's standards, if the key can be tested off-line. As long as the key can only be tested using your live system, it will probably be quite effective and you can also add active countermeasures to block clients that try many bad keys.
You're probably at much higher risk of the keys becoming public through server logs, browser logs, referrer headers, transparent proxies, etc.
There is definitely a risk to only using an "unguessable" URL. It really depends on what sort of stuff you are trying to secure. Take picasa, they are photos that are being secured, not bank records, therefore a random query string is fine. Plus, the larger your website gets the larger attack surface you will open up. It is one thing if there is only one page, that could take a fair bit of scanning to try and figure out what single URL is in use. But if you have hundreds of thousands of pages like that, then attackers are far more likely to "guess" the right page.
So, I don't really have an answer for you, just some advice on the "unguessable" url approach: don't do it. It's not secure.
Cheers,
Here is my 2-cent. I had similar problem. Our intial approach was to rename the file with random but unique name and do a two way encryption with a complex key for that name. But the things eventually boiled down to the fact that once a URL is in someone's hand, you can't guarantee the stuff's privacy. We eventually went down to DB based authentication route. See here
Edit#1:
On CDN issue, I am not sure what the solution would be. But even if what martona is saying is correct. One of the purposes of CDN is to reduce load from your main servers, and pinging back to server for each resource is probably not a good idea.
There's no such thing as an unguessable URL, and even if there were the very first time you used it over a non-SSL connection it could be seen by anyone who wanted to, by ISPs and by proxies, caches, etc. Do you really want your users/customers to trust their private photos to "unguessability"?
Making URLs unguessable isn't a great approach to security, unless your unique URLs have a time limit on their usefulness (e.g. they're short-lived URLs)
Related
I'm making a C# (winforms) app that I want user to be able to execute only for a defined number of times (say 100 times). I know its possible to add an xml or a text file to store a count but that would be easy to edit or "crack"... is their any way to embed the counter in the code or maybe any other way that might not be easy to crack? and that its also easy later to "update" the membership for another period of 100 executions?
Thanks in advance
There are lots of ways to store a variable. As you've noted, you can write it to a text or xml file. You could write it to the Registry. You could encrypt it and write it in a file somewhere.
Probably the most secure method is to write it on a server and have the application "call home" whenever it wants to run.
Preventing copying is a difficult balancing act - treat your legitimate customers too much like criminals and they'll leave you.
If you're talking about memberships, your application may be web connected. If that is the case, you could verify the instance against a web service on your server that holds and increments the count and issues a "OK/Not OK to run" reply.
If you don't want to do this, I have heard of an application that uses steganography to hide relevant details in certain files - you could hide your count in some of your image resources.
Create multiple files containing the counter or the number of times your app will run. Name these files with different file names and store it in different location so that it will be hard to locate,delete and crack by user. The reason why it is not just one file because if the user found one of your file and alter or delete it, you still have other files which contains the valid information about your app.
If your application is a commercial product it might be worth to have a look at security products from other commercial vendors like SafeNet.com, for example.
A few years ago I used the HASP HL hardlock for a project, which worked just fine.
They offer hardware dongles for software protection as well as software based protection (using authentication services over the internet), and combinations of both.
Their products allow for very fine grained control of what you want to allow your users, e.g. how many times an application may be started before it expires (which would be just what you want) or time-expiration, or feature packages, or any combination of it all.
The downside is, that they have very "healthy" licensing prices.
If this is worth it will depend on the size and price of your own application.
I have an MVC3 app written in C# that I'd like to generate rel=canonical tags for. In searching SO for ways to achieve this automatically, I came across this post.
I implemented it in my dev environment and it works as intended and generates tags such as
<link href="http://localhost/" rel="canonical" />.
My question is, what good does this do? Shouldn't the canonical URL point to explicitly where I want it to (i.e. my production site), rather than whatever the URL happens to be?
The reason I bring this up is because my hosting provider (who shall remain nameless for now) also generates another URL that points to my site (same IP address just a different hostname, I have no idea why, they claim it's for reverse DNS purposes -- this is another subject). However, I've started seeing my page show up in Google search results under this mirrored URL. Not good for SEO, since it's "duplicate content". Now, I've fixed it by simply configuring my IIS site to respond only to requests to my site's domain, however, it seemed a good time to look at what type of a solution canonical URLs could have provided here.
Using the solution in the post above, the rel=canonical link tag would have output a canonical URL containing the MIRRORED URL if someone were to go to the mirrored site, which is not at all what I would want. It should ALWAYS be <link rel="canonical" href="http://www.productionsite.com" />, regardless of the URL in the address bar, right? I mean, isn't that the point of canonical URLs or am I missing something?
Assuming I'm correct, is there an accepted, generic way to generate canonical URLs for an MVC3 app? I can obviously define them individually for every page, or I can simply replace the rawUrl.Host parameter in the solution I linked with a hard-coded domain name, I'm just wondering why I see so many examples of people generating canonical URLs this way when it doesn't seem to fit the purpose (at least in my example). What problem are they trying to solve by just inserting the current URL into a rel=canonical link element?
Great question, and you're bang on regarding the mirrored site still getting marked as canonical. In fact, you've got to fix a couple problems before it hammers your "link juice" any harder.
I suspect the main reason is because MVC, by design, is a URL rewriting/routing system. So depending on the massage that occurs to the originally requested URL, people are trying to set the canonical link to the "settled on" final URL format, post rewriting. That said, I think you've dialed in on an oversight most people are having - which is "What about URLs that reached the page, that were NOT anticipated and REWRITTEN to become the valid, canonical path to the URL?" The answer here, is to rewrite these "bad requests" as you discover them. For example: If you rewrote your ISP's mirrored domain requests, then by the time it reaches the loaded page, it's NOW a valid url; This is because it was "fixed" by your rewrite rules. Make sense? So you'll need to update your MVC routes to handle the bad route created by your ISP. NOTE: You MUST make sure you don't use the originally requested URL, but the final, rewritten one, when building the canonical link value.
Continue on for my WWW vs. non-WWW tip, as well as a concern about something you mentioned regarding not processing the invalid urls.
People also do this because your site already "mirrors" another domain that people always forget about. The "WWW" subdomain.
Believe it or not, although debated, many are stating that having www.yourdomain.com/mypage.htm and yourdomain.com/mypage.htm is actually hurting your page ranking due to "duplicated" content. I suspect this is why people are showing the "same domain" there, because it's actually the domain stripped of the "WWW". (I use a rewrite rule to make the www vs no-www consistent.)
Also, be careful regarding "configuring my IIS site to respond only to requests to my site's domain" because if Google still sees links there and considers them a part of your site, it might actually just penalize you for having pages that fail to load (i.e. 404s) I recommend having a rewrite rule that sends them to your "real" domain OR at least have the canonical link be setup to only use your "real" domain with the WWW consistently there, or not there. (It is argued which is better, I don't think it matters as long as you are consistent.)
What problem are they trying to solve by just inserting the current URL into a rel=canonical link element?
None! They just make things even worse! They have just been mislead
There are misleading answers for this matter, here also on stack overflow which have been accepted and up-voted!!
The whole concept is to produce a unique link ID for each page with different content in a canonical tag.
So a good way to produce unique links for your canonical tags is based on.
Controller Name , Action Name and Language.Those 3 variations will provide different content.
Domain , Protocol and Letter casing don't!
See the question and my answer here for a better understanding.
MVC Generating rel="canonical" automatically
Creating a canonical URL based on the current URL does NOT do any good. You should create a canonical URL based off something static like database information. For instance, if your URL includes, let's say, the title of a book. You should pull that book title from the database and create the canonical URL from THAT and NOT the current page's URL. That way if part of the URL is missing AND the page still displays, the canonical URL will always be the same.
I am working on my mapper and I need to get the full map of newegg.com
I could try to scrap NE directly (which kind of violates NE's policies), but they have many products that are not available via direct NE search, but only via google.com search; and I need those links too.
Here is the search string that returns 16mil of results:
https://www.google.com/search?as_q=&as_epq=.com%2FProduct%2FProduct.aspx%3FItem%3D&as_oq=&as_eq=&as_nlo=&as_nhi=&lr=&cr=&as_qdr=all&as_sitesearch=newegg.com&as_occt=url&safe=off&tbs=&as_filetype=&as_rights=
I want my scraper to go over all results and log hyperlinks to all these results.
I can scrap all the links from google search results, but google has limit of 100 pages for each query- 1,000 results and again, google is not happy with this approach. :)
I am new to this; Could you advise / point me in the right direction ? Are there any tools/methodology that could help me to achieve my goals?
I am new to this; Could you advise / point me in the right direction ?
Are there any tools/methodology that could help me to achieve my
goals?
Google takes a lot of steps to prevent you from crawling their pages and I'm not talking about merely asking you to abide by their robots.txt. I don't agree with their ethics, nor their T&C, not even the "simplified" version that they pushed out (but that's a separate issue).
If you want to be seen, then you have to let google crawl your page; however, if you want to crawl Google then you have to jump through some major hoops! Namely, you have to get a bunch of proxies so you can get past the rate limiting and the 302s + captcha pages that they post up any time they get suspicious about your "activity."
Despite being thoroughly aggravated about Google's T&C, I would NOT recommend that you violate it! However, if you absolutely need to get the data, then you can get a big list of proxies, load them in a queue and pull a proxy from the queue each time you want to get a page. If the proxy works, then put it back in the queue; otherwise, discard the proxy. Maybe even give a counter for each failed proxy and discard it if it exceeds some number of failures.
I've not tried it but you can use googles custom search API. Of course, its starts to cost money after 100 searches a day. I guess they must be running a business ;p
It might be a bit late but I think it is worth to mention that you can professionally scrape Google reliable and not cause problems with it.
Actually it is not of any threat I know about to scrape Google.
It is cahllenging if you are unexperienced but I am not aware about a single case of legal consequence and I am always following this topic.
Maybe one of the largest cases of scraping happened some years ago when Microsoft scraped Google to power Bing. Google was able to proof it by placing fake results which do not exist in real world and Bing suddenly took them up.
Google named and shamed them, that's all that happened as far as I remember.
Using the API is rarely ever a real use, it costs a lot of money to use it for even a small amount of results and the free amount is rather small (40 lookups per hour before ban).
The other downside is that the API does not mirror the real search results, in your case maybe less a problem but in most cases people want to get the real ranking positions.
Now if you do not accept Googles TOS or ignore it (they did not care about your TOS when they scraped you in their startup) you can go another route.
Mimic a real user and get the data directly from the SERPs.
The clue here is to send around 10 requests per hour (can be increased to 20) with each IP address (yes you use more than one IP). That amount has proven to cause no problem with Google over the past years.
Use caching, databases, ip rotation management to avoid hitting it more often than required.
The IP addresses need to be clean, unshared and if possible without abusive history.
The originally suggested proxy-list would complicate the topic a lot as you receive unstable, unreliable IPs with questionable absuive use, share and history.
There is an open source PHP project on http://scraping.compunect.com which contains all the features you need to start, I used it for my work which now runs for some years without troubles.
Thats a finished project which is mainly built to be used as customizable base of your project but runs standalone too.
Also PHP is not a bad choice, I originally was sceptical but I was running PHP (5) as background process for two years without a single interruption.
The performance is easily good enough for such a project so I would give it a shot.
Otherwise, PHP code is like C/JAVA .. you can see how things are done and repeat them in your own project.
I want to pass information to athenticate a user to an XBAP application running in a browser. It's a username and password, where the password is hashed.
I've figured out how to do it via GET request (i.e. just pass in the information in a query string and use BrowserInteropHelper.Source.Query to get the information).
However that means exposing the data in the query string. Since the password is hashed it's not like you can actually see it, but it feels like bad practice to me. I can't find any real information about whether it's possible to pass data in via POST or a cookie. From what I've gathered from the internet cookies won't work for XBAP applications, but I might be wrong.
Does anyone know if and how it's possible to transfer this kind of data in a more secure way? It would also be nice to get a confirmation that cookies indeed won't work in this scenario - or how I need to go ahead and implement them.
From what I could gather from various sources on the internet, GET really is the only way to go in this scenario.
POST doesn't seem to work at all. Also, XBAPs cannot access any session cookies, so that option is not feasible as well.
(I would link to the sources, but it was more about collecting bits and pieces from everywhere and putting it together.)
We settled on passing the parameters via GET, but encrypting the whole query string. This is not an ideal solution, but it has to do until we have the resources to implement a more complex and prettier solution which enables sharing authentication details between two completely separate applications - where one is a Java application and the other an XBAP.
I have a business requirement that forces me to store a customer's full credit card details (number, name, expiry date, CVV2) for a short period of time.
Rationale: If a customer calls to order a product and their credit card is declined on the spot you are likely to lose the sale. If you take their details, thank them for the transaction and then find that the card is declined, you can phone them back and they are more likely to find another way of paying for the product. If the credit card is accepted you clear the details from the order.
I cannot change this. The existing system stores the credit card details in clear text, and in the new system I am building to replace this I am clearly not going to replicate this!
My question, then, is how I can securely store a credit card for a short period of time. I obviously want some kind of encryption, but what's the best way to do this?
Environment: C#, WinForms, SQL-Server.
Basically avoid by all means taking the responsiblity to save the CC details on your side, however I can assume you are using a thirdparty service to do your transaction such as PayPal/Verisign or whatever, most of them have API's that enables you to save CC credentials at their side, and they give you back a key that you can then use later to complete or initiate transactions, so they take care of the hard part, while all what you have to do is store this string key in your DB.
I don't believe it's actually illegal to store CVV info (in the sense that it's against any law), but it does violate Payment Card Industry rules, and they could impose any number of different sanctions. So, your requirements could actually result in you not being able to accept credit cards ;-(
Andrew, you need to understand the PCI-DSS, no small task. Personally, I find it extremely vague but here is what I understand.
First off, from the scenario you describe I would attempt to authorize the card for the full amount and then if that failed I would store the customer's information (but not the cardholder data) so someone could contact the user. Where I use to work some of our customers would only charge $1.00 and then void the transaction immediately, just to make sure the card was valid. They would then process all orders manually.
Where you will need to store the number is on a successful authorization. The only number you need then is the credit card number and the transaction code (at least with every gateway I have ever worked with).
The standard, last time I looked at it, is not specific on encryption algorithms but instead makes it clear it should be currently unbreakable encryption.
Now, one thing you cannot do is store the CCV subsequent to authorization. My understanding is that you can store it prior to authorization but I could never get anyone that would put that in writing. Basically, you authorize the card, you better wipe it.
And it is not illegal at this point but if you get nailed they will bring the hammer down on you. They have within their authority to level heavy fines against you, but it seems like what they usually do is put you in remediation. If you don't comply I don't know what happens because everyone I have heard this happening to complied. But then they really go up your booty with a microscope.
Ultimately, I believe their only stick they really have is to prevent you from accepting credit cards. Most merchants I have worked with were scared to death of exactly that.
If you just want to store the string for a short period of time in memory, you can take a look at System.Security.SecureString.
Taken from this answer:
SecureString values are stored encrypted (obfuscated, rather), but most importantly, they are never swapped to disk and can be disposed of immediately when you're done with them.
They're tricky to use because you can only build them one character at a time (to encourage you to build them by capturing keystrokes as the user types their password), and require three lines of code to recover and then wipe their plain text, but when used properly they can make a program more secure by avoiding the virtual-memory vulnerability.
At the end of the example the SecureString is converted into a regular managed string, which makes it vulnerable again (be sure to use the try-catch-finally pattern to Zero the string after you're done with it). SecureString's use is in reducing the surface-area of attack by limiting the number of copies the Garbage Collector will make of the value, and reducing the likelihood of being written to the swap file.
// Make a SecureString
SecureString sPassphrase = new SecureString();
Console.WriteLine("Please enter your passphrase");
ConsoleKeyInfo input = Console.ReadKey(true);
while (input.Key != ConsoleKey.Enter)
{
sPassphrase.AppendChar(input.KeyChar);
Console.Write('*');
input = Console.ReadKey(true);
}
sPassphrase.MakeReadOnly();
// Recover plaintext from a SecureString
// Marshal is in the System.Runtime.InteropServices namespace
try {
IntPtr ptrPassphrase = Marshal.SecureStringToBSTR(sPassphrase);
string uPassphrase = Marshal.PtrToStringUni(ptrPassphrase);
// ... use the string ...
}
catch {
// error handling
}
finally {
Marshal.ZeroFreeBSTR(ptrPassphrase);
}
If you are going to store credit card information you really need to be PCI compliant or you're just asking for trouble.
Having said that look at the cell level encryption available in SQL Server 2005 and above. Coincidentally :) I have recently given a presentation with T-SQL samples on encryption with SQL Server 2005/2008 available here: http://moss.bennettadelson.com/Lists/Events/Attachments/9/June2008.zip (Link location updated December 23, 2008)
Agreed that you should avoid storing the data if you can. But maybe you are that third party? If so, get familiar with PCI standards. Look around a bit on the site and you'll find the security measures you are required to implement.
It costs somewhere in the neighborhood of $30,000 to become properly compliant and to be able to do that kind of stuff. You are better off using a 3rd party payment service. Personally, I recommend Element Express, and they have a "Hosted" solution that bypasses the PCI-DSS PAPDB compliance. I've had to convert to this for my own applications, even a Point of Sale machine!!! It's a big pain, but we're a small company.
http://www.elementps.com/software-providers/our-security-edge/hosted-payments/PA-DSS-Certification-vs-Elements-Hosted-Payments/
The above link has some good information about the costs associated with becoming compliant. We have had customers ask us to store credit card numbers, and we won't do it because we could be fined as well. Not good. Don't open yourself up to liability.
Edit:
Additionally, if you DO decide to store the credit card information you definitely need to consider the forms of encryption you are going to use. Symmetric ? Asymmetric ?
If you do Symmetric encryption (Passkey) then you open yourself up to some serious security vulnerabilities if the server(site) that has the key (needed to encrypt) is compromised in any way. Remember, even compiled code won't hide a text key.
If you use Asymmetric encryption (public/private keypairs) then you run into some additional issues, but if the primary public facing server is compromised they will only have the public key, and if they also access your database.. they won't be able to decrpyt the contents.
The question then is, where do you store the private key ? Do you have someone paste it in from their local computers when running admin functions.. have a separate application that runs on the desktop to view orders, etc.
There are a lot of things to take into consideration.
Final note: Use a payment gateway (Element Express, Authorize.NET, Paypal, etc.) and don't store any credit card info locally. :P
Here is a link about using X509 Asymmetric Encryption in C#: http://www.csharpbydesign.com/2008/04/asymmetric-key-encryption-with.html
Lets look at the requirement a little differently. Currently it looks like this:
As a product owner for website X i want the system to temporarily store a customers cc details so that i can recover a sale that was declined by the CC company
Ppl tend to think like that and request features in that manner. Now i think your requirement is more conveniently described as follows:
As a user i want website X to be able to retry payment for my purchase so i dont have the hassle of having to go thru the checkout process again coz that is a real pain in the...
So there's no explicit requirement for storing anything (on your side) is there? Its only implied
Payment providers can provide programmatic APIs to your merchant account and the ability to attempt a re-auth on a declined attempt. i think #bashmohandes eluded to this earlier
Not all payment providers can do this however i think its dependent on their relationships with the banks involved. Thats the stuff you want to avoid ie. having a close relationship with banks.
Scenario 1: Assuming all i said is true
You don't have to store anything but a reference to the authorization attempt. Some payment providers even give you a sweet backoffice tool so you dont have to make your own to do re-auths. I think paygate does this
Your best bet i believe is to interview a number of payment providers. they should know this stuff like the back of their hands. This is potentially a zero-code solution
Scenario 2: Assuming i'm like totally wrong but legally this storing CC stuff is ok
So you have to store that data somewhere temporarily. I advise:
use a 2-way encryption method (naturally) that is non-vendor specific so you can use any language/platform to encrypt/decrypt
decouple the encrypt/decrypt service from your app and treat it like a black box
use public/private keys for authentication to this service
put this machine on a private network with its own elevated firewall rules (doesn't have to be a hardware firewall but hardware is better)
have your app servers communicate with this machine via ssl (you could get away with a self-signed cert since its on your private LAN)
All i've suggested in scenario 2 is hurdles but eventually persistence wins the race to get to your data. The only way to absolutely secure data is to unplug your server from the ether but that option is a little radical :-)
Scenario 1 would be nice. Wouldn't it?
Consider your t logs!
If you explain to your customer the full impact (and remedial requirements if they are found out of compliance) then trust me, your 'business requirements' will change very quickly.
If you must store the credit card number (and I advance the thought here that there is no reasonable scenario where you should) and you intend to use a native encryption built-in to your database, then consider this: what about your transaction logs?
If your transaction logs could reflect a credit card number in the clear, then you are out of compliance and should budget for a $10,000 to $50,000 forensic audit at your site if you get caught. Budget for your own attorney in case your customer sues you because you should have known all this stuff.
So if you are going to store a credit card number, run the cipher in code so the transaction logs (insert or update) reflect a ciphered string, not the card number in the clear.
And don't even have a field or column in your database for CVV - encrypted or not - that forensic audit will reveal this (so will the logs) and then your customer is in BIG, BIG trouble. They will pay a fine and could lose their ability to accept credit cards. Your attorney will be very happy.
I have a blog post that deals with this exact situation of storing sensitive data in the database. The blog post uses a String Encryptor class that I built using a Triple DES algorithm but you can plug in your own if you would like.
The blog post contains the video and source code that was used. You can check it out at http://www.wrightin.gs/2008/11/how-to-encryptdecrypt-sensitive-column-contents-in-nhibernateactive-record-video.html. I think it will definitely solve your issue.