I am allowing users to upload files to my server. What possible security threats do I face and how can I eliminate them?
Let's say I am allowing users to upload images to my server either from their system or from net. Now to check even the size of these images I have to store them in my
/tmp folder. Isn't it risky? How can I minimize the risk?
Also let's say I am using
wget to download the images from the link that the users upload in my form. I first have to save those files in my server to check if they actually are images. Also what if a prankster gives me a URL and I end up downloading an entire website full of malware?
First of all, realize that uploading a file means that the user is giving you a lot of data in various formats, and that the user has full control over that data. That's even a concern for a normal form text field, file uploads are the same and a lot more. The first rule is: Don't trust any of it.
What you get from the user with a file upload:
These are the three main components of the file upload, and none of it is trustable.
Do not trust the MIME type in
$_FILES['file']['type']. It's an entirely arbitrary, user supplied value.
Don't use the file name for anything important. It's an entirely arbitrary, user supplied value. You cannot trust the file extension or the name in general. Do not save the file to the server's hard disk using something like
'dir/' . $_FILES['file']['name']. If the name is
'../../../passwd', you're overwriting files in other directories. Always generate a random name yourself to save the file as. If you want you can store the original file name in a database as meta data.
Never let anybody or anything access the file arbitrarily. For example, if an attacker uploads a
malicious.phpfile to your server and you're storing it in the webroot directory of your site, a user can simply go to
example.com/uploads/malicious.phpto execute that file and run arbitrary PHP code on your server.
Never store arbitrary uploaded files anywhere publicly, always store them somewhere where only your application has access to them.
Only allow specific processes access to the files. If it's supposed to be an image file, only allow a script that reads images and resizes them to access the file directly. If this script has problems reading the file, it's probably not an image file, flag it and/or discard it. The same goes for other file types. If the file is supposed to be downloadable by other users, create a script that serves the file up for download and does nothing else with it.
If you don't know what file type you're dealing with, detect the MIME type of the file yourself and/or try to let a specific process open the file (e.g. let an image resize process try to resize the supposed image). Be careful here as well, if there's a vulnerability in that process, a maliciously crafted file may exploit it which may lead to security breaches (the most common example of such attacks is Adobe's PDF Reader).
To address your specific questions:
No. Just storing data in a file in a temp folder is not risky if you're not doing anything with that data. Data is just data, regardless of its contents. It's only risky if you're trying to execute the data or if a program is parsing the data which can be tricked into doing unexpected things by malicious data if the program contains parsing flaws.
Of course, having any sort of malicious data sitting around on the disk is more risky than having no malicious data anywhere. You never know who'll come along and do something with it. So you should validate any uploaded data and discard it as soon as possible if it doesn't pass validation.
It's up to you what exactly you download. One URL will result at most in one blob of data. If you are parsing that data and are downloading the content of more URLs based on that initial blob that's your problem. Don't do it. But even if you did, well, then you'd have a temp directory full of stuff. Again, this is not dangerous if you're not doing anything dangerous with that stuff.