PHP form with bot deterrent

Question

I saw somewhere on here that it helps reduce spam by adding a dummy input field that you hide with display: none, and like if it's filled out, then it's obviously a bot sending the message. Well, I did something kind of like that but initially set it to = 0. So if it's not 0 then it doesn't send the form, since a bot would have changed the value to something else. I don't know if that will really work effectively or not. If you have a better way to do something like that, please share.
I don't know if my token validation stuff is anywhere close to how it should be. Please let me know if what I'm doing with that is even adding any security or not.
After the user submits the form, I put some echo's in so that what they entered is still shown in the form fields, in case there was an error so they don't lose what they typed. Is my use of htmlspecialcharacters needed there?

Please share any other comments, recommendations, improvements, etc, etc that you see with my code, because I really want to get better at this and your answers almost always help me understand things better. You know, the manual can be kind of confusing at times when your new so having a master coder review your code and explain things helps so much for me.

<?php

session_start();

function getIp() {
    if (!empty($_SERVER['HTTP_CLIENT_IP'])) {
        $ip = $_SERVER['HTTP_CLIENT_IP'];
    }
    elseif (!empty($_SERVER['HTTP_X_FORWARDED_FOR'])) {
        $ip = $_SERVER['HTTP_X_FORWARDED_FOR'];
    }
    else {
         $ip = $_SERVER['REMOTE_ADDR'];
    }
    return $ip;
}

if($_POST['yourlastname'] === '0') {
    if ($_SESSION['token1'] == $_POST['token']) {
    echo "<p>checker: " . $_SESSION['token1'] . "  and the post token " . $_POST['token'] ."</p>";
        if(filter_var($_POST['youremail'], FILTER_VALIDATE_EMAIL)) {
            $name = htmlspecialchars($_POST['yourname']);
            $email = htmlspecialchars($_POST['youremail']);
            $from = $name . ' - ' . $email;
            $ip = getIp();
            $message = htmlspecialchars($_POST['yourmessage']) . "\r\n" .
                          'Name : ' . $name . "\r\n" .
                          'Email : ' . $email . "\r\n" .
                          'User IP: ' . htmlspecialchars($ip);
            $message = wordwrap($message, 60, "\r\n");
            $headers = 'From: ' . $from . "\r\n" .
                          'Reply-To: ' . $email;
            mail('[email protected]', 'Static Subject', $message, $headers);
        }
    }
}

$token1 = md5(microtime(true));
$_SESSION['token1'] = $token1;

?>

<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>Tittle</title>
</head>

<body>
<?php $token2 = md5(microtime(true)); ?>
<form method="post" name="form" action="<?php echo htmlspecialchars($_SERVER["PHP_SELF"]); ?>">

<input name="token" type="hidden" value="<?php echo htmlspecialchars($token2); ?>">

<label for="yourname">Your Name:</label>
<input class="required" id="yourname" name="yourname" type="text" value="<?php if(isset($_POST['yourname'])) { echo htmlspecialchars($_POST['yourname']); } ?>">

<label for="yourlastname" style="display: none;">Your Last Name:</label>
<input class="required" id="yourlastname" name="yourlastname" type="text" style="display: none;" value="0">

<label for="youremail">Your Email:</label>
<input class="required" id="youremail" name="youremail" type="text" value="<?php if(isset($_POST['youremail'])) { echo htmlspecialchars($_POST['youremail']); } ?>">

<label for="yourmessage">Your Message:</label>
<textarea class="required" id="yourmessage" name="yourmessage" cols="30" rows="8"><?php if(isset($_POST['yourmessage'])) { echo htmlspecialchars($_POST['yourmessage']); } ?></textarea>

<input type="submit" id="send" value="Send">

</form>
</body>
</html>

1. I don't think this will be particularly effective security. I wouldn't be surprised if automated tools simply recognize hidden fields and not touch them. You might want to look into hooking up to the Akismet API, which is the system used in Wordpress to filter comment spam. — Martijn, Commented Feb 7, 2014 at 14:14

Community · Accepted Answer · 2017-04-13 12:40:57Z

Right, I'll be adding to this answer later on, but for a kick-off, here's a few quick remarks/considerations:

On the hidden input fields
As Martijn said in his comment: most bots will probably parse the DOM, and filter out those fields that are marked as required (looking for labels with a * text node in them), and they'll simply ignore hidden fields. You've created regular input fields that are not displayed. That might work on some bots, but like I said, they'll probably parse the DOM and see that they have a style attribute that sets the display to none.
It's not hard to deal with that, and if I were to write a bot, it wouldn't take me 5 minutes to add the code to deal with those input fields, too.

You can keep those hidden input fields as a means of security (they don't do any harm, after all), but don't set their style attributes in the markup, use CSS classes, or better still: use JavaScript to set class/attributes on the load event. That way, the DOM doesn't reflect what the page will actually look like as much as it does now.
This may help a little, but all in all, this will only protect you from script-kiddies and amateurish attacks.

The token stuff
That's fine. Session tokens that should be posted back are basic security steps that everybody should take. How you get those tokens is up to you, but an md5 of the timestamp? A hash doesn't make your token more secure. Don't go thinking that md5 does anything else than add overhead, however minor it may be.
Of course I get that a hash string just makes the token look more important, and in a way less random than it actually is, but why not hash some of the clients data in that case, like the remote address or referrer or whatever... though far from reliable, it can give you something extra to check, and log, and compare to the access logs of your server. It may also give you a clue as to what tools are being used when your page is targeted.

However, you should add checks along the lines of if (isset($_POST['yourlastname'])) and isset($_SESSION['token1']), to ensure that you can actually validate the form submission. But that's as an asside.

htmlspecialchars
Is it needed? NO. In fact, they're capable of doing more harm than good in some cases. Particularly when they're applied to the email address, as you seem to be doing. You're using filter_var($email, FILTER_VALIDATE_EMAIL), which is great, because that filter can handle the complex and wacky email addresses can contain htmlspecial chars, including: <. Here's what I'd do:

$email = filter_var($_POST['email'], FILTER_SANITIZE_EMAIL);//sanitize
if (filter_var($email, FILTER_VALIDATE_EMAIL))
    //continue

As far as the sanitation of the other data that'll make-up your email message is concerned: set the MIME type to text,because it would appear that that's all you need, and take measures against email injection.
A subject I've already discussed in my answer here

Code issues
Just a quick update of my answer. On second glance, I may have spotted an issue with your code that could sometimes result in your script treating a valid form submission as an invalid one:

$token1 = md5(microtime(true));
$_SESSION['token1'] = $token1;//set token, then:
?>
<!DOCTYPE html>
<html>
<!-- setting HTML... -->
<?php $token2 = md5(microtime(true)); ?>
<form method="post" name="form" action="<?php echo htmlspecialchars($_SERVER["PHP_SELF"]); ?>">

<input name="token" type="hidden" value="<?php echo htmlspecialchars($token2); ?>">

The token value set in your form is a different variable than the one you've assigned to $_SESSION['token']! Why? The chances that the values of $token1 and $token2 are different are really quite high.
Why aren't you simply using $token1 in the form, too?

Next: htmlspecialchars? What's the point? $token1 and $token2 both are hash strings, which are all [a-z0-9] chars, to put it in regex terms. They don't need escaping, so this function call (which adds overhead) can, and should be dropped.

Also note that you don't have to specify the action attribute for the form, if the target is $_SERVER['PHP_SELF']. Leaving it blank triggers a form submissions default behaviour, which is to submit to self

<form method="post" name="form"> <!-- will do the trick -->

As an aside, also check my addition to the recommendations bit below on how you can make the mix-ins of your php in the markup, IMO at least, a bit more easy to read.

Recommendations
Given the fact you are looking into ways to ensure bots/spammers can't abuse your form, why not use a simple Captcha? It's proven technology, and rather effective anyway...

There are various scripts readily available, phpcaptcha is a free script you can download. It's easy, fairly well documented, actively maintained, there's a lot of plugins readily available for all major CMS/blog type of things and it's already been extensively tested.
I've just looked into the source of phpcaptcha now. I must say, there's a couple of things I don't much care for in there, but on the whole, it's not bad. I've even gone through the trouble of forking the repo, and I've started writing a little extra feature (which I'll call "mangler"). It'll transform words like transform into tr@nsf0rm, in such a way that both transform and tr@nsf0rm are accepted.
Though still work in progress, feel free to contribute ;-P

When echo-ing PHP variables inside HTML, you may find your code easier to read when using the short echo tag. This is not the same as the short open tag (<?) in that it does not require specific ini settings. In other words, it's always going to be available.
If nothing else, it at least requires less typing/code-chars

<input name="token" type="hidden" value="<?php echo htmlspecialchars($token2); ?>">
<!-- can be written as (I'm applying my critiques, too) -->
<input name="token" type="hidden" value="<?= $token1; ?>">

Not even the ; at the end of the line is required, but I prefer to have it there... it's a good habit.

Can anyone explain how does writing the token in a hidden form field and in $_SESSION, and later comparing the two and rejecting mis-matches protect against bots (the stated intent)? Wouldn't any bot who makes this request and then submit the form pass this test? — BeetleJuice, Commented Jun 17, 2016 at 0:55
@Patrick: a great deal of spam-bots won't actually submit the form itself, they'll simply parse the DOM, build the request and send it to the form action. These bots often ignore non-required fields (like hidden fields), and simply do not include hidden field data in the requests they'll send. Others might use proxies so as to not seem too suspicious, and they'll send the same requests over various proxies (and therefore various sessions). If they submit a single token value for all requests, irrespective of the proxy/session they use, a hidden field will stop most bogus requests — Elias Van Ootegem, Commented Jun 20, 2016 at 14:05
Of course, this catches out only some of the bots, but even so, you can change the token for each request, which means a bot that repeatedly sends the same request will only be able to send its spam once. Then you have people adding another layer of bot defences to this simple token approach by adding dummy token fields (containing gibberish or partial token values), and use JS to set the actual values after the DOM is loaded. If a bot merely parses the DOM, it will almost certainly generate invalid requests, or incomplete tokens. TL;DR: a token alone is not enough, but it does help — Elias Van Ootegem, Commented Jun 20, 2016 at 14:11
@Patrick: A more sophisticated bot would actually use a webdriver, and populate the fields and submit the form in the exact same way a human user would. Those kinds of bots would not be detected through hidden fields and tokens like this. You can defend against those requests in many different ways (IP tracking, either in your application, or at server level), timing between page request and form submission < 10 sec -> captcha (something similar to what SO does) or just add a captcha to all of your forms to begin with... — Elias Van Ootegem, Commented Jun 20, 2016 at 14:17

Stack Exchange Network

PHP form with bot deterrent

1 Answer 1

Not the answer you're looking for? Browse other questions tagged
php
beginner
security
form
session
or ask your own question.

Linked

Hot Network Questions

PHP form with bot deterrent

1 Answer 1

Not the answer you're looking for? Browse other questions tagged phpbeginnersecurityformsession or ask your own question.

Linked

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
php
beginner
security
form
session
or ask your own question.