Using Markdown in PHP Safely

You may want to allow users to perform more advanced actions when filling out a task/bug description or editing their profile description. You may want them to give them the ability to create hyperlinks, create lists, embed images, and add formatting their text. However, allowing users to enter raw html/javascript code into your site and then displaying/executing that content is unsafe. The best middle ground is to allow the user to input markdown text and then parse that to the relevant html rather than allowing users to enter html directlty and then trying to safely filter it yourself. The added bonus is that it is faster for the user to write, and probably easier to learn markdown than entering html.

Using the michelf/php-markdown package, you could do something like:

$input = $_POST['user_input'];
$sanitizedInput = htmlspecialchars($input);

// display the content to the user
$parser = new \Michelf\MarkdownExtra();
$displayContent = $parser->transform($sanitizedInput);
echo $displayContent;

Using htmlspecialchars prevents the user from entering malicious code that will get executed later, instead it will be converted into a form that will result in it being displayed as output text rather than executed.

The only problem with this is that markdown code blocks expect code to be input and perform htmlspecialchars as well to convert the inputted code into a format that can be displayed. The result being that htmlspecialchars gets executed twice and the code doesn't get displayed correctly. The solution to this is writing your own code block conversion function, which will simply return the input, and passing it to the parser as shown below:

$input = $_POST['user_input'];
$sanitizedInput = htmlspecialchars($input);

// display the content to the user
$parser = new \Michelf\MarkdownExtra();
$parser->code_block_content_func = function($input){ return $input; };
$displayContent = $parser->transform($sanitizedInput);
echo $displayContent;

You may want to disable the ability for user to embed images through markdown as a malicious user could exploit this for an XSS attack by putting in a non-image url link that causes subsequent viewers' computers to submit a GET request when they load the page with that "image". My view is, just make sure your own site(s) are not vulnerable to XSS before enabling this functionality, and let the rest of the internet take care of itself.

References

Author

Programster

Stuart is a software developer with a passion for Linux and open source projects.

comments powered by Disqus
We are a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for us to earn fees by linking to Amazon.com and affiliated sites. More info.