Good Apples, 5/100 Days of Code
A rotten apple spoils the bunch. Python has a library called Bleach that throws out the rotten parts of a database. What you’re left with is the crisp, delicious taste of safe data.
Script Injection attacks are similar to SQL Injections in the way they use malicious code to harm a database. In this attack, malicious JavaScript is inserted into a database. The web server uses the JavaScript from the database to build the webpage displayed in a browser, and the browser runs the malicious JavaScript code. Though the code does not damage the database, it hijcacks a user’s web browser to perform unauthorized actions. This is where Bleach comes in.
Bleach is a Python library created to filter malicious HTML between attackers and databases. Similar to SQL injectios, Script injections can be safeguarded against by sanitizing data. Continuing where I left off in the Udacity Relational Databases Course, I sanitized the output from my database; not the input.
Output sanitization is the process of cleaning the database before its data can be used for displaying information to users. This method works on the assumption that the contents of a database might be unsafe.
To understand how Bleach works, the course offered two exercises for updating and cleansing the database. When I first launched the test script injection attack, at least 100 spam posts were added to the database. They literally said the word “spam”. To remedy this, I wrote a statement that would update the posts table using the update command in postgresql.
The statement replaced every row containing the word “spam” in the content column with the word “cheese”. Not only did it remove the large text, font, and neon colors from the messages, the script successfully replaced each spam post. After this, I created another statement using the delete command. It deleted every row containing the word “cheese” in the content column. From what I gathered, it seems Bleach checks the contents of a database against a dictionary of well-known malicious code. It replaces these parts of a table with some keyword and then deletes them from the table.
Based on what I’ve seen so far, this is the best method of safeguarding against Script Injection Attacks. Cleaning the database is faster and more efficient than analyzing every user input for malicious content before inserting. It’s better to store input and automate tasks that scan for malicious content on the back end. Any red flags can be identified and removed without compromising user safety.