Sunday 4 September 2016

RET_CHK: Protect against ROP exploits

I was exploiting a ctf challenge that required a ROP chain recently and I had an idea that would make ROP exploitation impossible (or way more complicated at the very least)

Background:

There are a number of protections in place to protect against exploiting buffer overflows, NX stack, ASLR and Stack cookies are the main ones. (If you're not sure what they are, let me introduce you to my friend google.com)

With all these turned on, buffer related exploits require one or two info leaks to get the stack cookie and some offsets to bypass ASLR. From there, use ROP gadgets (borrowed chunks of code e.g pop ebx, pop eax, ret) to pop a shell.

Theoretical New Idea:

RET_CHK (PASTEBIN_SECURE_RET was a bit of a mouthfull)

To prevent ROP Related Exploits, extend the 'ret' instruction to add a check that the return address is just after a call instruction. To make it compatible with all the weird hacks that people want to do, add an instruction ret-insecure, that acts as normal ret.

This will only add 1 or 2 clock cycles (probably) to every function call.

Clock cycles are really really fast, but programs do make a lot of function calls. This technique can be removed from some functions that get called very frequently. It makes exploiting easier but still much harder than not having it at all.


How to sploit?

Of course, as with all new security features, there are ways to get around this.
Typically, once you get to the point where you control eip, you have boatloads of rop gadgets to choose from and if you've got libc you can probably do anything you want with only a little effort.
With RetCheck™ enabled, the spots you can jump to goes from all of the code segment and libraries to the 0.01% of assembly instructions that directly follow a call instruction. (Actual percentage of call instructions may vary)

Firstly, due to the already overwhelming pressure to make things fast, any implementation will probably only do something like "die if [eip-4] != call_instruction_byte_code". This extends our reach to 4 bytes after some offset assembly that looks like a call instruction and doesn't break everything.

Then comes the fun bit. We take our limited set of gadgets and profile them to find their net effect on system state. Pretty much glorified assembly fuzzing. Run them a bunch with different initial stacks and registers, then find the patterns. Look for those pesky "won't make it crash" requirements, the registers and memory regions it does and doesn't touch, as well as the effect it has on different values. This will be handled by my super smart deep learning neural network magic engine that I haven't written yet.

Yes I can hear you saying "why don't you just look at the code and work out what it does". Sure, you can do that, take all 10,000 potential gadgets and manually do a writeup of what input it takes to not crash and the effect it has on all the registers and values on the stack etc. Good luck. With that.

Meanwhile, back in lazy land, I'll build my assembly-fuzzing-AI and have a general purpose tool forevers.

At this point we have a bunch of gadget blobs that each have a net effect on current state. By chaining these together with a stack of stuff we control we can find combinations that achieve what we want (maybe).

Popping a shell becomes quite hard if you don't have any easy wins like "call whatevs; call system". 

If you allow there to be any ret's (even 1 is enough), then this becomes easier. Find a chain of valid ret-chks that ends in an insecure-ret. Use the insecure-ret to do any gadget of choice and if that gadget has a ret-check then add your chain of valid ret-chks to get another gadget-of-choice. Just hope that the valid ret-chk chain doesn't ruin what the last do-what-you-want-gadget did.

With enough insecure-ret's it acts as only a mild to severe nuisance finding a chain that doesn't break everything.

Either way, this makes exploitation harder at a not-too-unreasonable cost.

We'll probably have a smarter solution that totally solves buffer overflows in the near future anyway, hence this is a blog post not a patent application

The super smart machine genius assembly fuzzer AI  thing would probably be a useful tool in current exploitation but it's too much of a trek for me for now and is probably not necessary 99% of the time.

Hack the planet
-pasteBin