Tikalon Blog is now in archive mode.
An easily printed and saved version of this article, and a link
to a directory of all articles, can be found below: |
This article |
Directory of all articles |
Obfuscated Software
August 14, 2013
When I was in
graduate school, one of my fellow students commented to a
professor that he would be happy when he got his degree, since he wouldn't need to take any more tests after that. The professor replied that after his
Ph.D., every working day as a
scientist would be a test. You're always being
judged by your peers; and, most importantly, by the
funding agencies.
As a personal example, at the end of a staff meeting, our
research vice president distributed copies of an unsolicited
proposal which was routed to him for comment. This was our daily test. We scanned the proposal quickly, and one of the other
physicists started to explain that what was proposed violated
Heisenberg's uncertainty principle. Actually, he failed his test, since
quantum mechanics did not apply. I outlined the problem using
classical electrodynamics. As my reward for passing my daily test, I was the one who had to write the report!
When at school, your reward for passing a test is a good grade and a slight forward movement towards your degree. Outside of
academia, your reward is usually your
salary, occasionally the esteem of your
colleagues, and sometimes a monetary
award.
Mathematicians can reap a huge reward by solving one of the
Millennium Prize Problems, valued at a million dollars each, and there are large monetary awards in all major fields.
Often,
competitions are devised whose only award is the recognition of your colleagues. One such competition that I've followed for more than two
decades is the
International Obfuscated C Code Contest.[1] The goal of this contest is the creation of working
programs in the
C programming language which are nearly impossible to decipher. I could insert a joke here about my own C programs, but I won't.
C is a good language for such a contest, since it's oblivious to
whitespace, and it has several
language constructs that lend themselves to
obfuscation. I'll mention, also, an inherently obfuscating programming language,
Whitespace, for which any program looks like a blank page. The language is coded using unique sequences of whitespace characters as its
syntax.
The C programming language is still popular more than thirty years after its creation by Brian Kernighan and Dennis Ritchie of Bell Labs.
Cover of the 1978 first edition of Kernighan and Ritchie's book, The C Programming Language.[2]
(Scan of author's copy.
A graphical reproduction of this cover by Jules Mazur is available on Wikimedia Commons)
I've attached one example of an obfuscated C program,
dorssel.c, written by Frans van Dorsselaer of
Bakker Industrial Automation,
Leiden, The Netherlands. This short program, which compiles without error using the
GNU C compiler (gcc), but with a few warnings, converts
ASCII characters to
Morse code, and Morse code to ASCII characters. However, it's hard to discern its purpose by looking at the
source code. This program won the 1998 Obfuscated C award for "Obsolescent Feature."
Although such a program is obfuscated, its decoding is not impossible. After all, an experienced
programmer wrote the program, so
reverse-engineering by another experienced programmer should be possible. When we do obfuscation by an
algorithm, however, things will get harder if the algorithm is unknown to the decoder.
PHP is an
interpreted language, so the only way its programs can be distributed is by source code. There are obfuscator programs available which are designed to automatically obfuscate your PHP programs to prevent their being modified; and, perhaps redistributed by others claiming the program as their own.
These obfuscator programs still produce code that's decipherable, albeit with considerable difficulty. A team of
computer scientists from
UCLA,
IBM Research and the
University of Texas at Austin has just developed an obfuscator approach for which reverse-engineering requires the solution of
mathematical problems so complex that their solution would take hundreds of years.[3-4] Their approach is described in a 46-page paper scheduled for presentation at the
54th Annual IEEE Symposium on Foundations of Computer Science, October 27-29, 2013, at
Berkeley, California.[4]
Do you remember the order of operations?
To solve any of these equations, it's important to remember to multiply before adding. I always use parentheses in my code to make things more obvious.
(UCLA Engineering image.)
Their method is difficult to summarize, but the
authors liken it to a mathematical
jigsaw puzzle, as artistically pictured above.[3] Says coauthor
Amit Sahai, who specializes in
cryptography at UCLA's
Henry Samueli School of Engineering and Applied Science,
"The real challenge and the great mystery in the field was: Can you actually take a piece of software and encrypt it but still have it be runnable, executable and fully functional... It's a question that a lot of companies have been interested in for a long time... You write your software in a nice, reasonable, human-understandable way and then feed that software to our system... It will output this mathematically transformed piece of software that would be equivalent in functionality, but when you look at it, you would have no idea what it's doing."[3]
Professor Amit Sahai of UCLA.
(UCLA Engineering photograph.)
The research team has applied the same principles of obfuscation in the creation of a process called
functional encryption. Instead of sending an
encrypted message, you send an encrypted
function in its place. Functional encryption has the property that a single message could be transmitted to a group of people, and each person would receive a different decrypted message.[3-4]
This research was funded by the
National Science Foundation and other agencies.[3-4]
References:
- International Obfuscated C Code Contest Web Site.
- Brian W. Kernighan and Dennis M. Ritchie , "The C Programming Language," Prentice Hall (Englewood Cliffs, New Jersey), First Edition, 1978, ISBN-10: 0-13-110163-3, paperback: 228 pages.
- Matthew Chin, "Computer scientists develop 'mathematical jigsaw puzzles' to encrypt software," UCLA Press Release, July 29, 2013.
- Sanjam Garg, Craig Gentry, Shai Halevi, Mariana Raykova, Amit Sahai and Brent Waters, "Candidate Indistinguishability Obfuscation and Functional Encryption for all circuits," July 21, 2013, Paper prepared for the 54th Annual IEEE Symposium on Foundations of Computer Science, October 27-29, 2013, at Berkeley, California.
Permanent Link to this article
Linked Keywords: Graduate school; professor; Doctor of Philosophy; Ph.D.; scientist; peer review; judged by your peers; funding agency; research; vice president; research proposal; physicist; Heisenberg's uncertainty principle; quantum mechanics; classical electrodynamics; academia; salary; collegiality; colleague; award; Mathematician; Millennium Prize Problem; competition; decade; International Obfuscated C Code Contest; computer program; C programming language; whitespace character; language construct; obfuscation; Whitespace programming language; syntax; Brian Kernighan; Dennis Ritchie; Bell Labs; Jules Mazur; Wikimedia Commons; dorssel.c; Bakker Industrial Automation; Leiden, The Netherlands; GNU C compiler; gcc; ASCII character; Morse code; source code; programmer; reverse-engineering; algorithm; PHP; interpreted language; computer scientist; University of California, Los Angeles; UCLA; IBM Research; University of Texas at Austin; mathematics; mathematical problem; 54th Annual IEEE Symposium on Foundations of Computer Science; Berkeley, California; order of operations; equation; multiplication; multiply; addition; adding; parentheses; author; jigsaw puzzle; Amit Sahai; cryptography; Henry Samueli School of Engineering and Applied Science; functional encryption; encryption; encrypted; function; National Science Foundation; Brian W. Kernighan and Dennis M. Ritchie , "The C Programming Language," Prentice Hall (Englewood Cliffs, New Jersey), First Edition, 1978, ISBN-10: 0-13-110163-3, paperback: 228 pages.