Author: InvisibleUser Team
Categories: Open-Source Software
You cannot trust a program if it is not open-source software! Always go for free software solutions if privacy and security are crucial. This article is our analysis of the problems that come with the lack of transparency around closed-source software and we will explain how to review proprietary applications.
Closed source software is shipped in a format that allows you to install the program or run it directly on your computer/mobile device. The technical term for making executables from source code are to “compile” or “build” a program. There is no way to access the code of a program that has been pre-compiled by the developer. We cannot easily convert an executable back to source code, since the developer compiled it to binary code. That means ones and zeros, like “01011000”, for example, which stands for the letter “X” in ASCII binary code. Binary code can only be understood by machines and is used to give the CPU instructions. 1 and 0, true and false, switching the electrical current on and off.
If you do not have the source code directly from the developer, the only option to get to it is reverse engineering. That means trying to find out how a program works and what code was probably used. Reverse engineering is difficult and time-consuming. It is also not always successful, although you learn a lot about how a program executes, in the process.
The process is done by performing the creation of the executable in reverse order. It often starts with disassembling the code. That is a process where binary machine code is translated back into assembly language code. Assembly is a very low-level programming language, but still readable for humans, unlike pure binary code composed of ones and zeros.
There are very good open-source disassemblers (list) like Radare2 or Binary Ninja available, but the assembly code they produce can still be far away from the original. Things might work differently than intended by the software you are attempting to disassemble. Oftentimes, jumps from one location of the original program to another (distance measured in bytes) cannot be captured correctly by the disassembler. The same applies to branches in the original code or conditional statements like if, else and switch. (switch statements in C++ are known to be heavily changed by an optimising compiler to increase performance.)
The main reason for the difficulties are changes that compilers make: They do not simply translate to machine language, but process a statement and its “neighbours” together or mix data into the Assembly file’s .code section (or vice versa code into .data). While this optimises file size and speed, it makes decifering almost impossible, since it is hard to separate code from data and information is lost by “mixing” statements close to each other.
Even when the disassembler did a great job, there are still obstacles to overcome. Assembly is a very low-level programming language and most programmers have to invest a lot of time into the resulting code or cannot read it at all. The assembly code you get is still far away from the original source code in high-level languages like Python, Java or C++. The assembly code might help reviewing the software, but it does not give you all the answers you were looking for. Assembly can be very cryptic and even programmers that are used to understanding assembly can get stuck. This is especially true for very large and complex programs.
Therefore, disassembling closed-source software does not necessarily increase transparency. It cannot replace open-source programs. Software developers that do not release their source code do this for understandable reasons. They definitely know how difficult or impossible it is to recover source code after compiling. Only providing binaries helps them protect their work, although it does not stop piracy at all.
The issue we have with closed-source software is that you could hide almost anything in it. Collecting usage statistics without consent is a “harmless” example. The hidden source files could just as easily code for backdoors or direct eavesdropping, like in Skype, which we describe in our article “Microsoft help Police wire-tap Skype”. You will never know and that is the problem.
For this reason, we cannot accept closed-source software, even if we have verified that it is not harmful, by the procedures listed above. There is open-source software available for almost all applications, from development and graphics design to office programs, so you should definitely use it.
Besides, disassembling and reverse engineering software is often illegal under copyright laws. This does not stop some black sheep in the software industry from trying, however. Take the MS Office clone WPS Office, for example. It is an almost identical replicate of Microsoft’s software suite. Both in looks and functions, which is hardly possible without at least reverse engineering parts of the original.