71

I'm really wanting to learn assembly. I'm pretty good at c/c++, but want a better understanding of what's going on at a lower level.

I realize that assembly related questions have been asked before, but I'm just looking for some direction that's particular to my situation:

I'm running windows 7, and am confused about how I should start working with assembly. Do I have to start with x64 because I'm running windows 7? Some people have said 'start with 32 bit first' - how do I go about doing this? What does my operating system have to do with my ability to write assembly for '32' or '64' bit. In fact, what does 'n bit' assembly mean, where n is a number??


Edit:

Here are some links that have helped me get started with assembly; others who are just getting started may find them helpful. I'll keep updating this list as I continue on my assembly journey :)

Note: As I've been learning, I've decided to focus on programming with masm32. Therefore most of the below resources focus on that.

  • tag wiki (beginner guides, reference manuals, ABI documentation, and more.)
  • www.masm32.com
  • X86 Assembly WikiBook
  • X86 Dissassembly WikiBook (great for understanding some conventions, and the basics of how higher level code translates into assembly)
  • WinAsm IDE (plays nicely with masm32)
  • Intro: Assembly for Windows (all code examples are for masm32)
  • List of Interrupts
  • Assembly Tutorial (great for helping to understand core concepts)
  • x86 Assembly Guide
  • Agner Fog's Software optimization resources, including some good stuff about calling conventions on different platforms (Windows vs. Linux/OS X), as well as a lot of examples of how to do specific things efficiently. Not great for total beginners, but great for intermediate to advanced readers.

    (He also has detailed performance info for each instruction for Intel and AMD CPUs, excellent for serious performance micro-optimization. Some beginners might want to look at some of that to get started thinking about how CPUs work, and why you might do something one way instead of another.)

2
  • 2
    Consider "Programming from the Ground Up"
    – user1831086
    Commented Mar 1, 2010 at 5:28
  • 1
    Good luck dude. Writing assembly is a real drag. Not trying to discourage it, but damn it's quite the undertaking Commented Aug 29, 2017 at 1:04

5 Answers 5

45

When people refer to 32-bit and 64-bit assembly, they're talking about which instruction set you'll use - they're also sometimes called Ia32 and x64 in the Intel case, which I presume you're asking about. There is a lot more going on in the 64-bit case, so starting with 32-bit is probably good; you just need to make sure you're assembling your program with a 32-bit assembler into a 32-bit binary. Windows will still know how to run it.

What I really recommend for getting started with assembly would be something with a simpler instruction set to get a handle on. Go learn MIPS assembly - the spim simulator is great and easy to use. If you really want to dive straight into the Intel assembly world, write yourself a little C program that calls your assembly routines for you; doing all the setup and teardown for a 'real program' is a big mess, and you won't even be able to get started there. So just write a C wrapper with main() in it, and compile and link that with the object files you get from writing your assembly code.

Please don't get in the habit of writing inline assembly in your C code - it's a code portability nightmare, and there's no reason for it.

You can download all of the Intel 64 and IA-32 Architectures Software Developer's Manuals to get started.

9
  • 1
    This is helpful, thanks. The instruction set difference makes sense... is that the only difference? Like, is there any difference in the way a program written in 32-bit will run as opposed to a 64-bit program? If not, why are they called 32-bit/64-bit, as opposed to 'instruction set A' and 'instruction set B', for example?
    – Cam
    Commented Feb 28, 2010 at 18:24
  • 4
    @incrediman, the instruction set is a pretty huge difference. The instruction sets do have different names, but people just use 32-bit/64-bit as shorthand. In addition, there are different calling conventions (ABI) between the two instruction sets, and even two competing 64-bit ABIs.
    – Carl Norum
    Commented Feb 28, 2010 at 18:33
  • 2
    @incrediman, yes. The 64-bit architecture has 64-bit registers (and memory addressing), and the 32-bit architecture has 32-bit registers. That's one of the reasons people use the 64/32 names to differentiate the two.
    – Carl Norum
    Commented Feb 28, 2010 at 19:13
  • 3
    Definitely consider getting some microcontroller emulator and learning assembly for it. Assembly is pretty much dead for PC (most compilers can produce better ASM code from higher-language than programmers could by hand), but it's still very strong on microcontrollers, and various cool projects you could achieve with them are really worth the effort. Plus you get to write without the OS, which in assembly gets in your way more often than helps.
    – SF.
    Commented Mar 1, 2010 at 14:28
  • 2
    I think I'd differ with these guys and go for 64 bit -- 64 bit assembly has twice as many registers which gives more room for learning.
    – Joel
    Commented May 17, 2010 at 19:09
34

I started writing assembly in 1977 by taking the long route: first learning basic operations (and, or, xor, not) and octal math before writing programs for the DEC PDP-8/E with OS/8 and 8k of memory. This was in 1977.

Since then I have discovered a few tricks on how to learn assembly for architectures I am unfamiliar with. It's been a few: 8080/8085/Z80, x86, 68000, VAX, 360, HC12, PowerPC and V850. I seldom write stand-alone programs, it's usually functions that are linked with the rest of the system which is usually written in C.

So first of all I must be able to interface to the rest of the software which requires learning the parameter passing, stack layout, creating the stack frame, parameter positions, local variable positions, discarding the stack frame, returned values, return and stack cleanup. The best way to do this is to write a function that calls another function in C and examine the code listing generated by the compiler.

To learn the assembly language itself I write some simple code, seeing what the compiler generates and single-stepping through it in a raw debugger. I have the instruction set manuals close by so I can look up instructions I am unsure of.

A good thing to get to know (in addition to the stack handling mentioned previously) is how the compiler generates machine code given a certain high-level language construct. One such sequence is how indexed arrays/structures are translated into pointers. Another is the basic machine code sequence for loops.

So what is a "raw debugger?" To me it's a debugger that is part of a simple development package and that doesn't try to protect me from the hardware like the Visual debugger(s). In it I can easily switch between source and assembly debugging. It also starts quickly from inside the development IDE. It doesn't have three thousand features, more likely thirty and those will be the ones you use 99.9% of the time. The development package will typically be part of an installer where you click once for license approval, once for approving the default setup (don't you love it when someone has thought about and done that work for you?) and a last time for install.

I have one favorite simple development environment for x86-32 (IA-32) and that is OpenWatcom. You can find it at openwatcom.org.

I am fairly new to x86-64 (AMD64) but the transition seems straightforward (much like when moving from x86-16 to x86-32) with some extra gimmicks such as the extra registers r8 to r15 and that the main registers are 64 bits wide. I just recently ran across a development environment for XP/64, Vista/64 and 7/64 (probably works for the server OS:s as well) and it is called Pelles C (pellesc.org). It is written and maintained by one Pelle Orinius in Sweden and from the few hours I've spent with I can say that it is destined to become my favorite for x86-64. I've tried the Visual Express packages (they install so much junk - do you know how many uninstalls you need to do afterwards? more than 20) and also tried to get gcc from one place to work with an IDE (eclipse or something else) from another.

Once you've come this far and you come across a new architecture you will be able to spend an hour or two looking at the generated listing and after that pretty much know what other architecture it resembles. If the index and loop constructs appear strange you can look over the source code generating them and perhaps also the compiler optimization level.

I think I should warn you that once you get the hang of it you will notice that at desks close by, at the coffee machine, in meetings, in fora and lots of other places there will be individuals waiting to scorn you, make fun of you, throw incomplete quotes at you and give uninformed/incompetent advice because of your interest in assembly. Why they do this I don't know. Perhaps they themselves are failed assembly programmers, perhaps they only know OO (C++, C# and Java) and simply don't have a clue as to what assembler is about. Perhaps someone they "know" (or whom a friend of theirs knows) who is "really good" may have read something in a forum or heard something at a conference and therefore can deliver an absolute truth as to why assembly is a complete waste of time. There are plenty of them here at stackoverflow.

2
  • Great answer (thanks for adding it despite the question's age), but you didn't have to make it community wiki - you deserve some rep! :)
    – Cam
    Commented Feb 16, 2011 at 15:54
  • 3
    Thanks Cam. I felt the question needed something more ... howto in practice! Commented Feb 16, 2011 at 20:56
3

Get IDA pro. It's the bees knees for working with assembly.

I personally don't see much of a difference between 32-bit and 64-bit. It is not about the bits but the instruction set. When you talk about assembly you talk about instruction sets. Perhaps they are implying that a 32-bit instruction set is better to learn from. However if that is your goal I suggest Donald Knuths books on algorithms -- they teach algorithms in terms of a 7-bit instruction set assembly :D

For portability issues, I suggest that instead of inline assembly you learn how to use compiler intrinsics -- it will be the best optimization for non-embedded optimizations. :D

7
  • 1
    From what I'm reading, it's a disassembler... so, actually writing/coding some assembly language code isn't possible, right? If so, this is only a half-answer. Commented Feb 28, 2010 at 18:23
  • 1
    Working with assembly these days (even in embedded land) is about making adjustments to code generated by a C/C++ compiler. IDA makes this job as painless as possible. Commented Feb 28, 2010 at 18:27
  • That may be but it still didn't respond to my original question that well :)
    – Cam
    Commented Feb 28, 2010 at 18:28
  • 1
    @Hassan, what if I need to write code that runs before a C/C++ runtime exists?
    – Carl Norum
    Commented Feb 28, 2010 at 18:30
  • @Hassan Yeah, true. Maybe I was a bit harsh... sorry!
    – Cam
    Commented Feb 28, 2010 at 18:31
1

but want a better understanding of what's going on at a lower level

If you really want to know everything that's going on at a lower level on x86/x64 processors/systems, I would really recommend starting with the basics, that is, 286/386 real mode code. For example, in 16-bit code you are forced to use memory segmentation which is an important concept to understand. Today's 32-bit and 64-bit operating systems are still started in real mode, then switch to/between the relevant modes.

But if you're interested in application/algorithm development, you might not want to learn all the low-level OS stuff. Instead you can start right off with x86/x64 code, depending on your platform. Note that 32-bit code will also run on 64-bit Windows, but not the other way round.

4
  • 3
    Boot time isn't the only lower-level way to interact with a system; I think writing native assembly for OS programs is a good way to start. Writing and debugging boot systems is not for the faint of heart.
    – Carl Norum
    Commented Feb 28, 2010 at 18:29
  • 3
    Understanding 16 bit segments is about as useful as learning how Roman numerals work. And as far as starting in real mode to bootstrap your own OS, that would take a couple of years of study unless it's just going to be printing out "The BIOS handed me these register values on screen xxxx xxxx". Low level stuff like reading/writing hardware ports in device drivers would be a good use for assembly code even if you're not an asm genius. Commented Feb 28, 2010 at 23:32
  • 1
    Segmentation in x86-64 long mode is mostly vestigial. For most of the segment registers, the base is fixed at 0, so the only option is a flat memory (just like all the major OSes use in 32-bit mode). Learning 16-bit segmentation after you understand the easier flat memory model may help understanding how fs or gs are used for thread-local storage in modern systems. But I would recommend against trying to learn that first. Even internally, modern CPUs special-case the segment-base=0 case, and otherwise have higher latency for loads. So even internally, segmentation isn't happening. Commented Jul 13, 2017 at 2:14
  • 1
    If you mostly want to do asm for helping the compiler do a good job for user-space code (e.g. notice that it did a bad job and change the source to help it emit better asm), you don't need to know anything about segmentation for the vast majority of cases. Segments in modern x86 are how a 64-bit OS runs either 64 or 32-bit processes (different CS descriptors), and how they do TLS. (Different fs base for each thread). Until you're ready to learn that, ignore segments. Commented Jul 13, 2017 at 2:17
0

Start programming C, (not C++ , or C#) will help you to get a basic understanding of what is needed to 'do it all yourself', like register, stack frame, and data processing. I did a master on computer science and one of my favourite topics is compiler building (yes yacc and lex!) which did help me to understand all higher level language in a deep intimate level. I still cherish those moments defining my own language and compiling it to low level constructs. Indeed I designed a object oriented language to be executed on a virtual processor.

So: there are no shortcuts learning assembler. It can be tedious. But very satisfying.

Not the answer you're looking for? Browse other questions tagged or ask your own question.