So in this unit we finally come to a point in which we are describe how to develop a project six, which is, implementing the, the assembler. And which to remind you that, we have two options, at this stage. If you want to, write a program that actually implements, an assembler using a programming language you are welcome to, listen carefully to this unit and, use the guidelines that we provide to actually develop this program. If you don't have, programming, background, then there's another unit that describes an equivalent project that implements an assembler without actually writing code. But, I think that every one of you, even if you don't write the assembler using a programming language, it may pay off to listen to this unit and see an example of, you know, how to put together and how to carry out, a complex software development project. So, without further ado, we have to, develop a hack assembler, and here's the contract. W decided that, we will call our assembler hack assembler. This is an arbitrary decision. And, this program should translate hack assembly programs, programs that are written in symbolic hack code into executable, hack binary code. And, we assume that the source program is written in, or supplied in a file, a text file called some name.asm. And given this is the input that they have to operate on your Hack Assembler program should generate a new file called the same name as the input file.hack so different extension and you know. Once you create this file, youshould be able to take this file put into the, hack computer and actually execute it. Now, we're making a, a big assumption which is that the x x x, that the supplied, input file is error free so we don't have any programming, syntax error, errors in this file. And, in the last unit of this week, we'll talk about whether or not this assumption, makes sense. So, with all that in mind, you have to develop an assembler that, follows this contract and just for the sake of the argument, let's assume that you did in in Java. Although, you can do it in any other, high level language. If you did it in Java and you want to actually use the assembler. Then, we assume that you operating some, shell environment. And, you type, you know, Java the name of the, Java virtual machine. You provide the name of your assembler, HackAssembler, and then you give this assembler, your assembler, an argument which is the name of the input file that you want to translate. And once you provided this command and hit enter, your assembler goes to work. It translates the supplied, file Xxx.asm and it creates a new, hack file that can be actually that, that contains the, the binary code. And if this file already exists it overrides it. So this is what you normally do, right, you write a program, you assemble it. Translate it. If you don't like the result you make some corrections to the source code. You reassemble it, so it makes sense to override the same target file again and again. So that's what we have to do in this project. Now, how do we do it? Well, once again you are welcome to write this assembler In any high level, language that you please. And yet we recommend that you follow a certain software architecture. And we describe this architecture in the previous unit, so I don't want to spend too much time talking about it. But I just want to remind you the architecture that we propose consists of four different software modules. Each of these modules should be, should be a stand alone software module that can, can also be unit tested in isolation from the other modules. So we need a parser that unpacks each instruction into It's underlying fields. We need a module called code that actually translates the symbolic fields, the mnemonics into, it's, corresponding, binary value. And by the way, I should say it more precisely. It's not that the module translates. The module has a set of methods that are used to carry out this translation. We need a symbol table module that contains a set of methods that manages, that manage, and create, create and manage a symbol table. And finally we need the main module which actually drives the show, and you know it opens the input file, it creates an output file, and actually go through the entire translation process by calling methods from the previously described modules. This main module will probably be named the hack assembler or something like that, depending on the language that you will end up using. So this is the proposed, architecture that we recommend that you use in oder to develop the assembler. Now, how should you actually go about it, how should you actually do the implementation? Well, we recommend that, you follow what, what is sometimes called a staged development. We want you to develop, the assembler in stages. So, first of all, as I described in previous units. We want you to write a basic assembler, that, can handle programs or hack assembly programs that contain no symbols. Once you write this assembler, we want you to test it in isolation, that is test it before you move on and do other things in this, project. Once this, assembler, basic assembler works to your satisfaction. You can go on and develop the symbol table module that can also be tested and, and, and, you should verify that you can create a symbol table and manage it in isolation from the other pieces of the project. Finally, once you have these two, abilities in place, you can put them together and create and assembler that can actually translate any given assembly program. So, this is, you know, yet another example of divide and conquer. You know, you take a complex, task. You split it into more manageable tasks and, complete and unit test each one of them in isolation. Now, in order to carry out this unit testing, we provide, a set of seven test programs, that you are welcome to use, in order to make sure that your assembler is actually working. Now, the first program called Add.asm is very simple and it's provided in one flavor only. Every one of the next free programs is supplied in two different versions. With symbols, and without symbols. So, the without symbol version of each program has capital L, suffix, which stands for less symbols, you know, max less symbols. Rectangle, less symbols, and so on. And in the next few slides, I'm going to talk about each of these programs in isolation. And, that's exactly what you should also do in your project. You should test your assembler on each of these programs in different stages of the development. And, this will help you once again, manage the process in a, in a good way. All right. So, here's the Add program. The add program is a very simple hack language. It contains no symbol. It's just a bunch of a few a and c instructions. And, it is designed to verify that your assemble, your assembler can handle white space and instructions, that's it. Now, when you look at this program you probably tell yourself that this is a very limited test, right? Because, well for one thing, we haven't, we, we didn't even test all the possible white space options because we also have inline comments that we, we don't see in this program. And indeed, we don't claim that we provide programs that that amount to an exhaustive testing of your assembler. It's just a very basic tes, t and you are welcome and encouraged to take this program Add.asm, and maybe make it more complicated, you know. Add more comments, add more white space add more instructions if you want. And, just make sure that your assembler can handle these kinds of programs, simple programs without any problems. So once again, the programs that we provide is just the minimum that that you are welcome to extend with your own testing as well. All right, so this is the add program. The next program is called Max. And as you can see from the documentation, it is designed to computer the maximum of two values. This is really not interesting at all from a translators perspective. You know, the translator has no idea what the program is trying to do. The translator is interested only in the syntax, but I'm saying it because, you know, I want you to know, what is going on here. So, this is the max program, and here is another version of this program without labels. So, without, I'm sorry, without symbols. And so, the only difference between these two, programs is that the second program in the second program we replaced every symbol with its numeric meaning. And once again, once you write your basic assembler you should test it on Maxl. And only later when you add the ability to handle symbols, you should test your assembler on Max.asm. In a very similar fashion, we also provide the program called Rectangle that draws a rectangle on the on the screen, as we see here in this snapshot from the CPU emulator. So, here's the code of the Rectangle program. Once again, we see that we supply two different versions of the same program on the left hand side. You see the the full blown version with symbols. On the right hand side, you see the same program without symbols. We basically replaced every symbol with its numeric meaning. And once again, when you write your basic assembler, you should test it on RectangleL. First and only later when you write the full blown assembler you test it on rectangle.asm. The last program that we provide is called Pong. And, this is quite an elaborate program. It's a program that implements a pong game. And, the simplest way to describe it is to just give you a demo of how it works in the CPU emulator. So, that's what we'll do next. Here we are in the CPU emulator. And, we would like to demonstrate running a pong game in the CPU emulator. So, we go ahead and load the Pong program. So, let us go to projects. And within projects, we go to project six. And, we see that we have a folder called pong. Open it, and we see that, as usual in this project, we have two versions of the same program named Pong.asm and PongL.asm. Now in the CPU emulated level, it doesn't really matter if you load into it a program with or without symbols because the CPU emulator is going resolve the symbols into physical addresses on the fly. In other words, the CPU emulator has a built in assembler in it. And when you load a symbolic program into the ROM, which by the way is something that you cannot really do. We can do it only because it's an, it's an emulator. So once again, it, it, translates in binary code, and to prove it, let, let us do that. Let's, let's you know, let's select this version of the program, and load it into the ROM. And, I see that I get a symbolic program, but there are no symbolic labels in it. All the symbolic labels, like, you know, this is, probably has been @R0, and it was translated into @3. So, and once again, this is just for illustration purposes because in reality the program looks excuse me. I think I have to do it here. In reality, the program looks like this. But then obviously, it's very difficult to to follow it. So, in the simulator level we allow looking at the program both symbolically as well as numerically. All right, so this is a program that implements a simple Pong game, and it's quite a huge program. If you scroll, you know, downstream, you will see that this program is, low and behold, it is something like let's see. Something like 27,000 lines of code, which is a lot, right? But, it's little bit deceiving, because the program itself, the, the pong logic, the logic of the pong game is, is much shorter. It's only maybe, I don't exactly remember, something like 200 lines of high level language code. But by the time you translate this code into binary code, and now I'm talking about the compiler, you know, translating from a high level language into assembly. Then, you get, you know, many more Lines of code but still we don't get so many lines of code. The reason we have so much code here is because What you see here is not only an implementation of the pong game, but also an, an implementation of the entire operating system that enables us to control the screen, the keyboard code. And mathematics operations and numerous other things that are needed when you implement and run high level language programs. So once again, just to summarize it's important to note that the code that we see here. Was originally written in, the Jack high level language and then it was translated by, a Jack compiler into, eventually into, the code that you see here. So this code was not written by a human being. It was written by a compiler. And, of course, this whole business of writing high level programs or writing compilers and operating systems and so on are not covered in NAND to Tetris part one, but rather they are covered in NAND to Tetris part two. And yet, when you write an assembler, all this information is completely irrelevant. You know? What you have is a file that consists of numerous, lines of, symbolic code. And you simply have to translate it into binary code. All right, so, having said all that, let us try to run the program. So I click the fast forward so to speak, and there seems to be a lot of action going on, but nothing appears from the screen, and I'm a bit disappointed because I expected to play a pong game and instead I see, something that looks like a program executing, but once again nothing is really happening. And indeed, you have to realize that what is happening here is. The execution of a lot of setup code. The operating system is initializing all sorts of data structures and all sorts of drivers are loaded and then so on and so forth. And it'll take a while before the programmers code proper will actually start running. So if we lose our patience as I'm sure that every one of you has done already, we can stop this processing, we can, you know, rewind everything if we want, and we can tell the emulator that we don't want to see the program flow. And we do it by clicking this control here, the no animation. Now what we mean here is no code or execution animation. So let's do this and then run the program again, and hopefully we will get to see some fun action. All right? And hallelujah! There's a Pong game going on, that's quite amazing. Okay, so I'm playing Pong now. I'm not very good at it. Oh, dear! And you know, let me try another game. All right maybe I'll, make some improvement now. Wait, I have to, to do this. Okay. Playing Pong, let's see if I'm any better than before, oh, hallelujah! I got a score of two. Three! That's my world record so far! And four! Very impressive. And game over. So what you saw here is the Pong game in progress. I'm always excited to see, programs like this, working, on the head computer because, you know, this is a program that was written by us in Jack. It was compiled using a compiler that we wrote, and then, the, executable code could run on, on a computer that we actually built, or a computer that you actually built, throughout this course. So, this is, I think, very, very, satisfying. So, this has been a demo of a pong game. You're welcome to play pong game on your own computer using the CPU emulator which is installed in your tools folder. And once again I want to emphasize that as far as the assembler writer is concerned, all this story about Pong playing and so on, it's, you know, it's a very nice background story, but it has no relevance whatsoever to writing an assembler. And assembler could not care less if it has to translate 12 instructions or 12,000 instructions. It's a computer program. It will process, you know, whatever file we give it to process and, and that's it. So, I hope you enjoyed, playing, pong with our, pong game. And, it's time to, you know, open the black box and look at the, pong, code, so here is the beginning of the Pong.asm program that we also supply is one of the test programs that you have to process, that your assembler has to process. I'd like to make some observations about this Pong.asm program, because it's quite different In some respect from the previous programs that we saw in this project. First of all, the code of Pong was originally written in a language called Jack. Jack is a Java-like simple object based language That we introduce in the second part of this course. In part two. And in part two, we introduce this language and then we write a compiler that translates from this language all the way down to hack, assembly code. And, and so what you see here is the result of this translation. So, writing this compiler and there's also a virtual machine in the middle that I don't have time to, to talk about, you know, writing these software layers is quite an elaborate undertaking. But we do it in a very similar style to what we did in this course, we do it one step at a time, and this is done in the second part of the course. The result of all this, translation effort is the ASM program that you see in front of you. So the special thing about this program is that it was automatically generated by by the compiler and and by the virtual machine. The resulting code is about 28,000 instructions long. If you wonder why it's so long it is because it also includes the jack operating system and the operating system is another thing that we'll develop in the second part of course. And you know taken together we have an assembly language program and if you translate it into Binary code then, bingo, you have something that causes your computer to play Pong. And so, we thought that it's important that as part of this project you will also translate, so-called industrial strength program, and pong is one such program. Now, when you look at this code, if you explore it. You know. And we recommend that you look at it at least briefly, you know, you don't have to understand what's going on here. It's very difficult to understand code just from looking it's symbolic I'm sorry the assembly version of this code. But you will realize that first of all we don't have any white space. And that is because all the white space was lost in in translation. And then you will see this called all sorts of strange things. For example, you see some strange addresses like 256. Now where does this 256 come from? Well, you know, it comes from the way our virtual machine is implemented. Now you may ask yourself what is a virtual machine? Well to find out you have to take the second part of the course. You also see all sorts of strange labels, like for example, END_EQ. You know, where did this END_EQ come from? Well, this was generated by the compiler automatically when it translated the Jack code that was written by the programmer and you also see some all sorts of predefined symbols that we haven't seen before like SP. [COUGH] What is SP? Well SP stands for a stick pointer. And once again, this is something that we discuss in the second part of the course. So, when you look at this code, you know, it's a little bit like reading genetic code, you know, DNA. We have all sorts of strange things that were created by previous generations so to speak. And indeed the code that you see here was created by several layers of revolution. It's not exactly a revolution, but we have a compiler that added some stuff to the code. We have virtual machine that added some stuff to the code. And finally, you know, you have something which becomes somewhat cryptic. But you can handle it to your assembler, and the assembler will translate it into machine code. And then finally you will have something that actually runs on your Hack computer. Okay, so this is, a long explanation of, the Pong.asm program. As far as your assembler is concerned, all this explanation is irrelevant. You know, it just has to take this file, and translate it into binary code. So, how should you test, you know, the results of your work? Well, first of all you have to use your assembler to generate Hack programs within binary code. And then you should test these programs and make sure that they actually do what they're supposed to do. So one way to do it is to invoke the supplied the supplied hardware simulator, load into it the built-in hack computer chip or, or the chip that you wrote, doesn't matter. Although we recommend that you use the built-in chip to avoid the errors. And then load into this chip, the Hack code and run it. That's, that's one testing option. Another testing option is to build the same but use the CPU emulator instead of the hardware simulator. This will be a more user friendly way to carry out this test. And finally there is the option of choice what you, what we recommend you actually do. We provide a working assembler, which is available to you on your personal computer if you downloaded our software suite. Our assembler is called assembler, quite simply. And you can use our assembler to translate any one of the supplied ASM programs, and then you can compare the code that our assembler generated to the code that your assembler has generated. If the two codes are the same, then you know that your assembler is at least as good as ours, maybe better. Okay. So, we do recommend that you use this, third option in order to test your work and here is how you actually do it. What we see here is a snapshot of the assembler that we supply. In the course, website, in, actually in the Nand Nand to Tetris website. So, you invoke this program called assembler, this nice, window pops up, and then it can load into the assembler, the ASM program which you see here in the left pane. Then you will click a button and our assembler which will translate the symbolic code into binary code, and you see the result in binary code in the center pane in this snapshot. Then, you have an option to load into this program what we call a compare file. And a compare file is the Hack program that was generated by your assembler. So, once you do this, the program will not only load the compare file, it will also compare the two files, and then you will get a nice message saying that the comparison was successful, or a somewhat nasty message saying that the comparison was not successful, which means that your assembler produces code which is different from the code that our assembler produces. In which case, something is probably wrong in your assembler, and you have to test and fix it. Okay, so this is the recommended way to, to make, sort of the final test that your assembler actually works. This is the source file, this is the the Hack file produced by our assembler and this is the Hack file produced by your assembler. So this basically sums up what you have to do. All the resources that you need are available in the nand2tetris org website. And the website describes the supplied files. There's no need to download anything because if you downloaded the software suite at the beginning of the course, then all the files that you need for this project are available on your personal computer in the projects/06 directory. And there's another set of resources, that you might find useful like the supplied assembler, the supplied CPU emulator the tutorial, a proposed assembler API and and so on. All these things are available in the website and you are welcome to use and consult them. So, this has been the unit that gave you, hopefully all the information that you need in order to build the assembler on your own. And the next unit will sum up everything that we did in this week.