C and assembler on Linux: Difference between revisions

From Noisebridge
Jump to navigation Jump to search
No edit summary
No edit summary
 
(7 intermediate revisions by 4 users not shown)
Line 1: Line 1:
This is a page for the C and assembler on linux class, Tuesdays at 5:30 PM in Church.
This is a page for the C and assembler on linux class, Tuesdays at 5:30 PM in Church.


'''Here's a write up that covers the first half of
the first C on Linux class that I gave last Tuesday
20120619 in the Church classroom from 5:30 to 7 PM.'''''Italic text''
  I hope to write up the balance of last Tuesday's
class before the weekend's out.
  Note the To: list, please; if you know of anyone
who's missing, please let them and me know.
  Complaints, suggestions, sarcasms, all are welcome.


jim
415 823 4590 my cellphone, call anytime


 
[[Category:Events]]
Learning C programming on Linux
 
* C programming language is a specification that defines keywords,
operators, and rules of syntax.
  This may sound stupidly obvious or useless knowledge, but you may,
if you really get into using C, find that it's a practical
concept--useful, intelligently obvious.
 
* C compiler is a software program that implements the C specification:
parser, keywords, operators, syntax rules.
  The practical purpose of this idea is that there are different C
compilers for different machines and for different purposes. If you're
just starting to learn C, this idea will seem pretty nearly as useless
as the idea that C is a specification.
 
 
  The tools you use to write C programs include an editor and a C
compiler at minimum. There are a lot more tools available, such as
debuggers and profilers and more.
 
  The process you follow is to use a text editor to write some ASCII
text that complies with the rules of the C language then use a C
compiler to read your ASCII file and create a new file that contains
executable machine code.
  Look for C compiler-generated error messages. If there are any, even
one, then the compiler does not make an executable file; you have to fix
all errors. You may see warning messages that indicate the compiler
found one or more things that are not perfect but let the compiler
continue. If you don't have too many warning messages, the compiler will
probably make the executable file.
  If you get an executable file, run it and see if it works as you
expect. If it does, you probably won't learn anything more from this
exercise. If it doesn't, you get to learn about runtime and logic
errors: you wrote a program that is correct according to the C language
but incorrect in terms of implementing what you hoped it would do.
 
  The following commands exemplify the process using a bash shell:
$ vi myfile.c
$ gcc myfile.c
$ ls
a.out
$ chmod 755 a.out
$ ./a.out
 
 
  You use a text editor such as vi to create a file of text that
conforms to the rules of the C specification.
 
  You run the C compiler so that it reads what you wrote. The C
compiler sees your program file as an ASCII character stream that it
interprets as a token stream.
  So, what is a "token"? A token is one or more ASCII characters that
the compiler sees as a meaningful thing. To compare with the English
language, think of a token as a word or a word ending or punctuation or
some other element that's meaningful.
 
  The C compiler is a software program that conforms to a particular
design: the design for interpreters and compilers. Generally, any
compiler or interpreter includes an input stage that parses the incoming
ASCII (token) stream and also has a set of keywords and operators that
are reserved ASCII character(s) and a set of rules that the compiler
applies to the tokens it reads.
  When the compiler begins, it sets itself to a neutral state, which
is to say that it will examine the first ASCII characters to verify that
it can parse it as a stream of tokens.
  When the compiler identifies the first token, it verifies that that
token is of a class that can be a first token and then resets its (the
compiler's) state so that the following token must be one of a limited
set of tokens. For example:
1+2
  The compiler reads the 1 and then the '+' character, at which point
it determines that it
has at least one valid token:1. The compiler continues reading and sees
the 2 and determines that it now has two tokens, 1 and '+'. The 1 token
is an integer type of data the value of which is 1. The '+' token,
because it occurs between the 1 and the 2 represents the addition
operator. The compiler continues reading to find only whitespace and
then is able to identify the ASCII stream as a set of three tokens--a
value, an operator, and a value--that together form an expression.
  An expression is at least one operand and zero or more operators
that must be resolved to a single value.
  The compiler resolves the expression 1+2 to be a single value of 3.
  If you know how to write a C program that is exactly 1+2 and nothing
else, it's very likely your compiler will generate an error message
(remember, a compiler implements the C programming language
specification, and does so in its own way--the C specification is
deliberately permissive in some aspects of implementation).
  If you get an error message, very likely it will be a complaint that
there's not a complete statement or there's a problem at the end of the
file or some such.
 
  The C compiler is designed to read statements. A statement is a set
of valid tokens that follow the rules of the C programming language and
end with a statement termination character, which is the ; character.
  Try revising your program to read
1+2;
  The 1+2 is an expression: the C compiler sees 1 followed by +
followed by 2 and verifies that this is a valid sequence of tokens that
makes an expression. It interprets the ; character as a statement
terminator, which means the compiler creates the machine code for the
expression and resets itself to a neutral state, ready to read the next
statement (ASCII character stream of valid tokens).
  The compiler may compile the program with only warning messages. If
so, it will make a new file that is named a.out. It is not a loadable
program, nor is it executable. Very likely the entire contents is 3,
which means the compiler did the addition as it did the compiling. You
may think that the compiler would leave the 1+2 in the file as data and
machine instructions that the CPU runs to create the sum, 3. That the
compiler does the arithmetic before it is done is a matter of
optimization.
 
  The C compiler generally runs in four different phases:
1 preprocessor
2 compiler
3 optimizer
4 linker
 
  Consider the program:
1+2;
  The preprocessor runs and sees nothing to do.
  The compiler runs and translates the ASCII to data and machine code,
which properly is a set of 1 bits and 0 bits that represent integer 1,
integer 2, and the operation of addition.
  The optimizer recognizes that this expression can be resolved now
without doing any harm to any other parts of the program, so the
optimizer replaces the code with the integer value of 3.
  The linker runs and does nothing: there is no code to which to link
this module.
 
  Consider the following program:
1+2
3 + 4 ;
  How many statements do you see? How many expressions? How many
tokens?
 
  There is a single statement that has two expressions and a total of
seven tokens: 1, +, 2, 3, +, 4, and ; (we're not counting the space
characters or the newline characters).
  Note that the C compiler sees 1+2 and 3 + 4 identically: two
expressions that add two integer values together. Very likely the
resulting program will effectively be 3 7 after the optimizer pass does
its thing.
  Note that the 3 and the 7 are there in the program but the program
does nothing with them.
  Now it may be that the optimizer of your compiler detects that there
are no machine operations for the CPU and the optimizer might eliminate
the data itself. I doubt it, as it's possible that you may want to make
a file that contains only data and link it to one or more other programs
that you'll write at some time.
 
  The discussion so far includes the terms ASCII stream, token stream,
values, operands, operators, expressions, statements, and the four
compiler passes: preprocessor, compiler, optimizer, and linker.

Latest revision as of 23:19, 9 July 2019

This is a page for the C and assembler on linux class, Tuesdays at 5:30 PM in Church.