Arduino Pro Mini Primer

I am a frequent shopper at http://www.sparkfun.com. I have been looking at but until recently never bought any of the LilyPad/Arduino products. That time is behind me now, I ordered an Arduino Pro Mini - 3.3V/8MHz board. http://www.sparkfun.com/commerce/product_info.php?products_id=8824 And this cool little usb serial port to go with it http://www.sparkfun.com/commerce/product_info.php?products_id=8772

If you solder down a 6 pin header to the end of the blue board (with 6 holes), and line it up so that when you plug them together the corner with GRN matches the corner with GRN and the corner with BLK lines up with the corner that says BLK. Then "all you have to do" is plug in usb to provide power. The board comes programmed with a blinking light routine. You should also see a new serial port on your computer. I run linux and at the moment I have no other usb serial devices plugged in so it showed up as /dev/ttyUSB0. One quick way to figure out what usb serial port you have is unplug the device, ls /dev/ttyUSB*, then plug in the device and repeat and whatever the new number is is the FTDI board. dmesg and lsusb are also useful in determining what the serial port name is.

I am not sure on the history at this point but the Arduino is this very open source project hosted at http://www.arduino.cc. They have a getting started guide, which for the moment doesnt work on a 64 bit Ubuntu Linux (8.10), so I had to install a 32 bit linux to play with it. I was able to use the getting started guide and the java based software and the hacked avrdude programmer to change the blinky light program on the board to the example blinky light program. The first thing I couldnt figure out was what if I wrote my own assembler program and assembled it, how do I just load it? I dont think you can. You are completely confined to their sandbox, how fun...

Let the hacking begin. As of this writing, the one time I tried it the command line being used by the java program to avrdude was:
avrdude-C avrdude.conf -q -q -p m168 -cstk500v1 -P /dev/ttyUSB0 -b19200 -D-U flash:w:Blink.hex:i
I had to press the reset button just before running avrdude, not sure what was up with that. I am not sure what they did to modify the avrdude programmer, nor the config file. Either way, the ability to program a .hex file without the java program is available. So that makes me happy...but not completely. I dont want a gui programmer, in fact I dont want avrdude, I want to see the protocol, I want to see the bootloader source, I want to get my hands dirty. In this case I didnt have to mess with it, they did the right thing. It boots up, waits a couple of seconds in case you want to program it then if not it jumps to the program loaded in flash. I donot think the AVR Butterfly worked that way, I want to remember that I had to modify the bootloader.

As shipped from sparkfun, the board responds to the STK500 communications protocol described in the AVR061 (doc2525.pdf) app note. It only takes a handful of command bytes to put the board in a program state and read or write the flash.

Currently my loader is very crude, but I will work on it over time. I agree that it is very cool that they wired the FTDI (red) board such that you can programmatically reset the AVR, this way you dont have to touch the board to program it. My loader takes advantage of that. Although I still have some problems, sometimes the program will start up as desired after loading, sometimes you have to press the reset button after loading to get the new program.

EDIT: Made some changes to progard and ser and seems to program a lot more reliably. UNO, Pro 5v, Pro mini 5v, lilypad 3.3v.

progard.c
ser.c
ser.h

progard was updated 2008-12-22 to allow for programs larger than 256 bytes.
progard was updated 2011-02-04 to work with the uno and pro

gcc -o progard progard.c ser.c
Will compile the program. To run it do something like this:
./progard blink1.hex
I have hardcoded the serial port in ser.c, so you will likely need to change it before you (compile and) use it.

Blinking the led

The LED is tied to pin 13 using their nomenclature. It is really PORT B pin 5 or PB5. The other thing that many will prefer but I dont is that they have created a wonderful C language environment for you to work in. Follow the getting started guide and try the digital blink example program to see what I mean. The primary purpose of this web page is so that you have another choice. I dont necessarily want to use C and if I did I dont want that much hidden, I want to touch and feel every bit in every register. YMMV, if you prefer the high level interface then this product is for you as well... To blink the led on this board what you need to do is enable PB5 as an output pin. This is done by setting bit 5 in register DDRB (direction bit). Writing bit 5 in PORTB will allow you to turn the led bit on or off. I have not bothered to take care of the neighboring bits in the example program below, essentially I am taking over all of port B not just PB5. Ideally you want to do a read-modify-write so that you only change the one bit of interest. For the moment I am using avra on Linux (apt-get install avra), which is for my purposes compatible with the AVR assembler from Atmel (windows command line). You can or at least could use the AVR assembler from atmel (get av studio 3 it is a much smaller download, doesnt require the registration step and has what you need, but not an include file) using wine (on Linux). You will notice when you look at the datasheet for the ATmega168 part the addresses are next to the port descriptions (search for DDRB for example). You DO NOT need to include a file that you cannot always find to provide defines for these addresses. I have attempted below to create programs that are completely contained in a single file. That way you wont end up like me in a few years going, what about the port addresses what is the syntax for that, why didnt I put that file on my project page?
.device ATmega168
.equ  DDRB       = 0x04
.equ  PORTB      = 0x05

.org 0x0000
    rjmp RESET

RESET:
    ldi R16,0x20
    out DDRB,R16

    ldi R18,0x00
    ldi R17,0x00
    ldi R20,0x20
Loop:

    ldi R19,0xE8
aloop:
    inc R17
    cpi R17,0x00
    brne aloop

    inc R18
    cpi R18,0x00
    brne aloop

    inc R19
    cpi R19,0x00
    brne aloop

    eor R16,R20
    out PORTB, R16
    rjmp Loop
To assemble and load the above program:
avra -fI blink1.asm
progard blink1.hex
The led should blink about once a second, or is it that it is on for a second and off for a second, I dont remember. This is a counter timed loop, I turn on the led, I count to some number, I turn off the led, count to some number, turn on the led... The central R17 loop counts to 256, essentially R17 goes through all the counts from 0x01 to 0xFF then rolls over to 0x00 and is done. R18 is the same 256 times but it wraps around the R17 loop so you get 256*256 or 65536 counts. R19 counts to 24, so the grand total is 0x180000 or a meg and a half. The avr is supposedly running at 8MHz, so this is about right we must be using 5 clocks or so on average per count to get to about a second. What this tells me is that from boot, this arduino board is already running at full speed. So many chips/boards come up at a minimal clock rate and you have to program the clock dividers yourself to get to the maximum advertised rate (or at least something not so slow). Likewise the bootloader configures the uart so we dont have to (so long as we stick with the default speed).


UART Echo

A real simple uart echo program. If there is a byte in the rx buffer put it in the tx buffer.
.device ATmega168

.equ  UDR       = 0xC6
.equ  UBRRH     = 0xC5
.equ  UBRRL     = 0xC4
.equ  UCSRC     = 0xC2
.equ  UCSRB     = 0xC1
.equ  UCSRA     = 0xC0

mainloop:

getit:
    lds r20,UCSRA
    andi r20,0x80
    breq getit
    lds r18,UDR

sendit:
    lds r20,UCSRA
    andi r20,0x20
    breq sendit
    sts UDR,r18

    rjmp mainloop

IR blink

My initial goal is to do something with IR (remote controls). I like using IR modules like the Sharp GP1UE27XK (DigiKey part number 425-1904-ND) for example. Orienting the receiver such that pin 3 (GND) is in the GND next to PD2. PD2 gets pin 2 (VCC) and PD3 gets pin 1 (VOUT). I set PD2 as an output to power the IR module. Configure PD3 as an input and enable the pull up resistor. PB5 is configured as an output to blink the led. From there the code simply reads the state of the IR and sets the led accordingly. Be very careful, the IR modules are normally three pin, but the location of GND, VCC and VOUT vary. It is pretty easy to let the smoke out if you get it wrong.


.device ATmega168
.equ    DDRB    = 0x04
.equ    PORTB   = 0x05

.equ    PORTD   = 0x0B
.equ    DDRD    = 0x0A
.equ    PIND    = 0x09

.org 0x0000
    rjmp RESET

RESET:
    ; led is PB5 output
    ldi R16,0x20
    out DDRB,R16

    ;PD2 is out, vcc for ir module, hi
    ;PD3 is input, vout for ir module, pull up

    ldi R16,0x00
    out PORTD,R16
    ldi R16,0x04
    out DDRD,R16
    ldi R16,0x0C
    out PORTD,R16

    ldi R16,0x20
    ldi R17,0x08
    ldi R18,0x00

mainloop:
    in R19,PIND
    and R19,R17
    brne is_set
    mov R19,R16
is_set:
    out PORTB,R19
    rjmp mainloop

blink2.asm

Simple timer based blinking. I dont think there is a prescaler acting upon the I/O clock after leaving the bootloader. The program below puts timer0 in normal mode (a simple no frills counter) using I/O clock divided by 1024. The R17 loop divides that clock by 1024. And the R18 loop divides it by another 16. This results in a half second on, half second off, blink (the led looks like it blinks once a second). 16*1024*256 = 4*1024*256*4 = 4*1024*1024 or a divide by 4meg. Now thats meg as in bytes not hz so I am a little off. The outer loop should have divided by 15.25 not 16. Close enough for this purpose, and close enough to know that there are no prescalers inline, post-bootloader.

.device ATmega168
.equ  DDRB       = 0x04
.equ  PORTB      = 0x05
.equ  TCCR0A     = 0x24
.equ  TCCR0B     = 0x25
.equ  TCNT0      = 0x26

.org 0x0000
    rjmp RESET

RESET:

    ldi R16,0x00
    out TCCR0A,R16
    ldi R16,0x05
    out TCCR0B,R16

    ldi R16,0x20
    out DDRB,R16

    ldi R17,0x00
    ldi R20,0x20
Loop:


    ldi R18,0xF0
aloop:
    in R17,TCNT0
    cpi R17,0x00
    brne aloop

bloop:
    in R17,TCNT0
    cpi R17,0x00
    breq bloop

    inc R18
    cpi R18,0x00
    brne aloop

    eor R16,R20
    out PORTB, R16
    rjmp Loop

blink3.asm

Found the pre-scaler. Unless there is a way to disconnect, the problem with this is that the output is used both for the timer and the uart, so if you divide the system prescaler down too much then you cant use the uart.

CLKPR is used to divide the 8MHz down by 128. Then the timer prescaler is set for 1/1024. The result is 61 and change ticks per second. I sample the baseline timer count in R17 then sample it again in R18 subtract R17 from R18 and you get the difference (yes this works even when the timer rolls over). When the difference reaches 30 counts toggle the led. The led is on for half a second and off for half a second makes it look like it is blinking once a second.

.device ATmega168
.equ DDRB       = 0x04
.equ PORTB      = 0x05

.equ TCCR0A     = 0x24
.equ TCCR0B     = 0x25
.equ TCNT0      = 0x26

.equ CLKPR      = 0x61

.org 0x0000
    rjmp RESET

RESET:
    ldi R16,0x80
    ldi R17,0x07 ;0x7 1/128  0x8 1/256
    sts CLKPR,R16
    sts CLKPR,R17

    ldi R16,0x00
    out TCCR0A,R16
    ldi R16,0x05 ;0x5 1/1024 0x06 1/256
    out TCCR0B,R16

    ldi R16,0x20
    out DDRB,R16

    ldi R17,0x00
    ldi R20,0x20
Loop:

    in R17,TCNT0
aloop:
    in R18,TCNT0
    sub r18,r17
    cpi r18,0x1E
    brlo aloop

    eor R16,R20
    out PORTB, R16

    rjmp Loop

sirc1.asm

This is the first of at least two example programs that complete my initial project for this board. An IR (Remote Control) to serial receiver.

Using http://www.sbprojects.com/knowledge/ir/ir.htm as a reference. Look at the Sony SIRC protocol. This is the protocol decoded by this program and discussed here.

IR remotes in general, if not all IR devices, are modulated on some carrier frequency. In this case 40kHz. Basically the ON periods are a square wave of ons and offs to the IR led at a rate of 40kHz. And the OFF periods leave the led off. The IR module listed above (irblink1.asm), and the others like it, work best at a matching frequency (buy a 40kHz receiver for a 40kHz based protocol) although you can often get away with using the wrong frequency if you allow for more error in the on and off periods. The IR module flattens out or wraps or envelopes the ON period. So if you had 1ms of blinks at 40kHz, instead of 40 blinks the output of the module is a single 1ms pulse. This IR module and perhaps all of them let go or otherwise drive the output high when receiving IR and drive low when there is no incoming IR. (or do I have that backward?). What comes out of the IR receiver for the SIRC protocol will be 2.4ms or 1.2ms or 0.6ms pulses with 0.6ms gaps in between.

I like the protocols that start with a sync pattern. It makes decoding easier for one thing as well as other benefits. In this case the sync pattern is an ON period lasting 2.4ms. At 8MHz 2.4ms is 19200 clock cycles. Since we have one and know how to use it we might as well use one of the 8 bit timer/counters. This means we need to get 19200 plus a tolerance within an 8 bit counter with enough accuracy to tell the difference between 2.4ms, 1.2ms and 0.6ms. So to fit 19200 in 8 bits or 256 counts the divisor must be greater than 19200/256 or 74. Looking at the clock select bits in timer/counter 0 a divide by 256 is the next larger divisor. This means our timer would run at 8Mhz/256 or 31250Hz. 2.4ms at 31250Hz is 75 counts. 1.2ms is 37.5 counts and 0.6ms is 18.75 counts. So that should work.

The blink3.asm example above shows how to use the timer/counter to measure time. So we can begin. If we dont see any IR, sit in an infinite loop (set1). Once we get some IR, grab the current time (counter/timer), set the LED just for fun. Then wait for the IR to stop, grab the time, subtract end minus start to get the total time. If that time is not 75 counts plus or minus a margin then it is not a valid sync pattern so go back and wait for another. The web page that describes the protocol above has you thinking about the 0.6ms off periods as being AFTER the on period. Other than the sync pattern I am thinking of the off period as BEFORE the on period. So we can go into a loop that repeats the same tasks 12 times, one for each data bit we expect. If any measurement is not within a tolerance then go back to the top of the main loop and wait for another sync pattern. This 12 times loop measures the off period insuring that it is 0.6ms. The protocol dictates that a ONE is 1.2ms and a ZERO is 0.6ms. I use register r0 and r1 to hold these 12 bits. For each new bit I shift the register pair left one and if the new bit is a 1 I set the lower bit of r1.

This receiver is a generic Sony SIRC receiver, the code sent out on the uart (to the host) is using a protocol that is easy to parse in ascii or binary, etc. The packet starts with an equal sign (0x3D), ends with a CRLF, and the three bytes in the middle are the ascii representing the 12 code bits (in hex). Once decoded there is an infinite number of ways to transmit the code to the user, this six byte thing just happens to be the one I chose. As of this writing I do not know enough about LIRC or any other IR software on the host side that I can interface to for operating things like mythtv. I hope to learn more and put this code to use. (The goal of this immediate project).

After my first successful IR receiver many years ago, I then found another PIC based receiver that was universal and had different goals. I found this single protocol approach to be superior for my goals. The range on the receiver was noticably farther and more accurate. My feeling is/was that programmable remotes are common, they support hundreds or thousands of flavors of the common protocols, so within a single remote there are dozens to hundreds of choices that will not have a conflict with your existing equipment but that I can receive. Basically my goal here is not to receive any and every protocol, my goal is to provide a way to receive commands from a significant portion of the remotes out there. In particular ones that you use often. Looking at the LIRC protocol for example (LIRC was not the receiver mentioned above) you could easily rip most of this code out and create a LIRC receiver. Or you can leave most of the code here and provide a perfect bytestream to a LIRC daemon. Since we have done the work it seems a waste to make the PC do it again.
.device ATmega168
.equ    DDRB    = 0x04
.equ    PORTB   = 0x05

.equ    PORTD   = 0x0B
.equ    DDRD    = 0x0A
.equ    PIND    = 0x09

.equ    TCCR0A  = 0x24
.equ    TCCR0B  = 0x25
.equ    TCNT0   = 0x26

.equ  UDR       = 0xC6
.equ  UBRRH     = 0xC5
.equ  UBRRL     = 0xC4
.equ  UCSRC     = 0xC2
.equ  UCSRB     = 0xC1
.equ  UCSRA     = 0xC0


.equ SETLED     = 0x20
.equ RESETLED   = 0x00

.equ MAX24 = 82
.equ MIN24 = 67

.equ MAX12 = 41
.equ MIN12 = 33

.equ MAX06 = 21
.equ MIN06 = 17


.org 0x0000
    jmp RESET

RESET:

    ldi r16,0x00
    out TCCR0A,r16
    ldi r16,0x04 ; 0x04 divide by 256
    out TCCR0B,r16

    ; led is PB5 output
    ldi R16,0x20
    out DDRB,R16

    ;PD2 is out, vcc for ir module, hi
    ;PD3 is input, vout for ir module, pull up

    ldi R16,0x00
    out PORTD,R16
    ldi R16,0x04
    out DDRD,R16
    ldi R16,0x0C
    out PORTD,R16

    ldi R16,0x20
    ldi R17,0x08


mainloop:
    ;call ResetLed
    ldi r20,RESETLED
    out PORTB,r20

set1:           ;while set
    in r5,PIND
    and r5,r17
    brne set1
    in r8,TCNT0
    ;call SetLed
    ldi r20,SETLED
    out PORTB,r20

reset1:         ;while not set
    in r5,PIND
    and r5,r17
    breq reset1
    in r19,TCNT0
    ;call ResetLed
    ldi r20,RESETLED
    out PORTB,r20

    mov r7,r19
    sub r19,r8
    mov r8,r7



    ;2.4ms sync pattern
    cpi r19,MAX24
    brlo reset1a
    jmp mainloop
reset1a:

    cpi r19,MIN24
    brsh reset1b
    jmp mainloop
reset1b:





    eor r0,r0 ;rcode hi
    eor r1,r1 ;rcode lo
    ldi r20,0x0C
    mov r2,r20 ;count to 12

twelveloop:




twset1:         ;while set
    in r5,PIND
    and r5,r17
    brne twset1
    in r19,TCNT0
    ;call SetLed
    ldi r20,SETLED
    out PORTB,r20


    mov r7,r19
    sub r19,r8
    mov r8,r7



    ;0.6ms spacer
    cpi r19,MAX06
    brlo twset1a
    jmp mainloop
twset1a:

    cpi r19,MIN06
    brsh twset1b
    jmp mainloop
twset1b:



twreset1:       ;while not set
    in r5,PIND
    and r5,r17
    breq twreset1
    in r19,TCNT0
    ;call ResetLed
    ldi r20,RESETLED
    out PORTB,r20

    mov r7,r19
    sub r19,r8
    mov r8,r7

    ;0.6ms or 1.2ms data bit
    cpi r19,MAX12
    brlo twreset1a
    jmp mainloop
twreset1a:

    cpi r19,MIN06
    brsh twreset1b
    jmp mainloop
twreset1b:


    cpi r19,MIN12
    brsh new_one_bit

    cpi r19,MAX06
    brlo new_zero_bit

    jmp mainloop

new_one_bit:
    lsl r1
    rol r0
    inc r1
    jmp twelvebottom

new_zero_bit:
    lsl r1
    rol r0
    jmp twelvebottom

twelvebottom:

    dec r2
    breq new_code
    jmp twelveloop

new_code:



    ldi r18,0x3D
    call sendbyte

    mov r18,r0
    call sendhex

    mov r18,r1
    lsr r18
    lsr r18
    lsr r18
    lsr r18
    call sendhex

    mov r18,r1
    call sendhex


    ldi r18,0x0D
    call sendbyte

    ldi r18,0x0A
    call sendbyte

    jmp mainloop



sendhex:
    andi r18,0x0F
    cpi r18,0x0A
    brlo not_big
    ldi r20,0x07
    add r18,r20
not_big:
    ldi r20,0x30
    add r18,r20
sendbyte:
    lds r20,UCSRA
    andi r20,0x20
    breq sendbyte
    sts UDR,r18

    ret

sirc2.asm

This is the same as sirc1.asm above but instead of being generic and sending the raw IR code in a packet, I examine the code for a specific sony remote, a specific set of buttons/commands and what is sent to the computer is single bytes. Each single byte being a specific command. The commands being decoded are defined in this table.

CODESTRUCT codes[NCODES]=
{
    {"on",0x750},
    {"off",0xF50},
    {"cup",0x090},
    {"cdn",0x890},
    {"vup",0x490},
    {"vdn",0xC90},
    {"mute",0x290},
};
I wrote a simple program to take the above table and create avr code, you can see the patterns and the code could have been cleaner had I focused on shortcuts in the patterns. There is enough program memory and speed to waste the cycles here. Note that with the remote I am using (a direct tv remote) this code is fast enough to see three patterns per button press. This is depending on how you look at it a success or weakness for this protocol You get the same pattern over and over. Some protocols toggle a bit to show you a repeated keypress. And other protocols send a single code pattern then a completely different pattern that means repeat the last code pattern. So it is up to you to figure out how to distingquish a single volume up from two volume up presses. I would probably put a delay after the first successful decode long enough to miss or ignore the repeats. That way you decode at least one but not more than one per button press on the remote.

.device ATmega168
.equ    DDRB    = 0x04
.equ    PORTB   = 0x05

.equ    PORTD   = 0x0B
.equ    DDRD    = 0x0A
.equ    PIND    = 0x09

.equ    TCCR0A  = 0x24
.equ    TCCR0B  = 0x25
.equ    TCNT0   = 0x26

.equ  UDR       = 0xC6
.equ  UBRRH     = 0xC5
.equ  UBRRL     = 0xC4
.equ  UCSRC     = 0xC2
.equ  UCSRB     = 0xC1
.equ  UCSRA     = 0xC0


.equ SETLED     = 0x20
.equ RESETLED   = 0x00

.equ MAX24 = 82
.equ MIN24 = 67

.equ MAX12 = 41
.equ MIN12 = 33

.equ MAX06 = 21
.equ MIN06 = 17


.org 0x0000
    jmp RESET

RESET:

    ldi r16,0x00
    out TCCR0A,r16
    ldi r16,0x04 ; 0x04 divide by 256
    out TCCR0B,r16

    ; led is PB5 output
    ldi R16,0x20
    out DDRB,R16

    ;PD2 is out, vcc for ir module, hi
    ;PD3 is input, vout for ir module, pull up

    ldi R16,0x00
    out PORTD,R16
    ldi R16,0x04
    out DDRD,R16
    ldi R16,0x0C
    out PORTD,R16

    ldi R16,0x20
    ldi R17,0x08


    ldi r18,0x30
    jmp sendbyte

mainloop:
    ;call ResetLed
    ldi r20,RESETLED
    out PORTB,r20

set1:           ;while set
    in r5,PIND
    and r5,r17
    brne set1
    in r8,TCNT0
    ;call SetLed
    ldi r20,SETLED
    out PORTB,r20

reset1:         ;while not set
    in r5,PIND
    and r5,r17
    breq reset1
    in r19,TCNT0
    ;call ResetLed
    ldi r20,RESETLED
    out PORTB,r20

    mov r7,r19
    sub r19,r8
    mov r8,r7



    ;2.4ms sync pattern
    cpi r19,MAX24
    brlo reset1a
    jmp mainloop
reset1a:

    cpi r19,MIN24
    brsh reset1b
    jmp mainloop
reset1b:





    eor r0,r0 ;rcode hi
    eor r1,r1 ;rcode lo
    ldi r20,0x0C
    mov r2,r20 ;count to 12

twelveloop:




twset1:         ;while set
    in r5,PIND
    and r5,r17
    brne twset1
    in r19,TCNT0
    ;call SetLed
    ldi r20,SETLED
    out PORTB,r20


    mov r7,r19
    sub r19,r8
    mov r8,r7



    ;0.6ms spacer
    cpi r19,MAX06
    brlo twset1a
    jmp mainloop
twset1a:

    cpi r19,MIN06
    brsh twset1b
    jmp mainloop
twset1b:



twreset1:       ;while not set
    in r5,PIND
    and r5,r17
    breq twreset1
    in r19,TCNT0
    ;call ResetLed
    ldi r20,RESETLED
    out PORTB,r20

    mov r7,r19
    sub r19,r8
    mov r8,r7

    ;0.6ms or 1.2ms data bit
    cpi r19,MAX12
    brlo twreset1a
    jmp mainloop
twreset1a:

    cpi r19,MIN06
    brsh twreset1b
    jmp mainloop
twreset1b:


    cpi r19,MIN12
    brsh new_one_bit

    cpi r19,MAX06
    brlo new_zero_bit

    jmp mainloop

new_one_bit:
    lsl r1
    rol r0
    inc r1
    jmp twelvebottom

new_zero_bit:
    lsl r1
    rol r0
    jmp twelvebottom

twelvebottom:

    dec r2
    breq new_code
    jmp twelveloop

new_code:







    ;on 0x750
    ldi r20,0x07
    cp r0,r20
    brne not_on
    ldi r20,0x50
    cp r1,r20
    brne not_on
    ldi r18,0x31 ; on
    rjmp sendbyte
not_on:

    ;off 0xF50
    ldi r20,0x0F
    cp r0,r20
    brne not_off
    ldi r20,0x50
    cp r1,r20
    brne not_off
    ldi r18,0x32 ; off
    rjmp sendbyte
not_off:

    ;cup 0x090
    ldi r20,0x00
    cp r0,r20
    brne not_cup
    ldi r20,0x90
    cp r1,r20
    brne not_cup
    ldi r18,0x33 ; cup
    rjmp sendbyte
not_cup:

    ;cdn 0x890
    ldi r20,0x08
    cp r0,r20
    brne not_cdn
    ldi r20,0x90
    cp r1,r20
    brne not_cdn
    ldi r18,0x34 ; cdn
    rjmp sendbyte
not_cdn:

    ;vup 0x490
    ldi r20,0x04
    cp r0,r20
    brne not_vup
    ldi r20,0x90
    cp r1,r20
    brne not_vup
    ldi r18,0x35 ; vup
    rjmp sendbyte
not_vup:

    ;vdn 0xC90
    ldi r20,0x0C
    cp r0,r20
    brne not_vdn
    ldi r20,0x90
    cp r1,r20
    brne not_vdn
    ldi r18,0x36 ; vdn
    rjmp sendbyte
not_vdn:

    ;mute 0x290
    ldi r20,0x02
    cp r0,r20
    brne not_mute
    ldi r20,0x90
    cp r1,r20
    brne not_mute
    ldi r18,0x37 ; mute
    rjmp sendbyte
not_mute:

    ldi r18,0x30 ; UNKNOWN VALID CODE
    rjmp sendbyte

sendbyte:
    lds r20,UCSRA
    andi r20,0x20
    breq sendbyte
    sts UDR,r18

    jmp mainloop


This is basically the blink example with a mixture of C and asm. Needed to apt-get install gcc-avr to get the compiler. I am sure there are other preferred ways to hit the out or sbi/cbi instructions. I prefer to have assembler do it and prefer not to be tied to header files, linker scripts, etc. Some avr-gcc docs said that parameters are passed starting with r25 and working down, well this one used r24. Need to check test.list before you load the file to make sure. The dummy routine is there because otherwise the optimizer would remove the useless count to N loops. This blinks much faster than blink1.asm only because the count down loops are shorter. The three files below are part of the example. startup.s is both the entry point of the program (address zero in memory) and contain the few asm routines. test.c is the main program. And the last file is the Makefile showing how to build and generate a hex file. You can then use progard above or avrdude or roll your own loader.

startup.s


.global _start
_start:
    rjmp notmain

.global OUT_DDRB
OUT_DDRB:
    out 0x04,r24
    ret

.global OUT_PORTB
OUT_PORTB:
    out 0x05,r24
    ret

.global dummy_ret
dummy_ret:
    ret

test.c

extern void OUT_DDRB ( unsigned char );
extern void OUT_PORTB ( unsigned char );
extern void dummy_ret ( unsigned char );
void notmain  (void )
{
    unsigned short ra;

    OUT_DDRB(0x20);

    while(1)
    {
        ra=0; while(--ra) dummy_ret(ra&0xFF);
        OUT_PORTB(0x20);
        ra=0; while(--ra) dummy_ret(ra&0xFF);
        OUT_PORTB(0x00);
    }
}

Makefile


all : test.hex

startup.o : startup.s
	avr-as startup.s -o startup.o

test.o : test.c
	avr-gcc -O2 -c test.c -o test.o

test.elf : startup.o test.o
	avr-ld -Ttext 0 -nostdlib -nostartfiles  startup.o test.o -o test.elf

test.hex : test.elf
	avr-objdump -D test.elf > test.list
	avr-objcopy -O ihex test.elf test.hex