Ready, Set, Oscillate! The Fastest Way to Change Arduino Pins

Posted in Tutorials by Bill
18 Aug 2010

There are many ways to change an output pin. The way we know and love is the famous digitalWrite() function. (Spoiler: Want a faster digitalWrite? Download Here!)

But even the Arduino Reference claims that it is not the most efficient. The Arduino functions do a lot of error checking to make sure the pin is configured right and has to map Arduino numbering to actual IO ports.  All this cost processor cycles, and time.  But how much? This article is not to teach you how to useIO registers, you can read about it on the Arduino Port Manipulation page. This is to cover exactly how inefficient the Arduino functions are.

I ran some tests to find out. The test platform was a 16Mhz Arduino and a very nice oscilloscope watching one of its output pins. I ran two tests, one setting the pins to know values(on or off), and one that ‘flips’ the pin from its previous state.  My code was inside a never ending for loop, so the result would always be a square wave form that I could measure in frequency.

The estimated CPU cycles is calculated from ½ the waveform period measured divided by the period of 16Mhz, since it takes two write operations to complete a full period in a waveform.  There could be some differences in the machine instructions it takes to set a bit compared to clearing a bit, so this is somewhat rough.

The 3 methods I tested were

  • digitalWrite(pin, LOW);         digitalWrite(pin, HIGH);
  • CLR(PORTB, 0) ;     SET(PORTB, 0);
  • PORTB |= _BV(0);                   PORTB &= ~(_BV(0));

The macros used:

#define CLR(x,y) (x&=(~(1<<y)))

#define SET(x,y) (x|=(1<<y))

#define _BV(bit) (1 << (bit))

The results

As you can see, digitalWrite takes around 56 cycles to complete, while direct Port addressing takes 2 cycles. That’s a big difference in time for programs that have lot’s of IO operations!

Next I tested just flipping a pin. By this I mean I just changed the pin state without knowing what the initial state was without testing. The methods of testing

  • digitalWrite(pin, !digitalRead(pin))
  • PORTB ^= (_BV(0))
  • sbi(PINB,0)

#define sbi(port,bit) (port)|=(1<<(bit))

The results

Wow, the Arduino method takes a whopping 121 cycles to flip a pin! The sbi() using the PIN register is a neat trick for what usually is a read only register, and is the fastest at only 2 cycles.

So you see, the Arduino functions take much MUCH longer to complete pin operations then using direct port IO. But there is a reason why. Arduino does a lot of error checking, and has to look up pin number mappings to actual Atmega pins. Direct port access is not for the faint of heart, but it can be much faster for when you are ready to take off some of the Arduino training wheels.

Credit to Webbot for showing us the PIN register trick.

UPDATE: Well, it seems the attention of my article has made me aware of a neat-o library for Arduino that keeps the code simple, but runs just as fast as direct port manipulation. Arduino Forum post here. I just tested the digitalWriteFast2() function and it also seems to only require 2 cycles to complete.

Share

Trackbacks / Pingbacks

  1. Todays Arduino Moment - Hack a Day

Warning: count(): Parameter must be an array or an object that implements Countable in /homepages/46/d285670699/htdocs/bill/wp-includes/comment.php on line 879
  1. 42 Comments.

    • OddBot says:

      Occasionally I find the need to speed up a program and I often use similar test methods to test which method is quicker.

      For those who do not have an oscilloscope or frequency counter then I suggest puting the function to be tested in a loop of say 1000 times. Use the millis() function to measure the time before and after.

      This method will not tell you exactly how many cycles the processor takes but it will tell you which method is quicker and by roughly how much.

    • BillNo Gravatar says:

      Very true OddBot. I noticed people have posted results using similar methods of recording time, but they all don’t take into account how much time a call to millis() itself takes. I wanted hard numbers on how long the pin changes took.

      But the method does work for comparing sections of code to see which one is better. The time is relative in that case.

    • JohnNo Gravatar says:

      Please also let us know the version of the IDE you used for this test – I understand that the latest version is much faster.

    • […] for your convenience. The first tip we received was for some hints provided by [Bill] on some digitalWrite() alternatives. Similar to some previous research we covered, this tip also includes some tips on how to make the […]

    • MichaelNo Gravatar says:

      It seems like you are comparing apples to oranges when you compare the performance of PORTB = CLR(PORTB, 0) ; with the performance of PORTB |= _BV(0); The first clears a bit (sets the pin LOW), and the second sets a bit (sets the pin HIGH). Of course clearing a bit using the CLR operation is slower because it involves two bitwise operations (& and ~). Setting a bit is faster because it involves only one bitwise operation (|). You should compare
      PORTB = SET(PORTB, 0);
      with
      PORTB |= _BV(0);

      and compare
      PORTB = CLR(PORTB, 0) ;
      with
      PORTB &= ~(_BV(0));

      I doubt you will find a difference. The C preprocessor will replace the macros with equivalent C code which should result in the same assembly.

    • NsN says:

      […]but they all don’t take into account how much time a call to millis() itself takes. […]
      There is an easy way around that, just do one loop with 1000 iterations, measure the time and do another one with 2000 Iterations and measure the time. The difference between both time should be the time 1000 iterations take to execute. The more iterations you do, the more exact your measurements should become.

      Of course, if you have a nice scope, that is probably easier…

    • CharlieNo Gravatar says:

      Did you account for the jump back to the start of the loop?
      would a bunch of unrolled
      PORTB |= _BV(0); PORTB &= ~(_BV(0));
      PORTB |= _BV(0); PORTB &= ~(_BV(0));
      PORTB |= _BV(0); PORTB &= ~(_BV(0));
      …..
      give a different frequency? I’d try it if I had a scope….

    • BillNo Gravatar says:

      Thanks for the comments guys. I have revisited the tests and eliminated the delay by the for loop. Results have been updated. I also found I was using the SET() and CLR macros wrong.

    • P18F4550 says:

      I tried some similars test’s also, here are the results
      http://www.arduino.cc/cgi-bin/yabb2/YaBB.pl?num=1279055639/10

    • KI4MCWNo Gravatar says:

      Another =horrible= time waster is the good old MOD operator. In some AVR-GCC code on an ATmega328, I found that this:

      # z = remainder (MOD) of x/y
      z = x ;
      while ( z >= y ) { z -= y ; }

      …was roughly 100 times faster (for my purposes) than…

      z = x % y ;

      On a quad-core monster machine you might never notice a difference, but on an 8-bit MCU working with 16-bit or 32-bit unsigned integers, the difference is huge.

    • PaulNo Gravatar says:

      Nearly 1 year ago, I wrote an optimized digitalWrite() that compiles to only a single 2-cycle instruction when the inputs are compile-time constants. I’ve been shipping it as part of Teensyduino since November (Teensyduino also has non-const optimizations). The const-case single instruction code has also been sitting in Arduino issue #140, maybe someday to become part of Arduino proper.

    • BillNo Gravatar says:

      Sounds like the same approach used in the library I posted in my update. Can you confirm?

    • You may be interested in an Arduino runtime library replacement I have been developing over the last few months, which addresses this issue, and many others too, such as the ability to send serial data asynchronously.

      Full source & tutorials are available at: http://www.makehackvoid.com/group-projects/mhvlib-efficiency-oriented-library-avr-microcontrollers

    • You might also want to take account of the following from the avr-gcc FAQ:

      http://www.nongnu.org/avr-libc/user-manual/FAQ.html#faq_intpromote

      ———-
      Why does the compiler compile an 8-bit operation that uses bitwise operators into a 16-bit operation in assembly?

      Bitwise operations in Standard C will automatically promote their operands to an int, which is (by default) 16 bits in avr-gcc.

      To work around this use typecasts on the operands, including literals, to declare that the values are to be 8 bit operands.

      This may be especially important when clearing a bit:

      var &= ~mask; /* wrong way! */

      The bitwise “not” operator (~) will also promote the value in mask to an int. To keep it an 8-bit value, typecast before the “not” operator:

      var &= (unsigned char)~mask;
      ———

      You’d have to look at the assembler output to check if the char -> int promotion was happening. And in fact you don’t need a scope to figure out how fast you can do pin IO, just disassemble the code 😉

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.