Intel's Atom Architecture: The Journey Begins
by Anand Lal Shimpi on April 2, 2008 12:05 AM EST- Posted in
- CPUs
Instructions Gone Wild: Safe Instruction Recognition
The biggest fear with conventional in-order architectures is what happens if you have a high latency instruction that needs a piece of data that isn't available in the caches.
Since in-order microprocessors have to execute the instructions in order, the execution units remain idle until the CPU is able to retrieve the data it needs from main memory - a process that could easily take over a hundred clock cycles. The problem is that during these clock cycles, power is expended but no work is getting done - which is the exact opposite of what we want in an ultra low power microprocessor.
Out of order processors would get around this problem by simply scheduling around the dependent instruction. The scheduler would simply select the next instruction that was ready for execution and work would progress while the data dependent instruction waited for data for main memory. We've already established that a full OoOE core would be too power hungry for Atom, but relying on a pure in-order design also has the potential to be inefficient. Intel's Austin team found a clever middle ground for Atom.
It's called the Safe Instruction Recognition (SIR) algorithm and it works like this. If Atom is executing a long latency floating point operation followed by a short latency integer op you would traditionally stall until the FP op is complete (as we described above). The SIR algorithm looks at the two instructions and determines whether or not there are any data dependencies between the two (e.g. C = A + B followed by D = C + F), if there aren't then Atom will allow the "younger", shorter latency operation to proceed ahead of the longer FP operation.
SIR addresses a very specific case but it sprinkles a little bit of out-of-order goodness into the Atom's otherwise very strict in-order design. I wouldn't be too surprised if future iterations of Atom expand the situations in which these sort of out-of-order tricks are allowed.
46 Comments
View All Comments
lopri - Thursday, April 3, 2008 - link
This article is as much propagana-ish as it is technical. Did you read the last page of the article?clnee55 - Friday, April 4, 2008 - link
Since Anand wrote this article. I let him answer your accusationGulWestfale - Wednesday, April 2, 2008 - link
i believe that the graphics core in the chipset is a powerVR gen5 derivative; intel already uses some of their tech in its existing mainboards and wikipedia states that intel has licensed gen5 tech for one of its chipsets, the GMA500 (which is the same as poulsbo?) gen5 is also DX10-capable, which matches the info in your article.http://en.wikipedia.org/wiki/PowerVR#Series_5_.28S...">http://en.wikipedia.org/wiki/PowerVR#Series_5_.28S...
yyrkoon - Wednesday, April 2, 2008 - link
and wikipedia has been known to be wrong . . . a lot lately it seems.My point here *is*, I would probably trust anandtech more than wikipedia now days, as it seems any Joe can put up a 'reference' without citation.
jones377 - Wednesday, April 2, 2008 - link
Following the references link from the Wiki article...http://www.imgtec.com/News/Release/index.asp?NewsI...">http://www.imgtec.com/News/Release/index.asp?NewsI...
Poulsbo uses a PowerVR 3D core
Anand Lal Shimpi - Wednesday, April 2, 2008 - link
Yep, you guys are correct, I wasn't aware that it was public yet :) I've updated the article.Take care,
Anand