What is the meaning of the data32 data32 nopw %cs:0x0(%rax,%rax,1) instruction in gcc inline asm? -

March 15, 2015

while running tests -o2 optimization of gcc compilers, observed following instruction in disassembled code function:

data32 data32 data32 data32 nopw %cs:0x0(%rax,%rax,1)

what instruction do?

to more detailed trying understand how compiler optimize useless recursions below o2 optimization:

int foo(void) {    return foo(); } int main (void) {    return foo(); }

the above code causes stack overflow when compiled without optimization, works o2 optimized code.

i think o2 removed pushing stack of function foo, why data32 data32 data32 data32 nopw %cs:0x0(%rax,%rax,1) needed?

0000000000400480 <foo>: foo(): 400480:       eb fe                   jmp    400480 <foo> 400482:       66 66 66 66 66 2e 0f    data32 data32 data32 data32 nopw %cs:0x0(%rax,%rax,1) 400489:       1f 84 00 00 00 00 00  0000000000400490 <main>: main(): 400490:       eb fe                   jmp    400490 <main>

you see operand forwarding optimization of cpu pipeline.

although empty loop, gcc tries optimize :-).

the cpu running has superscalar architecture. means, has pipeline in it, , different phases of executions of consecuting instructions happen parallel. example, if there a

mov eax, ebx ;(#1) mov ecx, edx ;(#2)

then loading & decoding of instruction #2 can happen while #1 executed.

the pipelining has major problems solve in case of branches, if unconditional.

for example, while jmp decoding, next instruction prefetched pipeline. jmp changes location of next instruction. in such cases, pipeline needs emptied , refilled, , lot of worthy cpu cycles lost.

looks empty loop run faster if pipeline filled no-op in case, despite won't ever executed. optimization of uncommon feature of x86 pipeline.

earlier dec alphas segfault such things, , empty loops had have lot of no-ops in them. x86 slower. because must compatible intel 8086.

here can read lot handling of branching instructions in pipelines.

Search This Blog

UV code

What is the meaning of the data32 data32 nopw %cs:0x0(%rax,%rax,1) instruction in gcc inline asm? -

Comments

Post a Comment

Popular posts from this blog

jquery - How do you format the date used in the popover widget title of FullCalendar? -

Bubble Sort Manually a Linked List in Java -

asp.net mvc - SSO between MVCForum and Umbraco7 -