It is not fucking true.
The days of being able to study up on Abrash, take a look at istruction timings, and unroll loops by hand have been gone for over a decade. For fuck’s sake, even when you write in asm you’re not really writing asm. You’re writing in an asm layer that gets translated into microcode.
Microcode that you don’t see.
Unless you know the current CPUs architecture by rote, know what things like how Out of Order Exectution works, what the branch prediction algorithm is, what the cost of a failed prediction is, how the caches work, how much a cache miss costs, and about 18 gazillion other things, the best, THE BEST, you can possible hope to accomplish with your “hand rolled asm” is not to make it slower than your compilers optimizer did.
Unless you can write an optimizer, you can’t write faster asm by hand.