dzaima

joined 1 year ago
[–] dzaima@discuss.tchncs.de 2 points 2 months ago* (last edited 2 months ago)

There appears to be dedicated silicon for e.g. ADD AH, BL, see uops.info showing it having 1 uop across multiple microarchitectures (e.g. 1*p0156 being notation that it takes one uop on any port between 0/1/5/6, i.e. theoretical throughput of 4 instrs/cycle; I think the displayed 0.4 is just an artifact of it only testing 3 different destination registers despite there being a dependency on it). The newer Alder Lake actually has less throughput, but still takes only one uop.