Talk:PS2 Emulation: Difference between revisions

From PS4 Developer wiki
Jump to navigation Jump to search
Line 47: Line 47:
*Cop2 clamping is hardcodded in pcsx2 as far as i know, if no then is likely also governed by EE/FPU setting not VU/VU0
*Cop2 clamping is hardcodded in pcsx2 as far as i know, if no then is likely also governed by EE/FPU setting not VU/VU0
*xx-no-clamping setting is not really no clamping known from pcsx2. This is special mode which can be used regardless of other clamp commands. To compare pcsx2 have similar mode only for FPU (Full), to fully mimic that mode we still need fpu-to-double enabled.
*xx-no-clamping setting is not really no clamping known from pcsx2. This is special mode which can be used regardless of other clamp commands. To compare pcsx2 have similar mode only for FPU (Full), to fully mimic that mode we still need fpu-to-double enabled.
==ee-native-function==
Emulator have set of predefined functions used in popular PS2 SDK libraries. Those function are highly optimized to run natively on x64. <br>
'''--ee-native-function=name,address''' under the hood this is hooking selected address, and replace it with jump to predefined function. Functions available in JAK TPL emu:
memset      | fptoui          | ieee754_sinf
memcpy      | fptodp          | ieee754_cosf
strlen      | dptofp          | ieee754_sqrtf
strcmp      | fabs            | asinf
strcasecmp  | fabsf            | acosf
litodp      | ieee754_atan2f  | sinf
dptoli      | ieee754_asinf    | cosf
floatdidf  | ieee754_acosf    | sqrtf
This drastically reduce emitted code size for selected function. Additionally there is no need to recompile that at all, emulator just emit jump to label, and that's all. Additionally emulator advance delta clock to compensate cycles which will be normally took by original function.
<br><br>
Example ee_native_floatdidf
vcvtsi2sd    xmm0, xmm0, rdi
vmovq        rax, xmm0
retn
This is what real floatdidf looks like originally in ps2 mips, you can imagine that recompiled x64 code will be much longer. Every single instruction will be translated/recompiled separately.
addiu        $sp, -0x30
sd          $s0, 0x20+saved_s0($sp)
move        $s0, $a0
sd          $s1, 0x20+saved_s1($sp)
li          $s1, 0x81E0
dsll32      $s1, 15
dsra32      $a0, $s0, 0
sd          $ra, 0x20+saved_ra($sp)
jal          litodp
nop
move        $a1, $s1
jal          dpmul
move        $a0, $v0
move        $a1, $s1
jal          dpmul
move        $a0, $v0
move        $s1, $v0
lui          $v0, 0xFFFF
dsrl32      $v0, 0
and          $s0, $v0
dsll32      $s0, 0
dsra32      $s0, 0
jal          litodp
move        $a0, $s0
bgez        $s0, loc_2F3734
move        $a0, $s1
li          $a1, 0x83E0
dsll32      $a1, 15
jal          dpadd
move        $a0, $v0
move        $a0, $s1
jal          dpadd
move        $a1, $v0
ld          $ra, 0x20+saved_ra($sp)
ld          $s1, 0x20+saved_s1($sp)
ld          $s0, 0x20+saved_s0($sp)
jr          $ra
addiu        $sp, 0x30
This is corner case example as floatdidf convert a 64bit signed integer to IEEE double, and PS2 developers generally had no reason to use doubles (fpu/vu are operating on 32 bit floats). But you can see that whole conversion is practically done in 1 opcode, while ps2 take massive function to do this. Other functions are usually less optimized, but still really worth it.

Revision as of 14:39, 5 January 2023

TODO: Please remove unneeded uppercase letters not at the start of sentences.

  • This Is Not Elon Musk Here :P - Roxanne

Regs

VF regs you (Scalerize) described are VU0/COP2 only. Right after vf regs you can find vi regs (210+). Vi regs are only 32regs x 32 bit (vi00 to vi15, and 16 control/special regs) Edit: mapped by 0x10 tho. You can find similar array of regs for VU1 on 1040000000 or 1050000000. I don't know exactly where. This is virtual mapping and i don't own ps4 to test it really. --Kozarovv (talk) 16:54, 2 January 2023 (UTC)

Will work on this stuff when i get the time! thank you so much! -- Scalerize
Edit: i do not know what the registers for vu1 are since pcsx2 does not use them, So here  are dumps that i hope will help you figure it out! it's a dump from rayman m during the language select screen. why rayman m ? well because the values do not change in the registers at this screen! so it's the same values for both of us! 

SLES_504.57_7z

  • Thanks. Pcsx2 use vu1 regs, you just can't see them in debugger because for VU1 that will be pointless. :) From your dumps:
    • 1040000000 VU1 regs, mapping like on VU0.
    • 1050000000 VU1 micro data memory (1100C000 on real ps2 and pcsx2 debugger) size 0x4000.
    • 1050004000 VU1 micro data memory mirror (1100C000 on real ps2 and pcsx2 debugger) size 0x4000. Likely mirrored 2 more times on 8000 and c000
    • 104000C000 emulator place here VU1 constants used in popular operations. Eatan/eexp constants, masks for clamping, etc. Similar array can be found in Pcsx2 (mVU_Globals), Dobiestation (atan_const, etc), Play! (GenerateEATAN, etc.)
    • 1030004000 emulator place here VU0 constants used in popular operations. Like above (vu0 don't have efu so placing there efu constants for eatan/eexp is pointless, but there they are).

--Kozarovv (talk) 09:37, 5 January 2023 (UTC)

Misc info

Some data that eventually need to be posted on main emulation page. All data posted here is obtained from jak tpl (so called v1) emulator. All data is confirmed in code itself, no guessing (unless said otherwise). Time to start releasing that old work to public.

Misc misc info

  • Both settings do the same thing:
--external-hdd-fix
--cdvd-determinism
--ee-kernel-hle
--ee-injection-kernel
  • Setting take unused value:
--ee-cache-breaks-block
No matter which value is used, 1 is set.

Few popular misunderstandings

  • vu-xgkick-delay take integer between 0-31 (confirmed on both emu and compiler side), and not float (0.5 is invalid, will be truncated to 0 probably)
  • Cop2 rounding in pcsx2 is governed by "EE/FPU" rounding setting, not by VU or VU0.
  • Cop2 clamping is hardcodded in pcsx2 as far as i know, if no then is likely also governed by EE/FPU setting not VU/VU0
  • xx-no-clamping setting is not really no clamping known from pcsx2. This is special mode which can be used regardless of other clamp commands. To compare pcsx2 have similar mode only for FPU (Full), to fully mimic that mode we still need fpu-to-double enabled.

ee-native-function

Emulator have set of predefined functions used in popular PS2 SDK libraries. Those function are highly optimized to run natively on x64.
--ee-native-function=name,address under the hood this is hooking selected address, and replace it with jump to predefined function. Functions available in JAK TPL emu:

memset      | fptoui           | ieee754_sinf
memcpy      | fptodp           | ieee754_cosf
strlen      | dptofp           | ieee754_sqrtf
strcmp      | fabs             | asinf
strcasecmp  | fabsf            | acosf
litodp      | ieee754_atan2f   | sinf
dptoli      | ieee754_asinf    | cosf
floatdidf   | ieee754_acosf    | sqrtf

This drastically reduce emitted code size for selected function. Additionally there is no need to recompile that at all, emulator just emit jump to label, and that's all. Additionally emulator advance delta clock to compensate cycles which will be normally took by original function.

Example ee_native_floatdidf

vcvtsi2sd    xmm0, xmm0, rdi
vmovq        rax, xmm0
retn

This is what real floatdidf looks like originally in ps2 mips, you can imagine that recompiled x64 code will be much longer. Every single instruction will be translated/recompiled separately.

addiu        $sp, -0x30
sd           $s0, 0x20+saved_s0($sp)
move         $s0, $a0
sd           $s1, 0x20+saved_s1($sp)
li           $s1, 0x81E0
dsll32       $s1, 15
dsra32       $a0, $s0, 0
sd           $ra, 0x20+saved_ra($sp)
jal          litodp
nop
move         $a1, $s1
jal          dpmul
move         $a0, $v0
move         $a1, $s1
jal          dpmul
move         $a0, $v0
move         $s1, $v0
lui          $v0, 0xFFFF
dsrl32       $v0, 0
and          $s0, $v0
dsll32       $s0, 0
dsra32       $s0, 0
jal          litodp
move         $a0, $s0
bgez         $s0, loc_2F3734
move         $a0, $s1
li           $a1, 0x83E0
dsll32       $a1, 15
jal          dpadd
move         $a0, $v0
move         $a0, $s1
jal          dpadd
move         $a1, $v0
ld           $ra, 0x20+saved_ra($sp)
ld           $s1, 0x20+saved_s1($sp)
ld           $s0, 0x20+saved_s0($sp)
jr           $ra
addiu        $sp, 0x30

This is corner case example as floatdidf convert a 64bit signed integer to IEEE double, and PS2 developers generally had no reason to use doubles (fpu/vu are operating on 32 bit floats). But you can see that whole conversion is practically done in 1 opcode, while ps2 take massive function to do this. Other functions are usually less optimized, but still really worth it.