OK, so I still have nothing to show for it, but I've finally had a minor breakthrough in that I've been able to find where (some of?) the code that gets loaded into RAM lives on the disk. I got lucky in this regard; it could have been the case that this code was compressed or encrypted in some way, with the game dynamically translating it into native code while the game is already running. That would have made it substantially more difficult, since I wouldn't have even known what to look for in the ROM. Fortunately, the code seems to exist in pretty much the same form on disk as it does in RAM.
I'm still not sure how much of the code is actually copied into RAM at a given time, but this should be fairly inconsequential as long as I'm careful not to change the way the game loads the code.
I'm currently trying to modify the game to prevent Klonoa from ever losing lives, just because it should be relatively easy. I'm not quite there yet (my version crashes as it tries to start the first level), but I at least know roughly where the code that controls lives is, and what assembly code I need to replace it with to make this work.
In retrospect, finding the code was actually pretty easy. Using the PSX 1.13 emulator from earlier (I still haven't gotten BizHawk to work exactly how I want), all I needed to do was add a breakpoint that fires when the "lives" value is written to, look for the instructions that are executing when the breakpoint is hit, dump the memory that holds the instructions to disk, and then search for that same code in the ROM itself (which should be doable with any hex editors; I used Visual Studio even though it's certainly not the ideal tool for this).
The substantial pain point I ran into was that, even though the code is essentially the same in both places, the bytes weren't ordered in exactly the same way in the ROM as they were in my dumped RAM memory, and I couldn't find any documentation about exactly what the difference is. I ended up needing to use a lot of trial and error, but eventually I managed to figure it out. Now that I've got that figured out, it shouldn't be a problem anymore, which is pretty exciting.
I'm hoping I can find the time some time soon to document this process in a bit more detail, as I think you'll be pleasantly surprised at how not actually difficult it is. Once you've seen it done a single time you'll probably be able to do it yourself without too much trouble. My free time is still unfortunately quite limited, so I don't want to promise that this will happen any time soon.
Actually reading and writing assembly code can definitely be tricky/annoying/slow, particularly since assembly code pulled from a binary doesn't have any of the comments, labels, etc. that developers put in to make the code much easier to follow. There are also a lot of conventions about how (usually) to do certain things and store certain things (arguments that get passed to subroutines, where to return after leaving a subroutine, etc.), and being familiar with them makes the code a bit easier to understand, but the basic instructions, registers, etc. that are the building blocks for all of these conventions aren't too scary. Also note that I'm by no means an especially good assembly programmer; it's not something that often comes up in my day to day work, since I essentially never work closely enough with hardware to need it.