PPCenter :: devblog

PPcenter. Arts and craft for my Sega Saturn. Since 1847 :D

Bit shift in Saturn minimalistic C program

Written by cafealpha 4 comments
While making a Saturn program running directly from ROM, constraint is usually to have program as small as possible. Why ? Because I would like to make Pseudo Saturn Kai continue supporting Action Replay, which have limited ROM size

So, in that situation, I use homemade, minimalistic stdlib, containing usually only memcpy and memset functions, and everything goes well.
But for some reason, current program was begging for __ashlsi3 and __ashrsi3 functions.
... What's these things ? __ashlsi3 is an assembly function doing left shifts of data, and as you probably guessed, __ashrsi3 does the same, but with right shifts

No problem, let's ask my friend google, and copy & paste theses functions without minding about their contents Theses functions are standard functions provided by C library, so sources can be found here and there.


An example of what __ashrsi3 function looks like

A day after that, here comes the time to do basic testing of this ROM code
Let's first start with simple things such as screen display ...


Text display ... not as expected ?!

Hrm, it's probably a problem in my printchar routine ?

printchar source code : this is the core thing
behind text display in my programs


But that's a very long time I didn't touched this routine, which works well on Pseudo Saturn Kai, and works even better in vdp1ex from Charles MacDonald sample programs, because all my printchar routines come from there
But just in case of, let's try to tune one " <<3 " into " *8 " in the code above, and see if that changes something. Theses two operations are equivalent, so this should give the same result, but ... Saturn screen becomes all black

Okay, there's definitely something weird regarding that printchar, but I just re-used it as-is from other project, so what's going wrong with my it ?!
[...]
And then (finally) a light-bulb above my head appears Maybe there's something wrong regarding the __ashlsi3 and __ashrsi3 routines ?
And, yes there was something wrong regarding theses routines ... I wrote a paragraph about them on the first half of this article, so it would be surprising the problem came from somewhere else

In more accurate words, routines are OK, which is normal since they are written by people smarter than me But the calling convention of __ashrsi3 was not as my gcc was expecting ?
Well, that's just a guess from the display results, because it seems to display the same pixel 8 times on each row of a given character, which may happen when ignoring shift parameter in __ashrsi3 function.

Anyway, rather than finding in details what's wrong, let's fix the problem First, with assembly by hand, in desperately trying to change input parameter handling ... I don't remember exactly what I wrote, but that was something like pushing a register to stack, moving parameter to it, finally restoring register, putting nop instructions everywhere, etc ... nothing difficult, but over my extent in assembly programming

The result is ... a mixed success :

Trying to modify __ashrsi3 routine by hand ... well, it seems that
some shift values are not correctly handled


Okay, so let's re-ask google about that __ashrsi3 routine ... and a different implementation arrives in search results. Let's try it


__ashrsi3

And the execution result :

Execution result with second version of __ashrsi3 function

Yeah, it works this time !!! And don't ask me why, because I didn't took time in understanding what's different in that second version of __ashrsi3 function
And, rather than understanding why it works ... let's close that assembly pandora box before some other mystic bugs pop from it

I will probably see about this in the future, but that was absolutely out of the scope of today's programming, and in software world, too much digression leads to freeze of projects, which I would like to avoid in order to go forward to next pending project 

As a example, I started to adapt yeti3d engine to Saturn in 2010 (7 years ago !), and this ended in ... developing a SD card based memory cartridge for Saturn !
The point above is is not a joke : menus used in my yeti3d adaptation are the origin of menus used in Pseudo Saturn Kai, which shows the continuity (co-consanguinity ? ) in my projects. And after getting yeti3d working a bit, I really would like to load levels from something else than CD-ROM, which was from PC (via adequate link cable) on a first time, and which will (should ?) then evolve to SD card.
That would be cool to make a Saturn game not requiring to burn CD-ROM somedays, but before that I need to finish neighboring side quests


Evolution of my Saturn projects

To conclude this article, let's say it was terribly fun to see some bits of the internals behind C language
By the past, I remember I did something similar with my yeti3d adaptation, but remained at "C language level" : optimization done at that time was (IIRC, ) to avoid 4 bits shifts, because theses are not available in a single SH-2 CPU instruction. Avoiding 4 bits shifts was simply done by merging two fixed point operation in a single one (or something like that : I did this 7 years ago !), and this actually gave some improvements in 3D scene rendering Theses were the good times ... I sometimes think "I'm Getting Too Old For This Shit", but that's only to motivate to finish my old projects

But before that, Pseudo Saturn Kai, and then Kicad and Quartus are waiting for me


BTW, in the case you wondered about what was today's programming session, you probably guessed it was about exception handler addition to Pseudo Saturn Kai. This not a new feature, since it was available in Pseudo Saturn 0.83x ... in fact, I grabbed some sources from Pseudo Saturn 0.83x in order to implement this exception handler (thank you CyberWarriorX !) This exception handler will be available as a small add-don to cheat codes feature in Pseudo Saturn Kai next major release.

4 comments

#1  - antime said :

You can also link against libgcc if you don't want to implement the various support routines yourself.

Reply
#2  - cafealpha said :

Thank you for visiting my blog !
I actually hesitated to do the -nodefaultlibs way which would be simpler, but the way of doing in my article just required to add implementation for left/right bit shift, and implementation for memcpy, and would probably make the code a bit more optimized.
In the future, after I update gcc or change compilation settings, there are no guarantee my code will continue linking without begging for additional dependency, but since I'm the only maintainer of my projects, and OK to do such dirty maintenance, that's not a real problem :)

Reply
#3  - vbt said :

*buf++ += ((w[offs] * v)>>3);
vs
*buf++ = ((w[offs] * v) / 8);

who wins ?

Reply
#4  - cafealpha said :

Both expressions behave differently (" += " on the first expresion vs " = " on the second one), but if you're asking about the speed difference between " >>3 " and " /8 ", then I think " /8 " would win because (IIRC) a 3 bits shifts operation requires two instructions (2+1 bits shift) vs one instruction for the division.
It's late night (or early morning) here, and I didn't verified anything regarding cycles usage for both expressions, so maybe I'm wrong however.

Basically, using bit shifts in divisions is advantageous when combining or simplifying two arithmetic operations : after being sure that result won't overflow and/or precision of the result is not lost (or slightly lost, but still acceptable), then merging several operations and using bit shift in the same occasion usually gives some performance gain :)

Reply

Write a comment

Please enter this homepage's webmaster favorite video game console name.
It fits in 6 letters and is case insensitive.

Rss feed of the article's comments