|
Post by mborrmann on Aug 7, 2016 15:00:21 GMT -5
Hey folks,
since I am all the time a little bit shocked how incredibly cycle hungry "print_str" is, i tried to look a bit into its code.
It seems though, that the main loop which gets the bitmap data out of the chargen tables needs an exact cycle amount to display each line of a char, or things look very distorted. So there's no good way to speed up things at this most crucial point of the code (tried to unroll the loop by hand and gave it a fixed length).
I could unroll the the seven lines that each char is made of, but I guess those few cycles saved there by not having to use a branch don't make a big splash either.
So, anyone any idea how to speed up print_str, or is it simply a stupid idea?
|
|
|
Post by Malban on Aug 7, 2016 17:55:48 GMT -5
Hi, for more information: Vide Blogor here: vectrex-revisionsThe above mentioned Moonlander routines I can make available (see below). For a test: "THIS IS A TESTSTRING" - uses with "ordinary" Print_Str: 6980 cycles "THIS IS A TESTSTRING" - uses with ""Moonlander" Print_Str: 4227 cycles The gist of the speedup is, that it is printed bidirectional, although thomas mentioned that it does not look alright on all machines - on my three vectri it looks ok. (some emulators have problems though) You can download the Moonlander routines: print.asm.zip (8kb)(This was quickly put in a zip file, because of macro-usage it is not all THAT readable...) They can easily assembled an run in vide - but should work also with AS09. If you intend to do further optimizations, you can try vide as emulator, it is pretty good at timing issues but you also MUST test on a real machine. Also remember that a "Shift cycle" for raster printing should always be exactly 18 cycles. Regards Malban PS Also you might find interesting: In the Vide documentation: a corrected documented Print_Str
|
|
|
Post by mborrmann on Aug 8, 2016 0:52:29 GMT -5
Hello Malban, thanks for the fast reply.
Tried out the Moonlander code,unfortunately I get some kind of garbled output with it. Not sure yet what's wrong, yet. I terminated my string with $81 on both sides.
On VIDE, I tried to use it, it's slow as molasses though on my 2010 MacBook Pro unfortunately. Would love to have the option to properly debug my code, at the moment I am using all kinds of tricks and my own experience for getting along.
Greets, Michael
|
|
|
Post by Malban on Aug 8, 2016 1:26:07 GMT -5
Oh my,
I think I was sleeping already yesterday... Yes I get garbled output too - something went wrong when I put together the zip file.
Please download and try again.
Yep - Vide can be slow - best choice to make it faster is: in configuration, switch of "glow" and switch of "ringbuffer active", those to options usually double the speed :-).
Malban
|
|
|
Post by mountaingoat on Aug 17, 2016 5:59:48 GMT -5
I know this is not the answer you are looking for but I basically stopped using Print_str in tight loops.
In Sub Wars I replaced all text with graphics for status indication. In SGJ I am actually creating vector texts so that they can be drawn faster.
|
|
|
Post by Malban on Aug 21, 2016 4:37:24 GMT -5
Hi, I just wrote something about print str, might be interesting in this context: Print StringRegards Malban
|
|
|
Post by gauze on Dec 27, 2016 9:12:51 GMT -5
I found this thread while looking for information on profiling in vectrex dev ... how are you getting these cycle counts?
|
|
|
Post by Malban on Dec 27, 2016 10:06:50 GMT -5
Hehe, there is a program called Vide - with that it is (if you know what todo) quite easy. First hint: vectrex.malban.de/preliminary/53d0cc79.htmlIn dissi that is the first button of the "window"-buttons. If you use WaitRecal of the BIOS than it might just work out of the box - otherwise you have to figure out the addresses from where to where you would like to measure. But since cycle counting is so essential in vectrex business - there are also other ways. In dissi use the command line parameters (h = help). Commands like: "Cycles [#number]" print current cycle count of running vectrex, #number sets the current cycle count [c]
"RunCycles ####" Run program untill at least #### cycles have been processed (all instructions are fully processed) [rc]
"CountCycles $xxxx $yyyy" next time pc executed $xxxx it starts counting cycles and prints out the count after $yyyy is executed(only in currently emulated bank!) [cc] Are your friend... And one of the FAQ questions scratches the topic: Tracki - some programs do not show any tracki information, why is that?
They do to! Most probably you didn't configure tracki correctly. Tracky measure cycles between two memory locations the CPU (pc - program counter) passes. The default configuration is set up to measure cycles between the exit and the entry of the BIOS routine "WaitRecal". Many programs use that routine to do vector recalibration at the end of their game loop. Therefor for these programs the given locations measures the cylces of a game loop round. For programs which do not use the "WaitRecal" routine for their recalibration, the setup does not show any results. You must figure out what the program you are interested in uses for a game loop and/or recalibration. You can enter the so found addresses into the supplied start/end addreesses of tracki - than you can measure those programs. Also: Remember to switch the "always update" button to "on" - otherwise you won't see anything either :-)
Regards Malban
|
|
|
Post by gauze on Dec 27, 2016 11:21:16 GMT -5
ah I didn't know about tracki, cool! I was using the cycle counter at the top of regi but it's not very "friendly" for doing quick tests unless you are better than me at visual math between 2 numbers, one held in your mind it's quite eye opening seeing how slow the BIOS routines are, I just did a couple hours of quick game framework and it's already over 49K cycles during minimal action! thanks!
|
|
|
Post by Malban on Dec 27, 2016 11:37:44 GMT -5
Hehe - one more to appreciate Frogger/Robot Arena :-).
|
|
|
Post by bob on Dec 28, 2016 7:27:55 GMT -5
I included a vector font in the latest version of Vectrex32. I had hoped it would draw text faster than Print_Str but I think it ended up being slower. (*)
Print_Str is actually not that bad when you think about it. It's drawing an entire string with just seven lines. That's a lot if all you want is one character, but it's very efficient for three characters or more. And it's easy to get carried away when you're putting text on the screen.
(*) I'm still glad I added the vector font; it has a very different style than the Vectrex font and it more closely matches the text of classic arcade games.
- Bob
|
|
|
Post by gauze on Dec 28, 2016 12:08:58 GMT -5
a lot of cycles if you plan on keeping the score and Print_Ships running on every frame!
|
|
|
Post by christophertumber on Dec 28, 2016 15:43:17 GMT -5
I generally use vector fonts designed with as few lines as possible and draw ~4 (I don't remember exactly) characters per reset0ref. (as many as possible before drift becomes an issue)
For most in game screens I avoid using any text at all. I'd rather spend those cycles on game objects.
High score tables can actually be pretty cycle heavy. There's potential for a lot of characters there if you allow players to enter names and your scores have non-trivial significant digits. Don't remember if the most recent Space Death build I posted had the high score tables active but if so, you'll see what I mean there.
|
|
|
Post by Malban on Jan 2, 2017 8:48:53 GMT -5
Just another 2 points. 1) gauze : in Dissi doing a command line: " c 0" resets the cycle counter to 0, from that on it is very easy to use regi cycle count :-) - since the first parameter of your dif is 0 :-) 2) In Karl Quappe I used two tricks for efficiency for the in game strings: a) I used a custom font with only 5 "lines" (instead of 7) b) the level counter (1-16) is only one char - that is possible since the "ONE" character is so slim you can fit both numbers in 7 bits c) I naturally use an "own" printStr version which is also optimized in other parts (but not bidi) numbers: ; 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 SPACE db %00001100, %00001100, %00001100, %00001100, %00010000, %00001110, %00001100, %00011110, %00001100, %00001100, %11001100, %11001100, %11001100, %11001100, %11010000, %11001110, %110001100, %00000000 db %00010010, %00000100, %00010010, %00010010, %00010000, %00001000, %00010000, %00000100, %00010010, %00010010, %01010010, %01000100, %01010010, %01010010, %01010000, %01001000, %010010000, %00000000 db %00010010, %00000100, %00000100, %00000100, %00010100, %00001100, %00001100, %00001000, %00001100, %00001100, %01010010, %01000100, %01000100, %01000100, %01010100, %01001100, %010001100, %00000000 db %00010010, %00000100, %00001000, %00010010, %00011110, %00000010, %00010010, %00001000, %00010010, %00000010, %01010010, %01000100, %01001000, %01010010, %01011110, %01000010, %010010010, %00000000 db %00001100, %00000100, %00011110, %00001100, %00000100, %00001100, %00001100, %00001000, %00001100, %00001100, %01001100, %01000100, %01011110, %01001100, %01000100, %01001100, %010001100, %00000000
Also - remember when you do your own "chars", to end with a 0 in the LSB: (Text I am writing for Vide again (bitmaps) - but not finished) The vectrex uses the VIA 6522 and the models of VIA it uses have some bugs. One of its bugs is, that a complete shift cycle does not shift 8 times for one byte, but 8+1 (with the mode used by raster output). The last shifted value is repeated. So that e.g. 0101 0101 [e.g. part of a bitmap]
results in an output like:
0101 01011
In the normal "text" routines you do not see that, since letters always end with a "0" and a double "00" result just in a little bit more space between letters. In a "continues" bitmap you must align or tweak the bitmap thus, that you do not (or only barely) see bit doubles.
It certainly is even more difficult to do if you are working with "diagonals", since you can not get rid of a double step.
... tbcMalban
|
|