Page 5 of 6

Re: Any good examples of 3D vector programming?

Posted: Thu Dec 23, 2021 10:30 am
by uglifruit
That's excellent!
Really impressive.

Re: Any good examples of 3D vector programming?

Posted: Thu Dec 23, 2021 7:42 pm
by catmeows
Art wrote: Wed Dec 22, 2021 6:42 pm After several months, i had some time to continue on my 3D game. Since the last month, I finally learn Z80 assembler and I try to convert my Basic program to a machine code version. It's my first real assembler program (except some little 10-byte routines in the past), so I am sure it can be smaller and faster, but I am quite satisfied.
Well, it is very impressive piece of code.
I made it in the MRS 09/2 assembler on an emulated Spectrum and it was a surprisingly comfortable experience. MRS doesn't work in the Fuse emulator (I don't understand why), but it works in old emulators, so I used X128 in a DOS emulator. Yes, a bit crazy, but I wanted to experience the programming style of the Spectrum era.
heh, quite masochistic approach but why not
In the present version, I have the screen buffer below 32768, so if I understand well, I copy 6912 bytes from contended memory to contended memory, which is probably slow. I don't really understand how condended and uncontended memory work and which part of a program should be used in which part of a memory.
Short story is that video circuit (ULA) reads memory to build signal that is sent to TV. Thing is that CPU reads memory too and memory is not fast enough to satisfy both ULA and Z80. Since video signal cannot be interrupted, ULA has higher priority and when it reads memory, it also asks Z80 to delay any memory access.
So the crucial question is when Z80 wants to access memory. And there are three cases: it reads opcode of next instruction, instruction reads memory and instruction writes memory.
For example:

Code: Select all

LD A,B - read opcode, one access
LD A, 7 - read opcode, read immediate parameters - two accesses
LD IX, 0 - read IX prefix, read opcode, read lower byte of parameter, read higher byte of parameter - four accesses
DEC (HL) - read opcode, read value from (hl), write result to (hl) - three accesses
But ULA slows down only access to RAM between 16384 and 32767, included.
So assume HL has value 20000 (in contended memory).
Now, if you put instruction DEC (HL) at 28000, all three accesses (read opcode at 28000, read content of (HL), write to (HL) ) points to contended memory.
But same instruction at address 40000 has only two accesses to contended memory because opcode read is access into uncontended memory.

So rule of thumb is to place time critical code above 32768 because every instruction need to be fetched (CPU needs to read opcode). If you have space left, then move also data used often above 32768 (like sprite graphics, or back buffer). On other side, code dispatching game menu, or even level data can be placed below 32768 without too much worries.

The actual delay caused by ULA is predictable, for example ULA will not slow memory access when it generates signal for border. There are even some instruction sequencies that are known to kind of "self sync" with ULA so the contetion is not such big issue. But that is quite complicated topic.

Re: Any good examples of 3D vector programming?

Posted: Thu Dec 23, 2021 10:03 pm
by Art
Thank you for the explanation. It's still a bit complicated for me, but I'll think about it when I'll decide which part of the program to which part of the memory. For the present, I moved the screen buffer to 49152 and I have several ideas how to speed up drawing a bit. I also switched to another compiler, this time on a PC. But surprisingly, my program doesn't work correctly when I compile it in this way. Maybe because of its execution - I don't know. When I was using MRS, I always started the compiled program directly from MRS, but now I compile it to a TAP file, load it into a Spectrum emulator and use RANDOMIZE USR. When I don't touch anything, the program runs without problems. When I program a camera animation, everything moves normally. But when I use control keys for moving, several bytes of the program memory is overwritten, which causes the graphics corruption. I tried to figure out what is going on during the whole day, but I only saw that it's probably overwritten from ROM (maybe some ROM routine wrote to RAM). Is it possible that reading from a keyboard can cause writing to RAM? Should I do something special when I start a machine code program from Basic (except of CLEAR address-1)? I use something like this to read the keyboard:

Code: Select all

ld bc,57342
in a,(c)
push af
call nc,right
pop af

Re: Any good examples of 3D vector programming?

Posted: Fri Dec 24, 2021 10:48 pm
by Art
Finally I found the bug! It was hard, because I had no idea what was going on and what I was looking for. I use the IY register to read and write some 3D coordinates and I didn't know that the keyboard routine in ROM changes the IY register. When I was writing and testing the program in MRS, it never happened, because MRS doesn't use ROM, so it was a completely new information for me. So after several days full of fails, searching, testing, unnecessary changes, debugging and reading ROM, I learned something new and finally I will be able to continue my game. This will happen if you learn assembler from The Complete Machine Code Tutor like me. :) It's a shame that it lacks this very important information though it contains chapters about IX,IY and interrupts.

So, just two little questions: When I want to use the IY register, I need do use DI before and EI after that part. Is it the right way? And is a program faster when it runs with interrupts disabled?

Re: Any good examples of 3D vector programming?

Posted: Sat Dec 25, 2021 10:43 am
by patters
I'm not a Z80 assembly programmer, but wouldn't the answer to the first question just be to PUSH IY to the stack and make sure you POP IY back when you have finished using IY?

Re: Any good examples of 3D vector programming?

Posted: Sat Dec 25, 2021 10:49 am
by AndyC
DI, use IY for what you want and then EI is certainly one way, as long as you put the correct value back into IY (it's always the same value so it's not necessary to PUSH/POP it. Redirecting interrupts using IM 2 is another option.

Will code run faster with interrupts disabled? Well slightly, since the ROM handler won't be called every fiftieth of a second, so you'll save that time. It won't speed up short routines that run in-frame though.

Re: Any good examples of 3D vector programming?

Posted: Sat Dec 25, 2021 2:52 pm
by Joefish
My advice is always to leave BASIC behind and write your own interrupt routine. Even if it's just a single RET instruction and you just use it for timing with EI and HALT each time you want to synchronise your code with the display.

Of course, you will need to learn how to use IN A,(C) (which actually can use a 16-bit IN port address in BC) to read the keyboard yourself.

Re: Any good examples of 3D vector programming?

Posted: Sat Dec 25, 2021 4:54 pm
by Art
Thanks for the answers. Yes, I also decided not to use any ROM calls and I used IN A,(C) to read the keyboard. I just didn't know that the Spectrum still automatically calls the ROM keyboard routine even if I never use it, and moreover doesn't backup the IY register and uses it. Well, still learning. I will also look at how interrupt routines work.

Re: Any good examples of 3D vector programming?

Posted: Wed Dec 29, 2021 7:11 pm
by Art
Some progress to show:

It should be ready to make a game around it, but it's quite slow with more objects. I think that the code could be a little faster, so I'll try to improve it. It's still for Spectrum 48K, but maybe the screen buffer of Spectrum 128K could help a bit, although I don't know yet how it works. Or maybe look-up tables for some computations (I use it only for sine now).

Re: Any good examples of 3D vector programming?

Posted: Wed Dec 29, 2021 8:53 pm
by catmeows
I would find a decent emulator with a code profiler and check what is the actual bottleneck.

Re: Any good examples of 3D vector programming?

Posted: Thu Dec 30, 2021 11:19 pm
by Art
catmeows wrote: Wed Dec 29, 2021 8:53 pm I would find a decent emulator with a code profiler and check what is the actual bottleneck.
Unfortunately, I don't know yet how it works. There is a profiler in the Fuse emulator, but without any documentation. It makes a file with memory addresses and some numbers, which can also be converted to some other format, but it's not clear what it's good for.

I also tried several ways to make the program faster (copying with PUSH/POP, unrolling multiplication and division cycles etc.), but maybe it's 5% faster now, nothing significant. I was also thinking about multiplication and division using look-up tables. Are there any examples in Z80 assembler somewhere?

Re: Any good examples of 3D vector programming?

Posted: Fri Dec 31, 2021 8:49 am
by catmeows
Art wrote: Thu Dec 30, 2021 11:19 pm I was also thinking about multiplication and division using look-up tables. Are there any examples in Z80 assembler somewhere?

There is missing log based multiply/division (i.e. using identities log(x*y) = log(x) + log(y) , log(x/y) = log(x) - log(y)) but that is not very acurate.

But I would bet the real issue is fillrate, especially clearing inner area of polygon. If you still use a simple loop to clear line in polygon, you should consider introducing of some variation of Duff's device.

Re: Any good examples of 3D vector programming?

Posted: Fri Dec 31, 2021 9:02 am
by Art
Thank you for the table examples, I will try it.

Yes, filling polygons would really slow it down, but I don't fill or clear any polygons. I use two height buffers, 256 + 256 bytes.

Re: Any good examples of 3D vector programming?

Posted: Fri Dec 31, 2021 12:51 pm
by catmeows
Maybe you could share your draw line routine to se how efficient it is.

Re: Any good examples of 3D vector programming?

Posted: Fri Dec 31, 2021 1:23 pm
by Art
OK. At first, I will try to improve multiplication. I found several cases where I can convert (16-bit * 8-bit number) to (8-bit * 8-bit number) with some program branching and changing the sign, so it can help. Then I will try to use the look-up table for some multiplications. If it will be still too slow, I will rewrite the line drawing parts to a readable format and share it. As it's my first assembler program, it's a bit of a mess and the line drawing consists of several parts, because I use several line types with several types of buffering.

Re: Any good examples of 3D vector programming?

Posted: Sun Jan 02, 2022 5:12 pm
by Art
So, I used "Square Table 8-bit * 8-bit Signed", which I slightly adapted for my program and it works well. I would also use an "8-bit * 8-bit unsigned", but unfortunately it wouldn't be possible this way using just register pairs.

I also wanted to use "Restoring 16-bit / 8-bit Unsigned", which I also found on several other sites, but it doesn't work for some numbers. For example 1280/180 gives 0. Is there such a routine which divides 16-bit / 8-bit number and works well? I can use 16-bit / 16-bit which works, but I was thinking about something faster.

Re: Any good examples of 3D vector programming?

Posted: Sun Jan 02, 2022 6:05 pm
by catmeows
You are correct, the 16 / 8 division is bugged. Or it seems to me it is rather 16 / 7. It is interesting problem, I will try to come with working 16 / 8 .

Re: Any good examples of 3D vector programming?

Posted: Sun Jan 02, 2022 6:37 pm
by Art
Yes, when it divides by smaller values, it works, but when I traced it for 1280 / 180, it shifted hl 16 times to the left, so hl was 0. I use a division for the perspective projection, so near objects are OK, but distant objects are distorted. It's interesting that this routine can be found in several collections without any remarks about this problem.

Re: Any good examples of 3D vector programming?

Posted: Sun Jan 02, 2022 9:20 pm
by catmeows
For looped version, I don't think I can save subtantial time against unsigned 16 / 16.

However, for unlooped version the code may work and be somewhat faster than 16/16. Anyway, I'm not in mood to test it today.

Code: Select all

  ld a, 255
  ld c, a
  ;HL = dividend
  ;will have result in AC
  or a
  sbc hl, de
  jr nc, trylesser1
  add hl, de
  res 7, a
  or a
  sbc hl, de
  jr nc, trylesser2
  add hl, de
  res 6, a
  or a
  sbc hl, de
  jr nc, trylesser3
  add hl, de
  res 5, a
  ;and so on, second octet will reset bits in c
  or a
  sbc hl, de
  jr nc, trylesser15
  add hl, de
  res 1, c
  or a
  sbc hl, de
  jr nc, exit
  add hl, de
  res 0, c

Re: Any good examples of 3D vector programming?

Posted: Mon Jan 03, 2022 11:38 am
by Art
Thank you for your work! I don't know how to test its speed, but I will examine it.

Re: Any good examples of 3D vector programming?

Posted: Mon Jan 03, 2022 3:16 pm
by Art
Art wrote: Sun Jan 02, 2022 5:12 pm
So, I used "Square Table 8-bit * 8-bit Signed", which I slightly adapted for my program and it works well. I would also use an "8-bit * 8-bit unsigned", but unfortunately it wouldn't be possible this way using just register pairs.

I also wanted to use "Restoring 16-bit / 8-bit Unsigned", which I also found on several other sites, but it doesn't work for some numbers. For example 1280/180 gives 0. Is there such a routine which divides 16-bit / 8-bit number and works well? I can use 16-bit / 16-bit which works, but I was thinking about something faster.
By the way, "Square Table 8-bit * 8-bit Signed" on that site is also bugged, but I found an easy fix during tracing the routine:
"jp pe,Plus" should be "jp po,Plus". Maybe the author of this routine collection tested them only with small numbers.

Re: Any good examples of 3D vector programming?

Posted: Tue Jan 04, 2022 10:38 pm
by Art
catmeows wrote: Sun Jan 02, 2022 9:20 pm For looped version, I don't think I can save subtantial time against unsigned 16 / 16.

However, for unlooped version the code may work and be somewhat faster than 16/16. Anyway, I'm not in mood to test it today.

Code: Select all

  ld a, 255
  ld c, a
  ;HL = dividend
  ;will have result in AC
  or a
  sbc hl, de
  jr nc, trylesser1
  add hl, de
  res 7, a
  or a
  sbc hl, de
  jr nc, trylesser2
  add hl, de
  res 6, a
  or a
  sbc hl, de
  jr nc, trylesser3
  add hl, de
  res 5, a
  ;and so on, second octet will reset bits in c
  or a
  sbc hl, de
  jr nc, trylesser15
  add hl, de
  res 1, c
  or a
  sbc hl, de
  jr nc, exit
  add hl, de
  res 0, c
I tried it, but unfortunately, during my tests, the result was always 0.

Re: Any good examples of 3D vector programming?

Posted: Wed Jan 05, 2022 3:10 am
by presh
Art wrote: Sun Jan 02, 2022 5:12 pm Is there such a routine which divides 16-bit / 8-bit number and works well?
Have you tried this one? ... B_fast.z80

Loads of good stuff on [mention]ZedaZ80[/mention]'s github, worth having a dig around to see if anything else inspires you!

Re: Any good examples of 3D vector programming?

Posted: Wed Jan 05, 2022 3:17 pm
by Art
presh wrote: Wed Jan 05, 2022 3:10 am
Art wrote: Sun Jan 02, 2022 5:12 pm Is there such a routine which divides 16-bit / 8-bit number and works well?
Have you tried this one? ... B_fast.z80

Loads of good stuff on @ZedaZ80's github, worth having a dig around to see if anything else inspires you!
Thank you for the link, it looks like an useful collection. I tried the HL_Div_B_fast now, but unfortunately it works only if the result is an 8-bit number. For example 1327/2 doesn't work.

Re: Any good examples of 3D vector programming?

Posted: Wed Jan 05, 2022 5:10 pm
by catmeows
Art wrote: Wed Jan 05, 2022 3:17 pm
presh wrote: Wed Jan 05, 2022 3:10 am
Have you tried this one? ... B_fast.z80

Loads of good stuff on @ZedaZ80's github, worth having a dig around to see if anything else inspires you!
Thank you for the link, it looks like an useful collection. I tried the HL_Div_B_fast now, but unfortunately it works only if the result is an 8-bit number. For example 1327/2 doesn't work.
Thats starts to be quite a comedy :) Here is tested code:

Code: Select all

	.ORG 40000

	ld hl, 0  ;dividend
	ld c, 0   ;divisor
	ld b, 16
	xor a
	add hl, hl	;11
	rla 		;4
	jr c, divide1   ;7
	cp c		;4
	jr c, divide1a  ;7
	sub c           ;4
	inc hl          ;6 -> 43 ticks
	djnz loop       
	ld l, c
	ld b, h
I believe you are able to unroll the code.
Speedwise, the longest branch is few ticks shorter than shortest branch of "Restoring 16 / 16 Unsigned", so there may be slight improvement.